WorldWideScience

Sample records for audio feature space

  1. Emotion-based Music Rretrieval on a Well-reduced Audio Feature Space

    DEFF Research Database (Denmark)

    Ruxanda, Maria Magdalena; Chua, Bee Yong; Nanopoulos, Alexandros

    2009-01-01

    Music expresses emotion. A number of audio extracted features have influence on the perceived emotional expression of music. These audio features generate a high-dimensional space, on which music similarity retrieval can be performed effectively, with respect to human perception of the music......-emotion. However, the real-time systems that retrieve music over large music databases, can achieve order of magnitude performance increase, if applying multidimensional indexing over a dimensionally reduced audio feature space. To meet this performance achievement, in this paper, extensive studies are conducted...... on a number of dimensionality reduction algorithms, including both classic and novel approaches. The paper clearly envisages which dimensionality reduction techniques on the considered audio feature space, can preserve in average the accuracy of the emotion-based music retrieval....

  2. Video salient event classification using audio features

    Science.gov (United States)

    Corchs, Silvia; Ciocca, Gianluigi; Fiori, Massimiliano; Gasparini, Francesca

    2014-03-01

    The aim of this work is to detect the events in video sequences that are salient with respect to the audio signal. In particular, we focus on the audio analysis of a video, with the goal of finding which are the significant features to detect audio-salient events. In our work we have extracted the audio tracks from videos of different sport events. For each video, we have manually labeled the salient audio-events using the binary markings. On each frame, features in both time and frequency domains have been considered. These features have been used to train different classifiers: Classification and Regression Trees, Support Vector Machine, and k-Nearest Neighbor. The classification performances are reported in terms of confusion matrices.

  3. Fall Detection Using Smartphone Audio Features.

    Science.gov (United States)

    Cheffena, Michael

    2016-07-01

    An automated fall detection system based on smartphone audio features is developed. The spectrogram, mel frequency cepstral coefficents (MFCCs), linear predictive coding (LPC), and matching pursuit (MP) features of different fall and no-fall sound events are extracted from experimental data. Based on the extracted audio features, four different machine learning classifiers: k-nearest neighbor classifier (k-NN), support vector machine (SVM), least squares method (LSM), and artificial neural network (ANN) are investigated for distinguishing between fall and no-fall events. For each audio feature, the performance of each classifier in terms of sensitivity, specificity, accuracy, and computational complexity is evaluated. The best performance is achieved using spectrogram features with ANN classifier with sensitivity, specificity, and accuracy all above 98%. The classifier also has acceptable computational requirement for training and testing. The system is applicable in home environments where the phone is placed in the vicinity of the user.

  4. On the Use of Memory Models in Audio Features

    DEFF Research Database (Denmark)

    Jensen, Karl Kristoffer

    2011-01-01

    Audio feature estimation is potentially improved by including higher- level models. One such model is the Short Term Memory (STM) model. A new paradigm of audio feature estimation is obtained by adding the influence of notes in the STM. These notes are identified when the perceptual spectral flux...

  5. Simple Solutions for Space Station Audio Problems

    Science.gov (United States)

    Wood, Eric

    2016-01-01

    Throughout this summer, a number of different projects were supported relating to various NASA programs, including the International Space Station (ISS) and Orion. The primary project that was worked on was designing and testing an acoustic diverter which could be used on the ISS to increase sound pressure levels in Node 1, a module that does not have any Audio Terminal Units (ATUs) inside it. This acoustic diverter is not intended to be a permanent solution to providing audio to Node 1; it is simply intended to improve conditions while more permanent solutions are under development. One of the most exciting aspects of this project is that the acoustic diverter is designed to be 3D printed on the ISS, using the 3D printer that was set up earlier this year. Because of this, no new hardware needs to be sent up to the station, and no extensive hardware testing needs to be performed on the ground before sending it to the station. Instead, the 3D part file can simply be uploaded to the station's 3D printer, where the diverter will be made.

  6. EMOTION ANALYSIS OF SONGS BASED ON LYRICAL AND AUDIO FEATURES

    Directory of Open Access Journals (Sweden)

    Adit Jamdar

    2015-05-01

    Full Text Available In this paper, a method is proposed to detect the emotion of a song based on its lyrical and audio features. Lyrical features are generated by segmentation of lyrics during the process of data extraction. ANEW and WordNet knowledge is then incorporated to compute Valence and Arousal values. In addition to this, linguistic association rules are applied to ensure that the issue of ambiguity is properly addressed. Audio features are used to supplement the lyrical ones and include attributes like energy, tempo, and danceability. These features are extracted from The Echo Nest, a widely used music intelligence platform. Construction of training and test sets is done on the basis of social tags extracted from the last.fm website. The classification is done by applying feature weighting and stepwise threshold reduction on the k-Nearest Neighbors algorithm to provide fuzziness in the classification.

  7. Audio Environment Recognition using Zero Crossing Features and MPEG-7 Descriptors

    OpenAIRE

    Saleh Al-Zhrani; Mubarak AlQahtani

    2010-01-01

    Problem statement: This study investigated zero crossing features and selected MPEG-7 audio descriptors for environment sound recognition applications such as audio forensics. Approach: The study implemented several experiments focusing on the problems of environment recognition from audio particularly for forensic applications. Results: It was investigated the effect of the temporal zero crossing feature as well as selected MPEG-7 audio low level descriptors on environment sound recognition....

  8. Audio-visual synchrony and feature-selective attention co-amplify early visual processing.

    Science.gov (United States)

    Keitel, Christian; Müller, Matthias M

    2016-05-01

    Our brain relies on neural mechanisms of selective attention and converging sensory processing to efficiently cope with rich and unceasing multisensory inputs. One prominent assumption holds that audio-visual synchrony can act as a strong attractor for spatial attention. Here, we tested for a similar effect of audio-visual synchrony on feature-selective attention. We presented two superimposed Gabor patches that differed in colour and orientation. On each trial, participants were cued to selectively attend to one of the two patches. Over time, spatial frequencies of both patches varied sinusoidally at distinct rates (3.14 and 3.63 Hz), giving rise to pulse-like percepts. A simultaneously presented pure tone carried a frequency modulation at the pulse rate of one of the two visual stimuli to introduce audio-visual synchrony. Pulsed stimulation elicited distinct time-locked oscillatory electrophysiological brain responses. These steady-state responses were quantified in the spectral domain to examine individual stimulus processing under conditions of synchronous versus asynchronous tone presentation and when respective stimuli were attended versus unattended. We found that both, attending to the colour of a stimulus and its synchrony with the tone, enhanced its processing. Moreover, both gain effects combined linearly for attended in-sync stimuli. Our results suggest that audio-visual synchrony can attract attention to specific stimulus features when stimuli overlap in space.

  9. Audio segmentation using Flattened Local Trimmed Range for ecological acoustic space analysis

    Directory of Open Access Journals (Sweden)

    Giovany Vega

    2016-06-01

    Full Text Available The acoustic space in a given environment is filled with footprints arising from three processes: biophony, geophony and anthrophony. Bioacoustic research using passive acoustic sensors can result in thousands of recordings. An important component of processing these recordings is to automate signal detection. In this paper, we describe a new spectrogram-based approach for extracting individual audio events. Spectrogram-based audio event detection (AED relies on separating the spectrogram into background (i.e., noise and foreground (i.e., signal classes using a threshold such as a global threshold, a per-band threshold, or one given by a classifier. These methods are either too sensitive to noise, designed for an individual species, or require prior training data. Our goal is to develop an algorithm that is not sensitive to noise, does not need any prior training data and works with any type of audio event. To do this, we propose: (1 a spectrogram filtering method, the Flattened Local Trimmed Range (FLTR method, which models the spectrogram as a mixture of stationary and non-stationary energy processes and mitigates the effect of the stationary processes, and (2 an unsupervised algorithm that uses the filter to detect audio events. We measured the performance of the algorithm using a set of six thoroughly validated audio recordings and obtained a sensitivity of 94% and a positive predictive value of 89%. These sensitivity and positive predictive values are very high, given that the validated recordings are diverse and obtained from field conditions. The algorithm was then used to extract audio events in three datasets. Features of these audio events were plotted and showed the unique aspects of the three acoustic communities.

  10. Audio Environment Recognition using Zero Crossing Features and MPEG-7 Descriptors

    Directory of Open Access Journals (Sweden)

    Saleh Al-Zhrani

    2010-01-01

    Full Text Available Problem statement: This study investigated zero crossing features and selected MPEG-7 audio descriptors for environment sound recognition applications such as audio forensics. Approach: The study implemented several experiments focusing on the problems of environment recognition from audio particularly for forensic applications. Results: It was investigated the effect of the temporal zero crossing feature as well as selected MPEG-7 audio low level descriptors on environment sound recognition. The performance was evaluated against a varying number of training sounds and samples per training file. Conclusion/Recommendations: Experimental results showed that higher recognition accuracy is achieved by increasing the number of training files and by decreasing the number of samples per training file. This study presented an audio environment recognition using zero crossing features and MPEG-7 Descriptors.

  11. Automatic Segmentation of News Items Based on Video and Audio Features

    Institute of Scientific and Technical Information of China (English)

    王伟强; 高文

    2002-01-01

    The automatic segmentation of news items is a key for implementing the automatic cataloging system of news video. This paper presents an approach which manages audio and video feature information to automatically segment news items. The integration of audio and visual analyses can overcome the weakness of the approach using only image analysis techniques. It makes the approach more adaptable to various situations of news items. The proposed approach detects silence segments in accompanying audio, and integrates them with shot segmentation results, as well as anchor shot detection results, to determine the boundaries among news items. Experimental results show that the integration of audio and video features is an effective approach to solving the problem of automatic segmentation of news items.

  12. Bimodal Log-linear Regression for Fusion of Audio and Visual Features

    NARCIS (Netherlands)

    Rudovic, Ognjen; Petridis, Stavros; Pantic, Maja

    2013-01-01

    One of the most commonly used audiovisual fusion approaches is feature-level fusion where the audio and visual features are concatenated. Although this approach has been successfully used in several applications, it does not take into account interactions between the features, which can be a problem

  13. An Analysis of Audio Features to Develop a Human Activity Recognition Model Using Genetic Algorithms, Random Forests, and Neural Networks

    Directory of Open Access Journals (Sweden)

    Carlos E. Galván-Tejada

    2016-01-01

    Full Text Available This work presents a human activity recognition (HAR model based on audio features. The use of sound as an information source for HAR models represents a challenge because sound wave analyses generate very large amounts of data. However, feature selection techniques may reduce the amount of data required to represent an audio signal sample. Some of the audio features that were analyzed include Mel-frequency cepstral coefficients (MFCC. Although MFCC are commonly used in voice and instrument recognition, their utility within HAR models is yet to be confirmed, and this work validates their usefulness. Additionally, statistical features were extracted from the audio samples to generate the proposed HAR model. The size of the information is necessary to conform a HAR model impact directly on the accuracy of the model. This problem also was tackled in the present work; our results indicate that we are capable of recognizing a human activity with an accuracy of 85% using the HAR model proposed. This means that minimum computational costs are needed, thus allowing portable devices to identify human activities using audio as an information source.

  14. Integrating Audio-Visual Features and Text Information for Story Segmentation of News Video

    Institute of Scientific and Technical Information of China (English)

    LiuHua-yong; ZhouDong-ru

    2003-01-01

    Video data are composed of multimodal information streams including visual, auditory and textual streams, an approach of story segmentation for news video using multimodal analysis is described in this paper. The proposed approach detects the topic-caption frames, and integrates them with silence clips detection results, as well as shot segmentation results to locate the news story boundaries. The integration of audio-visual features and text information overcomes the weakness of the approach using only image analysis techniques. On test data with 135 400 frames, when the boundaries between news stories are detected, the accuracy rate 85.8% and the recall rate 97.5% are obtained. The experimental results show the approach is valid and robust.

  15. Integrating Audio-Visual Features and Text Information for Story Segmentation of News Video

    Institute of Scientific and Technical Information of China (English)

    Liu Hua-yong; Zhou Dong-ru

    2003-01-01

    Video data are composed of multimodal information streams including visual, auditory and textual streams, so an approach of story segmentation for news video using multimodal analysis is described in this paper. The proposed approach detects the topic-caption frames, and integrates them with silence clips detection results, as well as shot segmentation results to locate the news story boundaries. The integration of audio-visual features and text information overcomes the weakness of the approach using only image analysis techniques. On test data with 135 400 frames, when the boundaries between news stories are detected, the accuracy rate 85.8% and the recall rate 97.5% are obtained. The experimental results show the approach is valid and robust.

  16. Online Learning for Classification of Low-rank Representation Features and Its Applications in Audio Segment Classification

    CERN Document Server

    Shi, Ziqiang; Zheng, Tieran; Deng, Shiwen

    2011-01-01

    In this paper, a novel framework based on trace norm minimization for audio segment is proposed. In this framework, both the feature extraction and classification are obtained by solving corresponding convex optimization problem with trace norm regularization. For feature extraction, robust principle component analysis (robust PCA) via minimization a combination of the nuclear norm and the $\\ell_1$-norm is used to extract low-rank features which are robust to white noise and gross corruption for audio segments. These low-rank features are fed to a linear classifier where the weight and bias are learned by solving similar trace norm constrained problems. For this classifier, most methods find the weight and bias in batch-mode learning, which makes them inefficient for large-scale problems. In this paper, we propose an online framework using accelerated proximal gradient method. This framework has a main advantage in memory cost. In addition, as a result of the regularization formulation of matrix classificatio...

  17. Intelligent audio analysis

    CERN Document Server

    Schuller, Björn W

    2013-01-01

    This book provides the reader with the knowledge necessary for comprehension of the field of Intelligent Audio Analysis. It firstly introduces standard methods and discusses the typical Intelligent Audio Analysis chain going from audio data to audio features to audio recognition.  Further, an introduction to audio source separation, and enhancement and robustness are given. After the introductory parts, the book shows several applications for the three types of audio: speech, music, and general sound. Each task is shortly introduced, followed by a description of the specific data and methods applied, experiments and results, and a conclusion for this specific task. The books provides benchmark results and standardized test-beds for a broader range of audio analysis tasks. The main focus thereby lies on the parallel advancement of realism in audio analysis, as too often today’s results are overly optimistic owing to idealized testing conditions, and it serves to stimulate synergies arising from transfer of ...

  18. Audio-Visual Classification of Sports Types

    DEFF Research Database (Denmark)

    Gade, Rikke; Abou-Zleikha, Mohamed; Christensen, Mads Græsbøll

    2015-01-01

    In this work we propose a method for classification of sports types from combined audio and visual features ex- tracted from thermal video. From audio Mel Frequency Cepstral Coefficients (MFCC) are extracted, and PCA are applied to reduce the feature space to 10 dimensions. From the visual modality...... short trajectories are constructed to rep- resent the motion of players. From these, four motion fea- tures are extracted and combined directly with audio fea- tures for classification. A k-nearest neighbour classifier is applied for classification of 180 1-minute video sequences from three sports types...

  19. Unique features of space reactors

    Science.gov (United States)

    Buden, David

    Space reactors are designed to meet a unique set of requirements; they must be sufficiently compact to be launched in a rocket to their operational location, operate for many years without maintenance and servicing, operate in extreme environments, and reject heat by radiation to space. To meet these restrictions, operating temperatures are much greater than in terrestrial power plants, and the reactors tend to have a fast neutron spectrum. Currently, a new generation of space reactor power plants is being developed. The major effort is in the SP-100 program, where the power plant is being designed for seven years of full power, and no maintenance operation at a reactor outlet operating temperature of 1350 K.

  20. Audio Papers

    DEFF Research Database (Denmark)

    Groth, Sanne Krogh; Samson, Kristine

    2016-01-01

    With this special issue of Seismograf we are happy to present a new format of articles: Audio Papers. Audio papers resemble the regular essay or the academic text in that they deal with a certain topic of interest, but presented in the form of an audio production. The audio paper is an extension...

  1. Introduction to audio analysis a MATLAB approach

    CERN Document Server

    Giannakopoulos, Theodoros

    2014-01-01

    Introduction to Audio Analysis serves as a standalone introduction to audio analysis, providing theoretical background to many state-of-the-art techniques. It covers the essential theory necessary to develop audio engineering applications, but also uses programming techniques, notably MATLAB®, to take a more applied approach to the topic. Basic theory and reproducible experiments are combined to demonstrate theoretical concepts from a practical point of view and provide a solid foundation in the field of audio analysis. Audio feature extraction, audio classification, audio segmentation, au

  2. Audio Twister

    DEFF Research Database (Denmark)

    Cermak, Daniel; Moreno Garcia, Rodrigo; Monastiridis, Stefanos

    2015-01-01

    Daniel Cermak-Sassenrath, Rodrigo Moreno Garcia, Stefanos Monastiridis. Audio Twister. Installation. P-Hack Copenhagen 2015, Copenhagen, DK, Apr 24, 2015.......Daniel Cermak-Sassenrath, Rodrigo Moreno Garcia, Stefanos Monastiridis. Audio Twister. Installation. P-Hack Copenhagen 2015, Copenhagen, DK, Apr 24, 2015....

  3. Learning invariant features through local space contraction

    CERN Document Server

    Rifai, Salah; Glorot, Xavier; Mesnil, Gregoire; Bengio, Yoshua; Vincent, Pascal

    2011-01-01

    We present in this paper a novel approach for training deterministic auto-encoders. We show that by adding a well chosen penalty term to the classical reconstruction cost function, we can achieve results that equal or surpass those attained by other regularized auto-encoders as well as denoising auto-encoders on a range of datasets. This penalty term corresponds to the Frobenius norm of the Jacobian matrix of the encoder activations with respect to the input. We show that this penalty term results in a localized space contraction which in turn yields robust features on the activation layer. Furthermore, we show how this penalty term is related to both regularized auto-encoders and denoising encoders and how it can be seen as a link between deterministic and non-deterministic auto-encoders. We find empirically that this penalty helps to carve a representation that better captures the local directions of variation dictated by the data, corresponding to a lower-dimensional non-linear manifold, while being more i...

  4. Concept Framework for Audio Information Retrieval: ARF

    Institute of Scientific and Technical Information of China (English)

    LI GuoHui(李国辉); WU DeFeng(武德峰); ZHANG Jun(张军)

    2003-01-01

    The majority of researches on content-based retrieval focused on visual media.However audio is also an important medium and information carrier from the viewpoint of humanauditory perception, so it is needed to retrieve for audio collection. Audio is handled by conven-tional methods as an opaque stream medium, which is not suitable for information retrieval byits content. In fact, audio carries rich aural information with the form of speech, musical, andsound effects, so it could be retrieved based on its aural content, such as acoustic features, musicalmelodies and associated semantics. In this paper, a concept framework (ARF) for content-basedaudio retrieval is proposed from systematic perspectives, which describes audio content model,audio retrieval architecture and audio query schemes. Audio contents are represented by a hier-archical model and a set of formal descriptions from physical to acoustic to semantic level, whichdepict acoustic features, logical structure and semantics of audio and audio objects. The archi-tecture consisting of audio meta-database, populating and accessing modules presents a systemstructure view of audio information retrieval. The query schemes give generalized approaches andmodes concerning how users deliver audio information needs to audio collections. Finally, an audioretrieval example implemented is used to explain and specify the application of the components in the proposed ARF.

  5. Categorizing Video Game Audio

    DEFF Research Database (Denmark)

    Westerberg, Andreas Rytter; Schoenau-Fog, Henrik

    2015-01-01

    This paper dives into the subject of video game audio and how it can be categorized in order to deliver a message to a player in the most precise way. A new categorization, with a new take on the diegetic spaces, can be used a tool of inspiration for sound- and game-designers to rethink how they ...

  6. Searching Fragment Spaces with feature trees.

    Science.gov (United States)

    Lessel, Uta; Wellenzohn, Bernd; Lilienthal, Markus; Claussen, Holger

    2009-02-01

    Virtual combinatorial chemistry easily produces billions of compounds, for which conventional virtual screening cannot be performed even with the fastest methods available. An efficient solution for such a scenario is the generation of Fragment Spaces, which encode huge numbers of virtual compounds by their fragments/reagents and rules of how to combine them. Similarity-based searches can be performed in such spaces without ever fully enumerating all virtual products. Here we describe the generation of a huge Fragment Space encoding about 5 * 10(11) compounds based on established in-house synthesis protocols for combinatorial libraries, i.e., we encode practically evaluated combinatorial chemistry protocols in a machine readable form, rendering them accessible to in silico search methods. We show how such searches in this Fragment Space can be integrated as a first step in an overall workflow. It reduces the extremely huge number of virtual products by several orders of magnitude so that the resulting list of molecules becomes more manageable for further more elaborated and time-consuming analysis steps. Results of a case study are presented and discussed, which lead to some general conclusions for an efficient expansion of the chemical space to be screened in pharmaceutical companies.

  7. Efficient audio signal processing for embedded systems

    Science.gov (United States)

    Chiu, Leung Kin

    As mobile platforms continue to pack on more computational power, electronics manufacturers start to differentiate their products by enhancing the audio features. However, consumers also demand smaller devices that could operate for longer time, hence imposing design constraints. In this research, we investigate two design strategies that would allow us to efficiently process audio signals on embedded systems such as mobile phones and portable electronics. In the first strategy, we exploit properties of the human auditory system to process audio signals. We designed a sound enhancement algorithm to make piezoelectric loudspeakers sound ”richer" and "fuller." Piezoelectric speakers have a small form factor but exhibit poor response in the low-frequency region. In the algorithm, we combine psychoacoustic bass extension and dynamic range compression to improve the perceived bass coming out from the tiny speakers. We also developed an audio energy reduction algorithm for loudspeaker power management. The perceptually transparent algorithm extends the battery life of mobile devices and prevents thermal damage in speakers. This method is similar to audio compression algorithms, which encode audio signals in such a ways that the compression artifacts are not easily perceivable. Instead of reducing the storage space, however, we suppress the audio contents that are below the hearing threshold, therefore reducing the signal energy. In the second strategy, we use low-power analog circuits to process the signal before digitizing it. We designed an analog front-end for sound detection and implemented it on a field programmable analog array (FPAA). The system is an example of an analog-to-information converter. The sound classifier front-end can be used in a wide range of applications because programmable floating-gate transistors are employed to store classifier weights. Moreover, we incorporated a feature selection algorithm to simplify the analog front-end. A machine

  8. Multipurpose audio watermarking algorithm

    Institute of Scientific and Technical Information of China (English)

    Ning CHEN; Jie ZHU

    2008-01-01

    To make audio watermarking accomplish both copyright protection and content authentication with localization, a novel multipurpose audio watermarking scheme is proposed in this paper. The zero-watermarking idea is introduced into the design of robust watermarking algorithm to ensure the transparency and to avoid the interference between the robust watermark and the semi-fragile watermark. The property of natural audio that the VQ indices of DWT-DCT coefficients among neighboring frames tend to be very similar is utilized to extract essential feature from the host audio, which is then used for watermark extraction. And, the chaotic mapping based semi-fragile watermark is embedded in the detail wavelet coefficients based on the instantaneous mixing model of the independent component analysis (ICA) system. Both the robust and semi-fragile watermarks can be extracted blindly and the semi-fragile watermarking algorithm can localize the tampering accurately. Simulation results demonstrate the effectiveness of our algorithm in terms of transparency, security, robustness and tampering localization ability.

  9. pyAudioAnalysis: An Open-Source Python Library for Audio Signal Analysis.

    Directory of Open Access Journals (Sweden)

    Theodoros Giannakopoulos

    Full Text Available Audio information plays a rather important role in the increasing digital content that is available today, resulting in a need for methodologies that automatically analyze such content: audio event recognition for home automations and surveillance systems, speech recognition, music information retrieval, multimodal analysis (e.g. audio-visual analysis of online videos for content-based recommendation, etc. This paper presents pyAudioAnalysis, an open-source Python library that provides a wide range of audio analysis procedures including: feature extraction, classification of audio signals, supervised and unsupervised segmentation and content visualization. pyAudioAnalysis is licensed under the Apache License and is available at GitHub (https://github.com/tyiannak/pyAudioAnalysis/. Here we present the theoretical background behind the wide range of the implemented methodologies, along with evaluation metrics for some of the methods. pyAudioAnalysis has been already used in several audio analysis research applications: smart-home functionalities through audio event detection, speech emotion recognition, depression classification based on audio-visual features, music segmentation, multimodal content-based movie recommendation and health applications (e.g. monitoring eating habits. The feedback provided from all these particular audio applications has led to practical enhancement of the library.

  10. pyAudioAnalysis: An Open-Source Python Library for Audio Signal Analysis.

    Science.gov (United States)

    Giannakopoulos, Theodoros

    2015-01-01

    Audio information plays a rather important role in the increasing digital content that is available today, resulting in a need for methodologies that automatically analyze such content: audio event recognition for home automations and surveillance systems, speech recognition, music information retrieval, multimodal analysis (e.g. audio-visual analysis of online videos for content-based recommendation), etc. This paper presents pyAudioAnalysis, an open-source Python library that provides a wide range of audio analysis procedures including: feature extraction, classification of audio signals, supervised and unsupervised segmentation and content visualization. pyAudioAnalysis is licensed under the Apache License and is available at GitHub (https://github.com/tyiannak/pyAudioAnalysis/). Here we present the theoretical background behind the wide range of the implemented methodologies, along with evaluation metrics for some of the methods. pyAudioAnalysis has been already used in several audio analysis research applications: smart-home functionalities through audio event detection, speech emotion recognition, depression classification based on audio-visual features, music segmentation, multimodal content-based movie recommendation and health applications (e.g. monitoring eating habits). The feedback provided from all these particular audio applications has led to practical enhancement of the library.

  11. AC-3 audio coder

    Science.gov (United States)

    Todd, Craig

    1995-12-01

    AC-3 is a system for coding up to 5.1 channels of audio into a low bit-rate data stream. High quality may be obtained with compression ratios approaching 12-1 for multichannel audio programs. The high compression ratio is achieved by methods which do not increase decoder memory, and thus cost. The methods employed include: the transmission of a high frequency resolution spectral envelope; and a novel forward/backward adaptive bit allocation algorithm. In order to satisfy practical requirements of an emissions coder, the AC-3 syntax includes a number of features useful to broadcasters and consumers. These features include: loudness uniformity between programs; dynamic range control; and broadcaster control of downmix coefficients. The AC-3 coder has been formally selected for inclusion of the U.S. HDTV broadcast standard, and has been informally selected for several additional applications.

  12. Balancing Audio

    DEFF Research Database (Denmark)

    Walther-Hansen, Mads

    2016-01-01

    This paper explores the concept of balance in music production and examines the role of conceptual metaphors in reasoning about audio editing. Balance may be the most central concept in record production, however, the way we cognitively understand and respond meaningfully to a mix requiring balance...... is not thoroughly understood. In this paper I treat balance as a metaphor that we use to reason about several different actions in music production, such as adjusting levels, editing the frequency spectrum or the spatiality of the recording. This study is based on an exploration of a linguistic corpus of sound...

  13. Tournament screening cum EBIC for feature selection with high-dimensional feature spaces

    Institute of Scientific and Technical Information of China (English)

    2009-01-01

    The feature selection characterized by relatively small sample size and extremely high-dimensional feature space is common in many areas of contemporary statistics.The high dimensionality of the feature space causes serious diffculties:(i) the sample correlations between features become high even if the features are stochastically independent;(ii) the computation becomes intractable.These diffculties make conventional approaches either inapplicable or ine?cient.The reduction of dimensionality of the feature space followed by low dimensional approaches appears the only feasible way to tackle the problem.Along this line,we develop in this article a tournament screening cum EBIC approach for feature selection with high dimensional feature space.The procedure of tournament screening mimics that of a tournament.It is shown theoretically that the tournament screening has the sure screening property,a necessary property which should be satisfied by any valid screening procedure.It is demonstrated by numerical studies that the tournament screening cum EBIC approach enjoys desirable properties such as having higher positive selection rate and lower false discovery rate than other approaches.

  14. Multiclass Bayes error estimation by a feature space sampling technique

    Science.gov (United States)

    Mobasseri, B. G.; Mcgillem, C. D.

    1979-01-01

    A general Gaussian M-class N-feature classification problem is defined. An algorithm is developed that requires the class statistics as its only input and computes the minimum probability of error through use of a combined analytical and numerical integration over a sequence simplifying transformations of the feature space. The results are compared with those obtained by conventional techniques applied to a 2-class 4-feature discrimination problem with results previously reported and 4-class 4-feature multispectral scanner Landsat data classified by training and testing of the available data.

  15. Feature-space transformation improves supervised segmentation across scanners

    DEFF Research Database (Denmark)

    van Opbroek, Annegreet; Achterberg, Hakim C.; de Bruijne, Marleen

    2015-01-01

    Image-segmentation techniques based on supervised classification generally perform well on the condition that training and test samples have the same feature distribution. However, if training and test images are acquired with different scanners or scanning parameters, their feature distributions...... that our feature space transformation improved the Dice overlap of segmentations obtained with an SVM classifier from 0.36 to 0.85 when only 10 atlases were used and from 0.79 to 0.85 when around 100 atlases were used....

  16. Digital audio watermarking fundamentals, techniques and challenges

    CERN Document Server

    Xiang, Yong; Yan, Bin

    2017-01-01

    This book offers comprehensive coverage on the most important aspects of audio watermarking, from classic techniques to the latest advances, from commonly investigated topics to emerging research subdomains, and from the research and development achievements to date, to current limitations, challenges, and future directions. It also addresses key topics such as reversible audio watermarking, audio watermarking with encryption, and imperceptibility control methods. The book sets itself apart from the existing literature in three main ways. Firstly, it not only reviews classical categories of audio watermarking techniques, but also provides detailed descriptions, analysis and experimental results of the latest work in each category. Secondly, it highlights the emerging research topic of reversible audio watermarking, including recent research trends, unique features, and the potentials of this subdomain. Lastly, the joint consideration of audio watermarking and encryption is also reviewed. With the help of this...

  17. Audio Classification from Time-Frequency Texture

    CERN Document Server

    Yu, Guoshen

    2008-01-01

    Time-frequency representations of audio signals often resemble texture images. This paper derives a simple audio classification algorithm based on treating sound spectrograms as texture images. The algorithm is inspired by an earlier visual classification scheme particularly efficient at classifying textures. While solely based on time-frequency texture features, the algorithm achieves surprisingly good performance in musical instrument classification experiments.

  18. Registration of Standardized Histological Images in Feature Space

    CERN Document Server

    Bagci, Ulas; 10.1117/12.770219

    2009-01-01

    In this paper, we propose three novel and important methods for the registration of histological images for 3D reconstruction. First, possible intensity variations and nonstandardness in images are corrected by an intensity standardization process which maps the image scale into a standard scale where the similar intensities correspond to similar tissues meaning. Second, 2D histological images are mapped into a feature space where continuous variables are used as high confidence image features for accurate registration. Third, we propose an automatic best reference slice selection algorithm that improves reconstruction quality based on both image entropy and mean square error of the registration process. We demonstrate that the choice of reference slice has a significant impact on registration error, standardization, feature space and entropy information. After 2D histological slices are registered through an affine transformation with respect to an automatically chosen reference, the 3D volume is reconstruct...

  19. Audio Indexing for Efficiency

    Science.gov (United States)

    Rahnlom, Harold F.; Pedrick, Lillian

    1978-01-01

    This article describes Zimdex, an audio indexing system developed to solve the problem of indexing audio materials for individual instruction in the content area of the mathematics of life insurance. (Author)

  20. Detecting double compression of audio signal

    Science.gov (United States)

    Yang, Rui; Shi, Yun Q.; Huang, Jiwu

    2010-01-01

    MP3 is the most popular audio format nowadays in our daily life, for example music downloaded from the Internet and file saved in the digital recorder are often in MP3 format. However, low bitrate MP3s are often transcoded to high bitrate since high bitrate ones are of high commercial value. Also audio recording in digital recorder can be doctored easily by pervasive audio editing software. This paper presents two methods for the detection of double MP3 compression. The methods are essential for finding out fake-quality MP3 and audio forensics. The proposed methods use support vector machine classifiers with feature vectors formed by the distributions of the first digits of the quantized MDCT (modified discrete cosine transform) coefficients. Extensive experiments demonstrate the effectiveness of the proposed methods. To the best of our knowledge, this piece of work is the first one to detect double compression of audio signal.

  1. Back to basics audio

    CERN Document Server

    Nathan, Julian

    1998-01-01

    Back to Basics Audio is a thorough, yet approachable handbook on audio electronics theory and equipment. The first part of the book discusses electrical and audio principles. Those principles form a basis for understanding the operation of equipment and systems, covered in the second section. Finally, the author addresses planning and installation of a home audio system.Julian Nathan joined the audio service and manufacturing industry in 1954 and moved into motion picture engineering and production in 1960. He installed and operated recording theaters in Sydney, Austra

  2. Web Audio/Video Streaming Tool

    Science.gov (United States)

    Guruvadoo, Eranna K.

    2003-01-01

    In order to promote NASA-wide educational outreach program to educate and inform the public of space exploration, NASA, at Kennedy Space Center, is seeking efficient ways to add more contents to the web by streaming audio/video files. This project proposes a high level overview of a framework for the creation, management, and scheduling of audio/video assets over the web. To support short-term goals, the prototype of a web-based tool is designed and demonstrated to automate the process of streaming audio/video files. The tool provides web-enabled users interfaces to manage video assets, create publishable schedules of video assets for streaming, and schedule the streaming events. These operations are performed on user-defined and system-derived metadata of audio/video assets stored in a relational database while the assets reside on separate repository. The prototype tool is designed using ColdFusion 5.0.

  3. Principles of Audio Watermarking

    Directory of Open Access Journals (Sweden)

    Martin Hrncar

    2008-01-01

    Full Text Available The article contains a brief overview of modern methods for embedding additional data in audio signals. It could have many reasons - for the purposes of access control or identification related to particular type of audio. This secret information is not “visible” for a user. This concept utilizes the imperfection of human auditory system. Simple data hiding into audio file has been proved in MATLAB.

  4. Distributed Learning over Massive XML Documents in ELM Feature Space

    Directory of Open Access Journals (Sweden)

    Xin Bi

    2015-01-01

    Full Text Available With the exponentially increasing volume of XML data, centralized learning solutions are unable to meet the requirements of mining applications with massive training samples. In this paper, a solution to distributed learning over massive XML documents is proposed, which provides distributed conversion of XML documents into representation model in parallel based on MapReduce and a distributed learning component based on Extreme Learning Machine for mining tasks of classification or clustering. Within this framework, training samples are converted from raw XML datasets with better efficiency and information representation ability and taken to distributed learning algorithms in Extreme Learning Machine (ELM feature space. Extensive experiments are conducted on massive XML documents datasets to verify the effectiveness and efficiency for both classification and clustering applications.

  5. Audio Papers - A Manifesto

    DEFF Research Database (Denmark)

    Krogh Groth, Sanne; Samson, Kristine

    2016-01-01

    Audio papers resemble the regular essay or the academic text in that they deal with a certain topic of interest, but presented in the form of an audio production. The audio paper is an extension of the written paper through its specific use of media, a sonic awareness of aesthetics and materiality......, and creative approach towards communication. The audio paper is a performative format working together with an affective and elaborate understanding of language. It is an experiment embracing intellectual arguments and creative work, papers and performances, written scholarship and sonic aesthetics....

  6. Digital Audio Legal Recorder

    Data.gov (United States)

    Department of Transportation — The Digital Audio Legal Recorder (DALR) provides the legal recording capability between air traffic controllers, pilots and ground-based air traffic control TRACONs...

  7. Nonlinear multidimensional scaling and visualization of earthquake clusters over space, time and feature space

    Directory of Open Access Journals (Sweden)

    W. Dzwinel

    2005-01-01

    Full Text Available We present a novel technique based on a multi-resolutional clustering and nonlinear multi-dimensional scaling of earthquake patterns to investigate observed and synthetic seismic catalogs. The observed data represent seismic activities around the Japanese islands during 1997-2003. The synthetic data were generated by numerical simulations for various cases of a heterogeneous fault governed by 3-D elastic dislocation and power-law creep. At the highest resolution, we analyze the local cluster structures in the data space of seismic events for the two types of catalogs by using an agglomerative clustering algorithm. We demonstrate that small magnitude events produce local spatio-temporal patches delineating neighboring large events. Seismic events, quantized in space and time, generate the multi-dimensional feature space characterized by the earthquake parameters. Using a non-hierarchical clustering algorithm and nonlinear multi-dimensional scaling, we explore the multitudinous earthquakes by real-time 3-D visualization and inspection of the multivariate clusters. At the spatial resolutions characteristic of the earthquake parameters, all of the ongoing seismicity both before and after the largest events accumulates to a global structure consisting of a few separate clusters in the feature space. We show that by combining the results of clustering in both low and high resolution spaces, we can recognize precursory events more precisely and unravel vital information that cannot be discerned at a single resolution.

  8. Robust audio hashing for audio authentication watermarking

    Science.gov (United States)

    Zmudzinski, Sascha; Steinebach, Martin

    2008-02-01

    Current systems and protocols based on cryptographic methods for integrity and authenticity verification of media data do not distinguish between legitimate signal transformation and malicious tampering that manipulates the content. Furthermore, they usually provide no localization or assessment of the relevance of such manipulations with respect to human perception or semantics. We present an algorithm for a robust message authentication code in the context of content fragile authentication watermarking to verify the integrity of audio recodings by means of robust audio fingerprinting. Experimental results show that the proposed algorithm provides both a high level of distinction between perceptually different audio data and a high robustness against signal transformations that do not change the perceived information. Furthermore, it is well suited for the integration in a content-based authentication watermarking system.

  9. The Study of Audio Watermarking

    Institute of Scientific and Technical Information of China (English)

    王景; 唐晟

    2011-01-01

    This paper mainly introduced the basic knowledge of the digital watermarking and digital audio watermarking, including the definition of digital watermarking and digital audio watermarking, the embedding algorithm of digital audio watermarking and the com

  10. Structure Learning in Audio

    DEFF Research Database (Denmark)

    Nielsen, Andreas Brinch

    By having information about the setting a user is in, a computer is able to make decisions proactively to facilitate tasks for the user. Two approaches are taken in this thesis to achieve more information about an audio environment. One approach is that of classifying audio, and a new approach...

  11. Roundtable Audio Discussion

    Directory of Open Access Journals (Sweden)

    Chris Bigum

    2007-01-01

    Full Text Available RoundTable on Technology, Teaching and Tools. This is a roundtable audio interview conducted by James Farmer, founder of Edublogs, with Anne Bartlett-Bragg (University of Technology Sydney and Chris Bigum (Deakin University. Skype was used to make and record the audio conference and the resulting sound file was edited by Andrew McLauchlan.

  12. The Schema Features and Aesthetic Functions of the Foreign Language Teaching with Electric Audio-visual Aids%外语电化教学的图式特征与美育功能

    Institute of Scientific and Technical Information of China (English)

    齐欣

    2015-01-01

    外语电化教学对传统外语教学模式提出挑战的同时,其自身也面临着诸多的挑战,需要更多的理论支撑和功能研究。基于图式理论和美育教育,对外语电化教学图式特征及其隐性、感性、个性三种美育功能的创新审视,进一步丰富了外语电化教学的理论基础,并强调了其美育功能实现的必要性。%While the foreign language teaching with electric audio-visual aids brings about challenges to the traditional language teaching,it is also faced with many challenges,and more studies on its theoretical basis and functions are encouraged. On the basis of Schema Theory and aesthetic education,this paper makes an innovative examination of the schema features of foreign language teaching with electric audio-visual aids and its implicit,emotional,and personalized aesthetic functions,further enriches its theoretical basis and emphasizes the necessity of achieving its aesthetic functions.

  13. Watermarking-Based Digital Audio Data Authentication

    Directory of Open Access Journals (Sweden)

    Jana Dittmann

    2003-09-01

    Full Text Available Digital watermarking has become an accepted technology for enabling multimedia protection schemes. While most efforts concentrate on user authentication, recently interest in data authentication to ensure data integrity has been increasing. Existing concepts address mainly image data. Depending on the necessary security level and the sensitivity to detect changes in the media, we differentiate between fragile, semifragile, and content-fragile watermarking approaches for media authentication. Furthermore, invertible watermarking schemes exist while each bit change can be recognized by the watermark which can be extracted and the original data can be reproduced for high-security applications. Later approaches can be extended with cryptographic approaches like digital signatures. As we see from the literature, only few audio approaches exist and the audio domain requires additional strategies for time flow protection and resynchronization. To allow different security levels, we have to identify relevant audio features that can be used to determine content manipulations. Furthermore, in the field of invertible schemes, there are a bunch of publications for image and video data but no approaches for digital audio to ensure data authentication for high-security applications. In this paper, we introduce and evaluate two watermarking algorithms for digital audio data, addressing content integrity protection. In our first approach, we discuss possible features for a content-fragile watermarking scheme to allow several postproduction modifications. The second approach is designed for high-security applications to detect each bit change and reconstruct the original audio by introducing an invertible audio watermarking concept. Based on the invertible audio scheme, we combine digital signature schemes and digital watermarking to provide a public verifiable data authentication and a reproduction of the original, protected with a secret key.

  14. Feature Space Mapping as a universal adaptive system

    Science.gov (United States)

    Duch, Włodzisław; Diercksen, Geerd H. F.

    1995-06-01

    The most popular realizations of adaptive systems are based on the neural network type of algorithms, in particular feedforward multilayered perceptrons trained by backpropagation of error procedures. In this paper an alternative approach based on multidimensional separable localized functions centered at the data clusters is proposed. In comparison with the neural networks that use delocalized transfer functions this approach allows for full control of the basins of attractors of all stationary points. Slow learning procedures are replaced by the explicit construction of the landscape function followed by the optimization of adjustable parameters using gradient techniques or genetic algorithms. Retrieving information does not require searches in multidimensional subspaces but it is factorized into a series of one-dimensional searches. Feature Space Mapping is applicable to learning not only from facts but also from general laws and may be treated as a fuzzy expert system (neurofuzzy system). The number of nodes (fuzzy rules) is growing as the network creates new nodes for novel data but the search time is sublinear in the number of rules or data clusters stored. Such a system may work as a universal classificator, approximator and reasoning system. Examples of applications for the identification of spectra (classification), intelligent databases (association) and for the analysis of simple electrical circuits (expert system type) are given.

  15. Audio Video Compression Stream Synthesis and Implementation

    Institute of Scientific and Technical Information of China (English)

    徐燕凌; 方向忠; 周源华

    2004-01-01

    Multiplex of digital streams is one of the key technologies in audio video communication, and determines audio-video quality. A design scheme for an MPEG2 compliant digital television system including audio-video encoding and multiplexing was implemented. The principles and elements of system layer stream synthesis were analyzed. The key technologies of video and audio PES packetization were discussed, such as stream structure,scheduling matching, audio-video synchronization, data flow and buffering. DSP and FPGA are combined to construct header information and packet structure. The substitution of traditional RAM or PLD results in high operational efficiency and saves memory space. A scheduling algorithm was introduced for PES coding, using the monitor information of PES buffers. DTS is generated by multiplexer to guarantee synchronization. The system is not only simple but also stable, and maintains synchronization constraints of the standard. It supports both analogy and digital audio-video source input, and provides real-time MPEG2 compliant TS/PS output. It has perfect performance and meets the national broadcasting requirements.

  16. Forensic audio watermark detection

    Science.gov (United States)

    Steinebach, Martin; Zmudzinski, Sascha; Petrautzki, Dirk

    2012-03-01

    Digital audio watermarking detection is often computational complex and requires at least as much audio information as required to embed a complete watermark. In some applications, especially real-time monitoring, this is an important drawback. The reason for this is the usage of sync sequences at the beginning of the watermark, allowing a decision about the presence only if at least the sync has been found and retrieved. We propose an alternative method for detecting the presence of a watermark. Based on the knowledge of the secret key used for embedding, we create a mark for all potential marking stages and then use a sliding window to test a given audio file on the presence of statistical characteristics caused by embedding. In this way we can detect a watermark in less than 1 second of audio.

  17. Introduction to AVS Audio

    Institute of Scientific and Technical Information of China (English)

    Hao-Jun Ai; Shui-Xian Chen; Rui-Min Hu

    2006-01-01

    This paper describes a general audio coding algorithm which has been recently standardized by AVS, China.The algorithm is based on a perceptual coding technique. The codec delivers near CD-quality audio at 128kb/s. This paper describes the coder structure in detail and discusses the reasons for specific design methods. A summary of the subjective test results are presented for the prototype codec. Comparison Mean Opinion Score (CMOS) test indicates that the quality of the AVS audio coder is comparable with MPEG Layer-3 audio coder. A real-time decoder was used for the characterization test,which is based on a 16-bit fixed-point DSP. The performance of the DSP solution was demonstrated, including computational complexity and storage characteristics.

  18. Similarity Calculation Method of Chinese Short Text Based on Semantic Feature Space

    OpenAIRE

    Liqiang Pan; Pu Zhang; Anping Xiong

    2015-01-01

    In order to improve the accuracy of short text similarity calculation, this paper presents the idea that use the history of short text messages to construct semantic feature space, then use the vector in semantic feature space to represent short text and do semantic extension, and finally calculate the short text similarity of corresponding vector in the semantic feature space. This method can represent the semantic information of short text message thoroughly so as to improve the accuracy of...

  19. Improving audio chord transcription by exploiting harmonic and metric knowledge

    NARCIS (Netherlands)

    de Haas, W.B.; Rodrigues Magalhães, J.P.; Wiering, F.

    2012-01-01

    We present a new system for chord transcription from polyphonic musical audio that uses domain-specific knowledge about tonal harmony and metrical position to improve chord transcription performance. Low-level pulse and spectral features are extracted from an audio source using the Vamp plugin archi

  20. Voice activity detection using audio-visual information

    DEFF Research Database (Denmark)

    Petsatodis, Theodore; Pnevmatikakis, Aristodemos; Boukis, Christos

    2009-01-01

    An audio-visual voice activity detector that uses sensors positioned distantly from the speaker is presented. Its constituting unimodal detectors are based on the modeling of the temporal variation of audio and visual features using Hidden Markov Models; their outcomes are fused using a post-deci...

  1. A Physiologically Inspired Method for Audio Classification

    Directory of Open Access Journals (Sweden)

    David V. Anderson

    2005-06-01

    Full Text Available We explore the use of physiologically inspired auditory features with both physiologically motivated and statistical audio classification methods. We use features derived from a biophysically defensible model of the early auditory system for audio classification using a neural network classifier. We also use a Gaussian-mixture-model (GMM-based classifier for the purpose of comparison and show that the neural-network-based approach works better. Further, we use features from a more advanced model of the auditory system and show that the features extracted from this model of the primary auditory cortex perform better than the features from the early auditory stage. The features give good classification performance with only one-second data segments used for training and testing.

  2. Agency Video, Audio and Imagery Library

    Science.gov (United States)

    Grubbs, Rodney

    2015-01-01

    The purpose of this presentation was to inform the ISS International Partners of the new NASA Agency Video, Audio and Imagery Library (AVAIL) website. AVAIL is a new resource for the public to search for and download NASA-related imagery, and is not intended to replace the current process by which the International Partners receive their Space Station imagery products.

  3. Dynamic Bayesian Networks for Audio-Visual Speech Recognition

    Directory of Open Access Journals (Sweden)

    Liang Luhong

    2002-01-01

    Full Text Available The use of visual features in audio-visual speech recognition (AVSR is justified by both the speech generation mechanism, which is essentially bimodal in audio and visual representation, and by the need for features that are invariant to acoustic noise perturbation. As a result, current AVSR systems demonstrate significant accuracy improvements in environments affected by acoustic noise. In this paper, we describe the use of two statistical models for audio-visual integration, the coupled HMM (CHMM and the factorial HMM (FHMM, and compare the performance of these models with the existing models used in speaker dependent audio-visual isolated word recognition. The statistical properties of both the CHMM and FHMM allow to model the state asynchrony of the audio and visual observation sequences while preserving their natural correlation over time. In our experiments, the CHMM performs best overall, outperforming all the existing models and the FHMM.

  4. Limitations in 4-Year-Old Children's Sensitivity to the Spacing among Facial Features

    Science.gov (United States)

    Mondloch, Catherine J.; Thomson, Kendra

    2008-01-01

    Four-year-olds' sensitivity to differences among faces in the spacing of features was tested under 4 task conditions: judging distinctiveness when the external contour was visible and when it was occluded, simultaneous match-to-sample, and recognizing the face of a friend. In each task, the foil differed only in the spacing of features, and…

  5. The method of narrow-band audio classification based on universal noise background model

    Science.gov (United States)

    Rui, Rui; Bao, Chang-chun

    2013-03-01

    Audio classification is the basis of content-based audio analysis and retrieval. The conventional classification methods mainly depend on feature extraction of audio clip, which certainly increase the time requirement for classification. An approach for classifying the narrow-band audio stream based on feature extraction of audio frame-level is presented in this paper. The audio signals are divided into speech, instrumental music, song with accompaniment and noise using the Gaussian mixture model (GMM). In order to satisfy the demand of actual environment changing, a universal noise background model (UNBM) for white noise, street noise, factory noise and car interior noise is built. In addition, three feature schemes are considered to optimize feature selection. The experimental results show that the proposed algorithm achieves a high accuracy for audio classification, especially under each noise background we used and keep the classification time less than one second.

  6. Fast scenario-based design space exploration using feature selection

    NARCIS (Netherlands)

    van Stralen, P.; Pimentel, A.; Mühl, G.; Richling, J.; Herkersdorf, A.

    2012-01-01

    This paper presents a novel approach to efficiently perform early system level design space exploration (DSE) of MultiProcessor System-on-Chip (MPSoC) based embedded systems. By modeling dynamic multi-application workloads using application scenarios, optimal designs can be quickly identified using

  7. Perceptual Audio Hashing Functions

    Directory of Open Access Journals (Sweden)

    Emin Anarım

    2005-07-01

    Full Text Available Perceptual hash functions provide a tool for fast and reliable identification of content. We present new audio hash functions based on summarization of the time-frequency spectral characteristics of an audio document. The proposed hash functions are based on the periodicity series of the fundamental frequency and on singular-value description of the cepstral frequencies. They are found, on one hand, to perform very satisfactorily in identification and verification tests, and on the other hand, to be very resilient to a large variety of attacks. Moreover, we address the issue of security of hashes and propose a keying technique, and thereby a key-dependent hash function.

  8. Embedded Audio Without Beeps

    DEFF Research Database (Denmark)

    Overholt, Daniel; Møbius, Nikolaj Friis

    2014-01-01

    software environments for audio processing) via innovative interfaces that send real-time inputs to such software running on a laptop, mobile device, or small Linux board (e.g., Raspberry Pi or Beagleboard). Basic hardware will be provided, but participants are also encouraged to bring related equipment...

  9. Editing Audio with Audacity

    Directory of Open Access Journals (Sweden)

    Brandon Walsh

    2016-08-01

    Full Text Available For those interested in audio, basic sound editing skills go a long way. Being able to handle and manipulate the materials can help you take control of your object of study: you can zoom in and extract particular moments to analyze, process the audio, and upload the materials to a server to compliment a blog post on the topic. On a more practical level, these skills could also allow you to record and package recordings of yourself or others for distribution. That guest lecture taking place in your department? Record it and edit it yourself! Doing so is a lightweight way to distribute resources among various institutions, and it also helps make the materials more accessible for readers and listeners with a wide variety of learning needs. In this lesson you will learn how to use Audacity to load, record, edit, mix, and export audio files. Sound editing platforms are often expensive and offer extensive capabilities that can be overwhelming to the first-time user, but Audacity is a free and open source alternative that offers powerful capabilities for sound editing with a low barrier for entry. For this lesson we will work with two audio files: a recording of Bach’s Goldberg Variations available from MusOpen and another recording of your own voice that will be made in the course of the lesson. This tutorial uses Audacity 2.1.2, released January 2016.

  10. Audio Feedback -- Better Feedback?

    Science.gov (United States)

    Voelkel, Susanne; Mello, Luciane V.

    2014-01-01

    National Student Survey (NSS) results show that many students are dissatisfied with the amount and quality of feedback they get for their work. This study reports on two case studies in which we tried to address these issues by introducing audio feedback to one undergraduate (UG) and one postgraduate (PG) class, respectively. In case study one…

  11. The audio expert everything you need to know about audio

    CERN Document Server

    Winer, Ethan

    2012-01-01

    The Audio Expert is a comprehensive reference that covers all aspects of audio, with many practical, as well as theoretical, explanations. Providing in-depth descriptions of how audio really works, using common sense plain-English explanations and mechanical analogies with minimal math, the book is written for people who want to understand audio at the deepest, most technical level, without needing an engineering degree. It's presented in an easy-to-read, conversational tone, and includes more than 400 figures and photos augmenting the text.The Audio Expert takes th

  12. Generating Feature Spaces for Linear Algorithms with Regularized Sparse Kernel Slow Feature Analysis

    NARCIS (Netherlands)

    Böhmer, W.; Grünewälder, S.; Nickisch, H.; Obermayer, K.

    2013-01-01

    Without non-linear basis functions many problems can not be solved by linear algorithms. This article proposes a method to automatically construct such basis functions with slow feature analysis (SFA). Non-linear optimization of this unsupervised learning method generates an orthogonal basis on the

  13. Towards semantically sensitive text clustering: a feature space modeling technology based on dimension extension.

    Science.gov (United States)

    Liu, Yuanchao; Liu, Ming; Wang, Xin

    2015-01-01

    The objective of text clustering is to divide document collections into clusters based on the similarity between documents. In this paper, an extension-based feature modeling approach towards semantically sensitive text clustering is proposed along with the corresponding feature space construction and similarity computation method. By combining the similarity in traditional feature space and that in extension space, the adverse effects of the complexity and diversity of natural language can be addressed and clustering semantic sensitivity can be improved correspondingly. The generated clusters can be organized using different granularities. The experimental evaluations on well-known clustering algorithms and datasets have verified the effectiveness of our approach.

  14. An analysis of possible advanced space strategies featuring the role of space resource utilization

    Science.gov (United States)

    Cordell, Bruce; Steinbronn, Otto

    A major weakness of space planning in the U.S.A. has been the lack of clearly defined, major space goals within a coherent, politically palatable, long-range national space strategy. Unresolved issues include the Space Station's role, the most profitable space exploration strategies, and space resource use. We present an analysis of these factors with special emphasis on space resource utilization. Our performance modeling reveals that lunar oxygen is useful on or near the Moon and—if lunar hydrogen is available—lunar oxygen is also economical in LEO. Use of volatile materials from Phobos/Deimos is preferred or attractive in LEO, low lunar orbit, and—if lunar hydrogen is unavailable—on the Moon. Thus it appears that resource synergisms between operations in the Mars system and in Earth-Moon space could become commercially important.

  15. Neurophysiological Correlates of Featural and Spacing Processing for Face and Non-face Stimuli

    Science.gov (United States)

    Negrini, Marcello; Brkić, Diandra; Pizzamiglio, Sara; Premoli, Isabella; Rivolta, Davide

    2017-01-01

    The peculiar ability of humans to recognize hundreds of faces at a glance has been attributed to face-specific perceptual mechanisms known as holistic processing. Holistic processing includes the ability to discriminate individual facial features (i.e., featural processing) and their spatial relationships (i.e., spacing processing). Here, we aimed to characterize the spatio-temporal dynamics of featural- and spacing-processing of faces and objects. Nineteen healthy volunteers completed a newly created perceptual discrimination task for faces and objects (i.e., the “University of East London Face Task”) while their brain activity was recorded with a high-density (128 electrodes) electroencephalogram. Our results showed that early event related potentials at around 100 ms post-stimulus onset (i.e., P100) are sensitive to both facial features and spacing between the features. Spacing and features discriminability for objects occurred at circa 200 ms post-stimulus onset (P200). These findings indicate the existence of neurophysiological correlates of spacing vs. features processing in both face and objects, and demonstrate faster brain processing for faces. PMID:28348535

  16. Quality Enhancement of Compressed Audio Based on Statistical Conversion

    Directory of Open Access Journals (Sweden)

    Mouchtaris Athanasios

    2008-01-01

    Full Text Available Most audio compression formats are based on the idea of low bit rate transparent encoding. As these types of audio signals are starting to migrate from portable players with inexpensive headphones to higher quality home audio systems, it is becoming evident that higher bit rates may be required to maintain transparency. We propose a novel method that enhances low bit rate encoded audio segments by applying multiband audio resynthesis methods in a postprocessing stage. Our algorithm employs the highly flexible Generalized Gaussian mixture model which offers a more accurate representation of audio features than the Gaussian mixture model. A novel residual conversion technique is applied which proves to significantly improve the enhancement performance without excessive overhead. In addition, both cepstral and residual errors are dramatically decreased by a feature-alignment scheme that employs a sorting transformation. Some improvements regarding the quantization step are also described that enable us to further reduce the algorithm overhead. Signal enhancement examples are presented and the results show that the overhead size incurred by the algorithm is a fraction of the uncompressed signal size. Our results show that the resulting audio quality is comparable to that of a standard perceptual codec operating at approximately the same bit rate.

  17. Quality Enhancement of Compressed Audio Based on Statistical Conversion

    Directory of Open Access Journals (Sweden)

    Chris Kyriakakis

    2008-07-01

    Full Text Available Most audio compression formats are based on the idea of low bit rate transparent encoding. As these types of audio signals are starting to migrate from portable players with inexpensive headphones to higher quality home audio systems, it is becoming evident that higher bit rates may be required to maintain transparency. We propose a novel method that enhances low bit rate encoded audio segments by applying multiband audio resynthesis methods in a postprocessing stage. Our algorithm employs the highly flexible Generalized Gaussian mixture model which offers a more accurate representation of audio features than the Gaussian mixture model. A novel residual conversion technique is applied which proves to significantly improve the enhancement performance without excessive overhead. In addition, both cepstral and residual errors are dramatically decreased by a feature-alignment scheme that employs a sorting transformation. Some improvements regarding the quantization step are also described that enable us to further reduce the algorithm overhead. Signal enhancement examples are presented and the results show that the overhead size incurred by the algorithm is a fraction of the uncompressed signal size. Our results show that the resulting audio quality is comparable to that of a standard perceptual codec operating at approximately the same bit rate.

  18. Indexing spoken audio by LSA and SOMs

    OpenAIRE

    2000-01-01

    This paper presents an indexing system for spoken audio documents. The framework is indexing and retrieval of broadcast news. The proposed indexing system applies latent semantic analysis (LSA) and self-organizing maps (SOM) to map the documents into a semantic vector space and to display the semantic structures of the document collection. The SOM is also used to enhance the indexing of the documents that are difficult to decode. Relevant index terms and suitable index weights are computed by...

  19. Processing features of audio and video files

    Directory of Open Access Journals (Sweden)

    E. N. Vydalko

    2012-11-01

    Full Text Available Currently the analog videotape recorders using is a thing of the past. Therefore, digital video recording became actual and attractive for the users who put image quality above all else. It is important to make a video recording in digital format without digital signal into analog signal converting. The last leads to a significant loss of quality records. A program processes video stream of digital cable TV is described in this article. Also it can convert video stream of digital cable TV into a format that can easily be used by any computer or DVD-player in digital form.

  20. Parametric Coding of Stereo Audio

    Directory of Open Access Journals (Sweden)

    Erik Schuijers

    2005-06-01

    Full Text Available Parametric-stereo coding is a technique to efficiently code a stereo audio signal as a monaural signal plus a small amount of parametric overhead to describe the stereo image. The stereo properties are analyzed, encoded, and reinstated in a decoder according to spatial psychoacoustical principles. The monaural signal can be encoded using any (conventional audio coder. Experiments show that the parameterized description of spatial properties enables a highly efficient, high-quality stereo audio representation.

  1. An Efficient Audio Classification Approach Based on Support Vector Machines

    Directory of Open Access Journals (Sweden)

    Lhoucine Bahatti

    2016-05-01

    Full Text Available In order to achieve an audio classification aimed to identify the composer, the use of adequate and relevant features is important to improve performance especially when the classification algorithm is based on support vector machines. As opposed to conventional approaches that often use timbral features based on a time-frequency representation of the musical signal using constant window, this paper deals with a new audio classification method which improves the features extraction according the Constant Q Transform (CQT approach and includes original audio features related to the musical context in which the notes appear. The enhancement done by this work is also lay on the proposal of an optimal features selection procedure which combines filter and wrapper strategies. Experimental results show the accuracy and efficiency of the adopted approach in the binary classification as well as in the multi-class classification.

  2. Portable Audio Design

    DEFF Research Database (Denmark)

    Groth, Sanne Krogh

    2014-01-01

    within online and physical institutional contexts. The approach focuses especially on the relationship to specific sites, and how an awareness of the relationship between the site and the production can be part of the design process. Such awareness entails several approaches: the necessity of paying...... attention to the specific genre; a grasping of the complex relationship between site and time, the actual and the virtual; and getting aquatint with the specific site’s soundscape by approaching it both intuitively and systematically. These steps will finally lead to an audio production that not only...

  3. Small signal audio design

    CERN Document Server

    Self, Douglas

    2014-01-01

    Learn to use inexpensive and readily available parts to obtain state-of-the-art performance in all the vital parameters of noise, distortion, crosstalk and so on. With ample coverage of preamplifiers and mixers and a new chapter on headphone amplifiers, this practical handbook provides an extensive repertoire of circuits that can be put together to make almost any type of audio system.A resource packed full of valuable information, with virtually every page revealing nuggets of specialized knowledge not found elsewhere. Essential points of theory that bear on practical performance are lucidly

  4. Large anterior temporal Virchow-Robin spaces: unique MR imaging features

    Energy Technology Data Exchange (ETDEWEB)

    Lim, Anthony T. [Monash University, Neuroradiology Service, Monash Imaging, Monash Health, Melbourne, Victoria (Australia); Chandra, Ronil V. [Monash University, Neuroradiology Service, Monash Imaging, Monash Health, Melbourne, Victoria (Australia); Monash University, Department of Surgery, Faculty of Medicine, Nursing and Health Sciences, Melbourne (Australia); Trost, Nicholas M. [St Vincent' s Hospital, Neuroradiology Service, Melbourne (Australia); McKelvie, Penelope A. [St Vincent' s Hospital, Anatomical Pathology, Melbourne (Australia); Stuckey, Stephen L. [Monash University, Neuroradiology Service, Monash Imaging, Monash Health, Melbourne, Victoria (Australia); Monash University, Southern Clinical School, Faculty of Medicine, Nursing and Health Sciences, Melbourne (Australia)

    2015-05-01

    Large Virchow-Robin (VR) spaces may mimic cystic tumor. The anterior temporal subcortical white matter is a recently described preferential location, with only 18 reported cases. Our aim was to identify unique MR features that could increase prospective diagnostic confidence. Thirty-nine cases were identified between November 2003 and February 2014. Demographic, clinical data and the initial radiological report were retrospectively reviewed. Two neuroradiologists reviewed all MR imaging; a neuropathologist reviewed histological data. Median age was 58 years (range 24-86 years); the majority (69 %) was female. There were no clinical symptoms that could be directly referable to the lesion. Two thirds were considered to be VR spaces on the initial radiological report. Mean maximal size was 9 mm (range 5-17 mm); majority (79 %) had perilesional T2 or fluid-attenuated inversion recovery (FLAIR) hyperintensity. The following were identified as potential unique MR features: focal cortical distortion by an adjacent branch of the middle cerebral artery (92 %), smaller adjacent VR spaces (26 %), and a contiguous cerebrospinal fluid (CSF) intensity tract (21 %). Surgery was performed in three asymptomatic patients; histopathology confirmed VR spaces. Unique MR features were retrospectively identified in all three patients. Large anterior temporal lobe VR spaces commonly demonstrate perilesional T2 or FLAIR signal and can be misdiagnosed as cystic tumor. Potential unique MR features that could increase prospective diagnostic confidence include focal cortical distortion by an adjacent branch of the middle cerebral artery, smaller adjacent VR spaces, and a contiguous CSF intensity tract. (orig.)

  5. Beyond podcasting: creative approaches to designing educational audio

    Directory of Open Access Journals (Sweden)

    Andrew Middleton

    2009-12-01

    Full Text Available This paper discusses a university-wide pilot designed to encourage academics to creatively explore learner-centred applications for digital audio. Participation in the pilot was diverse in terms of technical competence, confidence and contextual requirements and there was little prior experience of working with digital audio. Many innovative approaches were taken to using audio in a blended context including student-generated vox pops, audio feedback models, audio conversations and task-setting. A podcast was central to the pilot itself, providing a common space for the 25 participants, who were also supported by materials in several other formats. An analysis of podcast interviews involving pilot participants provided the data informing this case study. This paper concludes that audio has the potential to promote academic creativity in engaging students through media intervention. However, institutional scalability is dependent upon the availability of suitable timely support mechanisms that can address the lack of technical confidence evident in many staff. If that is in place, audio can be widely adopted by anyone seeking to add a new layer of presence and connectivity through the use of voice.

  6. Structuring feature space: a non-parametric method for volumetric transfer function generation.

    Science.gov (United States)

    Maciejewski, Ross; Woo, Insoo; Chen, Wei; Ebert, David S

    2009-01-01

    The use of multi-dimensional transfer functions for direct volume rendering has been shown to be an effective means of extracting materials and their boundaries for both scalar and multivariate data. The most common multi-dimensional transfer function consists of a two-dimensional (2D) histogram with axes representing a subset of the feature space (e.g., value vs. value gradient magnitude), with each entry in the 2D histogram being the number of voxels at a given feature space pair. Users then assign color and opacity to the voxel distributions within the given feature space through the use of interactive widgets (e.g., box, circular, triangular selection). Unfortunately, such tools lead users through a trial-and-error approach as they assess which data values within the feature space map to a given area of interest within the volumetric space. In this work, we propose the addition of non-parametric clustering within the transfer function feature space in order to extract patterns and guide transfer function generation. We apply a non-parametric kernel density estimation to group voxels of similar features within the 2D histogram. These groups are then binned and colored based on their estimated density, and the user may interactively grow and shrink the binned regions to explore feature boundaries and extract regions of interest. We also extend this scheme to temporal volumetric data in which time steps of 2D histograms are composited into a histogram volume. A three-dimensional (3D) density estimation is then applied, and users can explore regions within the feature space across time without adjusting the transfer function at each time step. Our work enables users to effectively explore the structures found within a feature space of the volume and provide a context in which the user can understand how these structures relate to their volumetric data. We provide tools for enhanced exploration and manipulation of the transfer function, and we show that the initial

  7. Efficient audio power amplification - challenges

    Energy Technology Data Exchange (ETDEWEB)

    Andersen, Michael A.E.

    2005-07-01

    For more than a decade efficient audio power amplification has evolved and today switch-mode audio power amplification in various forms are the state-of-the-art. The technical steps that lead to this evolution are described and in addition many of the challenges still to be faced and where extensive research and development are needed is covered. (au)

  8. Digital audio and video broadcasting by satellite

    Science.gov (United States)

    Yoshino, Takehiko

    In parallel with the progress of the practical use of satellite broadcasting and Hi-Vision or high-definition television technologies, research activities are also in progress to replace the conventional analog broadcasting services with a digital version. What we call 'digitalization' is not a mere technical matter but an important subject which will help promote multichannel or multimedia applications and, accordingly, can change the old concept of mass media, such as television or radio. NHK Science and Technical Research Laboratories has promoted studies of digital bandwidth compression, transmission, and application techniques. The following topics are covered: the trend of digital broadcasting; features of Integrated Services Digital Broadcasting (ISDB); compression encoding and transmission; transmission bit rate in 12 GHz band; number of digital TV transmission channels; multichannel pulse code modulation (PCM) audio broadcasting system via communication satellite; digital Hi-Vision broadcasting; and development of digital audio broadcasting (DAB) for mobile reception in Japan.

  9. Audio Watermarking with Error Correction

    Directory of Open Access Journals (Sweden)

    Aman Chadha

    2011-09-01

    Full Text Available In recent times, communication through the internet has tremendously facilitated the distribution of multimedia data. Although this is indubitably a boon, one of its repercussions is that it has also given impetus to the notorious issue of online music piracy. Unethical attempts can also be made to deliberately alter such copyrighted data and thus, misuse it. Copyright violation by means of unauthorized distribution, as well as unauthorized tampering of copyrighted audio data is an important technological and research issue. Audio watermarking has been proposed as a solution to tackle this issue. The main purpose of audio watermarking is to protect against possible threats to the audio data and in case of copyright violation or unauthorized tampering, authenticity of such data can be disputed by virtue of audio watermarking.

  10. Audio Watermarking with Error Correction

    CERN Document Server

    Chadha, Aman; Goel, Rishabh; Dave, Hiren; Roja, M Mani

    2011-01-01

    In recent times, communication through the internet has tremendously facilitated the distribution of multimedia data. Although this is indubitably a boon, one of its repercussions is that it has also given impetus to the notorious issue of online music piracy. Unethical attempts can also be made to deliberately alter such copyrighted data and thus, misuse it. Copyright violation by means of unauthorized distribution, as well as unauthorized tampering of copyrighted audio data is an important technological and research issue. Audio watermarking has been proposed as a solution to tackle this issue. The main purpose of audio watermarking is to protect against possible threats to the audio data and in case of copyright violation or unauthorized tampering, authenticity of such data can be disputed by virtue of audio watermarking.

  11. Advances in audio watermarking based on singular value decomposition

    CERN Document Server

    Dhar, Pranab Kumar

    2015-01-01

    This book introduces audio watermarking methods for copyright protection, which has drawn extensive attention for securing digital data from unauthorized copying. The book is divided into two parts. First, an audio watermarking method in discrete wavelet transform (DWT) and discrete cosine transform (DCT) domains using singular value decomposition (SVD) and quantization is introduced. This method is robust against various attacks and provides good imperceptible watermarked sounds. Then, an audio watermarking method in fast Fourier transform (FFT) domain using SVD and Cartesian-polar transformation (CPT) is presented. This method has high imperceptibility and high data payload and it provides good robustness against various attacks. These techniques allow media owners to protect copyright and to show authenticity and ownership of their material in a variety of applications.   ·         Features new methods of audio watermarking for copyright protection and ownership protection ·         Outl...

  12. Digital Audio Collections

    Directory of Open Access Journals (Sweden)

    Jason Tenter

    2010-11-01

    Full Text Available

    This paper is about the possibility of libraries creating digital music or audio collections based on the current state of the digital music industry, and in comparison with the difficulties librarians have found in adding e-books to collections. In comparing the e-book and digital music markets, factors such as digital rights management (DRM and the differences in both markets’ relationships with customers are examined. This juxtaposition suggests that where e-books have been difficult to include in library collections because publishers want to maintain control over their content, music publishers have had to resign some of the control over their products because of file-sharing, and so may work with libraries to develop these collections in a more constructive way than e-book venders. At the end of the paper, some models are suggested for developing these collections.

  13. A linear feature space for simultaneous learning of spatio-spectral filters in BCI

    NARCIS (Netherlands)

    Farquhar, J.D.R.

    2009-01-01

    It is shown how two of the most common types of feature mapping used for classification of single trial Electroencephalography (EEG), i.e. spatial and frequency filtering, can be equivalently performed as linear operations in the space of frequency-specific detector covariance tensors. Thus by first

  14. An alternative to scale-space representation for extracting local features in image recognition

    DEFF Research Database (Denmark)

    Andersen, Hans Jørgen; Nguyen, Phuong Giang

    2012-01-01

    In image recognition, the common approach for extracting local features using a scale-space representation has usually three main steps; first interest points are extracted at different scales, next from a patch around each interest point the rotation is calculated with corresponding orientation...

  15. Digital Audio Watermarking: An Overview

    Directory of Open Access Journals (Sweden)

    Bhuvnesh Kumar Singh

    2013-10-01

    Full Text Available Digital watermarking is a very recent research area. Digital audio watermarking is a method to embed or hide the Watermark (Information signal into a digital signal i.e. Image, audio, text or video data. The watermark is difficult to remove from the audio signal. If the signal is copied, the information or watermark is also carried in the copy. A signal may carry several different watermarks at the same time. It is used to protecting multimedia data from unauthorized copying, piracy, ownership, inventions, authentication etc. in this paper we present the watermarking methods and applications

  16. 一种基于 MDCT 量化系数统计特征的A AC 音频隐写分析方法%A steganalysis method of AAC audio based on statistical features of MDCT quantized coefficients

    Institute of Scientific and Technical Information of China (English)

    王昱洁; 杨萍; 蒋薇薇

    2015-01-01

    文章提出了一种基于MDCT量化系数统计特征的AAC音频隐写分析方法。将AAC音频进行部分解码得到MDCT量化系数,在MDCT量化系数中提取广义高斯分布模型的参数、量化系数分布直方图的频域统计矩、帧内和帧间MDCT量化系数的Markov转移矩阵的部分数据作为隐写分析的特征,最后采用支持向量机进行分类。通过对不同比特率的AAC音频的实验结果表明,文中提出的AAC音频隐写分析方法对于MDCT量化系数中的直接扩频隐写方式的检测效果较好,对于比特率为128 kb/s的AAC音频,在隐写容量较低的情况下也能达到较高的检测率。%A steganalysis method of AAC audio based on statistical features of MDCT quantized coeffi‐cients is proposed .Firstly ,AAC audio is partly decoded to get MDCT quantized coefficients ,and then the parameters of generalized Gaussian distribution (GGD) model ,the statistical moments in fre‐quency domain of the distribution histogram of quantized coefficients ,and some data of the Markov transition matrix of the MDCT quantized coefficients in a frame and between frames are extracted from the MDCT quantized coefficients as the features of steganalysis .Finally ,the support vector machine is used as a classifier .The experimental results of AAC audio at different bitrates reveal that ,the pro‐posed steganalysis method of AAC audio has a good detection effect on the hiding method of the direct spread spectrum modulation on the MDCT quantized coefficients ,and for the AAC audio at 128 kb/s bitrate ,the detection accuracy is high even under the condition of low capacity of steganography .

  17. Technical Evaluation Report 52: Audio/ Videoconferencing Packages: High cost

    Directory of Open Access Journals (Sweden)

    Urel Sawyers

    2005-11-01

    Full Text Available This report compares two integrated course delivery packages: Centra 6 and WebEx. Both applications feature asynchronous and synchronous audio communications for online education and training. They are relatively costly products, and provide useful comparisons with the two less expensive products to be evaluated in the following report #53. The criteria used in the current evaluation include capacity, interactivity features, integration with learning management systems, technical specifications, and cost. The report ends with a short analysis of the currently emerging audio-conferencing software, Google Talk.

  18. Modeling Audio Fingerprints: Structure, Distortion, Capacity

    NARCIS (Netherlands)

    Doets, P.J.O.

    2010-01-01

    An audio fingerprint is a compact low-level representation of a multimedia signal. An audio fingerprint can be used to identify audio files or fragments in a reliable way. The use of audio fingerprints for identification consists of two phases. In the enrollment phase known content is fingerprinted,

  19. Virtual environment interaction through 3D audio by blind children.

    Science.gov (United States)

    Sánchez, J; Lumbreras, M

    1999-01-01

    Interactive software is actively used for learning, cognition, and entertainment purposes. Educational entertainment software is not very popular among blind children because most computer games and electronic toys have interfaces that are only accessible through visual cues. This work applies the concept of interactive hyperstories to blind children. Hyperstories are implemented in a 3D acoustic virtual world. In past studies we have conceptualized a model to design hyperstories. This study illustrates the feasibility of the model. It also provides an introduction to researchers to the field of entertainment software for blind children. As a result, we have designed and field tested AudioDoom, a virtual environment interacted through 3D Audio by blind children. AudioDoom is also a software that enables testing nontrivial interfaces and cognitive tasks with blind children. We explored the construction of cognitive spatial structures in the minds of blind children through audio-based entertainment and spatial sound navigable experiences. Children playing AudioDoom were exposed to first person experiences by exploring highly interactive virtual worlds through the use of 3D aural representations of the space. This experience was structured in several cognitive tasks where they had to build concrete models of their spatial representations constructed through the interaction with AudioDoom by using Legotrade mark blocks. We analyze our preliminary results after testing AudioDoom with Chilean children from a school for blind children. We discuss issues such as interactivity in software without visual cues, the representation of spatial sound navigable experiences, and entertainment software such as computer games for blind children. We also evaluate the feasibility to construct virtual environments through the design of dynamic learning materials with audio cues.

  20. Tag Based Audio Search Engine

    Directory of Open Access Journals (Sweden)

    Parameswaran Vellachu

    2012-03-01

    Full Text Available The volume of the music database is increasing day by day. Getting the required song as per the choice of the listener is a big challenge. Hence, it is really hard to manage this huge quantity, in terms of searching, filtering, through the music database. It is surprising to see that the audio and music industry still rely on very simplistic metadata to describe music files. However, while searching audio resource, an efficient "Tag Based Audio Search Engine" is necessary. The current research focuses on two aspects of the musical databases 1. Tag Based Semantic Annotation Generation using the tag based approach.2. An audio search engine, using which the user can retrieve the songs based on the users choice. The proposed method can be used to annotation and retrieve songs based on musical instruments used , mood of the song, theme of the song, singer, music director, artist, film director, instrument, genre or style and so on.

  1. A centralized audio presentation manager

    Energy Technology Data Exchange (ETDEWEB)

    Papp, A.L. III; Blattner, M.M.

    1994-05-16

    The centralized audio presentation manager addresses the problems which occur when multiple programs running simultaneously attempt to use the audio output of a computer system. Time dependence of sound means that certain auditory messages must be scheduled simultaneously, which can lead to perceptual problems due to psychoacoustic phenomena. Furthermore, the combination of speech and nonspeech audio is examined; each presents its own problems of perceptibility in an acoustic environment composed of multiple auditory streams. The centralized audio presentation manager receives abstract parameterized message requests from the currently running programs, and attempts to create and present a sonic representation in the most perceptible manner through the use of a theoretically and empirically designed rule set.

  2. Tourism research and audio methods

    DEFF Research Database (Denmark)

    Jensen, Martin Trandberg

    2016-01-01

    Audio methods enriches sensuous tourism ethnographies. • The note suggests five research avenues for future auditory scholarship. • Sensuous tourism research has neglected the role of sounds in embodied tourism experiences.......• Audio methods enriches sensuous tourism ethnographies. • The note suggests five research avenues for future auditory scholarship. • Sensuous tourism research has neglected the role of sounds in embodied tourism experiences....

  3. Semantic Analysis of Multimedial Information Usign Both Audio and Visual Clues

    Directory of Open Access Journals (Sweden)

    Andrej Lukac

    2008-01-01

    Full Text Available Nowadays, there is a lot of information in databases (text, audio/video form, etc.. It is important to be able to describe this data for better orientation in them. It is necessary to apply audio/video properties, which are used for metadata management, segmenting the document into semantically meaningful units, classifying each unit into a predefined scene type, indexing, summarizing the document for efficient retrieval and browsing. Data can be used for system that automatically searches for a specific person in a sequence also for special video sequences. Audio/video properties are presented by descriptors and description schemes. There are many features that can be used to characterize multimedial signals. We can analyze audio and video sequences jointly or considered them completely separately. Our aim is oriented to possibilities of combining multimedial features. Focus is direct into discussion programs, because there are more decisions how to combine audio features with video sequences.

  4. Modeling Audio Fingerprints: Structure, Distortion, Capacity

    OpenAIRE

    Doets, P.J.O.

    2010-01-01

    An audio fingerprint is a compact low-level representation of a multimedia signal. An audio fingerprint can be used to identify audio files or fragments in a reliable way. The use of audio fingerprints for identification consists of two phases. In the enrollment phase known content is fingerprinted, and ingested into a database, together with all relevant metadata. In the identification phase, unknown audio content is fingerprinted, and the fingerprints form the query to the database. The que...

  5. A Comparative Study on Two Techniques of Reducing the Dimension of Text Feature Space

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    With the development of large-scale text processing, the dimension of text feature space has become larger and larger, which has added a lot of difficulties to natural language processing. How to reduce the dimension has become a practical problem in the field. Here we present two clustering methods, i.e. concept association and concept abstract, to achieve the goal. The first refers to the keyword clustering based on the co-occurrence of keywords in the same text, and the second refers to that in the same category. Then we compare the difference between them. Our experiment results show that they are efficient to reduce the dimension of text feature space.

  6. Audio-Visual Speech Recognition Using Lip Information Extracted from Side-Face Images

    Directory of Open Access Journals (Sweden)

    Koji Iwano

    2007-03-01

    Full Text Available This paper proposes an audio-visual speech recognition method using lip information extracted from side-face images as an attempt to increase noise robustness in mobile environments. Our proposed method assumes that lip images can be captured using a small camera installed in a handset. Two different kinds of lip features, lip-contour geometric features and lip-motion velocity features, are used individually or jointly, in combination with audio features. Phoneme HMMs modeling the audio and visual features are built based on the multistream HMM technique. Experiments conducted using Japanese connected digit speech contaminated with white noise in various SNR conditions show effectiveness of the proposed method. Recognition accuracy is improved by using the visual information in all SNR conditions. These visual features were confirmed to be effective even when the audio HMM was adapted to noise by the MLLR method.

  7. Audio-Visual Speech Recognition Using Lip Information Extracted from Side-Face Images

    Directory of Open Access Journals (Sweden)

    Iwano Koji

    2007-01-01

    Full Text Available This paper proposes an audio-visual speech recognition method using lip information extracted from side-face images as an attempt to increase noise robustness in mobile environments. Our proposed method assumes that lip images can be captured using a small camera installed in a handset. Two different kinds of lip features, lip-contour geometric features and lip-motion velocity features, are used individually or jointly, in combination with audio features. Phoneme HMMs modeling the audio and visual features are built based on the multistream HMM technique. Experiments conducted using Japanese connected digit speech contaminated with white noise in various SNR conditions show effectiveness of the proposed method. Recognition accuracy is improved by using the visual information in all SNR conditions. These visual features were confirmed to be effective even when the audio HMM was adapted to noise by the MLLR method.

  8. A Model of Distraction in an Audio-on-Audio Interference Situation with Music Program Material

    DEFF Research Database (Denmark)

    Francombe, J.; Mason, R.; Dewhirst, M.

    2015-01-01

    listener can be viewed as having a personal sound zone system. In order to evaluate and optimize such situations in a perceptually relevant manner, the authors created a predictive model using the features that contribute to the distraction from unwanted sounds. Feature extraction was motivated......There are many situations in which multiple audio programs are replayed over loudspeakers in the same acoustic environment, allowing listeners to focus on their desired target program. Where this situation is deliberately created and the different program items are centrally controlled, each...... separation, and frequency content of the interferer. The model was found to predict accurately for the training and validation datasets....

  9. A New Ensemble Method with Feature Space Partitioning for High-Dimensional Data Classification

    Directory of Open Access Journals (Sweden)

    Yongjun Piao

    2015-01-01

    Full Text Available Ensemble data mining methods, also known as classifier combination, are often used to improve the performance of classification. Various classifier combination methods such as bagging, boosting, and random forest have been devised and have received considerable attention in the past. However, data dimensionality increases rapidly day by day. Such a trend poses various challenges as these methods are not suitable to directly apply to high-dimensional datasets. In this paper, we propose an ensemble method for classification of high-dimensional data, with each classifier constructed from a different set of features determined by partitioning of redundant features. In our method, the redundancy of features is considered to divide the original feature space. Then, each generated feature subset is trained by a support vector machine, and the results of each classifier are combined by majority voting. The efficiency and effectiveness of our method are demonstrated through comparisons with other ensemble techniques, and the results show that our method outperforms other methods.

  10. Nanoscale Analysis of Space-Weathering Features in Soils from Itokawa

    Science.gov (United States)

    Thompson, M. S.; Christoffersen, R.; Zega, T. J.; Keller, L. P.

    2014-01-01

    Space weathering alters the spectral properties of airless body surface materials by redden-ing and darkening their spectra and attenuating characteristic absorption bands, making it challenging to characterize them remotely [1,2]. It also causes a discrepency between laboratory analysis of meteorites and remotely sensed spectra from asteroids, making it difficult to associate meteorites with their parent bodies. The mechanisms driving space weathering include mi-crometeorite impacts and the interaction of surface materials with solar energetic ions, particularly the solar wind. These processes continuously alter the microchemical and structural characteristics of exposed grains on airless bodies. The change of these properties is caused predominantly by the vapor deposition of reduced Fe and FeS nanoparticles (npFe(sup 0) and npFeS respectively) onto the rims of surface grains [3]. Sample-based analysis of space weathering has tra-ditionally been limited to lunar soils and select asteroidal and lunar regolith breccias [3-5]. With the return of samples from the Hayabusa mission to asteroid Itoka-wa [6], for the first time we are able to compare space-weathering features on returned surface soils from a known asteroidal body. Analysis of these samples will contribute to a more comprehensive model for how space weathering varies across the inner solar system. Here we report detailed microchemical and microstructal analysis of surface grains from Itokawa.

  11. Location audio simplified capturing your audio and your audience

    CERN Document Server

    Miles, Dean

    2014-01-01

    From the basics of using camera, handheld, lavalier, and shotgun microphones to camera calibration and mixer set-ups, Location Audio Simplified unlocks the secrets to clean and clear broadcast quality audio no matter what challenges you face. Author Dean Miles applies his twenty-plus years of experience as a professional location operator to teach the skills, techniques, tips, and secrets needed to produce high-quality production sound on location. Humorous and thoroughly practical, the book covers a wide array of topics, such as:* location selection* field mixing* boo

  12. EEMD Independent Extraction for Mixing Features of Rotating Machinery Reconstructed in Phase Space

    Directory of Open Access Journals (Sweden)

    Zaichao Ma

    2015-04-01

    Full Text Available Empirical Mode Decomposition (EMD, due to its adaptive decomposition property for the non-linear and non-stationary signals, has been widely used in vibration analyses for rotating machinery. However, EMD suffers from mode mixing, which is difficult to extract features independently. Although the improved EMD, well known as the ensemble EMD (EEMD, has been proposed, mode mixing is alleviated only to a certain degree. Moreover, EEMD needs to determine the amplitude of added noise. In this paper, we propose Phase Space Ensemble Empirical Mode Decomposition (PSEEMD integrating Phase Space Reconstruction (PSR and Manifold Learning (ML for modifying EEMD. We also provide the principle and detailed procedure of PSEEMD, and the analyses on a simulation signal and an actual vibration signal derived from a rubbing rotor are performed. The results show that PSEEMD is more efficient and convenient than EEMD in extracting the mixing features from the investigated signal and in optimizing the amplitude of the necessary added noise. Additionally PSEEMD can extract the weak features interfered with a certain amount of noise.

  13. Technical Evaluation Report 56: Video-Conferencing with Audio Software

    Directory of Open Access Journals (Sweden)

    Jon Baggaley

    2006-06-01

    Full Text Available An online conference is illustrated using the format of a TV talk show. The conference combined live audio discussion with visual images spontaneously selected by the moderator in the manner of a TV control-room director. A combination of inexpensive online collaborative tools was used for the event, based on the browser-based audio-conferencing software, iVocalize. The exercise illustrates how an impression of a fully featured online video-conference can be created without the need for complex video-conferencing software and high bandwidth.

  14. Evaluation of Audio Compression Artifacts

    Directory of Open Access Journals (Sweden)

    M. Herrera Martinez

    2007-01-01

    Full Text Available This paper deals with subjective evaluation of audio-coding systems. From this evaluation, it is found that, depending on the type of signal and the algorithm of the audio-coding system, different types of audible errors arise. These errors are called coding artifacts. Although three kinds of artifacts are perceivable in the auditory domain, the author proposes that in the coding domain there is only one common cause for the appearance of the artifact, inefficient tracking of transient-stochastic signals. For this purpose, state-of-the art audio coding systems use a wide range of signal processing techniques, including application of the wavelet transform, which is described here. 

  15. Amplificador de audio Clase D

    OpenAIRE

    2012-01-01

    El presente proyecto lleva a cabo el desarrollo de un amplificador de audio tipo D basado en dos tipos de modulación, modulación PWM y modulación Sigma-Delta ambos con puente inversor en H. Tanto el modulador PWM como el modulador Sigma-Delta se desarrollaran mediante circuitos digitales implementados en una FPGA. La señal de audio de entrada se digitalizará mediante un convertidor analógico–digital (ADC) que también estará controlado mediante una circuitería digital implementada en la misma ...

  16. Audio power amplifier design handbook

    CERN Document Server

    Self, Douglas

    2013-01-01

    This book is essential for audio power amplifier designers and engineers for one simple reason...it enables you as a professional to develop reliable, high-performance circuits. The Author Douglas Self covers the major issues of distortion and linearity, power supplies, overload, DC-protection and reactive loading. He also tackles unusual forms of compensation and distortion produced by capacitors and fuses. This completely updated fifth edition includes four NEW chapters including one on The XD Principle, invented by the author, and used by Cambridge Audio. Cro

  17. Feature-space clustering for fMRI meta-analysis

    DEFF Research Database (Denmark)

    Goutte, C.; Hansen, L.K.; Liptrot, Matthew George

    2001-01-01

    Clustering functional magnetic resonance imaging (fMRI) time series has emerged in recent years as a possible alternative to parametric modeling approaches. Most of the work so far has been concerned with clustering raw time series. In this contribution we investigate the applicability...... of a clustering method applied to features extracted from the data. This approach is extremely versatile and encompasses previously published results [Goutte et al., 1999] as special cases. A typical application is in data reduction: as the increase in temporal resolution of fMRI experiments routinely yields f......-voxel analyses. In particular this allows the checking of the differences and agreements between different methods of analysis. Both approaches are illustrated on a fMRI data set involving visual stimulation, and we show that the feature space clustering approach yields nontrivial results and, in particular...

  18. Features of human skin in HSV color space and new recognition parameter

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    Features of human skin in HSV color space are widely applied in the area of image retrieval based on content. H is selected as the basic recognition parameter because its value has a narrow range for the skin color and can keep stable while the illumination intensity or the curvature of skin surface is changing. Rules of parameters with the change of illumination in HSV color space are studied. It is firstly found that the mean of saturation and value (S+V)/2 can keep stable when the illumination intensity is changed or the skin surface is inflected, and (S+V)/2 changes with skin color, but the tendency of change is contrary to that of H. Therefore, (S+V)/H can be used as a new recognition parameter which can enhance HSV ability to recognize human skin.

  19. Optimizing view/illumination geometry for terrestrial features using Space Shuttle and aerial polarimetry

    Science.gov (United States)

    Israel, Steven A.; Holly, Mark H.; Whitehead, Victor S.

    1992-01-01

    This paper describes to relationship of polarimetric observations from orbital and aerial platforms and the determination optimum sun-target-sensor geometry. Polarimetric observations were evaluated for feature discrimination. The Space Shuttle experiment was performed using two boresighted Hasselblad 70 mm cameras with identical settings with linear polarizing filters aligned orthogonally about the optic axis. The aerial experiment was performed using a single 35 mm Nikon FE2 and rotating the linear polarizing filter 90 deg to acquire both minimum and maximum photographs. Characteristic curves were created by covertype and waveband for both aerial and Space Shuttle imagery. Though significant differences existed between the two datasets, the observed polarimetric signatures were unique and separable.

  20. Between technical features and analytic capabilities: Charting a relational affordance space for digital social analytics

    Directory of Open Access Journals (Sweden)

    Anders Koed Madsen

    2015-01-01

    Full Text Available Digital social analytics is a subset of Big Data methods that is used to understand the social environment in which people and organizations have to act. This paper presents an analysis of eight projects that are experimenting with the use of these methods for various purposes. It shows that two specific technological features influence the work with such methods in all the cases. The first concerns the need to distribute choices about the structure of data to third-party actors and the second concerns the need to balance machine intelligence and human intuition when automating the analysis. These features set specific conditions for knowledge production, and the paper identifies two opposite approaches for engaging with each of these conditions. These features and approaches are finally combined into a two-dimensional affordance space that illustrates how there is flexibility in the way project leaders interact with the features of the data environment. It thereby also shows how digital social analytics come to have different affordances for different projects.

  1. A Joint Audio-Visual Approach to Audio Localization

    DEFF Research Database (Denmark)

    Jensen, Jesper Rindom; Christensen, Mads Græsbøll

    2015-01-01

    Localization of audio sources is an important research problem, e.g., to facilitate noise reduction. In the recent years, the problem has been tackled using distributed microphone arrays (DMA). A common approach is to apply direction-of-arrival (DOA) estimation on each array (denoted as nodes...

  2. Supervised pixel classification using a feature space derived from an artificial visual system

    Science.gov (United States)

    Baxter, Lisa C.; Coggins, James M.

    1991-01-01

    Image segmentation involves labelling pixels according to their membership in image regions. This requires the understanding of what a region is. Using supervised pixel classification, the paper investigates how groups of pixels labelled manually according to perceived image semantics map onto the feature space created by an Artificial Visual System. Multiscale structure of regions are investigated and it is shown that pixels form clusters based on their geometric roles in the image intensity function, not by image semantics. A tentative abstract definition of a 'region' is proposed based on this behavior.

  3. Audio watermark a comprehensive foundation using Matlab

    CERN Document Server

    Lin, Yiqing

    2015-01-01

    This book illustrates the commonly used and novel approaches of audio watermarking for copyrights protection. The author examines the theoretical and practical step by step guide to the topic of data hiding in audio signal such as music, speech, broadcast. The book covers new techniques developed by the authors are fully explained and MATLAB programs, for audio watermarking and audio quality assessments and also discusses methods for objectively predicting the perceptual quality of the watermarked audio signals. Explains the theoretical basics of the commonly used audio watermarking techniques Discusses the methods used to objectively and subjectively assess the quality of the audio signals Provides a comprehensive well tested MATLAB programs that can be used efficiently to watermark any audio media

  4. Audio Watermarking Using Lsb With Adjustment Method

    Directory of Open Access Journals (Sweden)

    Ansith.S, Priyanka Udayabhanu

    2013-05-01

    Full Text Available In this paper we are discussing watermarking on audio signals. In this method the recorded audio data is first sampled using a sampling frequency of 22050 Hz. Then the watermark message is watermarked into the sampled data of the audio signal. In this method the adjustment is done to increase the accuracy of the watermarked signal. Finally we extract the message from the audio data.

  5. Engaging Students with Audio Feedback

    Science.gov (United States)

    Cann, Alan

    2014-01-01

    Students express widespread dissatisfaction with academic feedback. Teaching staff perceive a frequent lack of student engagement with written feedback, much of which goes uncollected or unread. Published evidence shows that audio feedback is highly acceptable to students but is underused. This paper explores methods to produce and deliver audio…

  6. Digital Augmented Reality Audio Headset

    Directory of Open Access Journals (Sweden)

    Jussi Rämö

    2012-01-01

    Full Text Available Augmented reality audio (ARA combines virtual sound sources with the real sonic environment of the user. An ARA system can be realized with a headset containing binaural microphones. Ideally, the ARA headset should be acoustically transparent, that is, it should not cause audible modification to the surrounding sound. A practical implementation of an ARA mixer requires a low-latency headphone reproduction system with additional equalization to compensate for the attenuation and the modified ear canal resonances caused by the headphones. This paper proposes digital IIR filters to realize the required equalization and evaluates a real-time prototype ARA system. Measurements show that the throughput latency of the digital prototype ARA system can be less than 1.4 ms, which is sufficiently small in practice. When the direct and processed sounds are combined in the ear, a comb filtering effect is brought about and appears as notches in the frequency response. The comb filter effect in speech and music signals was studied in a listening test and it was found to be inaudible when the attenuation is 20 dB. Insert ARA headphones have a sufficient attenuation at frequencies above about 1 kHz. The proposed digital ARA system enables several immersive audio applications, such as a virtual audio tourist guide and audio teleconferencing.

  7. Haptic and Audio Interaction Design

    DEFF Research Database (Denmark)

    This book constitutes the refereed proceedings of the 5th International Workshop on Haptic and Audio Interaction Design, HAID 2010 held in Copenhagen, Denmark, in September 2010. The 21 revised full papers presented were carefully reviewed and selected for inclusion in the book. The papers are or...

  8. The Audio-Visual Man.

    Science.gov (United States)

    Babin, Pierre, Ed.

    A series of twelve essays discuss the use of audiovisuals in religious education. The essays are divided into three sections: one which draws on the ideas of Marshall McLuhan and other educators to explore the newest ideas about audiovisual language and faith, one that describes how to learn and use the new language of audio and visual images, and…

  9. Audio-Visual Aids: Historians in Blunderland.

    Science.gov (United States)

    Decarie, Graeme

    1988-01-01

    A history professor relates his experiences producing and using audio-visual material and warns teachers not to rely on audio-visual aids for classroom presentations. Includes examples of popular audio-visual aids on Canada that communicate unintended, inaccurate, or unclear ideas. Urges teachers to exercise caution in the selection and use of…

  10. [Audio-visual aids and tropical medicine].

    Science.gov (United States)

    Morand, J J

    1989-01-01

    The author presents a list of the audio-visual productions about Tropical Medicine, as well as of their main characteristics. He thinks that the audio-visual educational productions are often dissociated from their promotion; therefore, he invites the future creator to forward his work to the Audio-Visual Health Committee.

  11. Spatial audio quality perception (part 1)

    DEFF Research Database (Denmark)

    Conetta, R.; Brookes, T.; Rumsey, F.;

    2015-01-01

    Spatial audio processes (SAPs) commonly encountered in consumer audio reproduction systems are known to produce a range of impairments to spatial quality. By way of two listening tests, this paper investigated the degree of degradation of the spatial quality of six 5-channel audio recordings resu...

  12. Development of Sensitivity to Spacing Versus Feature Changes in Pictures of Houses: Evidence for Slow Development of a General Spacing Detection Mechanism?

    Science.gov (United States)

    Robbins, Rachel A.; Shergill, Yaadwinder; Maurer, Daphne; Lewis, Terri L.

    2011-01-01

    Adults are expert at recognizing faces, in part because of exquisite sensitivity to the spacing of facial features. Children are poorer than adults at recognizing facial identity and less sensitive to spacing differences. Here we examined the specificity of the immaturity by comparing the ability of 8-year-olds, 14-year-olds, and adults to…

  13. Diffraction of SH-waves by topographic features in a layered transversely isotropic half-space

    Science.gov (United States)

    Ba, Zhenning; Liang, Jianwen; Zhang, Yanju

    2017-01-01

    The scattering of plane SH-waves by topographic features in a layered transversely isotropic (TI) half-space is investigated by using an indirect boundary element method (IBEM). Firstly, the anti-plane dynamic stiffness matrix of the layered TI half-space is established and the free fields are solved by using the direct stiffness method. Then, Green's functions are derived for uniformly distributed loads acting on an inclined line in a layered TI half-space and the scattered fields are constructed with the deduced Green's functions. Finally, the free fields are added to the scattered ones to obtain the global dynamic responses. The method is verified by comparing results with the published isotropic ones. Both the steady-state and transient dynamic responses are evaluated and discussed. Numerical results in the frequency domain show that surface motions for the TI media can be significantly different from those for the isotropic case, which are strongly dependent on the anisotropy property, incident angle and incident frequency. Results in the time domain show that the material anisotropy has important effects on the maximum duration and maximum amplitudes of the time histories.

  14. Using Touch Screen Audio-CASI to Obtain Data on Sensitive Topics.

    Science.gov (United States)

    Cooley, Philip C; Rogers, Susan M; Turner, Charles F; Al-Tayyib, Alia A; Willis, Gordon; Ganapathi, Laxminarayana

    2001-05-01

    This paper describes a new interview data collection system that uses a laptop personal computer equipped with a touch-sensitive video monitor. The touch-screen-based audio computer-assisted self-interviewing system, or touch screen audio-CASI, enhances the ease of use of conventional audio CASI systems while simultaneously providing the privacy of self-administered questionnaires. We describe touch screen audio-CASI design features and operational characteristics. In addition, we present data from a recent clinic-based experiment indicating that the touch audio-CASI system is stable, robust, and suitable for administering relatively long and complex questionnaires on sensitive topics, including drug use and sexual behaviors associated with HIV and other sexually transmitted diseases.

  15. Design of an Audio Interface for Patmos

    OpenAIRE

    Ausin, Daniel Sanz; Goerge, Fabian

    2017-01-01

    This paper describes the design and implementation of an audio interface for the Patmos processor, which runs on an Altera DE2-115 FPGA board. This board has an audio codec included, the WM8731. The interface described in this work allows to receive and send audio from and to the WM8731, and to synthesize, store or manipulate audio signals writing C programs for Patmos. The audio interface described in this paper is intended to be used with the Patmos processor. Patmos is an open source RISC ...

  16. Space weathering effects in Diviner Lunar Radiometer multispectral infrared measurements of the lunar Christiansen Feature: Characteristics and mitigation

    Science.gov (United States)

    Lucey, Paul G.; Greenhagen, Benjamin T.; Song, Eugenie; Arnold, Jessica A.; Lemelin, Myriam; Hanna, Kerri Donaldson; Bowles, Neil E.; Glotch, Timothy D.; Paige, David A.

    2017-02-01

    Multispectral infrared measurements by the Diviner Lunar Radiometer Experiment on the Lunar Renaissance Orbiter enable the characterization of the position of the Christiansen Feature, a thermal infrared spectral feature that laboratory work has shown is proportional to the bulk silica content of lunar surface materials. Diviner measurements show that the position of this feature is also influenced by the changes in optical and physical properties of the lunar surface with exposure to space, the process known as space weathering. Large rayed craters and lunar swirls show corresponding Christiansen Feature anomalies. The space weathering effect is likely due to differences in thermal gradients in the optical surface imposed by the space weathering control of albedo. However, inspected at high resolution, locations with extreme compositions and Christiansen Feature wavelength positions - silica-rich and olivine-rich areas - do not have extreme albedos, and fall off the albedo- Christiansen Feature wavelength position trend occupied by most of the Moon. These areas demonstrate that the Christiansen Feature wavelength position contains compositional information and is not solely dictated by albedo. An optical maturity parameter derived from near-IR measurements is used to partly correct Diviner data for space weathering influences.

  17. Two-dimensional audio watermark for MPEG AAC audio

    Science.gov (United States)

    Tachibana, Ryuki

    2004-06-01

    Since digital music is often stored in a compressed file, it is desirable that an audio watermarking method in a content management system handles compressed files. Using an audio watermarking method that directly manipulates compressed files makes it unnecessary to decompress the files before embedding or detection, so more files can be processed per unit time. However, it is difficult to detect a watermark in a compressed file that has been compressed after the file was watermarked. This paper proposes an MPEG Advanced Audio Coding (AAC) bitstream watermarking method using a two-dimensional pseudo-random array. Detection is done by correlating the absolute values of the recovered MDCT coefficients and the pseudo-random array. Since the embedding algorithm uses the same pseudo-random values for two adjacent overlapping frames and the detection algorithm selects the better frame in the two by comparing detected watermark strengths, it is possible to detect a watermark from a compressed file that was compressed after the watermark was embedded in the original uncompressed file. Though the watermark is not detected as clearly in this case, the watermark can still be detected even when the watermark was embedded in a compressed file and the file was then decompressed, trimmed, and compressed again.

  18. Adaptive Quantization Index Modulation Audio Watermarking based on Fuzzy Inference System

    Directory of Open Access Journals (Sweden)

    Sunita V. Dhavale

    2014-02-01

    Full Text Available Many of the adaptive watermarking schemes reported in the literature consider only local audio signal properties. Many schemes require complex computation along with manual parameter settings. In this paper, we propose a novel, fuzzy, adaptive audio watermarking algorithm based on both global and local audio signal properties. The algorithm performs well for dynamic range of audio signals without requiring manual initial parameter selection. Here, mean value of energy (MVE and variance of spectral flux (VSF of a given audio signal constitutes global components, while the energy of each audio frame acts as local component. The Quantization Index Modulation (QIM step size Δ is made adaptive to both the global and local features. The global component automates the initial selection of Δ using the fuzzy inference system while the local component controls the variation in it based on the energy of individual audio frame. Hence Δ adaptively controls the strength of watermark to meet both the robustness and inaudibility requirements, making the system independent of audio nature. Experimental results reveal that our adaptive scheme outperforms other fixed step sized QIM schemes and adaptive schemes and is highly robust against general attacks.

  19. AudioRegent: Exploiting SimpleADL and SoX for Digital Audio Delivery

    Directory of Open Access Journals (Sweden)

    Nitin Arora

    2010-06-01

    Full Text Available AudioRegent is a command-line Python script currently being used by the University of Alabama Libraries’ Digital Services to create web-deliverable MP3s from regions within archival audio files. In conjunction with a small-footprint XML file called SimpleADL and SoX, an open-source command-line audio editor, AudioRegent batch processes archival audio files, allowing for one or many user-defined regions, particular to each audio file, to be extracted with additional audio processing in a transparent manner that leaves the archival audio file unaltered. Doing so has alleviated many of the tensions of cumbersome workflows, complicated documentation, preservation concerns, and reliance on expensive closed-source GUI audio applications.

  20. Haptic and Audio Interaction Design

    DEFF Research Database (Denmark)

    This book constitutes the refereed proceedings of the 5th International Workshop on Haptic and Audio Interaction Design, HAID 2010 held in Copenhagen, Denmark, in September 2010. The 21 revised full papers presented were carefully reviewed and selected for inclusion in the book. The papers are or...... are organized in topical sections on multimodal integration, tactile and sonic explorations, walking and navigation interfaces, prototype design and evaluation, and gestures and emotions.......This book constitutes the refereed proceedings of the 5th International Workshop on Haptic and Audio Interaction Design, HAID 2010 held in Copenhagen, Denmark, in September 2010. The 21 revised full papers presented were carefully reviewed and selected for inclusion in the book. The papers...

  1. Joint spatial-spectral feature space clustering for speech activity detection from ECoG signals.

    Science.gov (United States)

    Kanas, Vasileios G; Mporas, Iosif; Benz, Heather L; Sgarbas, Kyriakos N; Bezerianos, Anastasios; Crone, Nathan E

    2014-04-01

    Brain-machine interfaces for speech restoration have been extensively studied for more than two decades. The success of such a system will depend in part on selecting the best brain recording sites and signal features corresponding to speech production. The purpose of this study was to detect speech activity automatically from electrocorticographic signals based on joint spatial-frequency clustering of the ECoG feature space. For this study, the ECoG signals were recorded while a subject performed two different syllable repetition tasks. We found that the optimal frequency resolution to detect speech activity from ECoG signals was 8 Hz, achieving 98.8% accuracy by employing support vector machines as a classifier. We also defined the cortical areas that held the most information about the discrimination of speech and nonspeech time intervals. Additionally, the results shed light on the distinct cortical areas associated with the two syllables repetition tasks and may contribute to the development of portable ECoG-based communication.

  2. [The new method monitoring crop water content based on NIR-Red spectrum feature space].

    Science.gov (United States)

    Cheng, Xiao-juan; Xu, Xin-gang; Chen, Tian-en; Yang, Gui-jun; Li, Zhen-hai

    2014-06-01

    Moisture content is an important index of crop water stress condition, timely and effective monitoring of crop water content is of great significance for evaluating crop water deficit balance and guiding agriculture irrigation. The present paper was trying to build a new crop water index for winter wheat vegetation water content based on NIR-Red spectral space. Firstly, canopy spectrums of winter wheat with narrow-band were resampled according to relative spectral response function of HJ-CCD and ZY-3. Then, a new index (PWI) was set up to estimate vegetation water content of winter wheat by improveing PDI (perpendicular drought index) and PVI (perpendicular vegetation index) based on NIR-Red spectral feature space. The results showed that the relationship between PWI and VWC (vegetation water content) was stable based on simulation of wide-band multispectral data HJ-CCD and ZY-3 with R2 being 0.684 and 0.683, respectively. And then VWC was estimated by using PWI with the R2 and RMSE being 0.764 and 0.764, 3.837% and 3.840%, respectively. The results indicated that PWI has certain feasibility to estimate crop water content. At the same time, it provides a new method for monitoring crop water content using remote sensing data HJ-CCD and ZY-3.

  3. Local Control of Audio Environment: A Review of Methods and Applications

    Directory of Open Access Journals (Sweden)

    Jussi Kuutti

    2014-02-01

    Full Text Available The concept of a local audio environment is to have sound playback locally restricted such that, ideally, adjacent regions of an indoor or outdoor space could exhibit their own individual audio content without interfering with each other. This would enable people to listen to their content of choice without disturbing others next to them, yet, without any headphones to block conversation. In practice, perfect sound containment in free air cannot be attained, but a local audio environment can still be satisfactorily approximated using directional speakers. Directional speakers may be based on regular audible frequencies or they may employ modulated ultrasound. Planar, parabolic, and array form factors are commonly used. The directivity of a speaker improves as its surface area and sound frequency increases, making these the main design factors for directional audio systems. Even directional speakers radiate some sound outside the main beam, and sound can also reflect from objects. Therefore, directional speaker systems perform best when there is enough ambient noise to mask the leaking sound. Possible areas of application for local audio include information and advertisement audio feed in commercial facilities, guiding and narration in museums and exhibitions, office space personalization, control room messaging, rehabilitation environments, and entertainment audio systems.

  4. SNR-adaptive stream weighting for audio-MES ASR.

    Science.gov (United States)

    Lee, Ki-Seung

    2008-08-01

    Myoelectric signals (MESs) from the speaker's mouth region have been successfully shown to improve the noise robustness of automatic speech recognizers (ASRs), thus promising to extend their usability in implementing noise-robust ASR. In the recognition system presented herein, extracted audio and facial MES features were integrated by a decision fusion method, where the likelihood score of the audio-MES observation vector was given by a linear combination of class-conditional observation log-likelihoods of two classifiers, using appropriate weights. We developed a weighting process adaptive to SNRs. The main objective of the paper involves determining the optimal SNR classification boundaries and constructing a set of optimum stream weights for each SNR class. These two parameters were determined by a method based on a maximum mutual information criterion. Acoustic and facial MES data were collected from five subjects, using a 60-word vocabulary. Four types of acoustic noise including babble, car, aircraft, and white noise were acoustically added to clean speech signals with SNR ranging from -14 to 31 dB. The classification accuracy of the audio ASR was as low as 25.5%. Whereas, the classification accuracy of the MES ASR was 85.2%. The classification accuracy could be further improved by employing the proposed audio-MES weighting method, which was as high as 89.4% in the case of babble noise. A similar result was also found for the other types of noise.

  5. Video genre categorization and representation using audio-visual information

    Science.gov (United States)

    Ionescu, Bogdan; Seyerlehner, Klaus; Rasche, Christoph; Vertan, Constantin; Lambert, Patrick

    2012-04-01

    We propose an audio-visual approach to video genre classification using content descriptors that exploit audio, color, temporal, and contour information. Audio information is extracted at block-level, which has the advantage of capturing local temporal information. At the temporal structure level, we consider action content in relation to human perception. Color perception is quantified using statistics of color distribution, elementary hues, color properties, and relationships between colors. Further, we compute statistics of contour geometry and relationships. The main contribution of our work lies in harnessing the descriptive power of the combination of these descriptors in genre classification. Validation was carried out on over 91 h of video footage encompassing 7 common video genres, yielding average precision and recall ratios of 87% to 100% and 77% to 100%, respectively, and an overall average correct classification of up to 97%. Also, experimental comparison as part of the MediaEval 2011 benchmarking campaign demonstrated the efficiency of the proposed audio-visual descriptors over other existing approaches. Finally, we discuss a 3-D video browsing platform that displays movies using feature-based coordinates and thus regroups them according to genre.

  6. Review of AVS Audio Coding Standard

    Institute of Scientific and Technical Information of China (English)

    ZHANG Tao; ZHANG Caixia; ZHAO Xin

    2016-01-01

    Audio Video Coding Standard (AVS) is a second⁃generation source coding standard and the first standard for audio and video coding in China with independent intellectual property rights. Its performance has reached the international standard. Its coding efficiency is 2 to 3 times greater than that of MPEG⁃2. This technical solution is more simple, and it can greatly save channel resource. After more than ten years ’develop⁃ment, AVS has achieved great success. The latest version of the AVS audio coding standard is ongoing and mainly aims at the increasing demand for low bitrate and high quality audio services. The paper reviews the history and recent develop⁃ment of AVS audio coding standard in terms of basic fea⁃tures, key techniques and performance. Finally, the future de⁃velopment of AVS audio coding standard is discussed.

  7. Audio-magnetotelluric methods in reconnaissance geothermal exploration

    Science.gov (United States)

    Hoover, D.B.; Long, C.L.

    1976-01-01

    An audio-magnetotelluric (AMT) system has been developed by the U.S. Geological Survey for low-cost reconnaissance exploration of geothermal regions. This is an electromagnetic sounding technique in which the scalar or Cagniard resistivity is computed at 12 frequencies logarithmically spaced from 7.5 to 18 600 Hz. Our system uses natural source fields except at the two upper frequencies of 10 200

  8. Distortion Estimation in Compressed Music Using Only Audio Fingerprints

    NARCIS (Netherlands)

    Doets, P.J.O.; Lagendijk, R.L.

    2008-01-01

    An audio fingerprint is a compact yet very robust representation of the perceptually relevant parts of an audio signal. It can be used for content-based audio identification, even when the audio is severely distorted. Audio compression changes the fingerprint slightly. We show that these small finge

  9. [The new method monitoring agricultural drought based on SWIR-Red spectrum feature space].

    Science.gov (United States)

    Feng, Hai-Xia; Qin, Qi-Ming; Li, Bin-Yong; Liu, Fang; Jiang, Hong-Bo; Dong, Heng; Wang, Jin-Liang; Liu, Ming-Chao; Zhang, Ning

    2011-11-01

    Drought was a chronic, natural disaster, and Remote sensing drought monitoring had become a potential research field. In the present, short-wave infrared and red bands which sensitive to moisture variation were selected to monitor farmland drought conditions by analyzing the spectral characteristics of vegetation and soil. The goal of this paper was to provide a new method of drought monitoring--normalized drought monitoring index (NPDI), based on new constructed spectrum feature space by the difference of SWIR and Red and the sum of SWIR and Red. Field surveyed soil moisture verified NPDI model, and the result showed that NDPI and MPDI model could effectively monitor agricultural drought, and that had high correlation with soil moisture. The R2 was 0.583 and 0.438 with soil water of 10 cm. The monitoring effect of NPDI model was better than the MPDL. This model was further improvement to PDI and MPDI, and it could monitor the drought condition of different vegetation coverage and whole growing season. It has high application potential and popularization value.

  10. Optimization of audio - ultrasonic plasma system parameters

    Science.gov (United States)

    Haleem, N. A.; Abdelrahman, M. M.; Ragheb, M. S.

    2016-10-01

    The present plasma is a special glow plasma type generated by an audio ultrasonic discharge voltage. A definite discharge frequency using a gas at a narrow band pressure creates and stabilizes this plasma type. The plasma cell is a self-extracted ion beam; it is featured with its high output intensity and its small size. The influence of the plasma column length on the output beam due to the variation of both the audio discharge frequency and the power applied to the plasma electrodes is investigated. In consequence, the aim of the present work is to put in evidence the parameters that influence the self-extracted collected ion beam and to optimize the conditions that enhance the collected ion beam. The experimental parameters studied are the nitrogen gas, the applied frequency from 10 to 100 kHz, the plasma length that varies from 8 to 14 cm, at a gas pressure of ≈ 0.25 Torr and finally the discharge power from 50 to 500 Watt. A sheet of polyethylene of 5 micrometer covers the collector electrode in order to confirm how much ions from the beam can go through the polymer and reach the collector. To diagnose the occurring events of the beam on the collector, the polymer used is analyzed by means of the FTIR and the XRF techniques. Optimization of the plasma cell parameters succeeded to enhance and to identify the parameters that influence the output ion beam and proved that its particles attaining the collector are multi-energetic.

  11. On the comparison of audio fingerprints for extracting quality parameters of compressed audio

    NARCIS (Netherlands)

    Doets, P.J.O.; Menot Gisbert, M.; Lagendijk, R.L.

    2006-01-01

    Audio fingerprints can be seen as hashes of the perceptual content of an audio excerpt. Applications include linking metadata to unlabeled audio, watermark support, and broadcast monitoring. Existing systems identify a song by comparing its fingerprint to pre-computed fingerprints in a database. Sma

  12. Audio-visual Feature Fusion Person Identification Based on SVM and Score Normalization%基于SVM和归一化技术的音视频特征融合身份识别

    Institute of Scientific and Technical Information of China (English)

    丁辉; 安今朝

    2012-01-01

    In order to solve the problem of low recognition rate of face recognition and speech recognition under the wicked noise conditions. Based on the studies of feature level fusion theory and combined with Normalization and SVM theory, a novel model for face features and speech features fusion recognition is presented in this paper. First, we extract the face features and speech features correspondingly, then we fuse the two features on the feature level in order to obtain the fusion feature, after the calculation of the distance between the test people and template people we normalize the matching distance so as to reduce the computational and to improve the recognition accuracy. Al the last, we put the normalization matching distance into SVM can we obtain the recognition result. Trie experiment show that the fusion system performs well both in response time and system accuracy especially in noisy background.%针对噪声环境下人脸识别率和说话人识别率低的问题,在研究特征层融合的基础上,结合归一化技术和SVM理论,提出了一种融合人脸和语音的多生物特征识别模型.首先采用离散余弦变换和局部保持投影算法提取人脸特征及SVM方法提取语音特征,在特征层进行融合得到融合特征后,计算测试身份与模板问的距离,为了减少计算量和提高识别性能,对匹配距离进行归一化处理,最后输入到SVM进行识别.仿真结果表明,在噪声环境下,当信噪比降低时,融合识别率要明显高于单个系统的识别率,达到了身份识别的目的.

  13. Infrared Space Observatory Spectra of R Coronae Borealis Stars; 1, Emission Features in the Interval 3 - 25 microns

    CERN Document Server

    Lambert, D L; Pandey, G; Ivans, I I; Lambert, David L.; Pandey, Gajendra; Ivans, Inese I.

    2001-01-01

    Infrared Space Observatory 3 - 25 $\\mu$m spectra of the R Coronae Borealis stars V854 Cen, R CrB, and RY Sgr are presented and discussed. Sharp emission features coincident in wavelengths with the well known Unidentified Emission Features are present in the spectrum of V854 Cen but not of R CrB or RY Sgr. Since V854 Cen is not particularly H-poor and has a 1000 times more H than the other stars, the emission features are probably from a carrier containing hydrogen. There is a correspondence between the features and emission from laboratory samples of hydrogenated amorphous carbon. A search for C$_{60}$ in emission or absorption proved negative. Amorphous carbon particles account for the broad emission features seen between 6 - 14 $\\mu$m in the spectrum of each star.

  14. Functionality of system components: Conservation of protein function in protein feature space

    DEFF Research Database (Denmark)

    Jensen, Lars Juhl; Ussery, David; Brunak, Søren

    2003-01-01

    Many protein features useful for prediction of protein function can be predicted from sequence, including posttranslational modifications, subcellular localization, and physical/chemical properties. We show here that such protein features are more conserved among orthologs than paralogs, indicati...

  15. Vascular lesions of the lumbar epidural space: magnetic resonance imaging features of epidural cavernous hemangioma and epidural hematoma

    Directory of Open Access Journals (Sweden)

    Basile Júnior Roberto

    1999-01-01

    Full Text Available The authors report the magnetic resonance imaging diagnostic features in two cases with respectively lumbar epidural hematoma and cavernous hemangioma of the lumbar epidural space. Enhanced MRI T1-weighted scans show a hyperintense signal rim surrounding the vascular lesion. Non-enhanced T2-weighted scans showed hyperintense signal.

  16. Presence and the utility of audio spatialization

    DEFF Research Database (Denmark)

    Bormann, Karsten

    2005-01-01

    or not, while the presence questionnaire used by Slater and coworkers (see Tromp et al., 1998) was more sensitive to whether audio was fully spatialized or not. Finally, having the sound source active positively impacts the assessment of the audio while negatively impacting subjects' assessment...

  17. A listening test system for automotive audio

    DEFF Research Database (Denmark)

    Christensen, Flemming; Geoff, Martin; Minnaar, Pauli;

    2005-01-01

    This paper describes a system for simulating automotive audio through headphones for the purposes of conducting listening experiments in the laboratory. The system is based on binaural technology and consists of a component for reproducing the sound of the audio system itself and a component...

  18. Audio-Visual Aids in Universities

    Science.gov (United States)

    Douglas, Jackie

    1970-01-01

    A report on the proceedings and ideas expressed at a one day seminar on "Audio-Visual Equipment--Its Uses and Applications for Teaching and Research in Universities." The seminar was organized by England's National Committee for Audio-Visual Aids in Education in conjunction with the British Universities Film Council. (LS)

  19. Digital Advances in Contemporary Audio Production.

    Science.gov (United States)

    Shields, Steven O.

    Noting that a revolution in sonic high fidelity occurred during the 1980s as digital-based audio production methods began to replace traditional analog modes, this paper offers both an overview of digital audio theory and descriptions of some of the related digital production technologies that have begun to emerge from the mating of the computer…

  20. “Wrapping” X3DOM around Web Audio API

    Directory of Open Access Journals (Sweden)

    Andreas Stamoulias

    2015-12-01

    Full Text Available Spatial sound has a conceptual role in the Web3D environments, due to highly realism scenes that can provide. Lately the efforts are concentrated on the extension of the X3D/ X3DOM through spatial sound attributes. This paper presents a novel method for the introduction of spatial sound components in the X3DOM framework, based on X3D specification and Web Audio API. The proposed method incorporates the introduction of enhanced sound nodes for X3DOM which are derived by the implementation of the X3D standard components, enriched with accessional features of Web Audio API. Moreover, several examples-scenarios developed for the evaluation of our approach. The implemented examples established the achievability of new registered nodes in X3DOM, for spatial sound characteristics in Web3D virtual worlds.

  1. A Robust Zero-Watermarking Algorithm for Audio

    Directory of Open Access Journals (Sweden)

    Jie Zhu

    2008-03-01

    Full Text Available In traditional watermarking algorithms, the insertion of watermark into the host signal inevitably introduces some perceptible quality degradation. Another problem is the inherent conflict between imperceptibility and robustness. Zero-watermarking technique can solve these problems successfully. Instead of embedding watermark, the zero-watermarking technique extracts some essential characteristics from the host signal and uses them for watermark detection. However, most of the available zero-watermarking schemes are designed for still image and their robustness is not satisfactory. In this paper, an efficient and robust zero-watermarking technique for audio signal is presented. The multiresolution characteristic of discrete wavelet transform (DWT, the energy compression characteristic of discrete cosine transform (DCT, and the Gaussian noise suppression property of higher-order cumulant are combined to extract essential features from the host audio signal and they are then used for watermark recovery. Simulation results demonstrate the effectiveness of our scheme in terms of inaudibility, detection reliability, and robustness.

  2. Audio Sensing Aid based Wireless Microphone Emulation Attacks Detection

    Directory of Open Access Journals (Sweden)

    Wang Shan-shan

    2013-10-01

    Full Text Available The wireless microphone network is an important PU network for CRN, but there is no effective technology to solve the problem of microphone evaluation attacks. Therefore, this paper propose ASA algorithm, which utilizes three devices to detect MUs, and they are loudspeaker audio sensor (LAS, environment audio sensor (EAS, and radio frequency fingerprint detector (RFFD. LASs are installed near loudspeakers, which have two main effects: One is to sense loudspeakers’ output, and the other is to broadcast warning information to all SUs through the common control channel when detecting valid output. EASs are pocket voice captures provided to SU, and utilized to sense loudspeaker sound at SU’s location. Utilizing EASs and energy detections in SU can detect primary user emulation attack (PUEA fast. But to acquire the information of attacked channels, we need explore RFFDs to analyze the features of PU transmitters. The results show that the proposed algorithm can detect PUEA well.    

  3. Stego-audio Using Genetic Algorithm Approach

    Directory of Open Access Journals (Sweden)

    V. Santhi

    2014-06-01

    Full Text Available With the rapid development of digital multimedia applications, the secure data transmission becomes the main issue in data communication system. So the multimedia data hiding techniques have been developed to ensure the secured data transfer. Steganography is an art of hiding a secret message within an image/audio/video file in such a way that the secret message cannot be perceived by hacker/intruder. In this study, we use RSA encryption algorithm to encrypt the message and Genetic Algorithm (GA to encode the message in the audio file. This study presents a method to access the negative audio bytes and includes the negative audio bytes in the message encoding and position embedding process. This increases the capacity of encoding message in the audio file. The use of GA operators in Genetic Algorithm reduces the noise distortions.

  4. Exploiting Acoustic Similarity of Propagating Paths for Audio Signal Separation

    Directory of Open Access Journals (Sweden)

    Yin Bin

    2003-01-01

    Full Text Available Blind signal separation can easily find its position in audio applications where mutually independent sources need to be separated from their microphone mixtures while both room acoustics and sources are unknown. However, the conventional separation algorithms can hardly be implemented in real time due to the high computational complexity. The computational load is mainly caused by either direct or indirect estimation of thousands of acoustic parameters. Aiming at the complexity reduction, in this paper, the acoustic paths are investigated through an acoustic similarity index (ASI. Then a new mixing model is proposed. With closely spaced microphones (5–10 cm apart, the model relieves the computational load of the separation algorithm by reducing the number and length of the filters to be adjusted. To cope with real situations, a blind audio signal separation algorithm (BLASS is developed on the proposed model. BLASS only uses the second-order statistics (SOS and performs efficiently in frequency domain.

  5. Multiple-output support vector machine regression with feature selection for arousal/valence space emotion assessment.

    Science.gov (United States)

    Torres-Valencia, Cristian A; Álvarez, Mauricio A; Orozco-Gutiérrez, Alvaro A

    2014-01-01

    Human emotion recognition (HER) allows the assessment of an affective state of a subject. Until recently, such emotional states were described in terms of discrete emotions, like happiness or contempt. In order to cover a high range of emotions, researchers in the field have introduced different dimensional spaces for emotion description that allow the characterization of affective states in terms of several variables or dimensions that measure distinct aspects of the emotion. One of the most common of such dimensional spaces is the bidimensional Arousal/Valence space. To the best of our knowledge, all HER systems so far have modelled independently, the dimensions in these dimensional spaces. In this paper, we study the effect of modelling the output dimensions simultaneously and show experimentally the advantages in modeling them in this way. We consider a multimodal approach by including features from the Electroencephalogram and a few physiological signals. For modelling the multiple outputs, we employ a multiple output regressor based on support vector machines. We also include an stage of feature selection that is developed within an embedded approach known as Recursive Feature Elimination (RFE), proposed initially for SVM. The results show that several features can be eliminated using the multiple output support vector regressor with RFE without affecting the performance of the regressor. From the analysis of the features selected in smaller subsets via RFE, it can be observed that the signals that are more informative into the arousal and valence space discrimination are the EEG, Electrooculogram/Electromiogram (EOG/EMG) and the Galvanic Skin Response (GSR).

  6. The HDTV digital audio matrix

    Science.gov (United States)

    Mason, A. J.

    Multichannel sound systems are being studied as part of the Eureka 95 and Radio-communication Bureau TG10-1 investigations into high definition television. One emerging sound system has five channels; three at the front and two at the back. This raises some compatibility issues. The listener might have only, say, two loudspeakers or the material to be broadcast may have fewer than five channels. The problem is how best to produce a set of signals to be broadcast, which is suitable for all listeners, from those that are available. To investigate this area, a device has been designed and built which has six input channels and six output channels. Each output signal is a linear combination of the input signals. The inputs and outputs are in AES/EBU digital audio format using BBC-designed AESIC chips. The matrix operation, to produce the six outputs from the six inputs, is performed by a Motorola DSP56001. The user interface and 'housekeeping' is managed by a T222 transputer. The operator of the matrix uses a VDU to enter sets of coefficients and a rotary switch to select which set to use. A set of analog controls is also available and is used to control operations other than the simple compatibility matrixing. The matrix has been very useful for simple tasks: mixing a stereo signal into mono, creating a stereo signal from a mono signal, applying a fixed gain or attenuation to a signal, exchanging the A and B channels of an AES/EBU bitstream, and so on. These are readily achieved using simple sets of coefficients. Additions to the user interface software have led to several more sophisticated applications which still consist of a matrix operation. Different multichannel panning laws have been evaluated. The analog controls adjust the panning; the audio signals are processed digitally using a matrix operation. A digital SoundField microphone decoder has also been implemented. digital audio matrix is such that it can be applied to a wide variety of signal processing

  7. C Implementation & comparison of companding & silence audio compression techniques

    CERN Document Server

    Dangarwala, Kruti

    2010-01-01

    Just about all the newest living room audio-video electronics and PC multimedia products being designed today will incorporate some form of compressed digitized-audio processing capability. Audio compression reduces the bit rate required to represent an analog audio signal while maintaining the perceived audio quality. Discarding inaudible data reduces the storage, transmission and compute requirements of handling high-quality audio files. This paper covers wave audio file format & algorithm of silence compression method and companding method to compress and decompress wave audio file. Then it compares the result of these two methods.

  8. Content-based audio authentication using a hierarchical patchwork watermark embedding

    Science.gov (United States)

    Gulbis, Michael; Müller, Erika

    2010-05-01

    Content-based audio authentication watermarking techniques extract perceptual relevant audio features, which are robustly embedded into the audio file to protect. Manipulations of the audio file are detected on the basis of changes between the original embedded feature information and the anew extracted features during verification. The main challenges of content-based watermarking are on the one hand the identification of a suitable audio feature to distinguish between content preserving and malicious manipulations. On the other hand the development of a watermark, which is robust against content preserving modifications and able to carry the whole authentication information. The payload requirements are significantly higher compared to transaction watermarking or copyright protection. Finally, the watermark embedding should not influence the feature extraction to avoid false alarms. Current systems still lack a sufficient alignment of watermarking algorithm and feature extraction. In previous work we developed a content-based audio authentication watermarking approach. The feature is based on changes in DCT domain over time. A patchwork algorithm based watermark was used to embed multiple one bit watermarks. The embedding process uses the feature domain without inflicting distortions to the feature. The watermark payload is limited by the feature extraction, more precisely the critical bands. The payload is inverse proportional to segment duration of the audio file segmentation. Transparency behavior was analyzed in dependence of segment size and thus the watermark payload. At a segment duration of about 20 ms the transparency shows an optimum (measured in units of Objective Difference Grade). Transparency and/or robustness are fast decreased for working points beyond this area. Therefore, these working points are unsuitable to gain further payload, needed for the embedding of the whole authentication information. In this paper we present a hierarchical extension

  9. Authenticity examination of compressed audio recordings using detection of multiple compression and encoders' identification.

    Science.gov (United States)

    Korycki, Rafal

    2014-05-01

    Since the appearance of digital audio recordings, audio authentication has been becoming increasingly difficult. The currently available technologies and free editing software allow a forger to cut or paste any single word without audible artifacts. Nowadays, the only method referring to digital audio files commonly approved by forensic experts is the ENF criterion. It consists in fluctuation analysis of the mains frequency induced in electronic circuits of recording devices. Therefore, its effectiveness is strictly dependent on the presence of mains signal in the recording, which is a rare occurrence. Recently, much attention has been paid to authenticity analysis of compressed multimedia files and several solutions were proposed for detection of double compression in both digital video and digital audio. This paper addresses the problem of tampering detection in compressed audio files and discusses new methods that can be used for authenticity analysis of digital recordings. Presented approaches consist in evaluation of statistical features extracted from the MDCT coefficients as well as other parameters that may be obtained from compressed audio files. Calculated feature vectors are used for training selected machine learning algorithms. The detection of multiple compression covers up tampering activities as well as identification of traces of montage in digital audio recordings. To enhance the methods' robustness an encoder identification algorithm was developed and applied based on analysis of inherent parameters of compression. The effectiveness of tampering detection algorithms is tested on a predefined large music database consisting of nearly one million of compressed audio files. The influence of compression algorithms' parameters on the classification performance is discussed, based on the results of the current study.

  10. Implementing Audio-CASI on Windows' Platforms.

    Science.gov (United States)

    Cooley, Philip C; Turner, Charles F

    1998-01-01

    Audio computer-assisted self interviewing (Audio-CASI) technologies have recently been shown to provide important and sometimes dramatic improvements in the quality of survey measurements. This is particularly true for measurements requiring respondents to divulge highly sensitive information such as their sexual, drug use, or other sensitive behaviors. However, DOS-based Audio-CASI systems that were designed and adopted in the early 1990s have important limitations. Most salient is the poor control they provide for manipulating the video presentation of survey questions. This article reports our experiences adapting Audio-CASI to Microsoft Windows 3.1 and Windows 95 platforms. Overall, our Windows-based system provided the desired control over video presentation and afforded other advantages including compatibility with a much wider array of audio devices than our DOS-based Audio-CASI technologies. These advantages came at the cost of increased system requirements --including the need for both more RAM and larger hard disks. While these costs will be an issue for organizations converting large inventories of PCS to Windows Audio-CASI today, this will not be a serious constraint for organizations and individuals with small inventories of machines to upgrade or those purchasing new machines today.

  11. Implementing Audio-CASI on Windows’ Platforms

    Science.gov (United States)

    Cooley, Philip C.; Turner, Charles F.

    2011-01-01

    Audio computer-assisted self interviewing (Audio-CASI) technologies have recently been shown to provide important and sometimes dramatic improvements in the quality of survey measurements. This is particularly true for measurements requiring respondents to divulge highly sensitive information such as their sexual, drug use, or other sensitive behaviors. However, DOS-based Audio-CASI systems that were designed and adopted in the early 1990s have important limitations. Most salient is the poor control they provide for manipulating the video presentation of survey questions. This article reports our experiences adapting Audio-CASI to Microsoft Windows 3.1 and Windows 95 platforms. Overall, our Windows-based system provided the desired control over video presentation and afforded other advantages including compatibility with a much wider array of audio devices than our DOS-based Audio-CASI technologies. These advantages came at the cost of increased system requirements --including the need for both more RAM and larger hard disks. While these costs will be an issue for organizations converting large inventories of PCS to Windows Audio-CASI today, this will not be a serious constraint for organizations and individuals with small inventories of machines to upgrade or those purchasing new machines today. PMID:22081743

  12. Quantization of wavelet packet audio coding

    Institute of Scientific and Technical Information of China (English)

    Tan Jianguo; Zhang Wenjun; Liu Peilin

    2006-01-01

    The method of quantization noise control of audio coding in the wavelet domain is proposed. Using the inverse Discrete Fourier Transform (DFT), it converts the masking threshold coming from MPEG psycho-acoustic model in the frequency domain to the signal in the time domain; the Discrete Wavelet Packet Transform (DWPT) is performed; the energy in each subband is regarded as the maximum allowed quantization noise energy. The experimental result shows that the proposed method can attain the nearly transparent audio quality below 64kbps for the most testing audio signals.

  13. Audio-visual voice activity detection

    Institute of Scientific and Technical Information of China (English)

    LIU Peng; WANG Zuo-ying

    2006-01-01

    In speech signal processing systems,frame-energy based voice activity detection (VAD) method may be interfered with the background noise and non-stationary characteristic of the frame-energy in voice segment.The purpose of this paper is to improve the performance and robustness of VAD by introducing visual information.Meanwhile,data-driven linear transformation is adopted in visual feature extraction,and a general statistical VAD model is designed.Using the general model and a two-stage fusion strategy presented in this paper,a concrete multimodal VAD system is built.Experiments show that a 55.0% relative reduction in frame error rate and a 98.5% relative reduction in sentence-breaking error rate are obtained when using multimodal VAD,compared to frame-energy based audio VAD.The results show that using multimodal method,sentence-breaking errors are almost avoided,and flame-detection performance is clearly improved, which proves the effectiveness of the visual modal in VAD.

  14. The Effects of Audio-Visual Recorded and Audio Recorded Listening Tasks on the Accuracy of Iranian EFL Learners' Oral Production

    Science.gov (United States)

    Drood, Pooya; Asl, Hanieh Davatgari

    2016-01-01

    The ways in which task in classrooms has developed and proceeded have receive great attention in the field of language teaching and learning in the sense that they draw attention of learners to the competing features such as accuracy, fluency, and complexity. English audiovisual and audio recorded materials have been widely used by teachers and…

  15. Extraction of ions and electrons from audio frequency plasma source

    Directory of Open Access Journals (Sweden)

    N. A. Haleem

    2016-09-01

    Full Text Available Herein, the extraction of high ion / electron current from an audio frequency (AF nitrogen gas discharge (10 – 100 kHz is studied and investigated. This system is featured by its small size (L= 20 cm and inner diameter = 3.4 cm and its capacitive discharge electrodes inside the tube and its high discharge pressure ∼ 0.3 Torr, without the need of high vacuum system or magnetic fields. The extraction system of ion/electron current from the plasma is a very simple electrode that allows self-beam focusing by adjusting its position from the source exit. The working discharge conditions were applied at a frequency from 10 to 100 kHz, power from 50 – 500 W and the gap distance between the plasma meniscus surface and the extractor electrode extending from 3 to 13 mm. The extracted ion/ electron current is found mainly dependent on the discharge power, the extraction gap width and the frequency of the audio supply. SIMION 3D program version 7.0 package is used to generate a simulation of ion trajectories as a reference to compare and to optimize the experimental extraction beam from the present audio frequency plasma source using identical operational conditions. The focal point as well the beam diameter at the collector area is deduced. The simulations showed a respectable agreement with the experimental results all together provide the optimizing basis of the extraction electrode construction and its parameters for beam production.

  16. Temporal structure and complexity affect audio-visual correspondence detection

    Directory of Open Access Journals (Sweden)

    Rachel N Denison

    2013-01-01

    Full Text Available Synchrony between events in different senses has long been considered the critical temporal cue for multisensory integration. Here, using rapid streams of auditory and visual events, we demonstrate how humans can use temporal structure (rather than mere temporal coincidence to detect multisensory relatedness. We find psychophysically that participants can detect matching auditory and visual streams via shared temporal structure for crossmodal lags of up to 200 ms. Performance on this task reproduced features of past findings based on explicit timing judgments but did not show any special advantage for perfectly synchronous streams. Importantly, the complexity of temporal patterns influences sensitivity to correspondence. Stochastic, irregular streams – with richer temporal pattern information – led to higher audio-visual matching sensitivity than predictable, rhythmic streams. Our results reveal that temporal structure and its complexity are key determinants for human detection of audio-visual correspondence. The distinctive emphasis of our new paradigms on temporal patterning could be useful for studying special populations with suspected abnormalities in audio-visual temporal perception and multisensory integration.

  17. Implementation of Audio signal by using wavelet transform

    Directory of Open Access Journals (Sweden)

    Chakresh kumar,

    2010-10-01

    Full Text Available Audio coding is the technology to represent audio in digital form with as few bits as possible while maintaining the intelligibility and quality required for particular application. Interest in audio coding is motivated by the evolution to digital communications and the requirement to minimize bit rate, and hence conserve bandwidth. There is always a tradeoff between compression ratio and maintaining the delivered audio quality and intelligibility. Audio coding is widely used in application such as digital broadcasting, Internet audio or music database to reduce the bit rate of high quality audio signal without comprising the perceptual quality. In this dissertation work Design and implementation of a MPEG Lossless audio codec using wavelet transform has been proposed. The major issues concerning the development of audio codec are choosing optimal wavelets for audio signals, decomposition level in the digital wavelet transform and thresholding criteria for coefficient truncation which is the basis to provide compression ratio for audio with suitable peak signal to noise ratio (PSNR, wavelet packet compression technique has also been used to compare the performanceof audio codec using wavelet transform. A psychoacoustic model is used to improve the quality of audio signal. The proposed audio codec has been implemented on DSK6713 Starter Kit using MATLAB-7.3 and Link to Code Composer Studio and various audio signals of different time duration have been tested. Result obtained show that the proposed codec improves quality of the reconstructed audio signal.

  18. Audio Indexing on the Web: a Preliminary Study of Some Audio Descriptors

    OpenAIRE

    Parlangeau-Vallès, Nathalie; Farinas, Jérôme; Fohr, Dominique; Illina, Irina; Magrin-Chagnolleau, Ivan; Mella, Odile; PINQUIER, Julien; Rouas, Jean-Luc; Sénac, Christine

    2003-01-01

    Colloque avec actes et comité de lecture. internationale.; International audience; The "Invisible Web" is composed of documents which can not be currently accessed by Web search engines, because they have a dynamic URL or are not textual, like video or audio documents. For audio documents, one solution is automatic indexing. It consists in finding good descriptors of audio documents which can be used as indexes for archiving and search. This paper presents an overview and recent results of th...

  19. Augmenting Environmental Interaction in Audio Feedback Systems

    Directory of Open Access Journals (Sweden)

    Seunghun Kim

    2016-04-01

    Full Text Available Audio feedback is defined as a positive feedback of acoustic signals where an audio input and output form a loop, and may be utilized artistically. This article presents new context-based controls over audio feedback, leading to the generation of desired sonic behaviors by enriching the influence of existing acoustic information such as room response and ambient noise. This ecological approach to audio feedback emphasizes mutual sonic interaction between signal processing and the acoustic environment. Mappings from analyses of the received signal to signal-processing parameters are designed to emphasize this specificity as an aesthetic goal. Our feedback system presents four types of mappings: approximate analyses of room reverberation to tempo-scale characteristics, ambient noise to amplitude and two different approximations of resonances to timbre. These mappings are validated computationally and evaluated experimentally in different acoustic conditions.

  20. Virtual Microphones for Multichannel Audio Resynthesis

    Directory of Open Access Journals (Sweden)

    Athanasios Mouchtaris

    2003-09-01

    Full Text Available Multichannel audio offers significant advantages for music reproduction, including the ability to provide better localization and envelopment, as well as reduced imaging distortion. On the other hand, multichannel audio is a demanding media type in terms of transmission requirements. Often, bandwidth limitations prohibit transmission of multiple audio channels. In such cases, an alternative is to transmit only one or two reference channels and recreate the rest of the channels at the receiving end. Here, we propose a system capable of synthesizing the required signals from a smaller set of signals recorded in a particular venue. These synthesized “virtual” microphone signals can be used to produce multichannel recordings that accurately capture the acoustics of that venue. Applications of the proposed system include transmission of multichannel audio over the current Internet infrastructure and, as an extension of the methods proposed here, remastering existing monophonic and stereophonic recordings for multichannel rendering.

  1. Audio-visual affective expression recognition

    Science.gov (United States)

    Huang, Thomas S.; Zeng, Zhihong

    2007-11-01

    Automatic affective expression recognition has attracted more and more attention of researchers from different disciplines, which will significantly contribute to a new paradigm for human computer interaction (affect-sensitive interfaces, socially intelligent environments) and advance the research in the affect-related fields including psychology, psychiatry, and education. Multimodal information integration is a process that enables human to assess affective states robustly and flexibly. In order to understand the richness and subtleness of human emotion behavior, the computer should be able to integrate information from multiple sensors. We introduce in this paper our efforts toward machine understanding of audio-visual affective behavior, based on both deliberate and spontaneous displays. Some promising methods are presented to integrate information from both audio and visual modalities. Our experiments show the advantage of audio-visual fusion in affective expression recognition over audio-only or visual-only approaches.

  2. Spatial audio reproduction with primary ambient extraction

    CERN Document Server

    He, JianJun

    2017-01-01

    This book first introduces the background of spatial audio reproduction, with different types of audio content and for different types of playback systems. A literature study on the classical and emerging Primary Ambient Extraction (PAE) techniques is presented. The emerging techniques aim to improve the extraction performance and also enhance the robustness of PAE approaches in dealing with more complex signals encountered in practice. The in-depth theoretical study helps readers to understand the rationales behind these approaches. Extensive objective and subjective experiments validate the feasibility of applying PAE in spatial audio reproduction systems. These experimental results, together with some representative audio examples and MATLAB codes of the key algorithms, illustrate clearly the differences among various approaches and also help readers gain insights on selecting different approaches for different applications.

  3. Definición de audio

    OpenAIRE

    Montañez, Luis A.; Cabrera, Juan G.

    2015-01-01

    Descripción del significado de Audio como objeto de estudio por distintos autores, y su diferenciación con el significado de Sonido. De esta forma se define Audio como una señal eléctrica con características similares en su forma de onda en comparación a la de una señal sonora, teniendo en cuenta la señal sonora corresponde a presión en u medio físico, mientras que la señal de Audio es una tensión o voltaje definida como señal análoga. En este orden de ideas, el Audio se concibe como una seña...

  4. Post-Production: "Sweeting" the Final Audio.

    Science.gov (United States)

    Beasley, Augie

    1995-01-01

    Knowing how to use audio mixers in the postproduction of student videos is necessary for high-quality sound. Equipment and techniques are described, and the use of background sound, sound effects, and music is described. (AEF)

  5. CERN automatic audio-conference service

    CERN Multimedia

    Sierra Moral, R

    2009-01-01

    Scientists from all over the world need to collaborate with CERN on a daily basis. They must be able to communicate effectively on their joint projects at any time; as a result telephone conferences have become indispensable and widely used. Managed by 6 operators, CERN already has more than 20000 hours and 5700 audio-conferences per year. However, the traditional telephone based audio-conference system needed to be modernized in three ways. Firstly, to provide the participants with more autonomy in the organization of their conferences; secondly, to eliminate the constraints of manual intervention by operators; and thirdly, to integrate the audio-conferences into a collaborative working framework. The large number, and hence cost, of the conferences prohibited externalization and so the CERN telecommunications team drew up a specification to implement a new system. It was decided to use a new commercial collaborative audio-conference solution based on the SIP protocol. The system was tested as the first Euro...

  6. Audio watermarking for live performance

    Science.gov (United States)

    Tachibana, Ryuki

    2003-06-01

    Audio watermarking has been used mainly for digitally stored content. Using real-time watermark embedding, its coverage can be extended to live broadcasts and live performances. In general, a conventional embedding algorithm receives a host signal (HS) and outputs the summation of the HS and a watermark signal (WS). However, when applied to real-time embedding, there are two problems: (1) delay of the HS, and (2) possible interruption of the broadcast. To solve these problems, we propose a watermark generation algorithm that outputs only a WS, and a system composition method in which a mixer outside the computer mixes the WS generated by the algorithm and the HS. In addition, we propose a new composition method "sonic watermarking." In this composition method, the sound of the HS and the sound of the WS are played separately by two speakers, and the sounds are mixed in the air. Using this composition method, it would be possible to generate a watermarking sound in a concerto hall so that the watermark could be detected from content recorded by audience members who have recording devices at their seats. We report on the results of experiments and discuss the merits and flaws of various real-time watermarking composition methods.

  7. Audio description as an accessibility enhancer

    OpenAIRE

    Martins, Cláudia Susana Nunes

    2012-01-01

    Audio description for the blind and visually-impaired has been around since people have described what is seen. Throughout time, it has evolved and developed in different contexts, starting with daily life, moving into the cinema and television, then across other performing arts, museums and galleries, historical sites and public places. Audio description is above all an issue of accessibility and of providing visually-impaired people with the same rights to have access to culture, e...

  8. Image quality assessment method based on nonlinear feature extraction in kernel space

    Institute of Scientific and Technical Information of China (English)

    Yong DING‡; Nan LI; Yang ZHAO; Kai HUANG

    2016-01-01

    To match human perception, extracting perceptual features effectively plays an important role in image quality assessment. In contrast to most existing methods that use linear transformations or models to represent images, we employ a complex mathematical expression of high dimensionality to reveal the statistical characteristics of the images. Furthermore, by introducing kernel methods to transform the linear problem into a nonlinear one, a full-reference image quality assessment method is proposed based on high-dimensional nonlinear feature extraction. Experiments on the LIVE, TID2008, and CSIQ databases demonstrate that nonlinear features offer competitive performance for image inherent quality representation and the proposed method achieves a promising performance that is consistent with human subjective evaluation.

  9. PENGGUNAAN MEDIA AUDIO DALAM PEMBELAJARAN STENOGRAFI

    Directory of Open Access Journals (Sweden)

    S Martono

    2011-06-01

    Full Text Available The objective this study is to know the effectivenes of using audio media in stenografi typing learning. The population  of this research was 30 students that divided into two groups; experimental and controlled group consisted of 15 students. Based on the first score in stenografi subject that the two groups have the same abillity but they were given different treatment. For experimental group, they got a treatment of audio media whereas the controlled group didn’t use audio media. The technique of collecting data were documentation technique and experimental tecnique. The instrument was stenografi speed typing. The final result showed that the using of audio media was more effective and can improve the study result better than controlled group. This result was expected to  give significance for the stenografi teachers to apply audio media in learning and input for the students that stenografi was not a memorizing subject but it was a skill subject that must be trained by joining the lesson. Thus, people can use stenografi typing to record each talk. Keywords: Learning, Audio Media, Stenografi

  10. Optimized Audio Classification and Segmentation Algorithm by Using Ensemble Methods

    OpenAIRE

    Saadia Zahid; Fawad Hussain; Muhammad Rashid; Muhammad Haroon Yousaf; Hafiz Adnan Habib

    2015-01-01

    Audio segmentation is a basis for multimedia content analysis which is the most important and widely used application nowadays. An optimized audio classification and segmentation algorithm is presented in this paper that segments a superimposed audio stream on the basis of its content into four main audio types: pure-speech, music, environment sound, and silence. An algorithm is proposed that preserves important audio content and reduces the misclassification rate without using large amount o...

  11. Study on the construction of multi-dimensional Remote Sensing feature space for hydrological drought

    Science.gov (United States)

    Xiang, Daxiang; Tan, Debao; Cui, Yuanlai; Wen, Xiongfei; Shen, Shaohong; Li, Zhe

    2014-03-01

    Hydrological drought refers to an abnormal water shortage caused by precipitation and surface water shortages or a groundwater imbalance. Hydrological drought is reflected in a drop of surface water, decrease of vegetation productivity, increase of temperature difference between day and night and so on. Remote sensing permits the observation of surface water, vegetation, temperature and other information from a macro perspective. This paper analyzes the correlation relationship and differentiation of both remote sensing and surface measured indicators, after the selection and extraction a series of representative remote sensing characteristic parameters according to the spectral characterization of surface features in remote sensing imagery, such as vegetation index, surface temperature and surface water from HJ-1A/B CCD/IRS data. Finally, multi-dimensional remote sensing features such as hydrological drought are built on a intelligent collaborative model. Further, for the Dong-ting lake area, two drought events are analyzed for verification of multi-dimensional features using remote sensing data with different phases and field observation data. The experiments results proved that multi-dimensional features are a good method for hydrological drought.

  12. Geometry and dimensionality reduction of feature spaces in primary visual cortex

    Science.gov (United States)

    Barbieri, Davide

    2015-09-01

    Some geometric properties of the wavelet analysis performed by visual neurons are discussed and compared with experimental data. In particular, several relationships between the cortical morphologies and the parametric dependencies of extracted features are formalized and considered from a harmonic analysis point of view.

  13. Developmental features of the neonatal brain: MR imaging. Part II. Ventricular size and extracerebral space.

    Science.gov (United States)

    McArdle, C B; Richardson, C J; Nicholas, D A; Mirfakhraee, M; Hayden, C K; Amparo, E G

    1987-01-01

    Magnetic resonance (MR) imaging with a 0.6-T magnet was performed on 51 neonates, aged 29-42 weeks postconception. In 45 neonates, the ventricular/brain ratio (V/B) at the level of the frontal horns and midbody of the lateral ventricles ranged from 0.26 to 0.34. In six other infants a V/B of 0.36 or greater was associated with either cerebral atrophy or obstructive hydrocephalus. The width of the extracerebral space measured along specified points varied little in the neonatal period and ranged from 0 to 4 mm in 48 infants. Extracerebral space widths of 5-6 mm were seen in three other infants with severe asphyxia. Prominence of the subarachnoid space overlying the posterior parietal lobes is normal in neonates and should not be confused with cerebral atrophy. The authors conclude that V/B ratios of 0.26-0.34 and extracerebral space widths of 0-4 mm represent the normal range, and that neonates whose measurements exceed these values should be followed up.

  14. Non-retinotopic feature processing in the absence of retinotopic spatial layout and the construction of perceptual space from motion.

    Science.gov (United States)

    Ağaoğlu, Mehmet N; Herzog, Michael H; Oğmen, Haluk

    2012-10-15

    The spatial representation of a visual scene in the early visual system is well known. The optics of the eye map the three-dimensional environment onto two-dimensional images on the retina. These retinotopic representations are preserved in the early visual system. Retinotopic representations and processing are among the most prevalent concepts in visual neuroscience. However, it has long been known that a retinotopic representation of the stimulus is neither sufficient nor necessary for perception. Saccadic Stimulus Presentation Paradigm and the Ternus-Pikler displays have been used to investigate non-retinotopic processes with and without eye movements, respectively. However, neither of these paradigms eliminates the retinotopic representation of the spatial layout of the stimulus. Here, we investigated how stimulus features are processed in the absence of a retinotopic layout and in the presence of retinotopic conflict. We used anorthoscopic viewing (slit viewing) and pitted a retinotopic feature-processing hypothesis against a non-retinotopic feature-processing hypothesis. Our results support the predictions of the non-retinotopic feature-processing hypothesis and demonstrate the ability of the visual system to operate non-retinotopically at a fine feature processing level in the absence of a retinotopic spatial layout. Our results suggest that perceptual space is actively constructed from the perceptual dimension of motion. The implications of these findings for normal ecological viewing conditions are discussed.

  15. Features of Virchow-Robin spaces in newly diagnosed multiple sclerosis patients

    Energy Technology Data Exchange (ETDEWEB)

    Etemadifar, Masoud [Department of Clinical and Biological Sciences, Division of Neurology, San Luigi Gonzaga School of Medicine, Orbassano (Torino), Turin (Italy); Department of Neurology, Isfahan University of Medical Sciences, Isfahan (Iran, Islamic Republic of); Isfahan Research Committee of Multiple Sclerosis (IRCOMS), Isfahan University of Medical Sciences, Isfahan (Iran, Islamic Republic of); Hekmatnia, Ali; Tayari, Nazila [Department of Radiology, Isfahan University of Medical Sciences, Isfahan (Iran, Islamic Republic of); Kazemi, Mojtaba [Department of Neurology, Isfahan University of Medical Sciences, Isfahan (Iran, Islamic Republic of); Ghazavi, Amirhossein [Department of Radiology, Isfahan University of Medical Sciences, Isfahan (Iran, Islamic Republic of); Akbari, Mojtaba [Department of Epidemiology and Statistics, Isfahan University of Medical Sciences, Isfahan (Iran, Islamic Republic of); Maghzi, Amir-Hadi, E-mail: maghzi@edc.mui.ac.ir [Isfahan Research Committee of Multiple Sclerosis (IRCOMS), Isfahan University of Medical Sciences, Isfahan (Iran, Islamic Republic of); Neuroimmunology Unit, Centre for Neuroscience and Trauma, Blizard Institute of Cell and Molecular Science, Barts and the London School of Medicine and Dentistry, London (United Kingdom); Isfahan Neurosciences Research Center, Isfahan University of Medical Sciences, Isfahan (Iran, Islamic Republic of)

    2011-11-15

    Background: Virchow-Robin spaces (VRSs) are perivascular pia-lined extensions of the subarachnoid space around the arteries and veins as they enter the brain parenchyma. These spaces are responsible for inflammatory processes within the brain. Objectives: This study was designed to shed more light on the location, size and shape of VRSs on 3 mm slice thickness, 1.5 Tesla MRI scans of newly diagnosed MS patients in Isfahan, Iran and compare the results with healthy age- and sex-matched controls. Methods: We evaluated MRI scans of 73 MS patients obtained within 3 months of MS onset and compared them with MRI scans from 73 age- and sex-matched healthy volunteers. Three mm section proton density, T2W and FLAIR MR images were obtained for all subjects. The location, size and shape of VRSs were compared between the two groups. Results: The total number of VRSs was significantly more in the MS group (p < 0.001). The distribution of VRSs were significantly more located in the high convexity areas in the MS group (p < 0.001), while there was no significant differences in other regions. The round shaped VRSs were significantly more detected on MRI scans of MS patients, and curvilinear shapes were significantly more frequently observed in healthy volunteers, however there were no significant differences for oval shaped VRSs between the two groups. The number of VRSs with the size over than 2 mm were significantly more observed in the MS groups compared to controls. We also observed some differences in the characteristics of VRSs between the genders in the MS group. Conclusion: The results of this study shed more light on the usefulness of VRSs as an MRI marker for the disease. In addition, according to our results VRSs might also have implication to determine the prognosis of the disease. However, larger studies with more advanced MRI techniques are required to confirm our results.

  16. On the scaling features of magnetic field fluctuations at non-MHD scales in turbulent space plasmas

    Science.gov (United States)

    Consolini, G.; Giannattasio, F.; Yordanova, E.; Vörös, Z.; Marcucci, M. F.; Echim, M.; Chang, T.

    2016-11-01

    In several different contexts space plasmas display intermittent turbulence at magneto-hydro-dynamic (MHD) scales, which manifests in anomalous scaling features of the structure functions of the magnetic field increments. Moving to smaller scales, i.e. below the ion-cyclotron and/or ion inertial length, these scaling features are still observed, even though its is not clear if these scaling features are still anomalous or not. Here, we investigate the nature of scaling properties of magnetic field increments at non-MHD scales for a period of fast solar wind to investigate the occurrence or not of multifractal features and collapsing of probability distribution functions (PDFs) using the novel Rank-Ordered Multifractal Analysis (ROMA) method, which is more sensitive than the traditional structure function approach. We find a strong evidence for the occurrence of a near mono-scaling behavior, which suggests that the observed turbulent regime at non-MHD scales mainly displays a mono-fractal nature of magnetic field increments. The results are discussed in terms of a non-compact fractal structure of the dissipation field.

  17. EFFECTS OF ELECTRODE SPACING AND INVERSION TECHNIQUES ON THE EFFICACY OF 2D RESISTIVITY IMAGING TO DELINEATE SUBSURFACE FEATURES

    Directory of Open Access Journals (Sweden)

    Adiat Kola Abdul-Nafiu

    2013-01-01

    Full Text Available In this study, the effect of the choice of appropriate electrode spacing and inversion algorithms on the efficacy of 2D imaging to map subsurface features was investigated. The target being investigated was the drainage concrete pipe buried at approximately 0.3 m into the subsurface. A profile perpendicular to the strike of the pipe was established. 2D resistivity data was separately collected with the electrode spacings of 1.5 m and 0.5 m. using the Dipole-Dipole, the Wenner and the Wenner-Schlumberger array configurations. The results obtained showed that when the electrode spacing of 1.5 m was used for the investigations, none of the three array types was able to map the target with either of the two inversion techniques. The results further show that the attainment of RMS error of less about 10% which usually gives the indication of a good subsurface model is not a guarantee that subsurface features are successfully mapped. On the other hand, when the electrode spacing of 0.5 m was used for the data collection, the results obtained with the standard constrains inversion technique showed that all the three array configurations mapped the target however, only the dipole-dipole array was able to resolve the boundary between the concrete pipe and the entrapped air. With the robust constrain inversion technique; the target was also successfully mapped by all the three array types. In addition to this, the boundary between the entrapped air and the concrete pipe was resolved by all the three array types. This suggests that if there is a significant contrast in the subsurface layers’ resistivities, the robust constrain inversion algorithm technique gives better boundaries resolution irrespective of the array types used for the survey. The inversion of the 3D data gave 3D resistivity sections which were presented as horizontal depth slices. The result obtained from the inversion of the 3D data has assisted us in getting information about the

  18. Audio Journal in an ELT Context

    Directory of Open Access Journals (Sweden)

    Neşe Aysin Siyli

    2012-09-01

    Full Text Available It is widely acknowledged that one of the most serious problems students of English as a foreign language face is their deprivation of practicing the language outside the classroom. Generally, the classroom is the sole environment where they can practice English, which by its nature does not provide rich setting to help students develop their competence by putting the language into practice. Motivated by this need, this descriptive study investigated the impact of audio dialog journals on students’ speaking skills. It also aimed to gain insights into students’ and teacher’s opinions on keeping audio dialog journals outside the class. The data of the study developed from student and teacher audio dialog journals, student written feedbacks, interviews held with the students, and teacher observations. The descriptive analysis of the data revealed that audio dialog journals served a number of functions ranging from cognitive to linguistic, from pedagogical to psychological, and social. The findings and pedagogical implications of the study are discussed in detail. Key words: audio dialog journal, speaking skills, and student-teacher communication

  19. A High-Voltage Class D Audio Amplifier for Dielectric Elastomer Transducers

    DEFF Research Database (Denmark)

    Nielsen, Dennis; Knott, Arnold; Andersen, Michael A. E.

    2014-01-01

    Dielectric Elastomer (DE) transducers have emerged as a very interesting alternative to the traditional electrodynamic transducer. Lightweight, small size and high maneuverability are some of the key features of the DE transducer. An amplifier for the DE transducer suitable for audio applications...

  20. Unsupervised Learning of Structural Representation of Percussive Audio Using a Hierarchical Dirichlet Process Hidden Markov Model

    DEFF Research Database (Denmark)

    Antich, Jose Luis Diez; Paterna, Mattia; Marxer, Richard

    2016-01-01

    A method is proposed that extracts a structural representation of percussive audio in an unsupervised manner. It consists of two parts: 1) The input signal is segmented into blocks of approximately even duration, aligned to a metrical grid, using onset and timbre feature extraction, agglomerative...

  1. [Monitoring of farmland drought based on LST-LAI spectral feature space].

    Science.gov (United States)

    Sui, Xin-Xin; Qin, Qi-Ming; Dong, Heng; Wang, Jin-Liang; Meng, Qing-Ye; Liu, Ming-Chao

    2013-01-01

    Farmland drought has the characteristics of wide range and seriously affecting on agricultural production, so real-time dynamic monitored has been a challenging problem. By using MODIS land products, and constructing the spectral space of LST and LAI, the temperature LAI drought index (TLDI) was put forward and validated using ground-measured 0-10 cm averaged soil moisture of Ningxia farmland. The results show that the coefficient of determination (R2) of both them varies from 0.43 to 0.86. Compared to TVDI, the TLDI has higher accuracy for farmland moisture monitoring, and solves the saturation of NDVI during the late development phases of the crop. Furthermore, directly using MODIS land products LST and LAI and avoiding the complicated process of using the original MODIS data provide a new technical process to the regular operation of farmland drought monitoring.

  2. Near-field Localization of Audio

    DEFF Research Database (Denmark)

    Jensen, Jesper Rindom; Christensen, Mads Græsbøll

    2014-01-01

    Localization of audio sources using microphone arrays has been an important research problem for more than two decades. Many traditional methods for solving the problem are based on a two-stage procedure: first, information about the audio source, such as time differences-of-arrival (TDOAs......) and gain ratios-of-arrival (GROAs) between microphones is estimated, and, second, this knowledge is used to localize the audio source. These methods often have a low computational complexity, but this comes at the cost of a limited estimation accuracy. Therefore, we propose a new localization approach......, where the desired signal is modeled using TDOAs and GROAs, which are determined by the source location. This facilitates the derivation of one-stage, maximum likelihood methods under a white Gaussian noise assumption that is applicable in both near- and far-field scenarios. Simulations show...

  3. Nonlinear dynamic macromodeling techniques for audio systems

    Science.gov (United States)

    Ogrodzki, Jan; Bieńkowski, Piotr

    2015-09-01

    This paper develops a modelling method and a models identification technique for the nonlinear dynamic audio systems. Identification is performed by means of a behavioral approach based on a polynomial approximation. This approach makes use of Discrete Fourier Transform and Harmonic Balance Method. A model of an audio system is first created and identified and then it is simulated in real time using an algorithm of low computational complexity. The algorithm consists in real time emulation of the system response rather than in simulation of the system itself. The proposed software is written in Python language using object oriented programming techniques. The code is optimized for a multithreads environment.

  4. Information Security using Audio Steganography -A Survey

    Directory of Open Access Journals (Sweden)

    B. Santhi

    2012-07-01

    Full Text Available The most important application of internet is data transmission. Unfortunately this is less secured because of advanced hacking technologies. So, for secured data transmission we make use of steganography. This is the art of hiding information where the existence of data is unknown. Any medium like music, video, text, speech, etc can be used. In this study, the selected medium is audio. This study discusses about the existing audio steganographic techniques along with their advantages and limitations. Also an algorithm implementing parity and LSB methods is proposed. This mitigates the limitations of the existing methods discussed, thus increasing security and reducing computational load and code complexity.

  5. Frequency Hopping Method for Audio Watermarking

    Directory of Open Access Journals (Sweden)

    A. Anastasijević

    2012-11-01

    Full Text Available This paper evaluates the degradation of audio content for a perceptible removable watermark. Two different approaches to embedding the watermark in the spectral domain were investigated. The frequencies for watermark embedding are chosen according to a pseudorandom sequence making the methods robust. Consequentially, the lower quality audio can be used for promotional purposes. For a fee, the watermark can be removed with a secret watermarking key. Objective and subjective testing was conducted in order to measure degradation level for the watermarked music samples and to examine residual distortion for different parameters of the watermarking algorithm and different music genres.

  6. Synchronization and comparison of Lifelog audio recordings

    DEFF Research Database (Denmark)

    Nielsen, Andreas Brinch; Hansen, Lars Kai

    2008-01-01

    We investigate concurrent ‘Lifelog’ audio recordings to locate segments from the same environment. We compare two techniques earlier proposed for pattern recognition in extended audio recordings, namely cross-correlation and a fingerprinting technique. If successful, such alignment can be used...... as a preprocessing step to select and synchronize recordings before further processing. The two methods perform similarly in classification, but fingerprinting scales better with the number of recordings, while cross-correlation can offer sample resolution synchronization. We propose and investigate the benefits...

  7. Hierarchical structure for audio-video based semantic classification of sports video sequences

    Science.gov (United States)

    Kolekar, M. H.; Sengupta, S.

    2005-07-01

    A hierarchical structure for sports event classification based on audio and video content analysis is proposed in this paper. Compared to the event classifications in other games, those of cricket are very challenging and yet unexplored. We have successfully solved cricket video classification problem using a six level hierarchical structure. The first level performs event detection based on audio energy and Zero Crossing Rate (ZCR) of short-time audio signal. In the subsequent levels, we classify the events based on video features using a Hidden Markov Model implemented through Dynamic Programming (HMM-DP) using color or motion as a likelihood function. For some of the game-specific decisions, a rule-based classification is also performed. Our proposed hierarchical structure can easily be applied to any other sports. Our results are very promising and we have moved a step forward towards addressing semantic classification problems in general.

  8. A Perceptually Reweighted Mixed-Norm Method for Sparse Approximation of Audio Signals

    DEFF Research Database (Denmark)

    Christensen, Mads Græsbøll; Sturm, Bob L.

    2011-01-01

    In this paper, we consider the problem of finding sparse representations of audio signals for coding purposes. In doing so, it is of utmost importance that when only a subset of the present components of an audio signal are extracted, it is the perceptually most important ones. To this end, we...... propose a new iterative algorithm based on two principles: 1) a reweighted l1-norm based measure of sparsity; and 2) a reweighted l2-norm based measure of perceptual distortion. Using these measures, the considered problem is posed as a constrained convex optimization problem that can be solved optimally...... using standard software. A prominent feature of the new method is that it solves a problem that is closely related to the objective of coding, namely rate-distortion optimization. In computer simulations, we demonstrate the properties of the algorithm and its application to real audio signals....

  9. High-energy electromagnetic cascades in extragalactic space: Physics and features

    Science.gov (United States)

    Berezinsky, V.; Kalashev, O.

    2016-07-01

    Using the analytic modeling of the electromagnetic cascades compared with more precise numerical simulations, we describe the physical properties of electromagnetic cascades developing in the universe on cosmic microwave background and extragalactic background light radiations. A cascade is initiated by very-high-energy photon or electron, and the remnant photons at large distance have two-component energy spectrum, ∝E-2 (∝E-1.9 in numerical simulations) produced at the cascade multiplication stage and ∝E-3 /2 from Inverse Compton electron cooling at low energies. The most noticeable property of the cascade spectrum in analytic modeling is "strong universality," which includes the standard energy spectrum and the energy density of the cascade ωcas as its only numerical parameter. Using numerical simulations of the cascade spectrum and comparing it with recent Fermi LAT spectrum, we obtained the upper limit on ωcas stronger than in previous works. The new feature of the analysis is the "Emax rule." We investigate the dependence of ωcas on the distribution of sources, distinguishing two cases of universality: the strong and weak ones.

  10. The Visualization and Analysis of POI Features under Network Space Supported by Kernel Density Estimation

    Directory of Open Access Journals (Sweden)

    YU Wenhao

    2015-01-01

    Full Text Available The distribution pattern and the distribution density of urban facility POIs are of great significance in the fields of infrastructure planning and urban spatial analysis. The kernel density estimation, which has been usually utilized for expressing these spatial characteristics, is superior to other density estimation methods (such as Quadrat analysis, Voronoi-based method, for that the Kernel density estimation considers the regional impact based on the first law of geography. However, the traditional kernel density estimation is mainly based on the Euclidean space, ignoring the fact that the service function and interrelation of urban feasibilities is carried out on the network path distance, neither than conventional Euclidean distance. Hence, this research proposed a computational model of network kernel density estimation, and the extension type of model in the case of adding constraints. This work also discussed the impacts of distance attenuation threshold and height extreme to the representation of kernel density. The large-scale actual data experiment for analyzing the different POIs' distribution patterns (random type, sparse type, regional-intensive type, linear-intensive type discusses the POI infrastructure in the city on the spatial distribution of characteristics, influence factors, and service functions.

  11. Audio wiring guide how to wire the most popular audio and video connectors

    CERN Document Server

    Hechtman, John

    2012-01-01

    Whether you're a pro or an amateur, a musician or into multimedia, you can't afford to guess about audio wiring. The Audio Wiring Guide is a comprehensive, easy-to-use guide that explains exactly what you need to know. No matter the size of your wiring project or installation, this handy tool provides you with the essential information you need and the techniques to use it. Using The Audio Wiring Guide is like having an expert at your side. By following the clear, step-by-step directions, you can do professional-level work at a fraction of the cost.

  12. Feature to space conversion during target selection in the dorsolateral and ventrolateral prefrontal cortex of monkeys.

    Science.gov (United States)

    Inoue, Masato; Mikami, Akichika

    2010-03-01

    To investigate the neuronal mechanism of the process of selection of a target from an array of stimuli, we analysed neuronal activity of the lateral prefrontal cortex during the response period of a serial probe reproduction task. During the response period of this task, monkeys were trained to select a memorized target object from an array of three objects and make a saccadic eye movement toward it. Of 611 neurons, 74 neurons showed visual response and 56 neurons showed presaccadic activity during the response period. Among visual neurons, 27 showed array- and target-selectivity. All of these array- and target-selective visual responses were recorded from the ventrolateral prefrontal cortex (VLPFC). Among 56 neurons with presaccadic activity, nine showed target-selective activity, 17 showed target- and direction-selective activity, and 23 showed direction-selective activity. The target-selective, and the target- and direction-selective activities were recorded from the VLPFC, and the direction-selective activities were recorded from VLPFC and dorsolateral prefrontal cortex (DLPFC). The starting time of the activity was earlier for the target-selective, and target- and direction-selective activities in VLPFC, intermediate for the direction-selective activities in VLPFC, and later for the direction-selective activities in DLPFC. These results suggest that VLPFC plays a role in the process of selection of a target object from an array of stimuli, VLPFC and DLPFC play a role in determining the location of the target in space, and DLPFC plays a role in selecting a direction and making a decision to generate a saccadic eye movement.

  13. Music information retrieval in compressed audio files: a survey

    Science.gov (United States)

    Zampoglou, Markos; Malamos, Athanasios G.

    2014-07-01

    In this paper, we present an organized survey of the existing literature on music information retrieval systems in which descriptor features are extracted directly from the compressed audio files, without prior decompression to pulse-code modulation format. Avoiding the decompression step and utilizing the readily available compressed-domain information can significantly lighten the computational cost of a music information retrieval system, allowing application to large-scale music databases. We identify a number of systems relying on compressed-domain information and form a systematic classification of the features they extract, the retrieval tasks they tackle and the degree in which they achieve an actual increase in the overall speed-as well as any resulting loss in accuracy. Finally, we discuss recent developments in the field, and the potential research directions they open toward ultra-fast, scalable systems.

  14. Audio Mining with emphasis on Music Genre Classification

    DEFF Research Database (Denmark)

    Meng, Anders

    2004-01-01

    Audio is an important part of our daily life, basically it increases our impression of the world around us whether this is communication, music, danger detection etc. Currently the field of Audio Mining, which here includes areas of music genre, music recognition / retrieval, playlist generation...... in searching / retrieving audio effectively is needed. Currently, search engines such as e.g. Google, AltaVista etc. do not search into audio files, but uses either the textual information attached to the audio file or the textual information around the audio. Also in the hearing aid industries around...... the world the problem of detecting environments from the input audio is researched as to increase the life quality of hearing-impaired. Basically there is a lot of work within the field of audio mining. The presentation will mainly focus on music genre classification where we have a fixed amount of genres...

  15. Extracting meaning from audio signals - a machine learning approach

    DEFF Research Database (Denmark)

    Larsen, Jan

    2007-01-01

    * Machine learning framework for sound search * Genre classification * Music and audio separation * Wind noise suppression......* Machine learning framework for sound search * Genre classification * Music and audio separation * Wind noise suppression...

  16. Cross-modal retrieval of scripted speech audio

    Science.gov (United States)

    Owen, Charles B.; Makedon, Fillia

    1997-12-01

    This paper describes an approach to the problem of searching speech-based digital audio using cross-modal information retrieval. Audio containing speech (speech-based audio) is difficult to search. Open vocabulary speech recognition is advancing rapidly, but cannot yield high accuracy in either search or transcription modalities. However, text can be searched quickly and efficiently with high accuracy. Script- light digital audio is audio that has an available transcription. This is a surprisingly large class of content including legal testimony, broadcasting, dramatic productions and political meetings and speeches. An automatic mechanism for deriving the synchronization between the transcription and the audio allows for very accurate retrieval of segments of that audio. The mechanism described in this paper is based on building a transcription graph from the text and computing biphone probabilities for the audio. A modified beam search algorithm is presented to compute the alignment.

  17. Modeling of the ground-to-SSFMB link networking features using SPW

    Science.gov (United States)

    Watson, John C.

    1993-01-01

    This report describes the modeling and simulation of the networking features of the ground-to-Space Station Freedom manned base (SSFMB) link using COMDISCO signal processing work-system (SPW). The networking features modeled include the implementation of Consultative Committee for Space Data Systems (CCSDS) protocols in the multiplexing of digitized audio and core data into virtual channel data units (VCDU's) in the control center complex and the demultiplexing of VCDU's in the onboard baseband signal processor. The emphasis of this work has been placed on techniques for modeling the CCSDS networking features using SPW. The objectives for developing the SPW models are to test the suitability of SPW for modeling networking features and to develop SPW simulation models of the control center complex and space station baseband signal processor for use in end-to-end testing of the ground-to-SSFMB S-band single access forward (SSAF) link.

  18. Relevant Research on Audio-Tutorial Methods

    Science.gov (United States)

    Novak, Joseph D.

    1970-01-01

    Reviews two aspects of research related to audio-tutorial instructional methods. First, the learning theory of David P. Ausebel is summarized and applied to instructional procedures. Secondly, learning time for attainment of concept and knowledge levels is discussed. Concludes that studies are needed on designs based on Ausebel's theory,…

  19. All About Audio Equalization: Solutions and Frontiers

    Directory of Open Access Journals (Sweden)

    Vesa Välimäki

    2016-05-01

    Full Text Available Audio equalization is a vast and active research area. The extent of research means that one often cannot identify the preferred technique for a particular problem. This review paper bridges those gaps, systemically providing a deep understanding of the problems and approaches in audio equalization, their relative merits and applications. Digital signal processing techniques for modifying the spectral balance in audio signals and applications of these techniques are reviewed, ranging from classic equalizers to emerging designs based on new advances in signal processing and machine learning. Emphasis is placed on putting the range of approaches within a common mathematical and conceptual framework. The application areas discussed herein are diverse, and include well-defined, solvable problems of filter design subject to constraints, as well as newly emerging challenges that touch on problems in semantics, perception and human computer interaction. Case studies are given in order to illustrate key concepts and how they are applied in practice. We also recommend preferred signal processing approaches for important audio equalization problems. Finally, we discuss current challenges and the uncharted frontiers in this field. The source code for methods discussed in this paper is made available at https://code.soundsoftware.ac.uk/projects/allaboutaudioeq.

  20. Transparency benchmarking on audio watermarks and steganography

    Science.gov (United States)

    Kraetzer, Christian; Dittmann, Jana; Lang, Andreas

    2006-02-01

    The evaluation of transparency plays an important role in the context of watermarking and steganography algorithms. This paper introduces a general definition of the term transparency in the context of steganography, digital watermarking and attack based evaluation of digital watermarking algorithms. For this purpose the term transparency is first considered individually for each of the three application fields (steganography, digital watermarking and watermarking algorithm evaluation). From the three results a general definition for the overall context is derived in a second step. The relevance and applicability of the definition given is evaluated in practise using existing audio watermarking and steganography algorithms (which work in time, frequency and wavelet domain) as well as an attack based evaluation suite for audio watermarking benchmarking - StirMark for Audio (SMBA). For this purpose selected attacks from the SMBA suite are modified by adding transparency enhancing measures using a psychoacoustic model. The transparency and robustness of the evaluated audio watermarking algorithms by using the original and modifid attacks are compared. The results of this paper show hat transparency benchmarking will lead to new information regarding the algorithms under observation and their usage. This information can result in concrete recommendations for modification, like the ones resulting from the tests performed here.

  1. Spatial audio quality perception (part 2)

    DEFF Research Database (Denmark)

    Conetta, R.; Brookes, T.; Rumsey, F.;

    2015-01-01

    location, envelopment, coverage angle, ensemble width, and spaciousness. They can also impact timbre, and changes to timbre can then influence spatial perception. Previously obtained data was used to build a regression model of perceived spatial audio quality in terms of spatial and timbral metrics...

  2. An ESL Audio-Script Writing Workshop

    Science.gov (United States)

    Miller, Carla

    2012-01-01

    The roles of dialogue, collaborative writing, and authentic communication have been explored as effective strategies in second language writing classrooms. In this article, the stages of an innovative, multi-skill writing method, which embeds students' personal voices into the writing process, are explored. A 10-step ESL Audio Script Writing Model…

  3. Audible Aliasing Distortion in Digital Audio Synthesis

    Directory of Open Access Journals (Sweden)

    J. Schimmel

    2012-04-01

    Full Text Available This paper deals with aliasing distortion in digital audio signal synthesis of classic periodic waveforms with infinite Fourier series, for electronic musical instruments. When these waveforms are generated in the digital domain then the aliasing appears due to its unlimited bandwidth. There are several techniques for the synthesis of these signals that have been designed to avoid or reduce the aliasing distortion. However, these techniques have high computing demands. One can say that today's computers have enough computing power to use these methods. However, we have to realize that today’s computer-aided music production requires tens of multi-timbre voices generated simultaneously by software synthesizers and the most of the computing power must be reserved for hard-disc recording subsystem and real-time audio processing of many audio channels with a lot of audio effects. Trivially generated classic analog synthesizer waveforms are therefore still effective for sound synthesis. We cannot avoid the aliasing distortion but spectral components produced by the aliasing can be masked with harmonic components and thus made inaudible if sufficient oversampling ratio is used. This paper deals with the assessment of audible aliasing distortion with the help of a psychoacoustic model of simultaneous masking and compares the computing demands of trivial generation using oversampling with those of other methods.

  4. Structuring Broadcast Audio for Information Access

    Directory of Open Access Journals (Sweden)

    Gauvain Jean-Luc

    2003-01-01

    Full Text Available One rapidly expanding application area for state-of-the-art speech recognition technology is the automatic processing of broadcast audiovisual data for information access. Since much of the linguistic information is found in the audio channel, speech recognition is a key enabling technology which, when combined with information retrieval techniques, can be used for searching large audiovisual document collections. Audio indexing must take into account the specificities of audio data such as needing to deal with the continuous data stream and an imperfect word transcription. Other important considerations are dealing with language specificities and facilitating language portability. At Laboratoire d′Informatique pour la Mécanique et les Sciences de l′Ingénieur (LIMSI, broadcast news transcription systems have been developed for seven languages: English, French, German, Mandarin, Portuguese, Spanish, and Arabic. The transcription systems have been integrated into prototype demonstrators for several application areas such as audio data mining, structuring audiovisual archives, selective dissemination of information, and topic tracking for media monitoring. As examples, this paper addresses the spoken document retrieval and topic tracking tasks.

  5. Audio-visual integration in schizophrenia

    NARCIS (Netherlands)

    Gelder, B.L.M.F. de; Vroomen, J.; Annen, L.; Masthoff, E.D.M.; Hodiamont, P.P.G.

    2003-01-01

    Integration of information provided simultaneously by audition and vision was studied in a group of 18 schizophrenic patients. They were compared to a control group, consisting of 12 normal adults of comparable age and education. By administering two tasks, each focusing on one aspect of audio-visua

  6. Audio-visual integration in schizophrenia.

    NARCIS (Netherlands)

    Gelder, B. de; Vroomen, J.; Annen, L.; Masthof, E.; Hodiamont, P.P.G.

    2003-01-01

    Integration of information provided simultaneously by audition and vision was studied in a group of 18 schizophrenic patients. They were compared to a control group, consisting of 12 normal adults of comparable age and education. By administering two tasks, each focusing on one aspect of audio-visua

  7. Building Digital Audio Preservation Infrastructure and Workflows

    Science.gov (United States)

    Young, Anjanette; Olivieri, Blynne; Eckler, Karl; Gerontakos, Theodore

    2010-01-01

    In 2009 the University of Washington (UW) Libraries special collections received funding for the digital preservation of its audio indigenous language holdings. The university libraries, where the authors work in various capacities, had begun digitizing image and text collections in 1997. Because of this, at the onset of the project, workflows (a…

  8. Calibration of an audio frequency noise generator

    DEFF Research Database (Denmark)

    Diamond, Joseph M.

    1966-01-01

    A noise generator of known output is very convenient in noise measurement. At low audio frequencies, however, all devices, including noise sources, may be affected by excess noise (1/f noise). It is therefore very desirable to be able to check the spectral density of a noise source before it is u...

  9. Debugging of Class-D Audio Power Amplifiers

    DEFF Research Database (Denmark)

    Crone, Lasse; Pedersen, Jeppe Arnsdorf; Mønster, Jakob Døllner;

    2012-01-01

    Determining and optimizing the performance of a Class-D audio power amplier can be very dicult without knowledge of the use of audio performance measuring equipment and of how the various noise and distortion sources in uence the audio performance. This paper gives an introduction on how to measu...

  10. Progressive Syntax-Rich Coding of Multichannel Audio Sources

    Directory of Open Access Journals (Sweden)

    Dai Yang

    2003-09-01

    Full Text Available Being able to transmit the audio bitstream progressively is a highly desirable property for network transmission. MPEG-4 version 2 audio supports fine grain bit rate scalability in the generic audio coder (GAC. It has a bit-sliced arithmetic coding (BSAC tool, which provides scalability in the step of 1 Kbps per audio channel. There are also several other scalable audio coding methods, which have been proposed in recent years. However, these scalable audio tools are only available for mono and stereo audio material. Little work has been done on progressive coding of multichannel audio sources. MPEG advanced audio coding (AAC is one of the most distinguished multichannel digital audio compression systems. Based on AAC, we develop in this work a progressive syntax-rich multichannel audio codec (PSMAC. It not only supports fine grain bit rate scalability for the multichannel audio bitstream but also provides several other desirable functionalities. A formal subjective listening test shows that the proposed algorithm achieves an excellent performance at several different bit rates when compared with MPEG AAC.

  11. Enhancement of LSB based Steganography for Hiding Image in Audio

    OpenAIRE

    Pradeep Kumar Singh; R.K.Aggrawal

    2010-01-01

    In this paper we will take an in-depth look on steganography by proposing a new method of Audio Steganography. Emphasize will be on the proposed scheme of image hiding in audio and its comparison with simple Least Significant Bit insertion method for data hiding in audio.

  12. 47 CFR 10.520 - Common audio attention signal.

    Science.gov (United States)

    2010-10-01

    ... 47 Telecommunication 1 2010-10-01 2010-10-01 false Common audio attention signal. 10.520 Section... Equipment Requirements § 10.520 Common audio attention signal. A Participating CMS Provider and equipment manufacturers may only market devices for public use under part 10 that include an audio attention signal...

  13. Switching-mode Audio Power Amplifiers with Direct Energy Conversion

    DEFF Research Database (Denmark)

    Ljusev, Petar; Andersen, Michael Andreas E.

    2005-01-01

    This paper presents a new class of switching-mode audio power amplifiers, which are capable of direct energy conversion from the AC mains to the audio output. They represent an ultimate integration of a switching-mode power supply and a Class D audio power amplifier, where the intermediate DC bus...

  14. 47 CFR 73.403 - Digital audio broadcasting service requirements.

    Science.gov (United States)

    2010-10-01

    ... 47 Telecommunication 4 2010-10-01 2010-10-01 false Digital audio broadcasting service requirements. 73.403 Section 73.403 Telecommunication FEDERAL COMMUNICATIONS COMMISSION (CONTINUED) BROADCAST RADIO SERVICES RADIO BROADCAST SERVICES Digital Audio Broadcasting § 73.403 Digital audio broadcasting...

  15. Elicitation of attributes for the evaluation of audio-on audio-interference

    DEFF Research Database (Denmark)

    Francombe, Jon; Mason, R.; Dewhirst, M.;

    2014-01-01

    , annoyance, balance and blend, and confusion. Ratings using these attributes were collected in the fourth stage, and a principal component analysis performed. This suggested two dimensions underlying the perception of an audio-on-audio interference situation: The first dimension was labeled “distraction......” and accounted for 89% of the variance; the second dimension, accounting for 10% of the variance, was labeled “balance and blend.” © 2014 Acoustical Society of America...

  16. Video equipment of tele dosimetry and audio; Video equipo de teledosimetria y audio

    Energy Technology Data Exchange (ETDEWEB)

    Ojeda R, M.A.; Padilla C, I. [CFE, Central Laguna Verde, Subgerencia General de Operacion, Proteccion Radiologica, Veracruz (Mexico)]. e-mail: aojega@cfe.gob.mx

    2007-07-01

    To develop a work in an area with high radiation, it requires of a detailed knowledge of the surroundings work, a communication and effective vision, a near dosimetric control. In a work where the spaces variables and reduced accesses exist, noise that hinders the communication, defendant operative condition, radiation field and taking of decision, it is necessary to have tools that allow a total control of the environment to make opportune and effective decisions, there where the task is developed. Under this elementary concept, it was developed in the Laguna Verde Central a project that it allowed a mechanism, interactive of control in spaces complex; to see, to hear, to speak, to measure. This concept takes to the creation of an equipped system with closed circuit of television, wireless communication systems, tele dosimetry wireless systems, VHS and DVD recording equipment, uninterrupted energy units. The system requires of an electric power socket, and the installation of two cables by CCTV camera. The system is mobilized by a person. He puts on in operation in 5 minutes using a verification list. The concept was developed in the project denominated VETA-1, (Video Equipment of Tele dosimetry and Audio). It is objective of this work to present before the society the development of the VETA-1 tool that conclude in their first prototype in May of the present year. The VETA-1 project arises by a necessity of optimizing dose, it is an ALARA tool, with a countless applications, like it was proven in the 12 recharge stop of the Unit 1. The VETA-1 project integrate a recording system, with the primary end of analyzing in the place where the task is developed the details for an effective and opportune decision, but the resulting information is of utility for the personnel's training and the planning of future works. The VETA-1 system is an ALARA tool of quick response control. (Author)

  17. Anthropomorphic Coding of Speech and Audio: A Model Inversion Approach

    Directory of Open Access Journals (Sweden)

    W. Bastiaan Kleijn

    2005-06-01

    Full Text Available Auditory modeling is a well-established methodology that provides insight into human perception and that facilitates the extraction of signal features that are most relevant to the listener. The aim of this paper is to provide a tutorial on perceptual speech and audio coding using an invertible auditory model. In this approach, the audio signal is converted into an auditory representation using an invertible auditory model. The auditory representation is quantized and coded. Upon decoding, it is then transformed back into the acoustic domain. This transformation converts a complex distortion criterion into a simple one, thus facilitating quantization with low complexity. We briefly review past work on auditory models and describe in more detail the components of our invertible model and its inversion procedure, that is, the method to reconstruct the signal from the output of the auditory model. We summarize attempts to use the auditory representation for low-bit-rate coding. Our approach also allows the exploitation of the inherent redundancy of the human auditory system for the purpose of multiple description (joint source-channel coding.

  18. The KUSC Classical Music Dataset for Audio Key Finding

    Directory of Open Access Journals (Sweden)

    Ching-Hua Chuan

    2014-08-01

    Full Text Available In this paper, we present a benchmark dataset based on the KUSC classical music collection and provide baseline key-finding comparison results. Audio key finding is a basic music information retrieval task; it forms an essential component of systems for music segmentation, similarity assessment, and mood detection. Due to copyright restrictions and a labor-intensive annotation process, audio key finding algorithms have only been evaluated using small proprietary datasets to date. To create a common base for systematic comparisons, we have constructed a dataset comprising of more than 3,000 excerpts of classical music. The excerpts are made publicly accessible via commonly used acoustic features such as pitch-based spectrograms and chromagrams. We introduce a hybrid annotation scheme that combines the use of title keys with expert validation and correction of only the challenging cases. The expert musicians also provide ratings of key recognition difficulty. Other meta-data include instrumentation. As demonstration of use of the dataset, and to provide initial benchmark comparisons for evaluating new algorithms, we conduct a series of experiments reporting key determination accuracy of four state-of-the-art algorithms. We further show the importance of considering factors such as estimated tuning frequency, key strength or confidence value, and key recognition difficulty in key finding. In the future, we plan to expand the dataset to include meta-data for other music information retrieval tasks.

  19. Head Tracking of Auditory, Visual, and Audio-Visual Targets.

    Science.gov (United States)

    Leung, Johahn; Wei, Vincent; Burgess, Martin; Carlile, Simon

    2015-01-01

    The ability to actively follow a moving auditory target with our heads remains unexplored even though it is a common behavioral response. Previous studies of auditory motion perception have focused on the condition where the subjects are passive. The current study examined head tracking behavior to a moving auditory target along a horizontal 100° arc in the frontal hemisphere, with velocities ranging from 20 to 110°/s. By integrating high fidelity virtual auditory space with a high-speed visual presentation we compared tracking responses of auditory targets against visual-only and audio-visual "bisensory" stimuli. Three metrics were measured-onset, RMS, and gain error. The results showed that tracking accuracy (RMS error) varied linearly with target velocity, with a significantly higher rate in audition. Also, when the target moved faster than 80°/s, onset and RMS error were significantly worst in audition the other modalities while responses in the visual and bisensory conditions were statistically identical for all metrics measured. Lastly, audio-visual facilitation was not observed when tracking bisensory targets.

  20. Head Tracking of Auditory, Visual and Audio-Visual Targets

    Directory of Open Access Journals (Sweden)

    Johahn eLeung

    2016-01-01

    Full Text Available The ability to actively follow a moving auditory target with our heads remains unexplored even though it is a common behavioral response. Previous studies of auditory motion perception have focused on the condition where the subjects are passive. The current study examined head tracking behavior to a moving auditory target along a horizontal 100° arc in the frontal hemisphere, with velocities ranging from 20°/s to 110°/s. By integrating high fidelity virtual auditory space with a high-speed visual presentation we compared tracking responses of auditory targets against visual-only and audio-visual bisensory stimuli. Three metrics were measured – onset, RMS and gain error. The results showed that tracking accuracy (RMS error varied linearly with target velocity, with a significantly higher rate in audition. Also, when the target moved faster than 80°/s, onset and RMS error were significantly worst in audition the other modalities while responses in the visual and bisensory conditions were statistically identical for all metrics measured. Lastly, audio-visual facilitation was not observed when tracking bisensory targets.

  1. Extrastriate Visual Areas Integrate Form Features over Space and Time to Construct Representations of Stationary and Rigidly Rotating Objects.

    Science.gov (United States)

    McCarthy, J Daniel; Kohler, Peter J; Tse, Peter U; Caplovitz, Gideon Paul

    2015-11-01

    When an object moves behind a bush, for example, its visible fragments are revealed at different times and locations across the visual field. Nonetheless, a whole moving object is perceived. Unlike traditional modal and amodal completion mechanisms known to support spatial form integration when all parts of a stimulus are simultaneously visible, relatively little is known about the neural substrates of the spatiotemporal form integration (STFI) processes involved in generating coherent object representations from a succession visible fragments. We used fMRI to identify brain regions involved in two mechanisms supporting the representation of stationary and rigidly rotating objects whose form features are shown in succession: STFI and position updating. STFI allows past and present form cues to be integrated over space and time into a coherent object even when the object is not visible in any given frame. STFI can occur whether or not the object is moving. Position updating allows us to perceive a moving object, whether rigidly rotating or translating, even when its form features are revealed at different times and locations in space. Our results suggest that STFI is mediated by visual regions beyond V1 and V2. Moreover, although widespread cortical activation has been observed for other motion percepts derived solely from form-based analyses [Tse, P. U. Neural correlates of transformational apparent motion. Neuroimage, 31, 766-773, 2006; Krekelberg, B., Vatakis, A., & Kourtzi, Z. Implied motion from form in the human visual cortex. Journal of Neurophysiology, 94, 4373-4386, 2005], increased responses for the position updating that lead to rigidly rotating object representations were only observed in visual areas KO and possibly hMT+, indicating that this is a distinct and highly specialized type of processing.

  2. Key features of human episodic recollection in the cross-episode retrieval of rat hippocampus representations of space.

    Directory of Open Access Journals (Sweden)

    Eduard Kelemen

    2013-07-01

    Full Text Available Neurophysiological studies focus on memory retrieval as a reproduction of what was experienced and have established that neural discharge is replayed to express memory. However, cognitive psychology has established that recollection is not a verbatim replay of stored information. Recollection is constructive, the product of memory retrieval cues, the information stored in memory, and the subject's state of mind. We discovered key features of constructive recollection embedded in the rat CA1 ensemble discharge during an active avoidance task. Rats learned two task variants, one with the arena stable, the other with it rotating; each variant defined a distinct behavioral episode. During the rotating episode, the ensemble discharge of CA1 principal neurons was dynamically organized to concurrently represent space in two distinct codes. The code for spatial reference frame switched rapidly between representing the rat's current location in either the stationary spatial frame of the room or the rotating frame of the arena. The code for task variant switched less frequently between a representation of the current rotating episode and the stable episode from the rat's past. The characteristics and interplay of these two hippocampal codes revealed three key properties of constructive recollection. (1 Although the ensemble representations of the stable and rotating episodes were distinct, ensemble discharge during rotation occasionally resembled the stable condition, demonstrating cross-episode retrieval of the representation of the remote, stable episode. (2 This cross-episode retrieval at the level of the code for task variant was more likely when the rotating arena was about to match its orientation in the stable episode. (3 The likelihood of cross-episode retrieval was influenced by preretrieval information that was signaled at the level of the code for spatial reference frame. Thus key features of episodic recollection manifest in rat hippocampal

  3. ANALYSIS OF MULTIMODAL FUSION TECHNIQUES FOR AUDIO-VISUAL SPEECH RECOGNITION

    Directory of Open Access Journals (Sweden)

    D.V. Ivanko

    2016-05-01

    Full Text Available The paper deals with analytical review, covering the latest achievements in the field of audio-visual (AV fusion (integration of multimodal information. We discuss the main challenges and report on approaches to address them. One of the most important tasks of the AV integration is to understand how the modalities interact and influence each other. The paper addresses this problem in the context of AV speech processing and speech recognition. In the first part of the review we set out the basic principles of AV speech recognition and give the classification of audio and visual features of speech. Special attention is paid to the systematization of the existing techniques and the AV data fusion methods. In the second part we provide a consolidated list of tasks and applications that use the AV fusion based on carried out analysis of research area. We also indicate used methods, techniques, audio and video features. We propose classification of the AV integration, and discuss the advantages and disadvantages of different approaches. We draw conclusions and offer our assessment of the future in the field of AV fusion. In the further research we plan to implement a system of audio-visual Russian continuous speech recognition using advanced methods of multimodal fusion.

  4. Practical Design of Delta-Sigma Multiple Description Audio Coding

    DEFF Research Database (Denmark)

    Leegaard, Jack Højholt; Østergaard, Jan; Jensen, Søren Holdt;

    2014-01-01

    framework is suitable for practical low-delay MD audio coding. In particular, we design a practical MD audio coder with two descriptions and provide simulations on real audio data. The simulations demonstrate that even when using low-dimensional noise-shaping, prediction, and resampling filters......, it is possible to obtain good quality audio in the presence of packet losses. Simulations on real audio reveal that, contrary to existing designs, it is straightforward to obtain a large number of trade-off points between side distortion and central distortion, which makes the proposed coder suitable for a wide...

  5. Automatic Speech Segmentation Based On Audio and Optical Flow Visual Classification

    Directory of Open Access Journals (Sweden)

    Behnam Torabi

    2014-10-01

    Full Text Available Automatic speech segmentation as an important part of speech recognition system (ASR is highly noise dependent. Noise is made by changes in the communication channel, background, level of speaking etc. In recent years, many researchers have proposed noise cancelation techniques and have added visual features from speaker’s face to reduce the effect of noise on ASR systems. Removing noise from audio signals depends on the type of the noise; so it cannot be used as a general solution. Adding visual features improve this lack of efficiency, but advanced methods of this type need manual extraction of visual features. In this paper we propose a completely automatic system which uses optical flow vectors from speaker’s image sequence to obtain visual features. Then, Hidden Markov Models are trained to segment audio signals from image sequences and audio features based on extracted optical flow. The developed segmentation system based on such method acts totally automatic and become more robust to noise.

  6. Audio Steganography Techniques-A Survey

    Directory of Open Access Journals (Sweden)

    Navneet Kaur

    2014-06-01

    Full Text Available we can communicate with each other by passing messages which is not secure, but we make a communication be kept secret by embedding the message into carrier or by special tools such as invisible ink, microdots etc. Steganography is the science that involves communicating secret data in an appropriate carrier which is used from hundreds of years. In digital age new techniques of hiding the data inside the carrier are invented which are known as digital steganography. Nowadays, the carrier of the message can be an image, audio, video or a text file. In this paper we have purposed a method to enhance the security level in audio steganography and also improve the quality by making 2-level steganography.

  7. Using Audio-Derived Affective Offset to Enhance TV Recommendation

    DEFF Research Database (Denmark)

    Shepstone, Sven Ewan; Tan, Zheng-Hua; Jensen, Søren Holdt

    2014-01-01

    This paper introduces the concept of affective offset, which is the difference between a user's perceived affective state and the affective annotation of the content they wish to see. We show how this affective offset can be used within a framework for providing recommendations for TV programs....... First a user's mood profile is determined using 12-class audio-based emotion classifications . An initial TV content item is then displayed to the user based on the extracted mood profile. The user has the option to either accept the recommendation, or to critique the item once or several times......, by navigating the emotion space to request an alternative match. The final match is then compared to the initial match, in terms of the difference in the items' affective parameterization . This offset is then utilized in future recommendation sessions. The system was evaluated by eliciting three different...

  8. Haptic and Visual feedback in 3D Audio Mixing Interfaces

    DEFF Research Database (Denmark)

    Gelineck, Steven; Overholt, Daniel

    2015-01-01

    in order to augment the perception of the 3D space. We compare different interaction paradigms implemented using these interfaces, aiming to increase speed and accuracy and reduce the need for constant visual feedback. While the LEAP Motion relies upon visual perception and proprioception, users can forego......This paper describes the implementation and informal evaluation of a user interface that explores haptic feedback for 3D audio mixing. The implementation compares different approaches using either the LEAP Motion for mid-air hand gesture control, or the Novint Falcon for active haptic feed- back...... visual feedback with interfaces such as the Novint Falcon and rely primarily on haptic cues, allowing more focus on the spatial sound elements. Results of the evaluation support this claim, as users preferred the interaction paradigm using the Falcon with no visual feedback. Furthermore, users disliked...

  9. Museum audio guides as an accessibility enhancer

    OpenAIRE

    Martins, Cláudia Susana Nunes

    2012-01-01

    Accessibility to museums is enhanced by various types of cultural mediation, such as the use of audio guides, which consist of a means for innovative mediation put forth to make the museum visit more autonomous and simultaneously replace the traditional guided visit. Their use is integrated in the tendency for museum democratisation felt in Europe between the 60s and the 80s of the 20th century, especially with the development of educational services at museums and their opening to schools. I...

  10. PDE-SVD Based Audio Denoising

    OpenAIRE

    Baravdish, George; Evangelista, Gianpaolo; Svensson, Olof; Sofya, Faten

    2012-01-01

    In this paper we present a new method for denoising audio signals. The method is based on the Singular Value Decomposition (SVD) of the frame matrix representing the signal inthe Overlap Add decomposition. Denoising is performed by modifying both the singular values, using a tapering model, and the singular vectors of the representation, using a nonlinear PDE method. The performance of the method is evaluated and compared with denoising obtained by filtering.

  11. Digitisation of the CERN Audio Archives

    CERN Multimedia

    Maximilien Brice

    2006-01-01

    Since the creation of CERN in 1954 until mid 1980s, the audiovisual service has recorded hundreds of hours of moments of life at CERN on audio tapes. These moments range from inaugurations of new facilities to VIP speeches and general interest cultural seminars The preservation process started in June 2005 On these pictures, we see Waltraud Hug working on an open-reel tape.

  12. Capacity-optimized mp2 audio watermarking

    Science.gov (United States)

    Steinebach, Martin; Dittmann, Jana

    2003-06-01

    Today a number of audio watermarking algorithms have been proposed, some of them at a quality making them suitable for commercial applications. The focus of most of these algorithms is copyright protection. Therefore, transparency and robustness are the most discussed and optimised parameters. But other applications for audio watermarking can also be identified stressing other parameters like complexity or payload. In our paper, we introduce a new mp2 audio watermarking algorithm optimised for high payload. Our algorithm uses the scale factors of an mp2 file for watermark embedding. They are grouped and masked based on a pseudo-random pattern generated from a secret key. In each group, we embed one bit. Depending on the bit to embed, we change the scale factors by adding 1 where necessary until it includes either more even or uneven scale factors. An uneven group has a 1 embedded, an even group a 0. The same rule is later applied to detect the watermark. The group size can be increased or decreased for transparency/payload trade-off. We embed 160 bits or more in an mp2 file per second without reducing perceived quality. As an application example, we introduce a prototypic Karaoke system displaying song lyrics embedded as a watermark.

  13. Segmentation of perivascular spaces in 7T MR image using auto-context model with orientation-normalized features.

    Science.gov (United States)

    Park, Sang Hyun; Zong, Xiaopeng; Gao, Yaozong; Lin, Weili; Shen, Dinggang

    2016-07-01

    Quantitative study of perivascular spaces (PVSs) in brain magnetic resonance (MR) images is important for understanding the brain lymphatic system and its relationship with neurological diseases. One of the major challenges is the accurate extraction of PVSs that have very thin tubular structures with various directions in three-dimensional (3D) MR images. In this paper, we propose a learning-based PVS segmentation method to address this challenge. Specifically, we first determine a region of interest (ROI) by using the anatomical brain structure and the vesselness information derived from eigenvalues of image derivatives. Then, in the ROI, we extract a number of randomized Haar features which are normalized with respect to the principal directions of the underlying image derivatives. The classifier is trained by the random forest model that can effectively learn both discriminative features and classifier parameters to maximize the information gain. Finally, a sequential learning strategy is used to further enforce various contextual patterns around the thin tubular structures into the classifier. For evaluation, we apply our proposed method to the 7T brain MR images scanned from 17 healthy subjects aged from 25 to 37. The performance is measured by voxel-wise segmentation accuracy, cluster-wise classification accuracy, and similarity of geometric properties, such as volume, length, and diameter distributions between the predicted and the true PVSs. Moreover, the accuracies are also evaluated on the simulation images with motion artifacts and lacunes to demonstrate the potential of our method in segmenting PVSs from elderly and patient populations. The experimental results show that our proposed method outperforms all existing PVS segmentation methods.

  14. Le registrazioni audio dell’archivio Luigi Nono di Venezia

    Directory of Open Access Journals (Sweden)

    Luca Cossettini

    2009-11-01

    Full Text Available The audio recordings of the Luigi Nono Archive in Venice: guidelines for preservation and critical edition of audio documentsStudying audio recordings brings us back to ancient source verification problems that too often one thinks are overcome by the technical reproduction of sound. Au-dio signal is “fixed” on a specific carrier (tape, disc etc with a specific audio format (speed, number of tracks etc; the choice of support and format during the first “memorizing” process and the following copying processes is a subjective and, in case of copying, an interpretative operation conducted within a continuously evolv-ing audio technology. What we listen to today is the result of a transmission process that unavoidably transforms the original acoustic event and the documents that memorize it. Audio recording is no way a timeless and immutable fixing process. It is therefore necessary to study the transmission processes and to reconstruct the au-dio document tradition. The re-recording of the tapes of the Archivio Luigi Nono, conducted by the Audio Labs of the DAMS Musica of the University of Udine, of-fers clear examples of the technical and musicological interpretative problems one can find when he works with audio recordings.

  15. 基于张量神经网络的音频多语义分类方法%Multi-semantic audio classification method based on tensor neural network

    Institute of Scientific and Technical Information of China (English)

    邢玲; 贺梅; 马强; 朱敏

    2012-01-01

    Researches on the audio classification have involved various types of vector features. However, multi-semantics of audio information not only have their own properties, but also have some correlations among them. Whereas, to a certain extent, the simple vector representation cannot represent the multi-semantics and ignore their relations. Tensor Uniform Content Locator (TUCL) was brought forward to express the semantic information of audio, and a three-order Tensor Semantic Space (TSS) was constructed according to the semantic tensor. Tensor Semantic Dispersion (TSD) can aggregate some audio resources with the same semantics, and at the same time, the automatic audio classification can be accomplished by calculating their TSD. And Radical Basis Function Tensor Neural Network ( RBFTNN) was constructed and used to train intelligent learning model. For the problem of multi-semantic audio classification, the experimental results show that our method can significantly improve the classification precision in comparison with the typical method of Gaussian Mixture Model (GMM), and the classification precision of RBFTNN model is obviously better than that of Support Vector Machine (SVM).%音频特征向量已广泛应用于音频分类的研究,该表示形式虽能有效体现音频的固有特性,但无法表示音频信息多语义特性及各语义间的相关性.提出了基于张量统一内容定位(TUCL)的音频语义表征方式,将音频语义描述表示为三阶张量,并构建多语义张量空间.在此空间中,张量语义离散度(TSD)能有效聚集具有相同语义的音频资源,通过计算各音频资源的TSD来完成对音频资源的分类,并构建了RBF张量神经网络(RBFTNN)来自适应学习分类模型.实验结果表明,在多语义分类的情况下,TSD算法的分类性能明显优于当前典型的高斯混合模型(GMM)算法;通过与支持向量机(SVM)学习模型相比可知,基于TSD的RBFTNN模型分类学习的准确率明显优于基于TSD的SVM模型.

  16. Object-Based Change Detection in Urban Areas: The Effects of Segmentation Strategy, Scale, and Feature Space on Unsupervised Methods

    Directory of Open Access Journals (Sweden)

    Lei Ma

    2016-09-01

    Full Text Available Object-based change detection (OBCD has recently been receiving increasing attention as a result of rapid improvements in the resolution of remote sensing data. However, some OBCD issues relating to the segmentation of high-resolution images remain to be explored. For example, segmentation units derived using different segmentation strategies, segmentation scales, feature space, and change detection methods have rarely been assessed. In this study, we have tested four common unsupervised change detection methods using different segmentation strategies and a series of segmentation scale parameters on two WorldView-2 images of urban areas. We have also evaluated the effect of adding extra textural and Normalized Difference Vegetation Index (NDVI information instead of using only spectral information. Our results indicated that change detection methods performed better at a medium scale than at a fine scale where close to the pixel size. Multivariate Alteration Detection (MAD always outperformed the other methods tested, at the same confidence level. The overall accuracy appeared to benefit from using a two-date segmentation strategy rather than single-date segmentation. Adding textural and NDVI information appeared to reduce detection accuracy, but the magnitude of this reduction was not consistent across the different unsupervised methods and segmentation strategies. We conclude that a two-date segmentation strategy is useful for change detection in high-resolution imagery, but that the optimization of thresholds is critical for unsupervised change detection methods. Advanced methods need be explored that can take advantage of additional textural or other parameters.

  17. Differences in Human Audio Localization Performance between a HRTF- and a non-HRTF Audio System

    DEFF Research Database (Denmark)

    Larsen, Camilla Horne; Lauritsen, David Skødt; Larsen, Jacob Junker

    2013-01-01

    -related transfer function (HRTF) system implemen- tation in a study in relation to precision, speed and navi- gational performance in localizing audio sources in a virtual environment. We found that a system using HRTFs is signif- icantly better at all three performance tasks than a system using panning.......Spatial audio solutions have been around for a long time in real-time applications, but yielding spatial cues that more closely simulate real life accuracy has been a computational issue, and has often been solved by hardware solutions. This has long been a restriction, but now with more powerful...... computers this is becoming a lesser and lesser concern and software solutions are now applicable. Most current virtual environment applications do not take advantage of these im- plementations of accurate spatial cues, however. This paper compares a common implementation of spatial audio and a head...

  18. A new alley in Opinion Mining using Senti Audio Visual Algorithm

    Directory of Open Access Journals (Sweden)

    Mukesh Rawat,

    2016-02-01

    Full Text Available People share their views about products and services over social media, blogs, forums etc. If someone is willing to spend resources and money over these products and services will definitely learn about them from the past experiences of their peers. Opinion mining plays vital role in knowing increasing interests of a particular community, social and political events, making business strategies, marketing campaigns etc. This data is in unstructured form over internet but analyzed properly can be of great use. Sentiment analysis focuses on polarity detection of emotions like happy, sad or neutral. In this paper we proposed an algorithm i.e. Senti Audio Visual for examining Video as well as Audio sentiments. A review in the form of video/audio may contain several opinions/emotions, this algorithm will classify the reviews with the help of Baye’s Classifiers to three different classes i.e., positive, negative or neutral. The algorithm will use smiles, cries, gazes, pauses, pitch, and intensity as relevant Audio Visual features.

  19. SCALABLE PERCEPTUAL AUDIO REPRESENTATION WITH AN ADAPTIVE THREE TIME-SCALE SINUSOIDAL SIGNAL MODEL

    Institute of Scientific and Technical Information of China (English)

    Al-Moussawy Raed; Yin Junxun; Song Shaopeng

    2004-01-01

    This work is concerned with the development and optimization of a signal model for scalable perceptual audio coding at low bit rates. A complementary two-part signal model consisting of Sines plus Noise (SN) is described. The paper presents essentially a fundamental enhancement to the sinusoidal modeling component. The enhancement involves an audio signal scheme based on carrying out overlap-add sinusoidal modeling at three successive time scales,large, medium, and small. The sinusoidal modeling is done in an analysis-by-synthesis overlapadd manner across the three scales by using a psychoacoustically weighted matching pursuits.The sinusoidal modeling residual at the first scale is passed to the smaller scales to allow for the modeling of various signal features at appropriate resolutions. This approach greatly helps to correct the pre-echo inherent in the sinusoidal model. This improves the perceptual audio quality upon our previous work of sinusoidal modeling while using the same number of sinusoids. The most obvious application for the SN model is in scalable, high fidelity audio coding and signal modification.

  20. High-performance combination method of electric network frequency and phase for audio forgery detection in battery-powered devices.

    Science.gov (United States)

    Savari, Maryam; Abdul Wahab, Ainuddin Wahid; Anuar, Nor Badrul

    2016-09-01

    Audio forgery is any act of tampering, illegal copy and fake quality in the audio in a criminal way. In the last decade, there has been increasing attention to the audio forgery detection due to a significant increase in the number of forge in different type of audio. There are a number of methods for forgery detection, which electric network frequency (ENF) is one of the powerful methods in this area for forgery detection in terms of accuracy. In spite of suitable accuracy of ENF in a majority of plug-in powered devices, the weak accuracy of ENF in audio forgery detection for battery-powered devices, especially in laptop and mobile phone, can be consider as one of the main obstacles of the ENF. To solve the ENF problem in terms of accuracy in battery-powered devices, a combination method of ENF and phase feature is proposed. From experiment conducted, ENF alone give 50% and 60% accuracy for forgery detection in mobile phone and laptop respectively, while the proposed method shows 88% and 92% accuracy respectively, for forgery detection in battery-powered devices. The results lead to higher accuracy for forgery detection with the combination of ENF and phase feature.

  1. Feature-space assessment of electrical impedance tomography coregistered with computed tomography in detecting multiple contrast targets

    Energy Technology Data Exchange (ETDEWEB)

    Krishnan, Kalpagam; Liu, Jeff; Kohli, Kirpal [Department of Physics, BC Cancer Agency, Fraser Valley Centre, 13750 96th Avenue, Surrey, British Columbia V3V 1Z2 (Canada)

    2014-06-15

    Purpose: Fusion of electrical impedance tomography (EIT) with computed tomography (CT) can be useful as a clinical tool for providing additional physiological information about tissues, but requires suitable fusion algorithms and validation procedures. This work explores the feasibility of fusing EIT and CT images using an algorithm for coregistration. The imaging performance is validated through feature space assessment on phantom contrast targets. Methods: EIT data were acquired by scanning a phantom using a circuit, configured for injecting current through 16 electrodes, placed around the phantom. A conductivity image of the phantom was obtained from the data using electrical impedance and diffuse optical tomography reconstruction software (EIDORS). A CT image of the phantom was also acquired. The EIT and CT images were fused using a region of interest (ROI) coregistration fusion algorithm. Phantom imaging experiments were carried out on objects of different contrasts, sizes, and positions. The conductive medium of the phantoms was made of a tissue-mimicking bolus material that is routinely used in clinical radiation therapy settings. To validate the imaging performance in detecting different contrasts, the ROI of the phantom was filled with distilled water and normal saline. Spatially separated cylindrical objects of different sizes were used for validating the imaging performance in multiple target detection. Analyses of the CT, EIT and the EIT/CT phantom images were carried out based on the variations of contrast, correlation, energy, and homogeneity, using a gray level co-occurrence matrix (GLCM). A reference image of the phantom was simulated using EIDORS, and the performances of the CT and EIT imaging systems were evaluated and compared against the performance of the EIT/CT system using various feature metrics, detectability, and structural similarity index measures. Results: In detecting distilled and normal saline water in bolus medium, EIT as a stand

  2. Content-Based Hierarchical Analysis of News Video Using Audio and Visual Information

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    A schema for content-based analysis of broadcast news video is presented. First, we separate commercials from news using audiovisual features. Then, we automatically organize news programs into a content hierarchy at various levels of abstraction via effective integration of video, audio, and text data available from the news programs. Based on these news video structure and content analysis technologies, a TV news video Library is generated, from which users can retrieve definite news story according to their demands.

  3. Origin, Development and Trend of Audio Book---Coping Strategies of Library in the Face of New Audio Resources%“听书”形态的起源、发展与趋势--兼论图书馆面对新型音频资源的应对策略

    Institute of Scientific and Technical Information of China (English)

    张鹏; 王铮

    2016-01-01

    在网络化、数字化、移动化的背景下,传统的听书形态发生了新的变化。文章首先回顾了听书形态的概念演变,分析了听书发展的历史、听书与载体的关系,以及听书的普及化、市场化、资源化特征,在此基础上分析了听书新形态所带来的音频资源变革,最后讨论了图书馆面对新型音频资源的应对策略。%Traditional audio book pattern has changed in the environment of Internet, digitization and mobile. This article reviews the conception evaluation of audio book, and then analyzes the history of audio book, the connection between audio book and record medi-um, and the feature of popularization, marketization and resource of audio book. Based on above research, the article analyzes the new pattern of audio book and the revolution on audio content resource, and discusses the strategy on new audio content resources for library.

  4. Performance Improvement of Threshold based Audio Steganography using Parallel Computation

    OpenAIRE

    Muhammad Shoaib; Zakir Khan; Danish Shehzad; Tamer Dag; Arif Iqbal Umar; Noor Ul Amin

    2016-01-01

    Audio steganography is used to hide secret information inside audio signal for the secure and reliable transfer of information. Various steganography techniques have been proposed and implemented to ensure adequate security level. The existing techniques either focus on the payload or security, but none of them has ensured both security and payload at same time. Data Dependency in existing solution was reluctant for the execution of steganography mechanism serially. The audio data and secret ...

  5. Stuttering and speech naturalness: audio and audiovisual judgments.

    Science.gov (United States)

    Martin, R R; Haroldson, S K

    1992-06-01

    Unsophisticated raters, using 9-point interval scales, judged speech naturalness and stuttering severity of recorded stutterer and nonstutterer speech samples. Raters judged separately the audio-only and audiovisual presentations of each sample. For speech naturalness judgments of stutterer samples, raters invariably judged the audiovisual presentation more unnatural than the audio presentation of the same sample; but for the nonstutterer samples, there was no difference between audio and audiovisual naturalness ratings. Stuttering severity ratings did not differ significantly between audio and audiovisual presentations of the same samples. Rater reliability, interrater agreement, and intrarater agreement for speech naturalness judgments were assessed.

  6. Standardization Promotes the Quality of Meteorological Audio & Video Service

    Institute of Scientific and Technical Information of China (English)

    2011-01-01

    As an important part of meteorological sector and a critical basis for enhancing the capability of meteorological disaster prevention and mitigation and climate change response,the meteorological standardization is a significant support for facilitating the good and quick development of meteorological sector.Huafeng Group,as a leading enterprise of meteorological audio & video service,has,for years,attached much importance to employing the standardization of meteorological audio & video service to improve its management level and quality of programs,enhance the quality of meteorological audio & video service,build the brand image,cultivate the highlevel backbone personnel,and facilitate the sustainable development of meteorological audio & video service.

  7. A content-based digital audio watermarking algorithm

    Science.gov (United States)

    Zhang, Liping; Zhao, Yi; Xu, Wen Li

    2015-12-01

    Digital audio watermarking embeds inaudible information into digital audio data for the purposes of copyright protection, ownership verification, covert communication, and/or auxiliary data carrying. In this paper, we present a novel watermarking scheme to embed a meaningful gray image into digital audio by quantizing the wavelet coefficients (using integer lifting wavelet transform) of audio samples. Our audio-dependent watermarking procedure directly exploits temporal and frequency perceptual masking of the human auditory system (HAS) to guarantee that the embedded watermark image is inaudible and robust. The watermark is constructed by utilizing still image compression technique, breaking each audio clip into smaller segments, selecting the perceptually significant audio segments to wavelet transform, and quantizing the perceptually significant wavelet coefficients. The proposed watermarking algorithm can extract the watermark image without the help from the original digital audio signals. We also demonstrate the robustness of that watermarking procedure to audio degradations and distortions, e.g., those that result from noise adding, MPEG compression, low pass filtering, resampling, and requantization.

  8. Newnes audio and Hi-Fi engineer's pocket book

    CERN Document Server

    Capel, Vivian

    2013-01-01

    Newnes Audio and Hi-Fi Engineer's Pocket Book, Second Edition provides concise discussion of several audio topics. The book is comprised of 10 chapters that cover different audio equipment. The coverage of the text includes microphones, gramophones, compact discs, and tape recorders. The book also covers high-quality radio, amplifiers, and loudspeakers. The book then reviews the concepts of sound and acoustics, and presents some facts and formulas relevant to audio. The text will be useful to sound engineers and other professionals whose work involves sound systems.

  9. A Novel Algorithm for Robust Audio Watermarking in Wavelet Domain

    Institute of Scientific and Technical Information of China (English)

    FU Yu; WANG Bao-bao; LI Chun-ru; QUAN Ning-qiang

    2004-01-01

    A novel algorithm for digital audio watermarking in wavelet domain is proposed. First,an original audio signal is decomposed by discrete wavelet transform at three levels. Then, a discrete watermark is embedded into the coefficients of its intermediate frequencies. Finally, the watermarked audio signal is obtained by wavelet reconstruction. The proposed algorithm makes good use of the multiresolution characteristics of wavelet transform. The original audio signal is not needed when detecting the watermark correlatively. Simulation results show that the algorithm is inaudible and robust to noise, filtering and resampling.

  10. On Steganography in Lost Audio Packets

    CERN Document Server

    Mazurczyk, Wojciech; Szczypiorski, Krzysztof

    2011-01-01

    The paper presents a new hidden data insertion procedure based on estimated probability of the remaining time of the call for steganographic method called LACK (Lost Audio PaCKets steganography). LACK provides hidden communication for real-time services like Voice over IP. The analytical results presented in this paper concern the influence of LACK's hidden data insertion procedures on the method's impact on quality of voice transmission and its resistance to steganalysis. The proposed hidden data insertion procedure is also compared to previous steganogram insertion approach based on estimated remaining average call duration.

  11. Non Audio-Video gesture recognition system

    DEFF Research Database (Denmark)

    Craciunescu, Razvan; Mihovska, Albena Dimitrova; Kyriazakos, Sofoklis

    2016-01-01

    Gesture recognition is a topic in computer science and language technology with the goal of interpreting human gestures via mathematical algorithms. Gestures can originate from any bodily motion or state but commonly originate from the face or hand. Current research focus includes on the emotion...... recognition from the face and hand gesture recognition. Gesture recognition enables humans to communicate with the machine and interact naturally without any mechanical devices. This paper investigates the possibility to use non-audio/video sensors in order to design a low-cost gesture recognition device...

  12. Audio frequency in vivo optical coherence elastography

    Energy Technology Data Exchange (ETDEWEB)

    Adie, Steven G; Kennedy, Brendan F; Armstrong, Julian J; Alexandrov, Sergey A; Sampson, David D [Optical-Biomedical Engineering Laboratory (OBEL), School of Electrical, Electronic and Computer Engineering, University of Western Australia, 35 Stirling Highway, Crawley, Western Australia 6009 (Australia)], E-mail: dsampson@ee.uwa.edu.au

    2009-05-21

    We present a new approach to optical coherence elastography (OCE), which probes the local elastic properties of tissue by using optical coherence tomography to measure the effect of an applied stimulus in the audio frequency range. We describe the approach, based on analysis of the Bessel frequency spectrum of the interferometric signal detected from scatterers undergoing periodic motion in response to an applied stimulus. We present quantitative results of sub-micron excitation at 820 Hz in a layered phantom and the first such measurements in human skin in vivo.

  13. Mixing audio concepts, practices and tools

    CERN Document Server

    Izhaki, Roey

    2013-01-01

    Your mix can make or break a record, and mixing is an essential catalyst for a record deal. Professional engineers with exceptional mixing skills can earn vast amounts of money and find that they are in demand by the biggest acts. To develop such skills, you need to master both the art and science of mixing. The new edition of this bestselling book offers all you need to know and put into practice in order to improve your mixes. Covering the entire process --from fundamental concepts to advanced techniques -- and offering a multitude of audio samples, tips and tricks, this boo

  14. Audio marketing v ČR

    OpenAIRE

    Timanov, Vladimir

    2015-01-01

    The aim of the work is processing and evaluation of the investment project. The project implies an establishment of the firm in Czech Republic. The branch of the entrepreneurship is sensory marketing or audio-visual marketing. The essence of this field of the marketing is encouragement of sales through the influence on emotional side of the client. Components of the work are market research, analysis of the competitors in this sphere, and the financial plan. As a result, the work will be stru...

  15. Investigating the impact of audio instruction and audio-visual biofeedback for lung cancer radiation therapy

    Science.gov (United States)

    George, Rohini

    Lung cancer accounts for 13% of all cancers in the Unites States and is the leading cause of deaths among both men and women. The five-year survival for lung cancer patients is approximately 15%.(ACS facts & figures) Respiratory motion decreases accuracy of thoracic radiotherapy during imaging and delivery. To account for respiration, generally margins are added during radiation treatment planning, which may cause a substantial dose delivery to normal tissues and increase the normal tissue toxicity. To alleviate the above-mentioned effects of respiratory motion, several motion management techniques are available which can reduce the doses to normal tissues, thereby reducing treatment toxicity and allowing dose escalation to the tumor. This may increase the survival probability of patients who have lung cancer and are receiving radiation therapy. However the accuracy of these motion management techniques are inhibited by respiration irregularity. The rationale of this thesis was to study the improvement in regularity of respiratory motion by breathing coaching for lung cancer patients using audio instructions and audio-visual biofeedback. A total of 331 patient respiratory motion traces, each four minutes in length, were collected from 24 lung cancer patients enrolled in an IRB-approved breathing-training protocol. It was determined that audio-visual biofeedback significantly improved the regularity of respiratory motion compared to free breathing and audio instruction, thus improving the accuracy of respiratory gated radiotherapy. It was also observed that duty cycles below 30% showed insignificant reduction in residual motion while above 50% there was a sharp increase in residual motion. The reproducibility of exhale based gating was higher than that of inhale base gating. Modeling the respiratory cycles it was found that cosine and cosine 4 models had the best correlation with individual respiratory cycles. The overall respiratory motion probability distribution

  16. Semantic Labeling of Nonspeech Audio Clips

    Directory of Open Access Journals (Sweden)

    Xiaojuan Ma

    2010-01-01

    Full Text Available Human communication about entities and events is primarily linguistic in nature. While visual representations of information are shown to be highly effective as well, relatively little is known about the communicative power of auditory nonlinguistic representations. We created a collection of short nonlinguistic auditory clips encoding familiar human activities, objects, animals, natural phenomena, machinery, and social scenes. We presented these sounds to a broad spectrum of anonymous human workers using Amazon Mechanical Turk and collected verbal sound labels. We analyzed the human labels in terms of their lexical and semantic properties to ascertain that the audio clips do evoke the information suggested by their pre-defined captions. We then measured the agreement with the semantically compatible labels for each sound clip. Finally, we examined which kinds of entities and events, when captured by nonlinguistic acoustic clips, appear to be well-suited to elicit information for communication, and which ones are less discriminable. Our work is set against the broader goal of creating resources that facilitate communication for people with some types of language loss. Furthermore, our data should prove useful for future research in machine analysis/synthesis of audio, such as computational auditory scene analysis, and annotating/querying large collections of sound effects.

  17. Features of motivation of the crewmembers in an enclosed space at atmospheric pressure changes during breathing inert gases.

    Science.gov (United States)

    Komarevcev, Sergey

    Since the 1960s, our psychologists are working on experimenting with small groups in isolation .It was associated with the beginning of spaceflight and necessity to study of human behaviors in ways different from the natural habitat of man .Those, who study human behavior especially in isolation, know- that the behavior in isolation markedly different from that in the natural situаtions. It associated with the development of new, more adaptive behaviors (1) What are the differences ? First of all , isolation is achieved by the fact ,that the group is in a closed space. How experiments show - the crew members have changed the basic personality traits, such as motivation Statement of the problem and methods. In our experimentation we were interested in changing the features of human motivation (strength, stability and direction of motivation) in terms of a closed group in the modified atmosphere pressure and breathing inert gases. Also, we were interested in particular external and internal motivation of the individual in the circumstances. To conduct experimentation , we used an experimental barocomplex GVK -250 , which placed a group of six mаns. A task was to spend fifteen days in isolation on barokomplex when breathing oxigen - xenon mixture of fifteen days in isolation on the same complex when breathing oxygen- helium mixture and fifteen days of isolation on the same complex when breathing normal air All this time, the subjects were isolated under conditions of atmospheric pressure changes , closer to what you normally deal divers. We assumed that breathing inert mixtures can change the strength and stability , and with it , the direction and stability of motivation. To check our results, we planned on using the battery of psychological techniques : 1. Schwartz technique that measures personal values and behavior in society, DORS procedure ( measurement of fatigue , monotony , satiety and stress ) and riffs that give the test once a week. Our assumption is

  18. Person identification for mobile robot using audio-visual modality

    Science.gov (United States)

    Kim, Young-Ouk; Chin, Sehoon; Lee, Jihoon; Paik, Joonki

    2005-10-01

    Recently, we experienced significant advancement in intelligent service robots. The remarkable features of an intelligent robot include tracking and identification of person using biometric features. The human-robot interaction is very important because it is one of the final goals of an intelligent service robot. Many researches are concentrating in two fields. One is self navigation of a mobile robot and the other is human-robot interaction in natural environment. In this paper we will present an effective person identification method for HRI (Human Robot Interaction) using two different types of expert systems. However, most of mobile robots run under uncontrolled and complicated environment. It means that face and speech information can't be guaranteed under varying conditions, such as lighting, noisy sound, orientation of a robot. According to a value of illumination and signal to noise ratio around mobile a robot, our proposed fuzzy rule make a reasonable person identification result. Two embedded HMM (Hidden Marhov Model) are used for each visual and audio modality to identify person. The performance of our proposed system and experimental results are compared with single modality identification and simply mixed method of two modality.

  19. Technical Evaluation Report. 65. Video-Conferencing with Audio Software

    Science.gov (United States)

    Baggaley, Jon; Klaas, Jim

    2006-01-01

    An online conference is illustrated using the format of a TV talk show. The conference combined live audio discussion with visual images spontaneously selected by the moderator in the manner of a TV control-room director. A combination of inexpensive online collaborative tools was used for the event, based on the browser-based audio-conferencing…

  20. Minimizing Crosstalk in Self Oscillating Switch Mode Audio Power Amplifiers

    DEFF Research Database (Denmark)

    Knott, Arnold; Ploug, Rasmus Overgaard

    2012-01-01

    The varying switching frequencies of self oscillating switch mode audio amplifiers have been known to cause interchannel intermodulation disturbances in multi channel configurations. This crosstalk phenomenon has a negative impact on the audio performance. The goal of this paper is to present a m...

  1. Multilevel inverter based class D audio amplifier for capacitive transducers

    DEFF Research Database (Denmark)

    Nielsen, Dennis; Knott, Arnold; Andersen, Michael A. E.

    2014-01-01

    The reduced semiconductor voltage stress makes the multilevel inverters especially interesting, when driving capacitive transducers for audio applications. A ± 300 V flying capacitor class D audio amplifier driving a 100 nF load in the midrange region of 0.1-3.5 kHz with Total Harmonic Distortion...

  2. DOA Estimation of Audio Sources in Reverberant Environments

    DEFF Research Database (Denmark)

    Jensen, Jesper Rindom; Nielsen, Jesper Kjær; Heusdens, Richard;

    2016-01-01

    Reverberation is well-known to have a detrimental impact on many localization methods for audio sources. We address this problem by imposing a model for the early reflections as well as a model for the audio source itself. Using these models, we propose two iterative localization methods that est...

  3. Parametric Audio Based Decoder and Music Synthesizer for Mobile Applications

    NARCIS (Netherlands)

    Oomen, A.W.J.; Szczerba, M.Z.; Therssen, D.

    2011-01-01

    This paper reviews parametric audio coders and discusses novel technologies introduced in a low-complexity, low-power consumption audiodecoder and music synthesizer platform developed by the authors. Thedecoder uses parametric coding scheme based on the MPEG-4 Parametric Audio standard. In order to

  4. Effect of Audio vs. Video on Aural Discrimination of Vowels

    Science.gov (United States)

    McCrocklin, Shannon

    2012-01-01

    Despite the growing use of media in the classroom, the effects of using of audio versus video in pronunciation teaching has been largely ignored. To analyze the impact of the use of audio or video training on aural discrimination of vowels, 61 participants (all students at a large American university) took a pre-test followed by two training…

  5. Performance Analysis of Data Hiding in MPEG-4 AAC Audio

    Institute of Scientific and Technical Information of China (English)

    XU Shuzheng; ZHANG Peng; WANG Pengjun; YANG Huazhong

    2009-01-01

    A high capacity data hiding technique was developed for compressed digital audio.As perceptual audio coding has become the accepted technology for storage and transmission of audio signals,compressed audio information hiding enables robust,imperceptible transmission of data within audio signals,thus allowing valuable information to be attached to the content,such as the song title,lyrics,composer's name,and artist or property rights related data.This paper describes simultaneous low bitrate encoding and information hiding for highly compressed audio signals.The information hiding is implemented in the quantization process of the audio content which improves robustness,signal quality,and security.The imperceptibility of the embedded data is ensured based on the masking property of the human auditory system (HAS).The robustness and security are evaluated by various attacking algorithms.Tests with an extended MPEG4 advanced audio coding (AAC) encoder confirm that the method is robust to the regular and singular groups method (RS) and sample pair analysis (SPA) attacks as well as other statistical steganalysis method attacks.

  6. Multi Carrier Modulation Audio Power Amplifier with Programmable Logic

    DEFF Research Database (Denmark)

    Christiansen, Theis; Andersen, Toke Meyer; Knott, Arnold

    2009-01-01

    While switch-mode audio power amplifiers allow compact implementations and high output power levels due to their high power efficiency, they are very well known for creating electromagnetic interference (EMI) with other electronic equipment. To lower the EMI of switch-mode (class D) audio power a...

  7. Decision-level fusion for audio-visual laughter detection

    NARCIS (Netherlands)

    Reuderink, B.; Poel, M.; Truong, K.; Poppe, R.; Pantic, M.

    2008-01-01

    Laughter is a highly variable signal, which can be caused by a spectrum of emotions. This makes the automatic detection of laughter a challenging, but interesting task. We perform automatic laughter detection using audio-visual data from the AMI Meeting Corpus. Audio-visual laughter detection is per

  8. An Audio Stream Redirector for the Ethernet Speaker

    Science.gov (United States)

    Mandrekar, Ishan; Prevelakis, Vassilis; Turner, David Michael

    2004-01-01

    The authors have developed the "Ethernet Speaker" (ES), a network-enabled single board computer embedded into a conventional audio speaker. Audio streams are transmitted in the local area network using multicast packets, and the ES can select any one of them and play it back. A key requirement for the ES is that it must be capable of playing any…

  9. Circular microphone array for multi channel audio recording

    NARCIS (Netherlands)

    Hulsebos, E.M.; De Vries, D.; Boone, M.M.; Schuurmans, T.J.G.

    2004-01-01

    An audio system has a circular microphone array with a number of microphones arranged on a circle for receiving a sound field. A digital signal processor is provided for processing output signals from these microphones. To establish well controlled and sharp directivity patterns the audio system per

  10. Decision-Level Fusion for Audio-Visual Laughter Detection

    NARCIS (Netherlands)

    Reuderink, Boris; Poel, Mannes; Truong, Khiet; Poppe, Ronald; Pantic, Maja; Popescu-Belis, Andrei; Stiefelhagen, Rainer

    2008-01-01

    Laughter is a highly variable signal, which can be caused by a spectrum of emotions. This makes the automatic detection of laugh- ter a challenging, but interesting task. We perform automatic laughter detection using audio-visual data from the AMI Meeting Corpus. Audio- visual laughter detection is

  11. Effect of downsampling and compressive sensing on audio-based continuous cough monitoring.

    Science.gov (United States)

    Casaseca-de-la-Higuera, Pablo; Lesso, Paul; McKinstry, Brian; Pinnock, Hilary; Rabinovich, Roberto; McCloughan, Lucy; Monge-Álvarez, Jesús

    2015-01-01

    This paper presents an efficient cough detection system based on simple decision-tree classification of spectral features from a smartphone audio signal. Preliminary evaluation on voluntary coughs shows that the system can achieve 98% sensitivity and 97.13% specificity when the audio signal is sampled at full rate. With this baseline system, we study possible efficiency optimisations by evaluating the effect of downsampling below the Nyquist rate and how the system performance at low sampling frequencies can be improved by incorporating compressive sensing reconstruction schemes. Our results show that undersampling down to 400 Hz can still keep sensitivity and specificity values above 90% despite of aliasing. Furthermore, the sparsity of cough signals in the time domain allows keeping performance figures close to 90% when sampling at 100 Hz using compressive sensing schemes.

  12. Integration of top-down and bottom-up information for audio organization and retrieval

    DEFF Research Database (Denmark)

    Jensen, Bjørn Sand

    The increasing availability of digital audio and music calls for methods and systems to analyse and organize these digital objects. This thesis investigates three elements related to such systems focusing on the ability to represent and elicit the user's view on the multimedia object and the system...... sources based on latent Dirichlet allocation (LDA). The model is used to integrate bottom-up features (reflecting timbre, loudness, tempo and chroma), meta-data aspects (lyrics) and top-down aspects, namely user generated open vocabulary tags. The model and representation is evaluated on the auxiliary...... task of genre and style classification. Eliciting the subjective representation and opinion of users is an important aspect in building personalized systems. The thesis contributes with a setup for modelling and elicitation of preference and other cognitive aspects with focus on audio applications...

  13. El tratamiento documental del mensaje audiovisual Documentary treatment of the audio-visual message

    Directory of Open Access Journals (Sweden)

    Blanca Rodríguez Bravo

    2005-06-01

    Full Text Available Se analizan las peculiaridades del documento audiovisual y el tratamiento documental que sufre en las emisoras de televisión. Observando a las particularidades de la imagen que condicionan su análisis y recuperación, se establecen las etapas y procedimientos para representar el mensaje audiovisual con vistas a su reutilización. Por último se realizan algunas consideraciones acerca del procesamiento automático del video y de los cambios introducidos por la televisión digital.Peculiarities of the audio-visual document and the treatment it undergoes in TV broadcasting stations are analyzed. The particular features of images condition their analysis and recovery; this paper establishes stages and proceedings for the representation of audio-visual messages with a view to their re-usability Also, some considerations about the automatic processing of the video and the changes introduced by digital TV are made.

  14. High Capacity and Resistance to Additive Noise Audio Steganography Algorithm

    Directory of Open Access Journals (Sweden)

    Haider Ismael Shahadi

    2011-09-01

    Full Text Available Steganography is the art of message hiding in a cover signal without attracting attention. The requirements of the good steganography algorithm are security, capacity, robustness and imperceptibility, all them are contradictory, therefore, satisfying all together is not easy especially in audio cover signal because human auditory system (HAS has high sensitivity to audio modification. In this paper, we proposed a high capacity audio steganography algorithm with good resistance to additive noise. The proposed algorithm is based on wavelet packet transform and blocks matching. It has capacity above 35% of the input audio file size with acceptable signal to noise ratio. Also, it is resistance to additive Gaussian noise to about 25 db. Furthermore, the reconstruction of actual secret messages does not require the original cover audio signal.

  15. A Dither Modulation Audio Watermarking Algorithm Based on HAS

    Directory of Open Access Journals (Sweden)

    Yi-bo Huang

    2012-11-01

    Full Text Available In this study, we propose a dither modulation audio watermarking algorithm based on human auditory system which applied the theory of dither modulation. The algorithm made the two-value image watermarking to one-dimensional digital sequence firstly and used the Fibonacci to transform one-dimensional digital sequence. Then divide the audio into audio data segment and made discrete wavelet transform with audio data segment, every segment can adaptive choose quantization step. Finally put low frequency coefficients transformed embedding the watermarking which applied the dither modulation. When extract the watermark with no original audio, they realized blind extraction. The experimental results show that this algorithm has preferable robustness to against the attack from noise addition, compression, low pass filtering and re-sampling.

  16. Stream/Bounce Event Perception Reveals a Temporal Limit of Motion Correspondence Based on Surface Feature over Space and Time

    Directory of Open Access Journals (Sweden)

    Yousuke Kawachi

    2011-06-01

    Full Text Available We examined how stream/bounce event perception is affected by motion correspondence based on the surface features of moving objects passing behind an occlusion. In the stream/bounce display two identical objects moving across each other in a two-dimensional display can be perceived as either streaming through or bouncing off each other at coincidence. Here, surface features such as colour (Experiments 1 and 2 or luminance (Experiment 3 were switched between the two objects at coincidence. The moment of coincidence was invisible to observers due to an occluder. Additionally, the presentation of the moving objects was manipulated in duration after the feature switch at coincidence. The results revealed that a postcoincidence duration of approximately 200 ms was required for the visual system to stabilize judgments of stream/bounce events by determining motion correspondence between the objects across the occlusion on the basis of the surface feature. The critical duration was similar across motion speeds of objects and types of surface features. Moreover, controls (Experiments 4a–4c showed that cognitive bias based on feature (colour/luminance congruency across the occlusion could not fully account for the effects of surface features on the stream/bounce judgments. We discuss the roles of motion correspondence, visual feature processing, and attentive tracking in the stream/bounce judgments.

  17. Stream/bounce event perception reveals a temporal limit of motion correspondence based on surface feature over space and time.

    Science.gov (United States)

    Kawachi, Yousuke; Kawabe, Takahiro; Gyoba, Jiro

    2011-01-01

    We examined how stream/bounce event perception is affected by motion correspondence based on the surface features of moving objects passing behind an occlusion. In the stream/bounce display two identical objects moving across each other in a two-dimensional display can be perceived as either streaming through or bouncing off each other at coincidence. Here, surface features such as colour (Experiments 1 and 2) or luminance (Experiment 3) were switched between the two objects at coincidence. The moment of coincidence was invisible to observers due to an occluder. Additionally, the presentation of the moving objects was manipulated in duration after the feature switch at coincidence. The results revealed that a postcoincidence duration of approximately 200 ms was required for the visual system to stabilize judgments of stream/bounce events by determining motion correspondence between the objects across the occlusion on the basis of the surface feature. The critical duration was similar across motion speeds of objects and types of surface features. Moreover, controls (Experiments 4a-4c) showed that cognitive bias based on feature (colour/luminance) congruency across the occlusion could not fully account for the effects of surface features on the stream/bounce judgments. We discuss the roles of motion correspondence, visual feature processing, and attentive tracking in the stream/bounce judgments.

  18. ESA personal communications and digital audio broadcasting systems based on non-geostationary satellites

    Science.gov (United States)

    Logalbo, P.; Benedicto, J.; Viola, R.

    1993-01-01

    Personal Communications and Digital Audio Broadcasting are two new services that the European Space Agency (ESA) is investigating for future European and Global Mobile Satellite systems. ESA is active in promoting these services in their various mission options including non-geostationary and geostationary satellite systems. A Medium Altitude Global Satellite System (MAGSS) for global personal communications at L and S-band, and a Multiregional Highly inclined Elliptical Orbit (M-HEO) system for multiregional digital audio broadcasting at L-band are described. Both systems are being investigated by ESA in the context of future programs, such as Archimedes, which are intended to demonstrate the new services and to develop the technology for future non-geostationary mobile communication and broadcasting satellites.

  19. ESA personal communications and digital audio broadcasting systems based on non-geostationary satellites

    Science.gov (United States)

    Logalbo, P.; Benedicto, J.; Viola, R.

    Personal Communications and Digital Audio Broadcasting are two new services that the European Space Agency (ESA) is investigating for future European and Global Mobile Satellite systems. ESA is active in promoting these services in their various mission options including non-geostationary and geostationary satellite systems. A Medium Altitude Global Satellite System (MAGSS) for global personal communications at L and S-band, and a Multiregional Highly inclined Elliptical Orbit (M-HEO) system for multiregional digital audio broadcasting at L-band are described. Both systems are being investigated by ESA in the context of future programs, such as Archimedes, which are intended to demonstrate the new services and to develop the technology for future non-geostationary mobile communication and broadcasting satellites.

  20. Horatio Audio-Describes Shakespeare's "Hamlet": Blind and Low-Vision Theatre-Goers Evaluate an Unconventional Audio Description Strategy

    Science.gov (United States)

    Udo, J. P.; Acevedo, B.; Fels, D. I.

    2010-01-01

    Audio description (AD) has been introduced as one solution for providing people who are blind or have low vision with access to live theatre, film and television content. However, there is little research to inform the process, user preferences and presentation style. We present a study of a single live audio-described performance of Hart House…

  1. Subjective and Objective Assessment of Perceived Audio Quality of Current Digital Audio Broadcasting Systems and Web-Casting Applications

    NARCIS (Netherlands)

    Pocta, P.; Beerends, J.G.

    2015-01-01

    This paper investigates the impact of different audio codecs typically deployed in current digital audio broadcasting (DAB) systems and web-casting applications, which represent a main source of quality impairment in these systems and applications, on the quality perceived by the end user. Both subj

  2. Availability of feature-oriented scanning probe microscopy for remote-controlled measurements on board a space laboratory or planet exploration Rover.

    Science.gov (United States)

    Lapshin, Rostislav V

    2009-06-01

    Prospects for a feature-oriented scanning (FOS) approach to investigations of sample surfaces, at the micrometer and nanometer scales, with the use of scanning probe microscopy under space laboratory or planet exploration rover conditions, are examined. The problems discussed include decreasing sensitivity of the onboard scanning probe microscope (SPM) to temperature variations, providing autonomous operation, implementing the capabilities for remote control, self-checking, self-adjustment, and self-calibration. A number of topical problems of SPM measurements in outer space or on board a planet exploration rover may be solved via the application of recently proposed FOS methods.

  3. AUDIO CRYPTANALYSIS- AN APPLICATION OF SYMMETRIC KEY CRYPTOGRAPHY AND AUDIO STEGANOGRAPHY

    Directory of Open Access Journals (Sweden)

    Smita Paira

    2016-09-01

    Full Text Available In the recent trend of network and technology, “Cryptography” and “Steganography” have emerged out as the essential elements of providing network security. Although Cryptography plays a major role in the fabrication and modification of the secret message into an encrypted version yet it has certain drawbacks. Steganography is the art that meets one of the basic limitations of Cryptography. In this paper, a new algorithm has been proposed based on both Symmetric Key Cryptography and Audio Steganography. The combination of a randomly generated Symmetric Key along with LSB technique of Audio Steganography sends a secret message unrecognizable through an insecure medium. The Stego File generated is almost lossless giving a 100 percent recovery of the original message. This paper also presents a detailed experimental analysis of the algorithm with a brief comparison with other existing algorithms and a future scope. The experimental verification and security issues are promising.

  4. On Building Immersive Audio Applications Using Robust Adaptive Beamforming and Joint Audio-Video Source Localization

    Directory of Open Access Journals (Sweden)

    Beracoechea JA

    2006-01-01

    Full Text Available This paper deals with some of the different problems, strategies, and solutions of building true immersive audio systems oriented to future communication applications. The aim is to build a system where the acoustic field of a chamber is recorded using a microphone array and then is reconstructed or rendered again, in a different chamber using loudspeaker array-based techniques. Our proposal explores the possibility of using recent robust adaptive beamforming techniques for effectively estimating the original sources of the emitting room. A joint audio-video localization method needed in the estimation process as well as in the rendering engine is also presented. The estimated source signal and the source localization information drive a wave field synthesis engine that renders the acoustic field again at the receiving chamber. The system performance is tested using MUSHRA-based subjective tests.

  5. Enhanced Audio LSB Steganography for Secure Communication

    Directory of Open Access Journals (Sweden)

    Muhammad Junaid Hussain

    2016-01-01

    Full Text Available The ease with which data can be remitted across the globe via Internet has made it an obvious (as medium choice for on line data transmission and communication. This salient trait, however, is constraint with akin issues of privacy, veracity of the information being exchanged over it, and legitimacy of its sender together with its availability when needed. Although cryptography is being used to confront confidentiality concern yet for many is slightly limited in scope because of discernibility of encrypted information. Further, due to restrictions imposed on the use of cryptography by its citizens for personal doings, various Governments have also coxswained the research arena to explore another discipline of information hiding called steganography – whose sole purpose is to make the information being exchanged inaudible. This research is focused on evolution of model based secure LSB Steganographic scheme for digital audio wave file format to withstand passive attack by Warden Wendy.

  6. Particle Filtering on the Audio Localization Manifold

    CERN Document Server

    Ettinger, Evan

    2010-01-01

    We present a novel particle filtering algorithm for tracking a moving sound source using a microphone array. If there are N microphones in the array, we track all $N \\choose 2$ delays with a single particle filter over time. Since it is known that tracking in high dimensions is rife with difficulties, we instead integrate into our particle filter a model of the low dimensional manifold that these delays lie on. Our manifold model is based off of work on modeling low dimensional manifolds via random projection trees [1]. In addition, we also introduce a new weighting scheme to our particle filtering algorithm based on recent advancements in online learning. We show that our novel TDOA tracking algorithm that integrates a manifold model can greatly outperform standard particle filters on this audio tracking task.

  7. A direct broadcast satellite-audio experiment

    Science.gov (United States)

    Vaisnys, Arvydas; Abbe, Brian; Motamedi, Masoud

    1992-03-01

    System studies have been carried out over the past three years at the Jet Propulsion Laboratory (JPL) on digital audio broadcasting (DAB) via satellite. The thrust of the work to date has been on designing power and bandwidth efficient systems capable of providing reliable service to fixed, mobile, and portable radios. It is very difficult to predict performance in an environment which produces random periods of signal blockage, such as encountered in mobile reception where a vehicle can quickly move from one type of terrain to another. For this reason, some signal blockage mitigation techniques were built into an experimental DAB system and a satellite experiment was conducted to obtain both qualitative and quantitative measures of performance in a range of reception environments. This paper presents results from the experiment and some conclusions on the effectiveness of these blockage mitigation techniques.

  8. Time-Scale Invariant Audio Data Embedding

    Directory of Open Access Journals (Sweden)

    Mansour Mohamed F

    2003-01-01

    Full Text Available We propose a novel algorithm for high-quality data embedding in audio. The algorithm is based on changing the relative length of the middle segment between two successive maximum and minimum peaks to embed data. Spline interpolation is used to change the lengths. To ensure smooth monotonic behavior between peaks, a hybrid orthogonal and nonorthogonal wavelet decomposition is used prior to data embedding. The possible data embedding rates are between 20 and 30 bps. However, for practical purposes, we use repetition codes, and the effective embedding data rate is around 5 bps. The algorithm is invariant after time-scale modification, time shift, and time cropping. It gives high-quality output and is robust to mp3 compression.

  9. An inconclusive digital audio authenticity examination: a unique case.

    Science.gov (United States)

    Koenig, Bruce E; Lacey, Douglas S

    2012-01-01

    This case report sets forth an authenticity examination of 35 encrypted, proprietary-format digital audio files containing recorded telephone conversations between two codefendants in a criminal matter. The codefendant who recorded the conversations did so on a recording system he developed; additionally, he was both a forensic audio authenticity examiner, who had published and presented in the field, and was the head of a professional audio society's writing group for authenticity standards. The authors conducted the examination of the recordings following nine laboratory steps of the peer-reviewed and published 11-step digital audio authenticity protocol. Based considerably on the codefendant's direct involvement with the development of the encrypted audio format, his experience in the field of forensic audio authenticity analysis, and the ease with which the audio files could be accessed, converted, edited in the gap areas, and reconstructed in such a way that the processes were undetected, the authors concluded that the recordings could not be scientifically authenticated through accepted forensic practices.

  10. Sampling Function of Degree 2 for DVD-Audio

    Science.gov (United States)

    Toraichi, Kazuo; Nakamura, Koji

    Authors have been studying Fluency Information Theory that generalizes Shannon’s sampling theorem and its applications. Among the practical application of the research, the Fluency DAC that is developed as the Digital-to-analog converter for CD audio could have received objective valuation including receipt Golden Sound Award in 1988. In recent, DVD-Audio that deal with maximum sampling rate of 192 kHz has appeared. Due to the introduction of DVD audio that requires four times the sampling rate of nowadays CD audio, the request for developing a new Fluency DAC for DVD audio was initiated. From such requirements, the research for developing the Fluency DAC for DVD-Audio has been started. The result of the research could revive awards in local contest in Japan audio apparatus at 2000 and 2001. As the initial report on our project in developing the Fluency DAC that is capable of dealing with a maximum sampling rate of 192kHz, in this paper we aimed to derive the sampling function that acts as the impulse response for such a D/A converter.

  11. Optimized Audio Classification and Segmentation Algorithm by Using Ensemble Methods

    Directory of Open Access Journals (Sweden)

    Saadia Zahid

    2015-01-01

    Full Text Available Audio segmentation is a basis for multimedia content analysis which is the most important and widely used application nowadays. An optimized audio classification and segmentation algorithm is presented in this paper that segments a superimposed audio stream on the basis of its content into four main audio types: pure-speech, music, environment sound, and silence. An algorithm is proposed that preserves important audio content and reduces the misclassification rate without using large amount of training data, which handles noise and is suitable for use for real-time applications. Noise in an audio stream is segmented out as environment sound. A hybrid classification approach is used, bagged support vector machines (SVMs with artificial neural networks (ANNs. Audio stream is classified, firstly, into speech and nonspeech segment by using bagged support vector machines; nonspeech segment is further classified into music and environment sound by using artificial neural networks and lastly, speech segment is classified into silence and pure-speech segments on the basis of rule-based classifier. Minimum data is used for training classifier; ensemble methods are used for minimizing misclassification rate and approximately 98% accurate segments are obtained. A fast and efficient algorithm is designed that can be used with real-time multimedia applications.

  12. Characteristics of Abductive Inquiry in Earth and Space Science: An Undergraduate Teacher Prospective Case Study

    Science.gov (United States)

    Ramalis, T. R.; Liliasari; Herdiwidjaya, D.

    2016-08-01

    The purpose this case study was to describe characteristic features learning activities in the domain of earth and space science. Context of this study is earth and space learning activities on three groups of student teachers prospective, respectively on the subject of the shape and size of Earth, land and sea breeze, and moon's orbit. The analysis is conducted qualitatively from activity data and analyze students doing project work, student worksheets, group project report documents, note and audio recordings of discussion. Research findings identified the type of abduction: theoretical models abduction, factual abduction, and law abduction during the learning process. Implications for science inquiry learning as well as relevant research were suggested.

  13. Semi-fragile Audio Watermarking Scheme Based on the Approximate Components Energy%基于近似分量能量的半脆弱音频水印算法

    Institute of Scientific and Technical Information of China (English)

    宁超魁; 和红杰; 陈帆; 尹忠科

    2013-01-01

    为提高半脆弱音频水印算法的安全性,本文提出一种基于近似分量能量的半脆弱音频水印算法.该算法将每个音频帧分为两段,分别用于提取音频帧特征和嵌入其它音频帧的水印信息.本文利用音频段近似分量能量以α为底的对数作为音频帧特征,基于密钥将音频帧特征加密后随机嵌入到其他音频帧另一段的混合域中,检测时根据音频帧及其相邻帧水印的不一致性判断音频帧的真实性.兼顾音频帧特征的鲁棒性和分布特性讨论α的取值,实验结果表明该算法能准确定位被篡改的音频帧且能有效抵抗拼贴攻击.%In order to improve the security of the semi-fragile audio watermarking scheme, the semi-fragile audio watermarking algorithm based on the approximate components energy(ACE) was proposed. Every audio frame was divided into two sections. One section was used to extract the feature of the audio frame, and the other was used to hide the watermark data of the other audio frame. The feature of an audio frame was the logarithm of ACE of the chosen audio section form the audio frame to the base α. For each audio frame, the feature was encrypted and randomly embedded in the hybrid domain of another audio frame based on the secret key. The validity of an audio frame was determined by the inconsistency of itself and its neighborhood audio frames. This paper also discussed the value of α from the viewpoint of the robustness and distribution of the audio frame feature. Experimental results demonstrate that the proposed scheme can localize the tampered regions accurately and resist collage attacks effectively.

  14. Lattice Vector Quantization Applied to Speech and Audio Coding

    Institute of Scientific and Technical Information of China (English)

    Minjie Xie

    2012-01-01

    Lattice vector quantization (LVQ) has been used for real-time speech and audio coding systems. Compared with conventional vector quantization, LVQ has two main advantages: It has a simple and fast encoding process, and it significantly reduces the amount of memory required. Therefore, LVQ is suitable for use in low-complexity speech and audio coding. In this paper, we describe the basic concepts of LVQ and its advantages over conventional vector quantization. We also describe some LVQ techniques that have been used in speech and audio coding standards of international standards developing organizations (SDOs).

  15. A Review on Audio-visual Translation Studies

    Institute of Scientific and Technical Information of China (English)

    李瑶

    2008-01-01

    <正>This paper is dedicated to a thorough review on the audio-visual related translations from both home and abroad.In reviewing the foreign achievements on this specific field of translation studies it can shed some lights on our national audio-visual practice and research.The review on the Chinese scholars’ audio-visual translation studies is to offer the potential developing direction and guidelines to the studies and aspects neglected as well.Based on the summary of relevant studies,possible topics for further studies are proposed.

  16. Aurally Aided Visual Search Performance Comparing Virtual Audio Systems

    DEFF Research Database (Denmark)

    Larsen, Camilla Horne; Lauritsen, David Skødt; Larsen, Jacob Junker;

    2014-01-01

    Due to increased computational power reproducing binaural hearing in real-time applications, through usage of head-related transfer functions (HRTFs), is now possible. This paper addresses the differences in aurally-aided visual search performance between an HRTF enhanced audio system (3D...... with white dots. The results indicate that 3D audio yields faster search latencies than panning audio, especially with larger amounts of distractors. The applications of this research could fit virtual environments such as video games or virtual simulations....

  17. A New Steganographic Method for Embedded Image In Audio File

    Directory of Open Access Journals (Sweden)

    Mohammed S. Altaei

    2012-04-01

    Full Text Available Because secure transaction of information is increasing day by day therefore Steganography hasbecome very important and used modern strategies. Steganography is a strategy in whichrequired information is concealment in any other information such that the second informationdoes not change significantly and it appears the same as original. This work presents a newapproach of concealment encrypted mobile image in a audio file.The proposed work is replacingtwo LSB of each byte in audio file and these bytes are choices as randomly location. It becomesvery difficult for intruder to guess that an image is hidden in the audio.

  18. Robust message authentication code algorithm for digital audio recordings

    Science.gov (United States)

    Zmudzinski, Sascha; Steinebach, Martin

    2007-02-01

    Current systems and protocols for integrity and authenticity verification of media data do not distinguish between legitimate signal transformation and malicious tampering that manipulates the content. Furthermore, they usually provide no localization or assessment of the relevance of such manipulations with respect to human perception or semantics. We present an algorithm for a robust message authentication code (RMAC) to verify the integrity of audio recodings by means of robust audio fingerprinting and robust perceptual hashing. Experimental results show that the proposed algorithm provides both a high level of distinction between perceptually different audio data and a high robustness against signal transformations that do not change the perceived information.

  19. Musical examination to bridge audio data and sheet music

    Science.gov (United States)

    Pan, Xunyu; Cross, Timothy J.; Xiao, Liangliang; Hei, Xiali

    2015-03-01

    The digitalization of audio is commonly implemented for the purpose of convenient storage and transmission of music and songs in today's digital age. Analyzing digital audio for an insightful look at a specific musical characteristic, however, can be quite challenging for various types of applications. Many existing musical analysis techniques can examine a particular piece of audio data. For example, the frequency of digital sound can be easily read and identified at a specific section in an audio file. Based on this information, we could determine the musical note being played at that instant, but what if you want to see a list of all the notes played in a song? While most existing methods help to provide information about a single piece of the audio data at a time, few of them can analyze the available audio file on a larger scale. The research conducted in this work considers how to further utilize the examination of audio data by storing more information from the original audio file. In practice, we develop a novel musical analysis system Musicians Aid to process musical representation and examination of audio data. Musicians Aid solves the previous problem by storing and analyzing the audio information as it reads it rather than tossing it aside. The system can provide professional musicians with an insightful look at the music they created and advance their understanding of their work. Amateur musicians could also benefit from using it solely for the purpose of obtaining feedback about a song they were attempting to play. By comparing our system's interpretation of traditional sheet music with their own playing, a musician could ensure what they played was correct. More specifically, the system could show them exactly where they went wrong and how to adjust their mistakes. In addition, the application could be extended over the Internet to allow users to play music with one another and then review the audio data they produced. This would be particularly

  20. Dynamically-Loaded Hardware Libraries (HLL) Technology for Audio Applications

    DEFF Research Database (Denmark)

    Esposito, A.; Lomuscio, A.; Nunzio, L. Di

    2016-01-01

    In this work, we apply hardware acceleration to embedded systems running audio applications. We present a new framework, Dynamically-Loaded Hardware Libraries or HLL, to dynamically load hardware libraries on reconfigurable platforms (FPGAs). Provided a library of application-specific processors......, we load on-the-fly the specific processor in the FPGA, and we transfer the execution from the CPU to the FPGA-based accelerator. The proposed architecture provides excellent flexibility with respect to the different audio applications implemented, high quality audio, and an energy efficient solution....

  1. Aurally Aided Visual Search Performance Comparing Virtual Audio Systems

    DEFF Research Database (Denmark)

    Larsen, Camilla Horne; Lauritsen, David Skødt; Larsen, Jacob Junker;

    2014-01-01

    Due to increased computational power, reproducing binaural hearing in real-time applications, through usage of head-related transfer functions (HRTFs), is now possible. This paper addresses the differences in aurally-aided visual search performance between a HRTF enhanced audio system (3D......) and an amplitude panning audio system (panning) in a virtual environment. We present a performance study involving 33 participants locating aurally-aided visual targets placed at fixed positions, under different audio conditions. A varying amount of visual distractors were present, represented as black circles...

  2. Audio Effects Based on Biorthogonal Time-Varying Frequency Warping

    Directory of Open Access Journals (Sweden)

    Cavaliere Sergio

    2001-01-01

    Full Text Available We illustrate the mathematical background and musical use of a class of audio effects based on frequency warping. These effects alter the frequency content of a signal via spectral mapping. They can be implemented in dispersive tapped delay lines based on a chain of all-pass filters. In a homogeneous line with first-order all-pass sections, the signal formed by the output samples at a given time is related to the input via the Laguerre transform. However, most musical signals require a time-varying frequency modification in order to be properly processed. Vibrato in musical instruments or voice intonation in the case of vocal sounds may be modeled as small and slow pitch variations. Simulation of these effects requires techniques for time-varying pitch and/or brightness modification that are very useful for sound processing. The basis for time-varying frequency warping is a time-varying version of the Laguerre transformation. The corresponding implementation structure is obtained as a dispersive tapped delay line, where each of the frequency dependent delay element has its own phase response. Thus, time-varying warping results in a space-varying, inhomogeneous, propagation structure. We show that time-varying frequency warping is associated to an expansion over biorthogonal sets generalizing the discrete Laguerre basis. Slow time-varying characteristics lead to slowly varying parameter sequences. The corresponding sound transformation does not suffer from discontinuities typical of delay lines based on unit delays.

  3. Towards Structural Analysis of Audio Recordings in the Presence of Musical Variations

    Directory of Open Access Journals (Sweden)

    Frank Kurth

    2007-01-01

    Full Text Available One major goal of structural analysis of an audio recording is to automatically extract the repetitive structure or, more generally, the musical form of the underlying piece of music. Recent approaches to this problem work well for music, where the repetitions largely agree with respect to instrumentation and tempo, as is typically the case for popular music. For other classes of music such as Western classical music, however, musically similar audio segments may exhibit significant variations in parameters such as dynamics, timbre, execution of note groups, modulation, articulation, and tempo progression. In this paper, we propose a robust and efficient algorithm for audio structure analysis, which allows to identify musically similar segments even in the presence of large variations in these parameters. To account for such variations, our main idea is to incorporate invariance at various levels simultaneously: we design a new type of statistical features to absorb microvariations, introduce an enhanced local distance measure to account for local variations, and describe a new strategy for structure extraction that can cope with the global variations. Our experimental results with classical and popular music show that our algorithm performs successfully even in the presence of significant musical variations.

  4. A Perceptual Model for Sinusoidal Audio Coding Based on Spectral Integration

    Directory of Open Access Journals (Sweden)

    Jensen Søren Holdt

    2005-01-01

    Full Text Available Psychoacoustical models have been used extensively within audio coding applications over the past decades. Recently, parametric coding techniques have been applied to general audio and this has created the need for a psychoacoustical model that is specifically suited for sinusoidal modelling of audio signals. In this paper, we present a new perceptual model that predicts masked thresholds for sinusoidal distortions. The model relies on signal detection theory and incorporates more recent insights about spectral and temporal integration in auditory masking. As a consequence, the model is able to predict the distortion detectability. In fact, the distortion detectability defines a (perceptually relevant norm on the underlying signal space which is beneficial for optimisation algorithms such as rate-distortion optimisation or linear predictive coding. We evaluate the merits of the model by combining it with a sinusoidal extraction method and compare the results with those obtained with the ISO MPEG-1 Layer I-II recommended model. Listening tests show a clear preference for the new model. More specifically, the model presented here leads to a reduction of more than 20% in terms of number of sinusoids needed to represent signals at a given quality level.

  5. Research and Development Platform for Multimedia Streaming of MP3 Audio Content

    Directory of Open Access Journals (Sweden)

    Andrei Novak

    2006-07-01

    Full Text Available In the recent years, the MPEG Layer III (MP3 music compression format hasbecome an extremely popular choice for digital audio compression. Its high compressionratio, and near CD quality sound make it a natural choice for storing and distributingmusic - especially over the internet, where space and bandwidth are importantconsiderations. For example, using MPEG Layer-3 compression, 40 MBytes audio fileshave been compressed to approximately 3.5 MBytes. As a result of the MP3 popularity, avariety of portable MP3 players entered the market. We decided to design and implement aHard Disk based MP3 player similar to products currently available (e.g. Creative LabsNomad, Archos Jukebox 6000, Apple Ipod, etc.. Our goal was to design the player withminimal cost and to implement a FM Stereo Radio Transmitter module for ease ofconnectivity. This module resolves the compatibility problems with the current availablecar audio systems. In the same time system flexibility and scalability as well as systemevolution to more advanced architectures were the main principles that drove thedevelopment of this platform. The primary enhancement of the platform will be to switchthe communication module from a analogical FM radio transmitter to digital wired orwireless communication solutions.

  6. Implementation of ETAS (Embedding Text in Audio Signal Model To Ensure Secrecy

    Directory of Open Access Journals (Sweden)

    K. GEETHA

    2010-07-01

    Full Text Available Steganography is the art of hiding information that evolves as a new secret communication technology. For a long period time, information hiding was done using plain text, still images, video and IP datagram. Embedding secret messages using audio signal in digital format is now the area of focus. There exists numerous steganography techniques for hiding information in audio medium. In this work we propose a new model ETAS - Embedding Text in Audio Signal that embeds the text like the existing system but with encryption that gains thefull advantages of cryptography. Using steganography it is possible to conceal the full existence of the original text and the results obtained from the proposed model is compared with other existing techniques and proved to be efficient for textual messages of size beyond 12 KB as the size of the embedded text is approximately same as that of encrypted text size. This emphasis the fact that we are able to ensure secrecy without an additional cost of extra space consumed for the text to be communicated.

  7. The ventriloquist in periphery: impact of eccentricity-related reliability on audio-visual localization.

    Science.gov (United States)

    Charbonneau, Geneviève; Véronneau, Marie; Boudrias-Fournier, Colin; Lepore, Franco; Collignon, Olivier

    2013-10-28

    The relative reliability of separate sensory estimates influences the way they are merged into a unified percept. We investigated how eccentricity-related changes in reliability of auditory and visual stimuli influence their integration across the entire frontal space. First, we surprisingly found that despite a strong decrease in auditory and visual unisensory localization abilities in periphery, the redundancy gain resulting from the congruent presentation of audio-visual targets was not affected by stimuli eccentricity. This result therefore contrasts with the common prediction that a reduction in sensory reliability necessarily induces an enhanced integrative gain. Second, we demonstrate that the visual capture of sounds observed with spatially incongruent audio-visual targets (ventriloquist effect) steadily decreases with eccentricity, paralleling a lowering of the relative reliability of unimodal visual over unimodal auditory stimuli in periphery. Moreover, at all eccentricities, the ventriloquist effect positively correlated with a weighted combination of the spatial resolution obtained in unisensory conditions. These findings support and extend the view that the localization of audio-visual stimuli relies on an optimal combination of auditory and visual information according to their respective spatial reliability. All together, these results evidence that the external spatial coordinates of multisensory events relative to an observer's body (e.g., eyes' or head's position) influence how this information is merged, and therefore determine the perceptual outcome.

  8. Thermal and neutron-physical features of the nuclear reactor for a power pulsation plant for space applications

    Science.gov (United States)

    Gordeev, É. G.; Kaminskii, A. S.; Konyukhov, G. V.; Pavshuk, V. A.; Turbina, T. A.

    2012-05-01

    We have explored the possibility of creating small-size reactors with a high power output with the provision of thermal stability and nuclear safety under standard operating conditions and in emergency situations. The neutron-physical features of such a reactor have been considered and variants of its designs preserving the main principles and approaches of nuclear rocket engine technology are presented.

  9. Mixed-Signal Architectures for High-Efficiency and Low-Distortion Digital Audio Processing and Power Amplification

    Directory of Open Access Journals (Sweden)

    Pierangelo Terreni

    2010-01-01

    Full Text Available The paper addresses the algorithmic and architectural design of digital input power audio amplifiers. A modelling platform, based on a meet-in-the-middle approach between top-down and bottom-up design strategies, allows a fast but still accurate exploration of the mixed-signal design space. Different amplifier architectures are configured and compared to find optimal trade-offs among different cost-functions: low distortion, high efficiency, low circuit complexity and low sensitivity to parameter changes. A novel amplifier architecture is derived; its prototype implements digital processing IP macrocells (oversampler, interpolating filter, PWM cross-point deriver, noise shaper, multilevel PWM modulator, dead time compensator on a single low-complexity FPGA while off-chip components are used only for the power output stage (LC filter and power MOS bridge; no heatsink is required. The resulting digital input amplifier features a power efficiency higher than 90% and a total harmonic distortion down to 0.13% at power levels of tens of Watts. Discussions towards the full-silicon integration of the mixed-signal amplifier in embedded devices, using BCD technology and targeting power levels of few Watts, are also reported.

  10. Materials Science Research Hardware for Application on the International Space Station: an Overview of Typical Hardware Requirements and Features

    Science.gov (United States)

    Schaefer, D. A.; Cobb, S.; Fiske, M. R.; Srinivas, R.

    2000-01-01

    NASA's Marshall Space Flight Center (MSFC) is the lead center for Materials Science Microgravity Research. The Materials Science Research Facility (MSRF) is a key development effort underway at MSFC. The MSRF will be the primary facility for microgravity materials science research on board the International Space Station (ISS) and will implement the NASA Materials Science Microgravity Research Program. It will operate in the U.S. Laboratory Module and support U. S. Microgravity Materials Science Investigations. This facility is being designed to maintain the momentum of the U.S. role in microgravity materials science and support NASA's Human Exploration and Development of Space (HEDS) Enterprise goals and objectives for Materials Science. The MSRF as currently envisioned will consist of three Materials Science Research Racks (MSRR), which will be deployed to the International Space Station (ISS) in phases, Each rack is being designed to accommodate various Experiment Modules, which comprise processing facilities for peer selected Materials Science experiments. Phased deployment will enable early opportunities for the U.S. and International Partners, and support the timely incorporation of technology updates to the Experiment Modules and sensor devices.

  11. TNO at TRECVID 2008, Combining Audio and Video Fingerprinting for Robust Copy Detection

    NARCIS (Netherlands)

    Doets, P.J.; Eendebak, P.T.; Ranguelova, E.; Kraaij, W.

    2009-01-01

    TNO has evaluated a baseline audio and a video fingerprinting system based on robust hashing for the TRECVID 2008 copy detection task. We participated in the audio, the video and the combined audio-video copy detection task. The audio fingerprinting implementation clearly outperformed the video fing

  12. 37 CFR 201.27 - Initial notice of distribution of digital audio recording devices or media.

    Science.gov (United States)

    2010-07-01

    ... distribution of digital audio recording devices or media. 201.27 Section 201.27 Patents, Trademarks, and... Initial notice of distribution of digital audio recording devices or media. (a) General. This section..., any digital audio recording device or digital audio recording medium in the United States....

  13. Reception of infrasound and audio current in derma nerves

    Institute of Scientific and Technical Information of China (English)

    Jianwen Li; Ziyu Li; Xuezong Ma

    2010-01-01

    Determining the frequency range of derma nerve that responds to audio current is fundamental for the development of skin-hearing technology.Previous studies have shown that the range of derma nerve responding to audio current is 15-15 000 Hz,because audio amplification is not separated from the step-up transformer.Therefore,the present study used a signal generator which directly drives plane electrodes,simplified the original experimental environment for skin-hearing,measured lower limit voltage of frequency for derma nerve receiving pulse current signals,and revealed that the frequency range of human derma nerve response was as wide as 0.1-30 000 Hz.Results demonstrate that human derma nerve receives audio signals and infrasound within a wide frequency range.

  14. Audio CAPTCHA for SIP-Based VoIP

    Science.gov (United States)

    Soupionis, Yannis; Tountas, George; Gritzalis, Dimitris

    Voice over IP (VoIP) introduces new ways of communication, while utilizing existing data networks to provide inexpensive voice communications worldwide as a promising alternative to the traditional PSTN telephony. SPam over Internet Telephony (SPIT) is one potential source of future annoyance in VoIP. A common way to launch a SPIT attack is the use of an automated procedure (bot), which generates calls and produces audio advertisements. In this paper, our goal is to design appropriate CAPTCHA to fight such bots. We focus on and develop audio CAPTCHA, as the audio format is more suitable for VoIP environments and we implement it in a SIP-based VoIP environment. Furthermore, we suggest and evaluate the specific attributes that audio CAPTCHA should incorporate in order to be effective, and test it against an open source bot implementation.

  15. Perancangan Sistem Audio Mobil berbasiskan Sistem Pakar dan Web

    Directory of Open Access Journals (Sweden)

    Djunaidy Santoso

    2011-11-01

    Full Text Available Designing car audio that fits users needs is a fun activity. However, the design often consumes more time and costly since it should be consulted to the experts several times. For easy access to information in designing a car audio system as well as error prevention, and car audio system based on expert system and web is designed for those who do not have sufficient time and expense to consult directly to experts. This system consists of tutorial modules designed using the HyperText Preprocessor (PHP and MySQL as database. This car audio system design is evaluated uses black box testing method which focuses on the functional needs of the application. Tests are performed by providing inputs and produce outputs corresponding to the function of each module. The test results prove the correspondence between input and output, which means that the program meet the initial goals of the design.

  16. Effectiveness of 3-D audio for warnings in the cockpit

    NARCIS (Netherlands)

    Oving, A.B.; Veltman, J.A.; Bronkhorst, A.W.

    2004-01-01

    Een tweetal vliegsimulator experimenten lieten zien dat piloten sneller reagereerden op de auditieve waarschuwingen van het TCAS systeem in de civiele cockpit, waneer deze waarschuwingen werden gepresenteerd met 3D-audio in vergelijking tot mono geluid.

  17. Proper Use of Audio-Visual Aids: Essential for Educators.

    Science.gov (United States)

    Dejardin, Conrad

    1989-01-01

    Criticizes educators as the worst users of audio-visual aids and among the worst public speakers. Offers guidelines for the proper use of an overhead projector and the development of transparencies. (DMM)

  18. Can audio recording improve patients' recall of outpatient consultations?

    DEFF Research Database (Denmark)

    Wolderslund, Maiken; Kofoed, Poul-Erik; Axboe, Mette

    Introduction In order to give patients possibility to listen to their consultation again, we have designed a system which gives the patients access to digital audio recordings of their consultations. An Interactive Voice Response platform enables the audio recording and gives the patients access...... to replay their consultation. The intervention is evaluated in a randomised controlled trial with 5.460 patients in order to determine whether providing patients with digital audio recording of the consultation affects the patients overall perception of their consultation. In addition to this primary...... objective we want to investigate if replay of the consultations improves the patients’ recall of the information given. Methods Interviews are carried out with 40 patients whose consultations have been audio recorded. Patients are divided into two groups, those who have listened to their consultation...

  19. Audio-Visual Integration of Emotional Information

    Directory of Open Access Journals (Sweden)

    Penny Bergman

    2011-10-01

    Full Text Available Emotions are central to our perception of the environment surrounding us (Berlyne, 1971. An important aspect in the emotional response to a sound is dependent on the meaning of the sound, ie, it is not the physical parameter per se that determines our emotional response to the sound but rather the source of the sound (Genell, 2008, and the relevance it has to the self (Tajadura-Jiménez et al 2010. When exposed to sound together with visual information, the information from both modalities is integrated, altering the perception of each modality, in order to generate a coherent experience. In emotional information this integration is rapid and without requirements of attentional processes (De Gelder, 1999. The present experiment investigates perception of pink noise in two visual settings in a within-subjects design. Nineteen participants rated the same sound twice in terms of pleasantness and arousal in either a pleasant or an unpleasant visual setting. The results showed that pleasantness of the sound decreased in the negative visual setting, thus suggesting an audio-visual integration, where the affective information in the visual modality is translated to the auditory modality when information-markers are lacking in it. The results are discussed in relation to theories of emotion perception.

  20. Multi-Level Audio Classification Architecture

    Directory of Open Access Journals (Sweden)

    Jozef Vavrek

    2015-01-01

    Full Text Available A multi-level classification architecture for solving binary discrimination problem is proposed in this paper. The main idea of proposed solution is derived from the fact that solving one binary discrimination problem multiple times can reduce the overall miss-classification error. We aimed our effort towards building the classification architecture employing the combination of multiple binary SVM (Support Vector Machine classifiers for solving two-class discrimination problem. Therefore, we developed a binary discrimination architecture employing the SVM classifier (BDASVM with intention to use it for classification of broadcast news (BN audio data. The fundamental element of BDASVM is the binary decision (BD algorithm that performs discrimination between each pair of acoustic classes utilizing decision function modeled by separating hyperplane. The overall classification accuracy is conditioned by finding the optimal parameters for discrimination function resulting in higher computational complexity. The final form of proposed BDASVM is created by combining four BDSVM discriminators supplemented by decision table. Experimental results show that the proposed classification architecture can decrease the overall classification error in comparison with binary decision trees SVM (BDTSVM architecture.

  1. IELTS speaking instruction through audio/voice conferencing

    Directory of Open Access Journals (Sweden)

    Hamed Ghaemi

    2012-02-01

    Full Text Available The currentstudyaimsatinvestigatingtheimpactofAudio/Voiceconferencing,asanewapproachtoteaching speaking, on the speakingperformanceand/orspeakingband score ofIELTScandidates.Experimentalgroupsubjectsparticipated in an audio conferencing classwhile those of the control group enjoyed attending in a traditional IELTS Speakingclass. At the endofthestudy,allsubjectsparticipatedinanIELTSExaminationheldonNovemberfourthin Tehran,Iran.To compare thegroupmeansforthestudy,anindependentt-testanalysiswasemployed.Thedifferencebetween experimental and control groupwasconsideredtobestatisticallysignificant(P<0.01.Thatisthecandidates in experimental group have outperformed the ones in control group in IELTS Speaking test scores.

  2. Ferrite bead effect on Class-D amplifier audio quality

    OpenAIRE

    Haddad, Kevin El; Mrad, Roberto; Morel, Florent; Pillonnet, Gael; Vollaire, Christian; Nagari, Angelo

    2014-01-01

    International audience; This paper studies the effect of ferrite beads on the audio quality of Class-D audio amplifiers. This latter is a switch-ing circuit which creates high frequency harmonics. Generally, a filter is used at the amplifier output for the sake of electro-magnetic compatibility (EMC). So often, in integrated solutions, this filter contains ferrite beads which are magnetic components and present nonlinear behavior. Time domain measurements and their equivalence in frequency do...

  3. Audio Arduino - an ALSA (Advanced Linux Sound Architecture) audio driver for FTDI-based Arduinos

    DEFF Research Database (Denmark)

    Dimitrov, Smilen; Serafin, Stefania

    2011-01-01

    Technology Devices International Ltd [FTDI] company) can be demonstrated to behave as a full-duplex, mono, 8-bit 44.1 kHz soundcard, through an implementation of: a PC audio driver for ALSA (Advanced Linux Sound Architecture); a matching program for the Arduino's ATmega microcontroller - and nothing more...... than headphones (and a couple of capacitors). The main contribution of this paper is to bring a holistic aspect to the discussion on the topic of implementation of soundcards - also by referring to open-source driver, microcontroller code and test methods; and outline a complete implementation...

  4. Automated processing of massive audio/video content using FFmpeg

    Directory of Open Access Journals (Sweden)

    Kia Siang Hock

    2014-01-01

    Full Text Available Audio and video content forms an integral, important and expanding part of the digital collections in libraries and archives world-wide. While these memory institutions are familiar and well-versed in the management of more conventional materials such as books, periodicals, ephemera and images, the handling of audio (e.g., oral history recordings and video content (e.g., audio-visual recordings, broadcast content requires additional toolkits. In particular, a robust and comprehensive tool that provides a programmable interface is indispensable when dealing with tens of thousands of hours of audio and video content. FFmpeg is comprehensive and well-established open source software that is capable of the full-range of audio/video processing tasks (such as encode, decode, transcode, mux, demux, stream and filter. It is also capable of handling a wide-range of audio and video formats, a unique challenge in memory institutions. It comes with a command line interface, as well as a set of developer libraries that can be incorporated into applications.

  5. 特定类型音频流泛化识别方法%A Generic Method of Recognizing Specific Type Audio Stream

    Institute of Scientific and Technical Information of China (English)

    罗森林; 李金玉; 潘丽敏

    2011-01-01

    提出一种基于Mel频率倒谱系数(MFCC)和AdaBoost算法的特定类型音频流泛化识别方法,通过分析特定类型音频流的子类别间的共性和差异性,利用共性特征进行泛化识别,能够准确地检测并定位音频流中特定类型的音频.文中将枪声作为特定类型音频进行研究,通过提取各种枪声子类别的共性,弱化子类间的差异得到一个泛化的枪声模板,利用一个模板就可以支持多子类的准确识别.实验结果表明,算法的识别准确率为87.6%,查全率达到91.8%.%To meet the security demand of audio information, a generic method of recognizing specific type audio stream based on MFCC and AdaBoost is proposed in this paper, which can detect and locate the specific audio fragment from the audio stream accurately. The generality and differences between subcategories of the audio stream was analyzed to achieve the generic recognition. Multi-type gunshot was considered as the specific type of audio stream. The generic template was obtained by extracting the common features and reducing the different features of the gunshot audio, which could support the accurate identification of multiple sub-classes. The experiments show that the recognition accuracy of the proposed method is 87. 6% and the recall rate reaches 91. 8%.

  6. Determination of over current protection thresholds for class D audio amplifiers

    DEFF Research Database (Denmark)

    Nyboe, Flemming; Risbo, L; Andreani, Pietro

    2005-01-01

    Monolithic class-D audio amplifiers typically feature built-in over current protection circuitry that shuts down the amplifier in case of a short circuit on the output speaker terminals. To minimize cost, the threshold at which the device shuts down must be set just above the maximum current...... that can flow in the loudspeaker during normal operation. The current required is determined by the complex loudspeaker impedance and properties of the music signals played. This work presents a statistical analysis of peak output currents when playing music on typical loudspeakers for home entertainment....

  7. The Citation Patterns on the Papers' Feature Space of Being Cited%文献被引特征空间上的引文模式分析

    Institute of Scientific and Technical Information of China (English)

    徐建中; 王名扬

    2013-01-01

    在揭示文献被引数量之后隐含的知识流动特性的基础上,提出了文献被引特征空间的概念。通过对文献在:发表早期、整个生命周期中的特征空间引用分布特性进行探讨,深入分析了文献被引的特征空间分布对文献最终被引频次形成产生的影响。发现在早期具有较广泛引用分布的文献,越容易成长为高被引文献。此研究,为深入理解高被引文献的形成,以及预测未来的高被引文献,提供了非常重要的理论依据。%A new concept of papers' feature space was proposed by considering the knowledge flow properties hidden behind papers' cita-tion counts. Then the papers' citation patterns based on the feature space were discussed around two time periods: a) the first five years after paper publication; b) the total citation life cycle. We found that those papers with wide distribution on feature space in their early stage of publication would have higher probability to become highly-cited papers. The results would be helpful for a better understanding of the formation of highly-cited papers, and also for a more accurate prediction of the future highly-cited papers.

  8. Biomedical image representation approach using visualness and spatial information in a concept feature space for interactive region-of-interest-based retrieval.

    Science.gov (United States)

    Rahman, Md Mahmudur; Antani, Sameer K; Demner-Fushman, Dina; Thoma, George R

    2015-10-01

    This article presents an approach to biomedical image retrieval by mapping image regions to local concepts where images are represented in a weighted entropy-based concept feature space. The term "concept" refers to perceptually distinguishable visual patches that are identified locally in image regions and can be mapped to a glossary of imaging terms. Further, the visual significance (e.g., visualness) of concepts is measured as the Shannon entropy of pixel values in image patches and is used to refine the feature vector. Moreover, the system can assist the user in interactively selecting a region-of-interest (ROI) and searching for similar image ROIs. Further, a spatial verification step is used as a postprocessing step to improve retrieval results based on location information. The hypothesis that such approaches would improve biomedical image retrieval is validated through experiments on two different data sets, which are collected from open access biomedical literature.

  9. Features of Dongjing’s Commercial Space in the Northern Song Dynasty: An Interpretation Based on Riverside Scene at Qingming Festival

    Institute of Scientific and Technical Information of China (English)

    Zhu; Jin; Pan; Jiahong; Zhu; Xiaofeng; Li; Min

    2015-01-01

    With an analysis on the city image presented by the painting of Riverside Scene at Qingming Festival, as well as other relevant documents, this paper explores the factors that caused the market system reform in the Northern Song Dynasty. It also explores the features of Dongjing, the capital city’s commercial space prompted by the reform, revealing that the growth of urban population, the rise of its commercial status and the emergence of citizen class were the essential factors contributing to the market system reform. It concludes that Dongjing’s commercial space shows the following characteristics: developing in a linear form, gradually forming a commercial network system by integrating various shops, markets, and warehouses, expanding to the Outer City to combine the prosperous grassroots markets, and hosting commercial activities with longer business time.

  10. 论网络穿越小说的基本特性%On the Features of "Travel through Space-time" Network Fiction

    Institute of Scientific and Technical Information of China (English)

    李玉萍

    2012-01-01

    网络穿越小说是“穿越时空”文学母题在网络媒体时代的文本演绎形式。本文从分析“穿越时空”文学母题的特性和网络媒体的特性入手,分析了网络穿越小说的文学母题特性和网络艺术特性,进而分析了网络穿越小说的时空特质,结论表明:网络穿越小说在文学意义上是一种全新的小说艺术形式,在美学意义上则在最大程度上实现了人在数字化环境中的虚拟性存在。%"Travel through space-time network fiction" is the text format for the through space-time" in current internet era. By analyzing the "Travel through features and the characteristics of network media, this paper expounds the features motif of traditional~ "Travel space-time" literary motif of "Travel through spacetime" network fiction' s motif and network art; then, analyzes its spatial and temporal characteristics. The conclusion is, in the literary sense, "Travel through space-time" network fiction is a brand new novel form; in the aesthetic sense, it realizes people' s virtual existence in the digital environment to a maximum extent.

  11. Extraction of Subject-Specific Facial Expression Categories and Generation of Facial Expression Feature Space using Self-Mapping

    Directory of Open Access Journals (Sweden)

    Masaki Ishii

    2008-06-01

    Full Text Available This paper proposes a generation method of a subject-specific Facial Expression Map (FEMap using the Self-Organizing Maps (SOM of unsupervised learning and Counter Propagation Networks (CPN of supervised learning together. The proposed method consists of two steps. In the first step, the topological change of a face pattern in the expressional process of facial expression is learned hierarchically using the SOM of a narrow mapping space, and the number of subject-specific facial expression categories and the representative images of each category are extracted. Psychological significance based on the neutral and six basic emotions (anger, sadness, disgust, happiness, surprise, and fear is assigned to each extracted category. In the latter step, the categories and the representative images described above are learned using the CPN of a large mapping space, and a category map that expresses the topological characteristics of facial expression is generated. This paper defines this category map as an FEMap. Experimental results for six subjects show that the proposed method can generate a subject-specific FEMap based on the topological characteristics of facial expression appearing on face images.

  12. 商空间粒变换的深度特征表示%Deep Feature Representation Based on Granular Transaction of Quotient Space Theory

    Institute of Scientific and Technical Information of China (English)

    陈洁; 张燕平

    2014-01-01

    At present,big data is a matter of urgent doubt.The key of this issue is feature representation of problems.The most popular theory of feature representation is Deep Learning.But how much layers are need? How much features are need in each layer?.These are our most urgent problem in Deep Learning.In this paper,author introduces Quotient Space Theory (QST) to improve Deep Learning Theory.The feature is represented automatically by deep layers.So the uncertainty of number of deep layers is overcome,and the fault is also conquered that feature representation is indefinite.Author hierarchically descries features of problem in multi-granular spaces to form multilayer feature representation.The problem is solved in different granular space based on the granular transaction principle of QST.The experiment on Letter-recognition data set shows that the deep feature representation proposed in this paper expresses the whole problem with hierarchical structure by itself.The feature is described hierarchically and the solution precision is raised.%目前,大数据问题亟待解决,关键就是对问题的特征描述.目前特征描述最流行的理论是深度学习理论,但深层结构共需要多少层,每层需要多少特征?这是深度学习最需要解决的问题.引入商空间理论对深度学习理论进行改进,根据粒度变换原理对问题特征进行深层表示,克服深度学习理论中深度不确定,特征描述不明确的缺点.首先根据商空间理论的粒度变换原则,在多粒度空间分层描述问题特征,从而形成多层的深度特征表示.接着,根据商空间粒度变换的描述特性,在不同粒度空间对问题进行求解.最后,作者选取Letter-recognition数据集进行实验,实验结果表明本文所提的深度特征表示法可以自动将问题分为多层结构,分层描述问题的特征,提升了问题求解精度.

  13. Spaces

    Directory of Open Access Journals (Sweden)

    Maziar Nekovee

    2010-01-01

    Full Text Available Cognitive radio is being intensively researched as the enabling technology for license-exempt access to the so-called TV White Spaces (TVWS, large portions of spectrum in the UHF/VHF bands which become available on a geographical basis after digital switchover. Both in the US, and more recently, in the UK the regulators have given conditional endorsement to this new mode of access. This paper reviews the state-of-the-art in technology, regulation, and standardisation of cognitive access to TVWS. It examines the spectrum opportunity and commercial use cases associated with this form of secondary access.

  14. Digital Audio Radio Broadcast Systems Laboratory Testing Nearly Complete

    Science.gov (United States)

    2005-01-01

    Radio history continues to be made at the NASA Lewis Research Center with the completion of phase one of the digital audio radio (DAR) testing conducted by the Consumer Electronics Group of the Electronic Industries Association. This satellite, satellite/terrestrial, and terrestrial digital technology will open up new audio broadcasting opportunities both domestically and worldwide. It will significantly improve the current quality of amplitude-modulated/frequency-modulated (AM/FM) radio with a new digitally modulated radio signal and will introduce true compact-disc-quality (CD-quality) sound for the first time. Lewis is hosting the laboratory testing of seven proposed digital audio radio systems and modes. Two of the proposed systems operate in two modes each, making a total of nine systems being tested. The nine systems are divided into the following types of transmission: in-band on-channel (IBOC), in-band adjacent-channel (IBAC), and new bands. The laboratory testing was conducted by the Consumer Electronics Group of the Electronic Industries Association. Subjective assessments of the audio recordings for each of the nine systems was conducted by the Communications Research Center in Ottawa, Canada, under contract to the Electronic Industries Association. The Communications Research Center has the only CCIR-qualified (Consultative Committee for International Radio) audio testing facility in North America. The main goals of the U.S. testing process are to (1) provide technical data to the Federal Communication Commission (FCC) so that it can establish a standard for digital audio receivers and transmitters and (2) provide the receiver and transmitter industries with the proper standards upon which to build their equipment. In addition, the data will be forwarded to the International Telecommunications Union to help in the establishment of international standards for digital audio receivers and transmitters, thus allowing U.S. manufacturers to compete in the

  15. An Adaptive Robust Watermarking Algorithm for Audio Signals Using SVD

    Science.gov (United States)

    Dutta, Malay Kishore; Pathak, Vinay K.; Gupta, Phalguni

    This paper proposes an efficient watermarking algorithm which embeds watermark data adaptively in the audio signal. The algorithm embeds the watermark in the host audio signal in such a way that the degree of embedding (DOE) is adaptive in nature and is chosen in a justified manner according to the localized content of the audio. The watermark embedding regions are selectively chosen in the high energy regions of the audio signal which make the embedding process robust to synchronization attacks. Synchronization codes are added along with the watermark in the wavelet domain and hence the embedded data can be subjected to self synchronization and the synchronization code can be used as a check to combat false alarm that results from data modification due to watermark embedding. The watermark is embedded by quantization of the singular value decompositions in the wavelet domain which makes the process perceptually transparent. The experimental results suggest that the proposed algorithm maintains a good perceptual quality of the audio signal and maintains good robustness against signal processing attacks. Comparative analysis indicates that the proposed algorithm of adaptive DOE has superior performance in comparison to existing uniform DOE.

  16. Lossless Audio Watermarking Based on the Alpha Statistic Modulation

    Directory of Open Access Journals (Sweden)

    Sunita V. Dhavale

    2012-09-01

    Full Text Available In this paper, we propose a high capacity, self-synchronized, lossless audio watermarking algorithm based on the alpha (‘α’ statistic modulation. Here ‘α’ is related to the correlation among any given sequence i.e audio samples and it is modulated according to the watermark bit stream. The embedding scheme is tested in both the time domain and DWT domain. Though the time domain embedding reduces the computational time in searching the synchronization codes, the time-frequency localization capability of DWT provides good trade off between the computational complexity and robustness of synchronization codes. In case of DWT, ‘α’ related to the 2nd level DWT coarse wavelet components is used for embedding the watermark. The offset value used for embedding is made adaptive to the required SNR for the final watermarked audio signal. After extraction of the embedded watermark using a watermark key, original audio can be recovered with minimal distortion. The watermarking method presented here does not require the use of the original signal for watermark detection. Also high embedding capacity is achieved by using small sizedaudio frames. Experimental results reveal that the proposed watermarking scheme maintains high audio quality and is simultaneously highly robust to pirate attacks, including MP3 compression, cropping, filtering, re-sampling, and re-quantization.

  17. The Fungible Audio-Visual Mapping and its Experience

    Directory of Open Access Journals (Sweden)

    Adriana Sa

    2014-12-01

    Full Text Available This article draws a perceptual approach to audio-visual mapping. Clearly perceivable cause and effect relationships can be problematic if one desires the audience to experience the music. Indeed perception would bias those sonic qualities that fit previous concepts of causation, subordinating other sonic qualities, which may form the relations between the sounds themselves. The question is, how can an audio-visual mapping produce a sense of causation, and simultaneously confound the actual cause-effect relationships. We call this a fungible audio-visual mapping. Our aim here is to glean its constitution and aspect. We will report a study, which draws upon methods from experimental psychology to inform audio-visual instrument design and composition. The participants are shown several audio-visual mapping prototypes, after which we pose quantitative and qualitative questions regarding their sense of causation, and their sense of understanding the cause-effect relationships. The study shows that a fungible mapping requires both synchronized and seemingly non-related components – sufficient complexity to be confusing. As the specific cause-effect concepts remain inconclusive, the sense of causation embraces the whole. 

  18. 澳门园林绿地植物配置特色研究%Study on the Planting Configuration Features in Macau's Green Space

    Institute of Scientific and Technical Information of China (English)

    傅嘉维; 李敏; 梁敏如

    2011-01-01

    It is a truth universally acknowledged that Macau's urban green space have being influenced by the city's unique location, climate, history and culture, therefore distinct characteristics of gardens have been formed gradually. And three features of plant configuration in Macau's green space have been concluded in this thesis, base on the detailed investigation and analysis. Those are using south subtropical plants intensively, synthesizing western with eastern cultures and extremely diverse special form in urban green space.%澳门独特的地理气候与历史文化,孕育了别具一格的园林绿地景观特色.文章通过对澳门城市绿地进行细致的调研分析,总结出澳门园林绿地的植物配置有三大特色:集约运用南亚热带植物,植物配植风格中西合璧,绿化空间形式丰富多样.

  19. ARC Code TI: SLAB Spatial Audio Renderer

    Data.gov (United States)

    National Aeronautics and Space Administration — SLAB is a software-based, real-time virtual acoustic environment rendering system being developed as a tool for the study of spatial hearing. SLAB is designed to...

  20. Audio Watermarking Based on HAS and Neural Networks in DCT Domain

    Directory of Open Access Journals (Sweden)

    Cheng Ji-Shiung

    2003-01-01

    Full Text Available We propose a new intelligent audio watermarking method based on the characteristics of the HAS and the techniques of neural networks in the DCT domain. The method makes the watermark imperceptible by using the audio masking characteristics of the HAS. Moreover, the method exploits a neural network for memorizing the relationships between the original audio signals and the watermarked audio signals. Therefore, the method is capable of extracting watermarks without original audio signals. Finally, the experimental results are also included to illustrate that the method significantly possesses robustness to be immune against common attacks for the copyright protection of digital audio.

  1. Performance Improvement of Threshold based Audio Steganography using Parallel Computation

    Directory of Open Access Journals (Sweden)

    Muhammad Shoaib

    2016-10-01

    Full Text Available Audio steganography is used to hide secret information inside audio signal for the secure and reliable transfer of information. Various steganography techniques have been proposed and implemented to ensure adequate security level. The existing techniques either focus on the payload or security, but none of them has ensured both security and payload at same time. Data Dependency in existing solution was reluctant for the execution of steganography mechanism serially. The audio data and secret data pre-processing were done and existing techniques were experimentally tested in Matlab that ensured the existence of problem in efficient execution. The efficient least significant bit steganography scheme removed the pipelining hazard and calculated Steganography parallel on distributed memory systems. This scheme ensures security, focuses on payload along with provisioning of efficient solution. The result depicts that it not only ensures adequate security level but also provides better and efficient solution.

  2. Audio system using binaural synthesis for multimodal telepresence applications

    DEFF Research Database (Denmark)

    Madsen, Esben; Markovic, Milos; Olesen, Søren Krarup;

    2013-01-01

    of microphones, headphones and loudspeakers as well as measurements of network latency and bandwidth requirements of the system. Furthermore, measurements were made to determine whether the level of echo and cross talk cause any issues. The overall system employs multiple modalities to virtually transport......An audio system was developed as part of a multimodal system aiming to go beyond current state of the art in telepresence.This paper provides an overview of how the audio was implemented and documents measurements that were performed on the audio system. The measurements include equalization...... a person (the visitor) to a different physical location (the destination). The goal is that both the visitor and the people physically at the destination (the locals) should be provided with a sensation that the visitor is really there. Both the general multimodal system and the auditory part...

  3. Technical Evaluation Report 31: Internet Audio Products (3/ 3

    Directory of Open Access Journals (Sweden)

    Jim Rudolph

    2004-08-01

    Full Text Available Two contrasting additions to the online audio market are reviewed: iVocalize, a browser-based audio-conferencing software, and Skype, a PC-to-PC Internet telephone tool. These products are selected for review on the basis of their success in gaining rapid popular attention and usage during 2003-04. The iVocalize review emphasizes the product’s role in the development of a series of successful online audio communities – notably several serving visually impaired users. The Skype review stresses the ease with which the product may be used for simultaneous PC-to-PC communication among up to five users. Editor’s Note: This paper serves as an introduction to reports about online community building, and reviews of online products for disabled persons, in the next ten reports in this series. JPB, Series Ed.

  4. A novel audio watermarking scheme using multiscale wavelet modulation

    Institute of Scientific and Technical Information of China (English)

    JI Bing; ZHANG De; JI Xiaoyong

    2004-01-01

    A novel audio watermarking scheme to embed robust and inaudible watermarks for the purpose of copyright protection is proposed. The key innovation is to add time-frequency redundancy into watermark signals by multiscale wavelet modulation. In order to maximize the watermarking strength within perceptual constraints, the signals synthesized from different scales are masked using a frequency auditory model, respectively, and then intergrated to form the final watermark signal. The detection structure is built using the redundancy in watermark signals, and the performance is further enhanced by modeling the statistical behaviors of wavelet coefficients as generalized Gaussian distribution. The use of original audio signal is not required in watermark detection. The experimental results show that our approach can achieve not only good transparency but also satisfying robustness to common audio manipulations.

  5. Sistema de adquisición y procesamiento de audio

    OpenAIRE

    Pérez Segurado, Rubén

    2015-01-01

    El objetivo de este proyecto es el diseño y la implementación de una plataforma para un sistema de procesamiento de audio. El sistema recibirá una señal de audio analógica desde una fuente de audio, permitirá realizar un tratamiento digital de dicha señal y generará una señal procesada que se enviará a unos altavoces externos. Para la realización del sistema de procesamiento se empleará: - Un dispositivo FPGA de Lattice, modelo MachX02-7000-HE, en la cual estarán todas la...

  6. Can audio recording of outpatient consultations improve patient outcome?

    DEFF Research Database (Denmark)

    Wolderslund, Maiken; Kofoed, Poul-Erik; Axboe, Mette

    different departments: Orthopedics, Urology, Internal Medicine and Pediatrics. A total of 5,460 patients will be included from the outpatient clinics. All patients randomized to an intervention group are offered audio recording of their consultation. An Interactive Voice Response platform enables an audio......Introduction Information provided in an outpatient consultation concerns medication, diagnostic tests, treatment and rehabilitation which is crucial knowledge in regards of patient compliance, decision making and general patient satisfaction. Despite good communication skills among clinicians...... the communication is challenged by the fact that patients tend to forget or misunderstand a great deal of the information given. The primary objective of this study is to investigate the effects of providing patients with an audio recording of the consultation. Methods A randomized controlled trial involving four...

  7. Improving Security of Audio Watermarking in Image using Selector Keys

    Directory of Open Access Journals (Sweden)

    Amir Reza Fazli

    2012-06-01

    Full Text Available This study presents a novel watermarking algorithm for improving the security and robustness of hiding audio data in an image. Multi resolution discrete wavelet transform is used for embedding the audio watermark in an image. In this context, security is quantified from an information theoretic point of view by means of the equivocation and information leakage of the secret parameters. The selector keys are used as a criterion to determine the location of appropriate wavelet blocks and wavelet coefficients for embedding the watermark. Also, simulations assess the security levels derived in the theoretical part of the paper. The experimental results demonstrate that using the selector keys enhance the security level of the watermark embedding for a variety of scenarios. The level of the algorithm robustness is shown by considering Normalized Correlation (NC between the original audio watermark and extracted watermark.

  8. Error-correcting output codes based on feature space transformation%基于特征空间变换的纠错输出编码

    Institute of Scientific and Technical Information of China (English)

    雷蕾; 王晓丹; 罗玺; 宋亚飞; 薛爱军

    2015-01-01

    The independency between each dichotomizer trained by coding matrix’s bi-partition is the key to using error-correcting output codes(ECOC) to solve multiclass problems. Therefore, an error-correcting output codes method based on feature space transformation(FST) is proposed. Inspired by the ensemble learning theory, a third feature space dimension is introduced into the coding matrix. Then, different subspaces are obtained by feature space transformation based on different positive and negative subclasses, so that the diversity between different binary classifiers are promoted to make the classification performance better. The experiment results based on UCI datasets show that the codes based on FST are better than the original codes. Besides, the proposed method can be applied to any kind of coding matrix, and provides new thought to large dataset for its quick training time and simplicity.%针对基于纠错输出编码多类分类中如何保证基分类器差异性的问题,提出一种基于特征空间变换的编码方法。该方法引入特征空间,将编码矩阵扩展成三维矩阵;然后基于二类划分,利用特征变换得到不同的特征子空间,从而训练得到差异性大的基分类器。基于公共数据集的实验结果表明:该方法能够比原始的编码矩阵获得更优的分类性能,同时增加了基分类器的差异性;该方法适用于任何编码矩阵,为大数据的分类提供了新的思路。

  9. GK Per (Nova Persei 1901): HUBBLE SPACE TELESCOPE IMAGERY AND SPECTROSCOPY OF THE EJECTA, AND FIRST SPECTRUM OF THE JET-LIKE FEATURE

    Energy Technology Data Exchange (ETDEWEB)

    Shara, Michael M.; Zurek, David; Mizusawa, Trisha [Department of Astrophysics, American Museum of Natural History, Central Park West and 79th street, New York, NY 10024-5192 (United States); De Marco, Orsola [Department of Physics, Macquarie University, Sydney (Australia); Williams, Robert; Livio, Mario [Space Telescope Science Institute, 3700 San Martin Drive, Baltimore, MD 21218 (United States)

    2012-06-15

    We have imaged the ejecta of GK Persei (Nova Persei 1901 A.D.) with the Hubble Space Telescope (HST), whose 0.1 arcsec resolution reveals hundreds of cometary-like structures with long axes aligned toward GK Per. One or both ends of the structures often show a brightness enhancement relative to the structures' middle sections, but there is no simple regularity to their morphologies (in contrast with, for example, the Helix nebula). Some of structures' morphologies suggest the presence of slow-moving or stationary material with which the ejecta is colliding, while others suggest shaping from a wind emanating from GK Per itself. The most detailed expansion map of any classical nova's ejecta was created by comparing HST images taken in successive years. Wide Field and Planetary Camera 2 narrowband images and Space Telescope Imaging Spectrograph spectra demonstrate that the physical conditions in this nova's ejecta vary strongly on spatial scales much smaller than those of the ejecta. Directly measuring accurate densities and compositions, and hence masses of this and other nova shells, will demand data at least as resolved spatially as those presented here. The filling factor of the ejecta is 1% or less, and the nova ejecta mass must be less than 10{sup -4} M{sub Sun }. A modest fraction of the emission nebulosities vary in brightness by up to a factor of two on timescales of one year. Finally, we present the deepest images yet obtained of a jet-like feature outside the main body of GK Per nebulosity, and the first spectrum of that feature. Dominated by strong, narrow emission lines of [N II], [O II], [O III], and [S II], this feature is probably a shock due to ejected material running into stationary interstellar matter, slowly moving ejecta from a previous nova episode, or circumbinary matter present before 1901. An upper limit to the mass of the jet is of order a few times 10{sup -6} M{sub Sun }. If the jet mass is close to this limit then the

  10. Evaluation of robustness and transparency of multiple audio watermark embedding

    Science.gov (United States)

    Steinebach, Martin; Zmudzinski, Sascha

    2008-02-01

    As digital watermarking becomes an accepted and widely applied technology, a number of concerns regarding its reliability in typical application scenarios come up. One important and often discussed question is the robustness of digital watermarks against multiple embedding. This means that one cover is marked several times by various users with by same watermarking algorithm but with different keys and different watermark messages. In our paper we discuss the behavior of our PCM audio watermarking algorithm when applying multiple watermark embedding. This includes evaluation of robustness and transparency. Test results for multiple hours of audio content ranging from spoken words to music are provided.

  11. Audio Steganography Coding Using the Discrete Wavelet Transforms

    Directory of Open Access Journals (Sweden)

    Siwar Rekik

    2012-02-01

    Full Text Available The performance of audio steganography compression system using discrete wavelet transform(DWT is investigated. Audio steganography coding is the technology of transforming stegospeechinto efficiently encoded version that can be decoded in the receiver side to produce aclose representation of the initial signal (non compressed. Experimental results prove theefficiency of the used compression technique since the compressed stego-speech areperceptually intelligible and indistinguishable from the equivalent initial signal, while being able torecover the initial stego-speech with slight degradation in the quality .

  12. Audio engineering 101 a beginner's guide to music production

    CERN Document Server

    Dittmar, Tim

    2013-01-01

    Audio Engineering 101 is a real world guide for starting out in the recording industry. If you have the dream, the ideas, the music and the creativity but don't know where to start, then this book is for you!Filled with practical advice on how to navigate the recording world, from an author with first-hand, real-life experience, Audio Engineering 101 will help you succeed in the exciting, but tough and confusing, music industry. Covering all you need to know about the recording process, from the characteristics of sound to a guide to microphones to analog versus digital

  13. DOA Estimation of Audio Sources in Reverberant Environments

    DEFF Research Database (Denmark)

    Jensen, Jesper Rindom; Nielsen, Jesper Kjær; Heusdens, Richard

    2016-01-01

    Reverberation is well-known to have a detrimental impact on many localization methods for audio sources. We address this problem by imposing a model for the early reflections as well as a model for the audio source itself. Using these models, we propose two iterative localization methods that est...... bias. Our simulation results show that we can estimate the DOA of the desired signal more accurately with this procedure compared to state-of-theart estimator in both synthetic and real data experiments with reverberation....

  14. Design of a WAV audio player based on K20

    Directory of Open Access Journals (Sweden)

    Xu Yu

    2016-01-01

    Full Text Available The designed player uses the Freescale Company’s MK20DX128VLH7 as the core control ship, and its hardware platform is equipped with VS1003 audio decoder, OLED display interface, USB interface and SD card slot. The player uses the open source embedded real-time operating system μC/OS-II, Freescale USB Stack V4.1.1 and FATFS, and a graphical user interface is developed to improve the user experience based on CGUI. In general, the designed WAV audio player has a strong applicability and a good practical value.

  15. Cover signal specific steganalysis: the impact of training on the example of two selected audio steganalysis approaches

    Science.gov (United States)

    Kraetzer, Christian; Dittmann, Jana

    2008-02-01

    The main goals of this paper are to show the impact of the basic assumptions for the cover channel characteristics as well as the impact of different training/testing set generation strategies on the statistical detectability of exemplary chosen audio hiding approaches known from steganography and watermarking. Here we have selected exemplary five steganography algorithms and four watermarking algorithms. The channel characteristics for two different chosen audio cover channels (an application specific exemplary scenario of VoIP steganography and universal audio steganography) are formalised and their impact on decisions in the steganalysis process, especially on the strategies applied for training/ testing set generation, are shown. Following the assumptions on the cover channel characteristics either cover dependent or cover independent training and testing can be performed, using either correlated or non-correlated training and test sets. In comparison to previous work, additional frequency domain features are introduced for steganalysis and the performance (in terms of classification accuracy) of Bayesian classifiers and multinomial logistic regression models is compared with the results of SVM classification. We show that the newly implemented frequency domain features increase the classification accuracy achieved in SVM classification. Furthermore it is shown on the example of VoIP steganalysis that channel character specific evaluation performs better than tests without focus on a specific channel (i.e. universal steganalysis). A comparison of test results for cover dependent and independent training and testing shows that the latter performs better for all nine algorithms evaluated here and the used SVM based classifier.

  16. Multi-modal gesture recognition using integrated model of motion, audio and video

    Science.gov (United States)

    Goutsu, Yusuke; Kobayashi, Takaki; Obara, Junya; Kusajima, Ikuo; Takeichi, Kazunari; Takano, Wataru; Nakamura, Yoshihiko

    2015-07-01

    Gesture recognition is used in many practical applications such as human-robot interaction, medical rehabilitation and sign language. With increasing motion sensor development, multiple data sources have become available, which leads to the rise of multi-modal gesture recognition. Since our previous approach to gesture recognition depends on a unimodal system, it is difficult to classify similar motion patterns. In order to solve this problem, a novel approach which integrates motion, audio and video models is proposed by using dataset captured by Kinect. The proposed system can recognize observed gestures by using three models. Recognition results of three models are integrated by using the proposed framework and the output becomes the final result. The motion and audio models are learned by using Hidden Markov Model. Random Forest which is the video classifier is used to learn the video model. In the experiments to test the performances of the proposed system, the motion and audio models most suitable for gesture recognition are chosen by varying feature vectors and learning methods. Additionally, the unimodal and multi-modal models are compared with respect to recognition accuracy. All the experiments are conducted on dataset provided by the competition organizer of MMGRC, which is a workshop for Multi-Modal Gesture Recognition Challenge. The comparison results show that the multi-modal model composed of three models scores the highest recognition rate. This improvement of recognition accuracy means that the complementary relationship among three models improves the accuracy of gesture recognition. The proposed system provides the application technology to understand human actions of daily life more precisely.

  17. Multi-modal Gesture Recognition using Integrated Model of Motion, Audio and Video

    Institute of Scientific and Technical Information of China (English)

    GOUTSU Yusuke; KOBAYASHI Takaki; OBARA Junya; KUSAJIMAIkuo; TAKEICHI Kazunari; TAKANO Wataru; NAKAMURA Yoshihiko

    2015-01-01

    Gesture recognition is used in many practical applications such as human-robot interaction, medical rehabilitation and sign language. With increasing motion sensor development, multiple data sources have become available, which leads to the rise of multi-modal gesture recognition. Since our previous approach to gesture recognition depends on a unimodal system, it is difficult to classify similar motion patterns. In order to solve this problem, a novel approach which integrates motion, audio and video models is proposed by using dataset captured by Kinect. The proposed system can recognize observed gestures by using three models. Recognition results of three models are integrated by using the proposed framework and the output becomes the final result. The motion and audio models are learned by using Hidden Markov Model. Random Forest which is the video classifier is used to learn the video model. In the experiments to test the performances of the proposed system, the motion and audio models most suitable for gesture recognition are chosen by varying feature vectors and learning methods. Additionally, the unimodal and multi-modal models are compared with respect to recognition accuracy. All the experiments are conducted on dataset provided by the competition organizer of MMGRC, which is a workshop for Multi-Modal Gesture Recognition Challenge. The comparison results show that the multi-modal model composed of three models scores the highest recognition rate. This improvement of recognition accuracy means that the complementary relationship among three models improves the accuracy of gesture recognition. The proposed system provides the application technology to understand human actions of daily life more precisely.

  18. Audio steganalysis based on "negative resonance phenomenon" caused by steganographic tools

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    Researching on the impact different steganographic software tools have audio statistical features, revealed the phenomenon that when messages are embedded in a WAV file by using a certain tool, the variation of statistical features in the WAV file which already contains messages embedded by the same tool is abruptly smaller than those in which messages have not been embedded. We call it "negative resonance phenomenon" temporarily. With the phenomenon above and Support Vector Machines (SVMs), we can detect the existence of hidden messages, and also identify the tools used to hide them. As shown by the experimental results, the proposed method can be very effectively used to detect hidden messages embedded by Hide4PGP, Stegowav and S-Tools4.

  19. Deep Complementary Bottleneck Features for Visual Speech Recognition

    NARCIS (Netherlands)

    Petridis, Stavros; Pantic, Maja

    2016-01-01

    Deep bottleneck features (DBNFs) have been used successfully in the past for acoustic speech recognition from audio. However, research on extracting DBNFs for visual speech recognition is very limited. In this work, we present an approach to extract deep bottleneck visual features based on deep auto

  20. Online feature selection with streaming features.

    Science.gov (United States)

    Wu, Xindong; Yu, Kui; Ding, Wei; Wang, Hao; Zhu, Xingquan

    2013-05-01

    We propose a new online feature selection framework for applications with streaming features where the knowledge of the full feature space is unknown in advance. We define streaming features as features that flow in one by one over time whereas the number of training examples remains fixed. This is in contrast with traditional online learning methods that only deal with sequentially added observations, with little attention being paid to streaming features. The critical challenges for Online Streaming Feature Selection (OSFS) include 1) the continuous growth of feature volumes over time, 2) a large feature space, possibly of unknown or infinite size, and 3) the unavailability of the entire feature set before learning starts. In the paper, we present a novel Online Streaming Feature Selection method to select strongly relevant and nonredundant features on the fly. An efficient Fast-OSFS algorithm is proposed to improve feature selection performance. The proposed algorithms are evaluated extensively on high-dimensional datasets and also with a real-world case study on impact crater detection. Experimental results demonstrate that the algorithms achieve better compactness and higher prediction accuracy than existing streaming feature selection algorithms.

  1. 基于空间特征的谱聚类含噪图像分割%Space Feature Based Spectral Clustering for Noisy Image Segmentation

    Institute of Scientific and Technical Information of China (English)

    刘汉强; 赵凤

    2012-01-01

    To overcome the problem thai the traditional spectral clustering is easily influenced by image noise while applied to noisy image segmentation, a space feature based spectral clustering algorithm for noise image segmentation is proposed. In this method, gray value, local spatial information and non-local spatial information of each pixel are utilized to construct a 3-dimensional feature dataset. Then, the space compactness function is introduced to compute the similarity between each feature point and its K nearest neighbors. Finally, the final image segmentation result is obtained by spectral clustering algorithm. Some noisy artificial images, nature images and synthetic aperture radar images are utilized and normalized. Cut, FCM_s and Nystrom method are compared with the proposed method in the experiments. The experimental results show that the proposed method is robustness and obtains the satisfying segmentation result.%为克服传统谱聚类算法应用到含噪图像分割时易受到图像中噪声影响的问题,提出一种基于空间特征的谱聚类含噪图像分割算法.该方法利用图像各个像素的灰度信息、局部空间邻接信息及非局部空间信息设计像素的三维特征,通过引入空间紧致性函数建立像素特征点与其K个最近邻之间的相似性,进而利用谱聚类算法得到图像的最终分割结果.实验中采用含噪的人工图像、自然图像及合成孔径雷达图像与空间模糊聚类、规范切谱聚类和Nystr(o)m方法3种算法进行对比实验,实验结果验证文中方法能克服图像中噪声影响并取得较满意的分割效果.

  2. Tech-Assisted Language Learning Tasks in an EFL Setting: Use of Hand phone Recording Feature

    Directory of Open Access Journals (Sweden)

    Alireza Shakarami

    2014-09-01

    Full Text Available Technology with its speedy great leaps forward has undeniable impact on every aspect of our life in the new millennium. It has supplied us with different affordances almost daily or more precisely in a matter of hours. Technology and Computer seems to be a break through as for their roles in the Twenty-First century educational system. Examples are numerous, among which CALL, CMC, and Virtual learning spaces come to mind instantly. Amongst the newly developed gadgets of today are the sophisticated smart Hand phones which are far more ahead of a communication tool once designed for. Development of Hand phone as a wide-spread multi-tasking gadget has urged researchers to investigate its effect on every aspect of learning process including language learning. This study attempts to explore the effects of using cell phone audio recording feature, by Iranian EFL learners, on the development of their speaking skills. Thirty-five sophomore students were enrolled in a pre-posttest designed study. Data on their English speaking experience using audio–recording features of their Hand phones were collected. At the end of the semester, the performance of both groups, treatment and control, were observed, evaluated, and analyzed; thereafter procured qualitatively at the next phase. The quantitative outcome lent support to integrating Hand phones as part of the language learning curriculum. Keywords: Hand phone, Recording, Audio, Language learning, Enhancement, EFL

  3. Audio-Described Educational Materials: Ugandan Teachers' Experiences

    Science.gov (United States)

    Wormnaes, Siri; Sellaeg, Nina

    2013-01-01

    This article describes and discusses a qualitative, descriptive, and exploratory study of how 12 visually impaired teachers in Uganda experienced audio-described educational video material for teachers and student teachers. The study is based upon interviews with these teachers and observations while they were using the material either…

  4. Audio-Visual Perception System for a Humanoid Robotic Head

    Directory of Open Access Journals (Sweden)

    Raquel Viciana-Abad

    2014-05-01

    Full Text Available One of the main issues within the field of social robotics is to endow robots with the ability to direct attention to people with whom they are interacting. Different approaches follow bio-inspired mechanisms, merging audio and visual cues to localize a person using multiple sensors. However, most of these fusion mechanisms have been used in fixed systems, such as those used in video-conference rooms, and thus, they may incur difficulties when constrained to the sensors with which a robot can be equipped. Besides, within the scope of interactive autonomous robots, there is a lack in terms of evaluating the benefits of audio-visual attention mechanisms, compared to only audio or visual approaches, in real scenarios. Most of the tests conducted have been within controlled environments, at short distances and/or with off-line performance measurements. With the goal of demonstrating the benefit of fusing sensory information with a Bayes inference for interactive robotics, this paper presents a system for localizing a person by processing visual and audio data. Moreover, the performance of this system is evaluated and compared via considering the technical limitations of unimodal systems. The experiments show the promise of the proposed approach for the proactive detection and tracking of speakers in a human-robot interactive framework.

  5. Audio and Video Reflections to Promote Social Justice

    Science.gov (United States)

    Boske, Christa

    2011-01-01

    Purpose: The purpose of this paper is to examine how 15 graduate students enrolled in a US school leadership preparation program understand issues of social justice and equity through a reflective process utilizing audio and/or video software. Design/methodology/approach: The study is based on the tradition of grounded theory. The researcher…

  6. Efficiency Optimization in Class-D Audio Amplifiers

    DEFF Research Database (Denmark)

    Yamauchi, Akira; Knott, Arnold; Jørgensen, Ivan Harald Holger;

    2015-01-01

    This paper presents a new power efficiency optimization routine for designing Class-D audio amplifiers. The proposed optimization procedure finds design parameters for the power stage and the output filter, and the optimum switching frequency such that the weighted power losses are minimized unde...

  7. Multi Carrier Modulator for Switch-Mode Audio Power Amplifiers

    DEFF Research Database (Denmark)

    Knott, Arnold; Pfaffinger, Gerhard; Andersen, Michael Andreas E.

    2008-01-01

    While switch-mode audio power amplifiers allow compact implementations and high output power levels due to their high power efficiency, they are very well known for creating electromagnetic interference (EMI) with other electronic equipment, in particular radio receivers. Lowering the EMI of swit...

  8. Current-Driven Switch-Mode Audio Power Amplifiers

    DEFF Research Database (Denmark)

    Knott, Arnold; Buhl, Niels Christian; Andersen, Michael A. E.

    2012-01-01

    The conversion of electrical energy into sound waves by electromechanical transducers is proportional to the current through the coil of the transducer. However virtually all audio power amplifiers provide a controlled voltage through the interface to the transducer. This paper is presenting a sw...

  9. Deep learning, audio adversaries, and music content analysis

    DEFF Research Database (Denmark)

    Kereliuk, Corey Mose; Sturm, Bob L.; Larsen, Jan

    2015-01-01

    We present the concept of adversarial audio in the context of deep neural networks (DNNs) for music content analysis. An adversary is an algorithm that makes minor perturbations to an input that cause major repercussions to the system response. In particular, we design an adversary for a DNN...

  10. Objective assessment of speech and audio quality - Technology and applications

    NARCIS (Netherlands)

    Rix, A.W.; Beerends, J.G.; Kim, D.-S.; Kroon, P.; Ghitza, O.

    2006-01-01

    In the past few years, objective quality assessment models have become increasingly used for assessing or monitoring speech and audio quality. By measuring perceived quality on an easily-understood subjective scale, such as listening quality (excellent, good, fair, poor, bad), these methods provide

  11. Audio-visual perception system for a humanoid robotic head.

    Science.gov (United States)

    Viciana-Abad, Raquel; Marfil, Rebeca; Perez-Lorenzo, Jose M; Bandera, Juan P; Romero-Garces, Adrian; Reche-Lopez, Pedro

    2014-01-01

    One of the main issues within the field of social robotics is to endow robots with the ability to direct attention to people with whom they are interacting. Different approaches follow bio-inspired mechanisms, merging audio and visual cues to localize a person using multiple sensors. However, most of these fusion mechanisms have been used in fixed systems, such as those used in video-conference rooms, and thus, they may incur difficulties when constrained to the sensors with which a robot can be equipped. Besides, within the scope of interactive autonomous robots, there is a lack in terms of evaluating the benefits of audio-visual attention mechanisms, compared to only audio or visual approaches, in real scenarios. Most of the tests conducted have been within controlled environments, at short distances and/or with off-line performance measurements. With the goal of demonstrating the benefit of fusing sensory information with a Bayes inference for interactive robotics, this paper presents a system for localizing a person by processing visual and audio data. Moreover, the performance of this system is evaluated and compared via considering the technical limitations of unimodal systems. The experiments show the promise of the proposed approach for the proactive detection and tracking of speakers in a human-robot interactive framework.

  12. Output impedance and stability of audio power amplifiers

    NARCIS (Netherlands)

    Schaink, T.

    2006-01-01

    This report is about the design of an audio amplifier which is stable for all passive loads. If stability analysis of an opamp is done, the ‘classical’ approach is to derive its transfer function. Investigation of the open loop gain and a phase/gain margin determine the stability of the opamp. Desig

  13. Comparative study of Audio-lingual method and CLT

    Institute of Scientific and Technical Information of China (English)

    2013-01-01

    For language teaching,various teaching methods and approaches have been proposed. But no one teaching approach is one-for-al that is good enough to be used as the standard of teaching. Among so many methods this paper mainly concerns the audio-lingual method and CLT.

  14. A Multimedia Application: Spatial Perceptual Entropy of Multichannel Audio Signals

    Directory of Open Access Journals (Sweden)

    Chen Shuixian

    2010-01-01

    Full Text Available Usually multimedia data have to be compressed before transmitting, and higher compression rate, or equivalently lower bitrate, relieves the load of communication channels but impacts negatively the quality. We investigate the bitrate lower bound for perceptually lossless compression of a major type of multimedia—multichannel audio signals. This bound equals to the perceptible information rate of the signals. Traditionally, Perceptual Entropy (PE, based primarily on monaural hearing measures the perceptual information rate of individual channels. But PE cannot measure the spatial information captured by binaural hearing, thus is not suitable for estimating Spatial Audio Coding (SAC bitrate bound. To measure this spatial information, we build a Binaural Cue Physiological Perception Model (BCPPM on the ground of binaural hearing, which represents spatial information in the physical and physiological layers. This model enables computing Spatial Perceptual Entropy (SPE, the lower bitrate bound for SAC. For real-world stereo audio signals of various types, our experiments indicate that SPE reliably estimates their spatial information rate. Therefore, "SPE plus PE" gives lower bitrate bounds for communicating multichannel audio signals with transparent quality.

  15. A Multimedia Application: Spatial Perceptual Entropy of Multichannel Audio Signals

    Directory of Open Access Journals (Sweden)

    Shuixian Chen

    2010-01-01

    Full Text Available Usually multimedia data have to be compressed before transmitting, and higher compression rate, or equivalently lower bitrate, relieves the load of communication channels but impacts negatively the quality. We investigate the bitrate lower bound for perceptually lossless compression of a major type of multimedia—multichannel audio signals. This bound equals to the perceptible information rate of the signals. Traditionally, Perceptual Entropy (PE, based primarily on monaural hearing measures the perceptual information rate of individual channels. But PE cannot measure the spatial information captured by binaural hearing, thus is not suitable for estimating Spatial Audio Coding (SAC bitrate bound. To measure this spatial information, we build a Binaural Cue Physiological Perception Model (BCPPM on the ground of binaural hearing, which represents spatial information in the physical and physiological layers. This model enables computing Spatial Perceptual Entropy (SPE, the lower bitrate bound for SAC. For real-world stereo audio signals of various types, our experiments indicate that SPE reliably estimates their spatial information rate. Therefore, “SPE plus PE” gives lower bitrate bounds for communicating multichannel audio signals with transparent quality.

  16. An Audio-Visual Lecture Course in Russian Culture

    Science.gov (United States)

    Leighton, Lauren G.

    1977-01-01

    An audio-visual course in Russian culture is given at Northern Illinois University. A collection of 4-5,000 color slides is the basis for the course, with lectures focussed on literature, philosophy, religion, politics, art and crafts. Acquisition, classification, storage and presentation of slides, and organization of lectures are discussed. (CHK)

  17. Towards a universal representation for audio information retrieval and analysis

    DEFF Research Database (Denmark)

    Jensen, Bjørn Sand; Troelsgaard, Rasmus; Larsen, Jan;

    2013-01-01

    A fundamental and general representation of audio and music which integrates multi-modal data sources is important for both application and basic research purposes. In this paper we address this challenge by proposing a multi-modal version of the Latent Dirichlet Allocation model which provides a...

  18. The Single- and Multichannel Audio Recordings Database (SMARD)

    DEFF Research Database (Denmark)

    Nielsen, Jesper Kjær; Jensen, Jesper Rindom; Jensen, Søren Holdt

    2014-01-01

    A new single- and multichannel audio recordings database (SMARD) is presented in this paper. The database contains recordings from a box-shaped listening room for various loudspeaker and array types. The recordings were made for 48 different configurations of three different loudspeakers and four...

  19. Audio-Visual Aid in Teaching "Fatty Liver"

    Science.gov (United States)

    Dash, Sambit; Kamath, Ullas; Rao, Guruprasad; Prakash, Jay; Mishra, Snigdha

    2016-01-01

    Use of audio visual tools to aid in medical education is ever on a rise. Our study intends to find the efficacy of a video prepared on "fatty liver," a topic that is often a challenge for pre-clinical teachers, in enhancing cognitive processing and ultimately learning. We prepared a video presentation of 11:36 min, incorporating various…

  20. Auteur Description: From the Director's Creative Vision to Audio Description

    Science.gov (United States)

    Szarkowska, Agnieszka

    2013-01-01

    In this report, the author follows the suggestion that a film director's creative vision should be incorporated into Audio description (AD), a major technique for making films, theater performances, operas, and other events accessible to people who are blind or have low vision. The author presents a new type of AD for auteur and artistic films:…

  1. Real-time Loudspeaker Distance Estimation with Stereo Audio

    DEFF Research Database (Denmark)

    Nielsen, Jesper Kjær; Gaubitch, Nikolay; Heusdens, Richard;

    2015-01-01

    . In this paper, we propose to use the desired audio signal instead. Specifically, we treat the case of estimating the distance between two loudspeakers playing back a stereo music or speech signal. In this connection, we develop a real-time maximum likelihood estimator and demonstrate that it has a variance...

  2. The relationship between basic audio quality and overall listening experience.

    Science.gov (United States)

    Schoeffler, Michael; Herre, Jürgen

    2016-09-01

    Basic audio quality (BAQ) is a well-known perceptual attribute, which is rated in various listening test methods to measure the performance of audio systems. Unfortunately, when it comes to purchasing audio systems, BAQ might not have a significant influence on the customers' buying decisions since other factors, like brand loyalty, might be more important. In contrast to BAQ, overall listening experience (OLE) is an affective attribute which incorporates all aspects that are important to an individual assessor, including his or her preference for music genre and audio quality. In this work, the relationship between BAQ and OLE is investigated in more detail. To this end, an experiment was carried out, in which participants rated the BAQ and the OLE of music excerpts with different timbral and spatial degradations. In a between-group-design procedure, participants were assigned into two groups, in each of which a different set of stimuli was rated. The results indicate that rating of both attributes, BAQ and OLE, leads to similar rankings, even if a different set of stimuli is rated. In contrast to the BAQ ratings, which were more influenced by timbral than spatial degradations, the OLE ratings were almost equally influenced by timbral and spatial degradations.

  3. Market potential for interactive audio-visual media

    NARCIS (Netherlands)

    Leurdijk, A.; Limonard, S.

    2005-01-01

    NM2 (New Media for a New Millennium) develops tools for interactive, personalised and non-linear audio-visual content that will be tested in seven pilot productions. This paper looks at the market potential for these productions from a technological, a business and a users' perspective. It shows tha

  4. Digital audio recordings improve the outcomes of patient consultations

    DEFF Research Database (Denmark)

    Wolderslund, Maiken; Kofoed, Poul-Erik; Axboe, Mette;

    2016-01-01

    OBJECTIVES: To investigate the effects on patients' outcome of the consultations when provided with: a Digital Audio Recording (DAR) of the consultation and a Question Prompt List (QPL). METHODS: This is a three-armed randomised controlled cluster trial. One group of patients received standard care...

  5. Possible technical solutions to reduce energy consumption in audio products

    Energy Technology Data Exchange (ETDEWEB)

    Nielsen, K.; Andersen, M.A.E.

    1999-07-01

    In common audio products nearly all the supplied power is dissipated as heat. The major consumers are with almost no exception the power supply and the audio amplifier. This paper is divided in two parts, concentrating on typical efficiency measures for the concepts of today and the possibly technical solutions, by which the overall efficiency can be considerably improved in the future. Traditional power supplies are made using a transformer operating on the mains frequency followed by a linear regulator. These are bulky and the efficiency is only around 40%. Using high frequency switch mode power supplies the size of the power supply can be reduced and the efficiency can be increased to 80-90%. Construction of optimal amplifiers in regard to total energy consumption over life time, can only be accomplished by considering both the general volume control distribution, and the general spectral amplitude distribution of audio signals. The traditional efficiency measure specified at the maximum efficiency level says only very little about the real energy consumption of the audio amplifier. As an example, the theoretical efficiency for at traditional class B amplifier is 78%. Using a new efficiency measure defined on the basis of the approximate volume control distribution, an 50W amplifier example shows an overall efficiency of only 1%. In the paper possible solutions and guidelines to increase the real amplifier efficiency are given. (au)

  6. Real-Time Audio-Visual Analysis for Multiperson Videoconferencing

    Directory of Open Access Journals (Sweden)

    Petr Motlicek

    2013-01-01

    Full Text Available We describe the design of a system consisting of several state-of-the-art real-time audio and video processing components enabling multimodal stream manipulation (e.g., automatic online editing for multiparty videoconferencing applications in open, unconstrained environments. The underlying algorithms are designed to allow multiple people to enter, interact, and leave the observable scene with no constraints. They comprise continuous localisation of audio objects and its application for spatial audio object coding, detection, and tracking of faces, estimation of head poses and visual focus of attention, detection and localisation of verbal and paralinguistic events, and the association and fusion of these different events. Combined all together, they represent multimodal streams with audio objects and semantic video objects and provide semantic information for stream manipulation systems (like a virtual director. Various experiments have been performed to evaluate the performance of the system. The obtained results demonstrate the effectiveness of the proposed design, the various algorithms, and the benefit of fusing different modalities in this scenario.

  7. SPEECH/MUSIC CLASSIFICATION USING WAVELET BASED FEATURE EXTRACTION TECHNIQUES

    Directory of Open Access Journals (Sweden)

    Thiruvengatanadhan Ramalingam

    2014-01-01

    Full Text Available Audio classification serves as the fundamental step towards the rapid growth in audio data volume. Due to the increasing size of the multimedia sources speech and music classification is one of the most important issues for multimedia information retrieval. In this work a speech/music discrimination system is developed which utilizes the Discrete Wavelet Transform (DWT as the acoustic feature. Multi resolution analysis is the most significant statistical way to extract the features from the input signal and in this study, a method is deployed to model the extracted wavelet feature. Support Vector Machines (SVM are based on the principle of structural risk minimization. SVM is applied to classify audio into their classes namely speech and music, by learning from training data. Then the proposed method extends the application of Gaussian Mixture Models (GMM to estimate the probability density function using maximum likelihood decision methods. The system shows significant results with an accuracy of 94.5%.

  8. Spectacular Attractions: Museums, Audio-Visuals and the Ghosts of Memory

    Directory of Open Access Journals (Sweden)

    Mandelli Elisa

    2015-12-01

    Full Text Available In the last decades, moving images have become a common feature not only in art museums, but also in a wide range of institutions devoted to the conservation and transmission of memory. This paper focuses on the role of audio-visuals in the exhibition design of history and memory museums, arguing that they are privileged means to achieve the spectacular effects and the visitors’ emotional and “experiential” engagement that constitute the main objective of contemporary museums. I will discuss this topic through the concept of “cinematic attraction,” claiming that when embedded in displays, films and moving images often produce spectacular mises en scène with immersive effects, creating wonder and astonishment, and involving visitors on an emotional, visceral and physical level. Moreover, I will consider the diffusion of audio-visual witnesses of real or imaginary historical characters, presented in Phantasmagoria-like displays that simulate ghostly and uncanny apparitions, creating an ambiguous and often problematic coexistence of truth and illusion, subjectivity and objectivity, facts and imagination.

  9. Modulation of visual responses in the superior temporal sulcus by audio-visual congruency.

    Science.gov (United States)

    Dahl, Christoph D; Logothetis, Nikos K; Kayser, Christoph

    2010-01-01

    Our ability to identify or recognize visual objects is often enhanced by evidence provided by other sensory modalities. Yet, where and how visual object processing benefits from the information received by the other senses remains unclear. One candidate region is the temporal lobe, which features neural representations of visual objects, and in which previous studies have provided evidence for multisensory influences on neural responses. In the present study we directly tested whether visual representations in the lower bank of the superior temporal sulcus (STS) benefit from acoustic information. To this end, we recorded neural responses in alert monkeys passively watching audio-visual scenes, and quantified the impact of simultaneously presented sounds on responses elicited by the presentation of naturalistic visual scenes. Using methods of stimulus decoding and information theory, we then asked whether the responses of STS neurons become more reliable and informative in multisensory contexts. Our results demonstrate that STS neurons are indeed sensitive to the modality composition of the sensory stimulus. Importantly, information provided by STS neurons' responses about the particular visual stimulus being presented was highest during congruent audio-visual and unimodal visual stimulation, but was reduced during incongruent bimodal stimulation. Together, these findings demonstrate that higher visual representations in the STS not only convey information about the visual input but also depend on the acoustic context of a visual scene.

  10. Modulation of visual responses in the superior temporal sulcus by audio-visual congruency

    Directory of Open Access Journals (Sweden)

    Christoph Dahl

    2010-04-01

    Full Text Available Our ability to identify or recognize visual objects is often enhanced by evidence provided by other sensory modalities. Yet, where and how visual object processing benefits from the information received by the other senses remains unclear. One candidate region is the temporal lobe, which features neural representations of visual objects, and in which previous studies have provided evidence for multisensory influences on neural responses. In the present study we directly tested whether visual representations in the lower bank of the superior temporal sulcus (STS benefit from acoustic information. To this end, we recorded neural responses in alert monkeys passively watching audio-visual scenes, and quantified the impact of simultaneously presented sounds on responses elicited by the presentation of naturalistic visual scenes. Using methods of stimulus decoding and information theory, we then asked whether the responses of STS neurons become more reliable and informative in multisensory contexts. Our results demonstrate that STS neurons are indeed sensitive to the modality composition of the sensory stimulus. Importantly, information provided by STS neurons’ responses about the particular visual stimulus being presented was highest during congruent audio-visual and unimodal visual stimulation, but was reduced during incongruent bimodal stimulation. Together, these findings demonstrate that higher visual representations in the STS not only convey information about the visual input but also depend on the acoustic context of a visual scene.

  11. Deutsch Durch Audio-Visuelle Methode: An Audio-Lingual-Oral Approach to the Teaching of German.

    Science.gov (United States)

    Dickinson Public Schools, ND. Instructional Media Center.

    This teaching guide, designed to accompany Chilton's "Deutsch Durch Audio-Visuelle Methode" for German 1 and 2 in a three-year secondary school program, focuses major attention on the operational plan of the program and a student orientation unit. A section on teaching a unit discusses four phases: (1) presentation, (2) explanation, (3)…

  12. An introduction to audio content analysis applications in signal processing and music informatics

    CERN Document Server

    Lerch, Alexander

    2012-01-01

    "With the proliferation of digital audio distribution over digital media, audio content analysis is fast becoming a requirement for designers of intelligent signal-adaptive audio processing systems. Written by a well-known expert in the field, this book provides quick access to different analysis algorithms and allows comparison between different approaches to the same task, making it useful for newcomers to audio signal processing and industry experts alike. A review of relevant fundamentals in audio signal processing, psychoacoustics, and music theory, as well as downloadable MATLAB files are also included"--

  13. A Single Core Hardware Approach of MPEG Audio Decoder for Real-Time Transmission

    Directory of Open Access Journals (Sweden)

    M.B.I. Reaz

    2012-04-01

    Full Text Available The decoding of the voice audio bit stream is an issue in terms of real-time transmission of high quality voice audio over the Internet. A stand-alone chip to perform decoding is a better solution over software approach. The MPEG audio compression provides high compression with minimal loss. This study describes a VHDL model of MPEG audio layer 1 decoder that perform concurrent processing while receiving voice quality audio input bit stream at a constant bit rate and simultaneously producing a stream of 8-bit monopole PCM samples at a constant sampling frequency in real time.

  14. Maintaining high-quality IP audio services in lossy IP network environments

    Science.gov (United States)

    Barton, Robert J., III; Chodura, Hartmut

    2000-07-01

    In this paper we present our research activities in the area of digital audio processing and transmission. Today's available teleconference audio solutions are lacking in flexibility, robustness and fidelity. There was a need for enhancing the quality of audio for IP-based applications to guarantee optimal services under varying conditions. Multiple tests and user evaluations have shown that a reliable audio communication toolkit is essential for any teleconference application. This paper summarizes our research activities and gives an overview of developed applications. In a first step the parameters, which influence the audio quality, were evaluated. All of these parameters have to be optimized in order to result into the best achievable quality. Therefore it was necessary to enhance existing schemes or develop new methods. Applications were developed for Internet-Telephony, broadcast of live music and spatial audio for Virtual Reality environments. This paper describes these applications and issues of delivering high quality digital audio services over lossy IP networks.

  15. From ITU-T G.722.1 to ITU-T G.722.1 Annex C: A New Low-Complexity 14kHz Bandwidth Audio Coding Standard

    Directory of Open Access Journals (Sweden)

    Minjie Xie

    2007-04-01

    Full Text Available This paper describes the low-complexity 14kHz bandwidth audio coding algorithm which has been recently standardized by ITU-T as Recommendation G.722.1 Annex C (“G.722.1C”. The algorithm is an extension to ITU-T Recommendation G.722.1 and a doubled form of the G.722.1 algorithm to permit 14 kHz audio bandwidth using a 32 kHz audio sample rate, at 24, 32, and 48 kbit/s. The G. 722.1C codec features very high audio quality, extremely low computational complexity, and low algorithmic delay compared to other state-of-the-art audio coding algorithms. This codec is suitable for use in video conferencing and teleconferencing, and Internet streaming applications as well as a general-purpose 14 kHz audio codec. Subjective test results from the Characterization phase of G 722.1C are also presented in the paper.

  16. A novel fiber audio transmission system for secure communication

    Institute of Scientific and Technical Information of China (English)

    SU Ke; JIA Bo

    2005-01-01

    A new,simple and efficient fiber audio transmission method for the long distance secure communication is presented, which performs signal modulation by the strain-optic effects and signal demodulation by the all-fiber interferometer. The interferometer is a truly path-matched device, which eliminates much of the undesirable noise by combining the reference and the sensing arms within the same optical fiber. The sinusoidal signals adopted in the experiment are in a frequency range of 300 HZ-3 400 HZ and of the multi-frequency, and the system shows good capabilities, robust security and maintenance of audio integrity. The device may be applicable in the field of point to point secure communication of 40 kilometer long transmission range.

  17. Adaptive audio watermarking based on SNR in localized regions

    Institute of Scientific and Technical Information of China (English)

    WU Guo-min; ZHUANG Yue-ting; WU Fei; PAN Yun-he

    2005-01-01

    In this paper, a novel localized audio watermarking scheme based on signal to noise ratio (SNR) to determine a scaling parameter α is proposed. The basic idea is to embed watermark in selected high inflexion regions, and the intensity of embedded watermarks are modified by adaptively adjusting α. As these high inflexion local regions usually correspond to music edges like sound of percussion instruments, explosion or transition of mixed music, which represent the music rhythm or tempo and are very important to human auditory perception, the embedded watermark is especially expected to escape the distortions caused by time domain synchronization attacks. Taking advantage of localization and SNR, the method shows strong robustness against common problems in audio signal processing, random cropping, time scale modification, etc.

  18. Evaluation of embedded audio feedback on writing assignments.

    Science.gov (United States)

    Graves, Janet K; Goodman, Joely T; Hercinger, Maribeth; Minnich, Margo; Murcek, Christina M; Parks, Jane M; Shirley, Nancy

    2015-01-01

    The purpose of this pilot study was to compare embedded audio feedback (EAF), which faculty provided using the iPad(®) application iAnnotate(®) PDF to insert audio comments and written feedback (WF), inserted electronically on student papers in a series of writing assignments. Goals included determining whether EAF provides more useful guidance to students than WF and whether EAF promotes connectedness among students and faculty. An additional goal was to ascertain the efficiency and acceptance of EAF as a grading tool by nursing faculty. The pilot study was a quasi-experimental, cross-over, posttest-only design. The project was completed in an Informatics in Health Care course. Faculty alternated the two feedback methods on four papers written by each student. Results of surveys and focus groups revealed that students and faculty had mixed feelings about this technology. Student preferences were equally divided between EAF and WF, with 35% for each, and 28% were undecided.

  19. Audio-visual interactions in product sound design

    Science.gov (United States)

    Özcan, Elif; van Egmond, René

    2010-02-01

    Consistent product experience requires congruity between product properties such as visual appearance and sound. Therefore, for designing appropriate product sounds by manipulating their spectral-temporal structure, product sounds should preferably not be considered in isolation but as an integral part of the main product concept. Because visual aspects of a product are considered to dominate the communication of the desired product concept, sound is usually expected to fit the visual character of a product. We argue that this can be accomplished successfully only on basis of a thorough understanding of the impact of audio-visual interactions on product sounds. Two experimental studies are reviewed to show audio-visual interactions on both perceptual and cognitive levels influencing the way people encode, recall, and attribute meaning to product sounds. Implications for sound design are discussed defying the natural tendency of product designers to analyze the "sound problem" in isolation from the other product properties.

  20. Dynamic range control of audio signals by digital signal processing

    Science.gov (United States)

    Gilchrist, N. H. C.

    It is often necessary to reduce the dynamic range of musical programs, particularly those comprising orchestral and choral music, for them to be received satisfactorily by listeners to conventional FM and AM broadcasts. With the arrival of DAB (Digital Audio Broadcasting) a much wider dynamic range will become available for radio broadcasting, although some listeners may prefer to have a signal with a reduced dynamic range. This report describes a digital processor developed by the BBC to control the dynamic range of musical programs in a manner similar to that of a trained Studio Manager. It may be used prior to transmission in conventional broadcasting, replacing limiters or other compression equipment. In DAB, it offers the possibility of providing a dynamic range control signal to be sent to the receiver via an ancillary data channel, simultaneously with the uncompressed audio, giving the listener the option of the full dynamic range or a reduced dynamic range.

  1. Random Numbers Generated from Audio and Video Sources

    Directory of Open Access Journals (Sweden)

    I-Te Chen

    2013-01-01

    Full Text Available Random numbers are very useful in simulation, chaos theory, game theory, information theory, pattern recognition, probability theory, quantum mechanics, statistics, and statistical mechanics. The random numbers are especially helpful in cryptography. In this work, the proposed random number generators come from white noise of audio and video (A/V sources which are extracted from high-resolution IPCAM, WEBCAM, and MPEG-1 video files. The proposed generator applied on video sources from IPCAM and WEBCAM with microphone would be the true random number generator and the pseudorandom number generator when applied on video sources from MPEG-1 video file. In addition, when applying NIST SP 800-22 Rev.1a 15 statistics tests on the random numbers generated from the proposed generator, around 98% random numbers can pass 15 statistical tests. Furthermore, the audio and video sources can be found easily; hence, the proposed generator is a qualified, convenient, and efficient random number generator.

  2. Literary Genres in Social Life: A Narrative, Audio-visual and Poetic Approach

    Directory of Open Access Journals (Sweden)

    Luis Felipe González Gutiérrez

    2008-05-01

    Full Text Available The proposal, "Literary Genres in Social Life: a Narrative, Audio-visual and Poetic Approach", attempts, by objective, to present/display to the academic psychology community and compatible social science disciplines the main contributions of literary genre theory through a social constructionist understanding of narrations and daily stories, and by means of an interactive construction of narrative collage. This work, sustained by an investigation financed by the University Santo Tomás in Bogota, Colombia, "Understanding of structuralist literary theories in the development of the narrative 'I' within the social constructionist approach", tries to propose alternative spaces for the presentation of its investigative results through the expression of metaphors, visual narrative sequences and interactive artistic forms, which invite the spectator to share in and to include/understand important concepts in the consolidation of social forms of construction of the quotidian. URN: urn:nbn:de:0114-fqs0802373

  3. Audio Quality Assurance : An Application of Cross Correlation

    DEFF Research Database (Denmark)

    Jurik, Bolette Ammitzbøll; Nielsen, Jesper Asbjørn Sindahl

    2012-01-01

    We describe algorithms for automated quality assurance on content of audio files in context of preservation actions and access. The algorithms use cross correlation to compare the sound waves. They are used to do overlap analysis in an access scenario, where preserved radio broadcasts are used...... in research and annotated. They have been applied in a migration scenario, where radio broadcasts are to be migrated for long term preservation....

  4. Audio Signal Generator System Based On State Machines

    Institute of Scientific and Technical Information of China (English)

    王维喜

    2009-01-01

    A state machine can make program designing quicker, simpler and more efficient. This paper describes in detail the model for a state machine and the idea for its designing and gives the design process of the state machine through an example of audio signal generator system based on Labview. The result shows that the introduction of the state machine can make complex design processes more clear and the revision of programs easier.

  5. Amplitude Modulated Sinusoidal Signal Decomposition for Audio Coding

    DEFF Research Database (Denmark)

    Christensen, M. G.; Jacobson, A.; Andersen, S. V.;

    2006-01-01

    In this paper, we present a decomposition for sinusoidal coding of audio, based on an amplitude modulation of sinusoids via a linear combination of arbitrary basis vectors. The proposed method, which incorporates a perceptual distortion measure, is based on a relaxation of a nonlinear least-squar......-squares minimization. Rate-distortion curves and listening tests show that, compared to a constant-amplitude sinusoidal coder, the proposed decomposition offers perceptually significant improvements in critical transient signals....

  6. Quality and Distortion Evaluation of Audio Signal by Spectrum

    OpenAIRE

    Er. Niranjan Singh; Dr. Bhupendra Verma

    2012-01-01

    Information hiding in digital audio can be used for such diverse applications as proof ofownership, authentication, integrity, secret communication, broadcast monitoring and eventannotation. To achieve secure and undetectable communication, stegano-objects, anddocuments containing a secret message, should be indistinguishable from cover-objects, andshow that documents not containing any secret message. In this respect, Steganalysis is the setof techniques that aim to distinguish between cover...

  7. Comparison of Linear Prediction Models for Audio Signals

    Directory of Open Access Journals (Sweden)

    van Waterschoot Toon

    2008-01-01

    Full Text Available While linear prediction (LP has become immensely popular in speech modeling, it does not seem to provide a good approach for modeling audio signals. This is somewhat surprising, since a tonal signal consisting of a number of sinusoids can be perfectly predicted based on an (all-pole LP model with a model order that is twice the number of sinusoids. We provide an explanation why this result cannot simply be extrapolated to LP of audio signals. If noise is taken into account in the tonal signal model, a low-order all-pole model appears to be only appropriate when the tonal components are uniformly distributed in the Nyquist interval. Based on this observation, different alternatives to the conventional LP model can be suggested. Either the model should be changed to a pole-zero, a high-order all-pole, or a pitch prediction model, or the conventional LP model should be preceded by an appropriate frequency transform, such as a frequency warping or downsampling. By comparing these alternative LP models to the conventional LP model in terms of frequency estimation accuracy, residual spectral flatness, and perceptual frequency resolution, we obtain several new and promising approaches to LP-based audio modeling.

  8. Comparison of Linear Prediction Models for Audio Signals

    Directory of Open Access Journals (Sweden)

    2009-03-01

    Full Text Available While linear prediction (LP has become immensely popular in speech modeling, it does not seem to provide a good approach for modeling audio signals. This is somewhat surprising, since a tonal signal consisting of a number of sinusoids can be perfectly predicted based on an (all-pole LP model with a model order that is twice the number of sinusoids. We provide an explanation why this result cannot simply be extrapolated to LP of audio signals. If noise is taken into account in the tonal signal model, a low-order all-pole model appears to be only appropriate when the tonal components are uniformly distributed in the Nyquist interval. Based on this observation, different alternatives to the conventional LP model can be suggested. Either the model should be changed to a pole-zero, a high-order all-pole, or a pitch prediction model, or the conventional LP model should be preceded by an appropriate frequency transform, such as a frequency warping or downsampling. By comparing these alternative LP models to the conventional LP model in terms of frequency estimation accuracy, residual spectral flatness, and perceptual frequency resolution, we obtain several new and promising approaches to LP-based audio modeling.

  9. DAB: Multiplex and system support features

    Science.gov (United States)

    Riley, J. L.

    This Report describes the multiplex and system support features of the Eureka 147/DAB digital audio system. It sets out the requirements of all users along the broadcast chain from service providers and broadcaster through to the listener. The contents of the transmission frame are examined drawing the distinction between the main service multiplex and the provision of control information in a separate fast data channel. The concept of the DAB service structure is introduced and the inherent system flexibility for altering the service arrangement is explained. A wide range of service information features builds on those provided in earlier systems, such as RDS (Radio Data System) and is intended to make it easier for a listener to find any required service and to add a further dimension to audio broadcasting. The choices available to users in all of these areas are examined.

  10. An Analysis of Translation Strategies for Professional English for Audio Recording Techniques%录音专业英语翻译策略探究

    Institute of Scientific and Technical Information of China (English)

    黄艺平

    2013-01-01

    随着现代传媒科技的发展,音频技术与艺术广泛的结合,越来越深入人们的生活。录音英语的专业性对中西方的跨文化交流形成了一定的障碍。该文通过对录音专业英语的特点进行梳理,结合尤金·奈达的等效翻译理论,力图探索出行之有效的录音专业英语翻译转化之路。%With the development of modern media technology, audio recording technology is widely integrated into artistic forms, which increasingly influences people’s life. The special characteristics of professional English for audio recording technology has somewhat become an obstacle in the cross-cultural exchange between the west and the east. This paper attempts to find some applica-ble translation strategies for professional English for audio recording techniques by analyzing the linguistic features of professional English for audio recording techniques from the perspective of Eugene A. Nida’s Functional Equivalence Theory.

  11. Study on the Design of Hall Space with Features of Fitting Place in the Ancient Village%契合场所特征的古村落厅空间设计研究

    Institute of Scientific and Technical Information of China (English)

    贺鹏飞; 闫芳; 衡苛

    2016-01-01

    This article proposed design policy on hall space of ancient village in the perspective of the features of fitting place, specifically taking the design of waterfront hall space of Ding Li Bay ancient village in Henan as an example. Through analyzing the key place features in ancient villages of water natural elements of Ding Li Bay, the village space elements, waterfront hall space elements etc., this paper will describe the design of waterfront hall space from points of extraction of tea culture, abstract form of mountain, space function design of waterfront hall etc., and summarize the general idea of the village hall space design and implementation methods based on the features of fitting place in order to provide a reference for the ancient village fall space design.%文章以契合场所特征为视角,提出古村落厅空间的设计策略,具体以豫南丁李湾古村落滨水厅空间的设计为例进行研究,通过分析丁李湾的水乡自然要素、村落空间要素、滨水厅空间要素等古村落关键场所特征,从茶文化的提炼、山体形态的抽象、滨水厅空间功能设计等角度进行设计,总结村落厅空间在契合场所特征的基础上进行设计的总体思路和实施手段,以期对古村落厅空间的设计提供参考。

  12. High capacity reversible watermarking for audio by histogram shifting and predicted error expansion.

    Science.gov (United States)

    Wang, Fei; Xie, Zhaoxin; Chen, Zuo

    2014-01-01

    Being reversible, the watermarking information embedded in audio signals can be extracted while the original audio data can achieve lossless recovery. Currently, the few reversible audio watermarking algorithms are confronted with following problems: relatively low SNR (signal-to-noise) of embedded audio; a large amount of auxiliary embedded location information; and the absence of accurate capacity control capability. In this paper, we present a novel reversible audio watermarking scheme based on improved prediction error expansion and histogram shifting. First, we use differential evolution algorithm to optimize prediction coefficients and then apply prediction error expansion to output stego data. Second, in order to reduce location map bits length, we introduced histogram shifting scheme. Meanwhile, the prediction error modification threshold according to a given embedding capacity can be computed by our proposed scheme. Experiments show that this algorithm improves the SNR of embedded audio signals and embedding capacity, drastically reduces location map bits length, and enhances capacity control capability.

  13. A Detailed look of Audio Steganography Techniques using LSB and Genetic Algorithm Approach

    Directory of Open Access Journals (Sweden)

    Gunjan Nehru

    2012-01-01

    Full Text Available This paper is the study of various techniques of audio steganography using different algorithmis like genetic algorithm approach and LSB approach. We have tried some approaches that helps in audio steganography. As we know it is the art and science of writing hidden messages in such a way that no one, apart from the sender and intended recipient, suspects the existence of the message, a form of security through obscurity. In steganography, the message used to hide secret message is called host message or cover message. Once the contents of the host message or cover message are modified, the resultant message is known as stego message. In other words, stego message is combination of host message and secret message. Audio steganography requires a text or audio secret message to be embedded within a cover audio message. Due to availability of redundancy, the cover audio message before steganography, stego message after steganography remains same. for information hiding.

  14. Efficient Query-by-Content Audio Retrieval by Locality Sensitive Hashing and Partial Sequence Comparison

    Science.gov (United States)

    Yu, Yi; Joe, Kazuki; Downie, J. Stephen

    This paper investigates suitable indexing techniques to enable efficient content-based audio retrieval in large acoustic databases. To make an index-based retrieval mechanism applicable to audio content, we investigate the design of Locality Sensitive Hashing (LSH) and the partial sequence comparison. We propose a fast and efficient audio retrieval framework of query-by-content and develop an audio retrieval system. Based on this framework, four different audio retrieval schemes, LSH-Dynamic Programming (DP), LSH-Sparse DP (SDP), Exact Euclidian LSH (E2LSH)-DP, E2LSH-SDP, are introduced and evaluated in order to better understand the performance of audio retrieval algorithms. The experimental results indicate that compared with the traditional DP and the other three compititive schemes, E2LSH-SDP exhibits the best tradeoff in terms of the response time, retrieval accuracy and computation cost.

  15. High Capacity Reversible Watermarking for Audio by Histogram Shifting and Predicted Error Expansion

    Directory of Open Access Journals (Sweden)

    Fei Wang

    2014-01-01

    Full Text Available Being reversible, the watermarking information embedded in audio signals can be extracted while the original audio data can achieve lossless recovery. Currently, the few reversible audio watermarking algorithms are confronted with following problems: relatively low SNR (signal-to-noise of embedded audio; a large amount of auxiliary embedded location information; and the absence of accurate capacity control capability. In this paper, we present a novel reversible audio watermarking scheme based on improved prediction error expansion and histogram shifting. First, we use differential evolution algorithm to optimize prediction coefficients and then apply prediction error expansion to output stego data. Second, in order to reduce location map bits length, we introduced histogram shifting scheme. Meanwhile, the prediction error modification threshold according to a given embedding capacity can be computed by our proposed scheme. Experiments show that this algorithm improves the SNR of embedded audio signals and embedding capacity, drastically reduces location map bits length, and enhances capacity control capability.

  16. Design and realization of digital audio equalizer based on MCU and FPAA

    Institute of Scientific and Technical Information of China (English)

    Zhou Ping; Liu Zhuo; Xia Liang

    2008-01-01

    In analog audio equalizer, the filters are constructed by op-amplifiers and discrete components. Being influenced by its discrete capabilities, audio equalizer has many disadvantages. Meanwhile, pure digital audio equalizer has got better performance and stability, but its cost and price are too high. So digital audio equalizer only has its application in upscale domain. A new design method for audio equalizer is proposed, which attempts to design and realize a high precision and high SNR (signal noise ratio) digital audio equalizer system based on field programmable analog array (FPAA) and micro-controller unit. This design confirms that design speed and performance will be greatly enhanced when FPAA technology is applied to analog design domain.

  17. Design and Research on Sigma-Delta Digital-to-Analog Converters for Audio Power Amplifiers

    OpenAIRE

    Puidokas, Vytenis

    2011-01-01

    The dissertation investigates the issues of analyzing a digital Sigma-Delta digital-to-analog converter (DAC) for audio power amplifiers. The main objects of research include a digital Sigma-Delta audio power DAC, improvement of its structure and an experimental research. The primary purpose of the dissertation is to suggest methods for improvement the structure of digital Sigma-Delta audio power DAC interpolator and the converter analysis. Disertacijoje nagrinėjami Sigma-Delta skaitmenini...

  18. Direct-conversion switching-mode audio power amplifier with active capacitive voltage clamp

    DEFF Research Database (Denmark)

    Ljusev, Petar; Andersen, Michael Andreas E.

    2005-01-01

    This paper discusses the advantages and problems when implementing direct energy conversion switching-mode audio power amplifiers. It is shown that the total integration of the power supply and Class D audio power amplifier into one compact direct converter can simplify the design, increase...... efficiency, reduce the product volume and lower its cost. As an example, the principle of operation and the measurements made on a direct-conversion switching-mode audio power amplifier with active capacitive voltage clamp are presented....

  19. An Analog I/O Interface Board for Audio Arduino Open Sound Card System

    DEFF Research Database (Denmark)

    Dimitrov, Smilen; Serafin, Stefania

    2011-01-01

    AudioArduino [1] is a system consisting of an ALSA (Advanced Linux Sound Architecture) audio driver and corresponding microcontroller code; that can demonstrate full-duplex, mono, 8-bit, 44.1 kHz soundcard behavior on an FTDI based Arduino. While the basic operation as a soundcard can be demonstr......AudioArduino [1] is a system consisting of an ALSA (Advanced Linux Sound Architecture) audio driver and corresponding microcontroller code; that can demonstrate full-duplex, mono, 8-bit, 44.1 kHz soundcard behavior on an FTDI based Arduino. While the basic operation as a soundcard can...

  20. Acoustic contrast sensitivity to transfer function errors in the design of a personal audio system.

    Science.gov (United States)

    Park, Jin-Young; Choi, Jung-Woo; Kim, Yang-Hann

    2013-07-01

    An analytic means to evaluate the error sensitivity of a personal audio system is proposed. The personal audio system, which focuses acoustic energy into a zone of interest using multiple loudspeakers, is subject to various errors when implemented. The performance of a personal audio system, defined as an energy ratio between the zone of interest and the rest, is inevitably influenced by errors. Thus the ability to predict performance change at the design stage is crucial when building a robust personal audio system. The dependence of the energy ratio change on various types of errors is formulated.

  1. Realization of guitar audio effects using methods of digital signal processing

    Science.gov (United States)

    Buś, Szymon; Jedrzejewski, Konrad

    2015-09-01

    The paper is devoted to studies on possibilities of realization of guitar audio effects by means of methods of digital signal processing. As a result of research, some selected audio effects corresponding to the specifics of guitar sound were realized as the real-time system called Digital Guitar Multi-effect. Before implementation in the system, the selected effects were investigated using the dedicated application with a graphical user interface created in Matlab environment. In the second stage, the real-time system based on a microcontroller and an audio codec was designed and realized. The system is designed to perform audio effects on the output signal of an electric guitar.

  2. Music and audio - oh how they can stress your network

    Science.gov (United States)

    Fletcher, R.

    Nearly ten years ago a paper written by the Audio Engineering Society (AES)[1] made a number of interesting statements: 1. 2. The current Internet is inadequate for transmitting music and professional audio. Performance and collaboration across a distance stress beyond acceptable bounds the quality of service Audio and music provide test cases in which the bounds of the network are quickly reached and through which the defects in a network are readily perceived. Given these key points, where are we now? Have we started to solve any of the problems from the musician's point of view? What is it that musician would like to do that can cause the network so many problems? To understand this we need to appreciate that a trained musician's ears are extremely sensitive to very subtle shifts in temporal materials and localisation information. A shift of a few milliseconds can cause difficulties. So, can modern networks provide the temporal accuracy demanded at this level? The sample and bit rates needed to represent music in the digital domain is still contentious, but a general consensus in the professional world is for 96 KHz and IEEE 64-bit floating point. If this was to be run between two points on the network across 24 channels in near real time to allow for collaborative composition/production/performance, with QOS settings to allow as near to zero latency and jitter, it can be seen that the network indeed has to perform very well. Lighting the Blue Touchpaper for UK e-Science - Closing Conference of ESLEA Project The George Hotel, Edinburgh, UK 26-28 March, 200

  3. On the relevance of spectral features for instrument classification

    DEFF Research Database (Denmark)

    Nielsen, Andreas Brinch; Sigurdsson, Sigurdur; Hansen, Lars Kai

    2007-01-01

    Automatic knowledge extraction from music signals is a key component for most music organization and music information retrieval systems. In this paper, we consider the problem of instrument modelling and instrument classification from the rough audio data. Existing systems for automatic instrument...... classification operate normally on a relatively large number of features, from which those related to the spectrum of the audio signal are particularly relevant. In this paper, we confront two different models about the spectral characterization of musical instruments. The first assumes a constant envelope...

  4. The Carina spiral feature: Strömgren-β photometry approach. II. Distances and space distribution of the O and B stars

    Science.gov (United States)

    Kaltcheva, N.; Scorcio, M.

    2010-05-01

    Aims: In recent years a significant development has become evident in the study of the stellar structure of the Galactic disk. This is especially true for the 3rd Galactic quadrant, where the stellar population was extensively investigated beyond 10 kpc, revealing details about the warped geometry of the thin and thick disks and outer arm. The 4th Galactic quadrant offers even better opportunity to follow the distribution of the young stellar populace to a large distance, since the line of sight is parallel to the largest single segment of a spiral arm seen from our position in the Galaxy: the Carina spiral feature. This paper further contributes to the study of the structure of the Galactic disk in the direction of Carina field utilizing homogeneous photometric distances of a sample of about 600 bright early-type stars seen in this direction up to 6 kpc. Methods: The derived stellar distances are based on uvbyβ photometry. All O and B type stars with uvbyβ data presently available are included in the study. Results: The photometry-derived parameters allow us to study the structure and characteristics of this segment of the Carina arm. We find that the stellar distribution is consistent with a location of the apparent edge of the arm at l = 287°. Toward the edge of the arm the warp of the Galactic plane can be traced up to 6 kpc where it reaches negative 200 pc. The field toward the edge seems to be much more complex than harboring just one OB association, and it is likely that some of the apparent concentrations in this field represent parts of long segments of the edge. In the 284° longitude range an interarm space about 1 kpc wide is found beyond 850 pc from the Sun. The giant molecular clouds and open clusters do not follow the edge of the arm as defined by the OB stars and indicate a possible presence of an age gradient in a direction perpendicular to the formal Galactic plane. Table 4 is only available in electronic form at the CDS via anonymous ftp to

  5. Application of Feature Space in Extraction of Asparagus Planting Area Using Renote Sensing%芦笋种植面积遥感提取

    Institute of Scientific and Technical Information of China (English)

    王猛; 隋学艳; 梁守真; 姚慧敏; 侯学会

    2016-01-01

    The traditional method of extracted cash crop planting area by remote sensing has been widely carried out. However,it is not suitable for the planting area extraction of asparagus.Aiming the deficiency of the existing methods,this paper considered the characteristics of asparagus production.Taking Caoxian county as the study area,the paper studied the method of extraction of asparagus planting area by using Landsat 8 images.By comparing the different NDVI between the asparagus planting area and other objects,this paper first removed the water and wheat field by NDVI threshold segmentation method,and found the distribution of soil line by further analyzing the two-dimensional feature space of the asparagus planting area,buildings and roads.Finally,the asparagus planting area was extracted through the defined thresholds.The results showed that the asparagus planting area is 14626.55ha2 acres in Caoxian county,where the accuracy achieves 84.85%.%针对传统遥感技术提取芦笋种植面积精度不高的问题,根据芦笋的种植特点,该文以山东省曹县为研究区域,以 Landsat 8影像为研究数据,提出了芦笋种植面积的提取方法。通过分析芦笋种植区与其他地物归一化差值植被指数特征,首先利用阈值分割方法去除水体、小麦地物,进一步分析芦笋种植区、建筑物和道路等的影像二维特征空间,发现芦笋种植区的土壤线分布规律,并通过波段运算结果确定芦笋种植区阈值,最后进行芦笋种植面积提取。结果表明,曹县的芦笋种植面积为14626.55ha2,总体精度为84.85%。

  6. Space space space

    CERN Document Server

    Trembach, Vera

    2014-01-01

    Space is an introduction to the mysteries of the Universe. Included are Task Cards for independent learning, Journal Word Cards for creative writing, and Hands-On Activities for reinforcing skills in Math and Language Arts. Space is a perfect introduction to further research of the Solar System.

  7. Sinusoidal Analysis-Synthesis of Audio Using Perceptual Criteria

    Directory of Open Access Journals (Sweden)

    Ted Painter

    2003-01-01

    Full Text Available This paper presents a new method for the selection of sinusoidal components for use in compact representations of narrowband audio. The method consists of ranking and selecting the most perceptually relevant sinusoids. The idea behind the method is to maximize the matching between the auditory excitation pattern associated with the original signal and the corresponding auditory excitation pattern associated with the modeled signal that is being represented by a small set of sinusoidal parameters. The proposed component-selection methodology is shown to outperform the maximum signal-to-mask ratio selection strategy in terms of subjective quality.

  8. Tools for signal compression applications to speech and audio coding

    CERN Document Server

    Moreau, Nicolas

    2013-01-01

    This book presents tools and algorithms required to compress/uncompress signals such as speech and music. These algorithms are largely used in mobile phones, DVD players, HDTV sets, etc. In a first rather theoretical part, this book presents the standard tools used in compression systems: scalar and vector quantization, predictive quantization, transform quantization, entropy coding. In particular we show the consistency between these different tools. The second part explains how these tools are used in the latest speech and audio coders. The third part gives Matlab programs simulating t

  9. Digital video and audio broadcasting technology a practical engineering guide

    CERN Document Server

    Fischer, Walter

    2010-01-01

    Digital Video and Audio Broadcasting Technology - A Practical Engineering Guide' deals with all the most important digital television, sound radio and multimedia standards such as MPEG, DVB, DVD, DAB, ATSC, T-DMB, DMB-T, DRM and ISDB-T. The book provides an in-depth look at these subjects in terms of practical experience. In addition it contains chapters on the basics of technologies such as analog television, digital modulation, COFDM or mathematical transformations between time and frequency domains. The attention in the respective field under discussion is focussed on aspects of measuring t

  10. Inexpensive Audio Activities: Earbud-based Sound Experiments

    Science.gov (United States)

    Allen, Joshua; Boucher, Alex; Meggison, Dean; Hruby, Kate; Vesenka, James

    2016-11-01

    Inexpensive alternatives to a number of classic introductory physics sound laboratories are presented including interference phenomena, resonance conditions, and frequency shifts. These can be created using earbuds, economical supplies such as Giant Pixie Stix® wrappers, and free software available for PCs and mobile devices. We describe two interference laboratories (beat frequency and two-speaker interference) and two resonance laboratories (quarter- and half-wavelength). Lastly, a Doppler laboratory using rotating earbuds is explained. The audio signal captured by all experiments is analyzed on free spectral analysis software and many of the experiments incorporate the unifying theme of measuring the speed of sound in air.

  11. Lost Audio Packets Steganography: The First Practical Evaluation

    CERN Document Server

    Mazurczyk, Wojciech

    2011-01-01

    This paper presents first experimental results for an IP telephony-based steganographic method called LACK (Lost Audio PaCKets steganography). This method utilizes the fact that in typical multimedia communication protocols like RTP (Real-Time Transport Protocol), excessively delayed packets are not used for the reconstruction of transmitted data at the receiver, i.e. these packets are considered useless and discarded. The results presented in this paper were obtained basing on a functional LACK prototype and show the method's impact on the quality of voice transmission. Achievable steganographic bandwidth for the different IP telephony codecs is also calculated.

  12. Audio Hijack Pro万能录音机

    Institute of Scientific and Technical Information of China (English)

    2004-01-01

    Audio Hijack Pro是由Rogue amoeba开发的音频软件,它的功能非常强大只要是你的Mac能放的声音。这个程序都可以录下来.从流媒体广播到DVD音频.还可以为任何程序作数字声效处理,可以使iTunes和Quicktime电台效果明显改善。

  13. Differences between the Audio-lingual Methodand the Communicative Approach

    Institute of Scientific and Technical Information of China (English)

    涂艳; 刘俊

    2016-01-01

    There are some differences between the two kinds of foreign language teaching methods .The Audio-lingual Method can help students gain control over grammatical structures as well as develop their oral ability, and the teaching focus is often on forms rather than functions, so students have learned a lot of structures or patterns without knowing how to use them appropriately in real situations. While the aim of the Communicative Approach is to develop student's communicative competence, which includes both the knowledge about the language and the knowledge about how to use the language appropriately in communication situations.

  14. Audio-haptic interaction in simulated walking experiences

    DEFF Research Database (Denmark)

    Serafin, Stefania

    2011-01-01

    and interchangeable use of the haptic and auditory modality in floor interfaces, and for the synergy of perception and action in capturing and guiding human walking. We describe the technology developed in the context of this project, together with some experiments performed to evaluate the role of auditory......In this paper an overview of the work conducted on audio-haptic physically based simulation and evaluation of walking is provided. This work has been performed in the context of the Natural Interactive Walking (NIW) project, whose goal is to investigate possibilities for the integrated...

  15. The complete guide to high-end audio

    CERN Document Server

    Harley, Robert

    2015-01-01

    An updated edition of what many consider the "bible of high-end audio"   In this newly revised and updated fifth edition, Robert Harley, editor in chief of the Absolute Sound magazine, tells you everything you need to know about buying and enjoying high-quality hi-fi. With this book, discover how to get the best sound for your money, how to identify the weak links in your system and upgrade where it will do the most good, how to set up and tweak your system for maximum performance, and how to become a more perceptive and appreciative listener. Just a few of the secrets you will learn cover hi

  16. A listening test system for automotive audio - listeners

    DEFF Research Database (Denmark)

    Choisel, Sylvain; Hegarty, Patrick; Christensen, Flemming;

    2007-01-01

    A series of experiments was conducted in order to validate an experimental procedure to perform listening tests on car audio systems in a simulation of the car environment in a laboratory, using binaural synthesis with head-tracking. Seven experts and 40 non-expert listeners rated a range...... of stimuli for 15 sound-quality attributes developed by the experts. This paper presents a comparison between the attribute ratings from the two groups of participants. Overall preference of the non-experts was also measured using direct ratings as well as indirect scaling based on paired comparisons...

  17. Automatic processing of CERN video, audio and photo archives

    Energy Technology Data Exchange (ETDEWEB)

    Kwiatek, M [CERN, Geneva (Switzerland)], E-mail: Michal.Kwiatek@cem.ch

    2008-07-15

    The digitalization of CERN audio-visual archives, a major task currently in progress, will generate over 40 TB of video, audio and photo files. Storing these files is one issue, but a far more important challenge is to provide long-time coherence of the archive and to make these files available on-line with minimum manpower investment. An infrastructure, based on standard CERN services, has been implemented, whereby master files, stored in the CERN Distributed File System (DFS), are discovered and scheduled for encoding into lightweight web formats based on predefined profiles. Changes in master files, conversion profiles or in the metadata database (read from CDS, the CERN Document Server) are automatically detected and the media re-encoded whenever necessary. The encoding processes are run on virtual servers provided on-demand by the CERN Server Self Service Centre, so that new servers can be easily configured to adapt to higher load. Finally, the generated files are made available from the CERN standard web servers with streaming implemented using Windows Media Services.

  18. Speech and audio processing for coding, enhancement and recognition

    CERN Document Server

    Togneri, Roberto; Narasimha, Madihally

    2015-01-01

    This book describes the basic principles underlying the generation, coding, transmission and enhancement of speech and audio signals, including advanced statistical and machine learning techniques for speech and speaker recognition with an overview of the key innovations in these areas. Key research undertaken in speech coding, speech enhancement, speech recognition, emotion recognition and speaker diarization are also presented, along with recent advances and new paradigms in these areas. ·         Offers readers a single-source reference on the significant applications of speech and audio processing to speech coding, speech enhancement and speech/speaker recognition. Enables readers involved in algorithm development and implementation issues for speech coding to understand the historical development and future challenges in speech coding research; ·         Discusses speech coding methods yielding bit-streams that are multi-rate and scalable for Voice-over-IP (VoIP) Networks; ·     �...

  19. Information-Driven Active Audio-Visual Source Localization.

    Directory of Open Access Journals (Sweden)

    Niclas Schult

    Full Text Available We present a system for sensorimotor audio-visual source localization on a mobile robot. We utilize a particle filter for the combination of audio-visual information and for the temporal integration of consecutive measurements. Although the system only measures the current direction of the source, the position of the source can be estimated because the robot is able to move and can therefore obtain measurements from different directions. These actions by the robot successively reduce uncertainty about the source's position. An information gain mechanism is used for selecting the most informative actions in order to minimize the number of actions required to achieve accurate and precise position estimates in azimuth and distance. We show that this mechanism is an efficient solution to the action selection problem for source localization, and that it is able to produce precise position estimates despite simplified unisensory preprocessing. Because of the robot's mobility, this approach is suitable for use in complex and cluttered environments. We present qualitative and quantitative results of the system's performance and discuss possible areas of application.

  20. Audio Editing Skills%音频剪辑技巧

    Institute of Scientific and Technical Information of China (English)

    范炜; 杨澍彬; 谭忠凯

    2015-01-01

    随着近年来影视剧的蓬勃发展,各个相关领域也由以前的冷门慢慢变得越来越受到重视,有必要进行专门的研究,以便更好地为影视剧制作进行服务。根据多年的影视剧制作经验,通过分析一些精彩影视剧中的音频制作技巧,来阐述音频剪辑在影视剧制作中的重要性。%As the vigorous development of the film and television drama develop vigorously in recent years,va-rious related fields also by previous unpopular slowly become more and more attention,it is necessary to carry out specialized research to server better for the TV drama making service.According to many years of industry experience,the author of this paper is to elaborate the importance of audio clip in the production of television drama through analyzing the audio production skills of some wonderful film and television drama.

  1. Audio-tactile integration and the influence of musical training.

    Directory of Open Access Journals (Sweden)

    Anja Kuchenbuch

    Full Text Available Perception of our environment is a multisensory experience; information from different sensory systems like the auditory, visual and tactile is constantly integrated. Complex tasks that require high temporal and spatial precision of multisensory integration put strong demands on the underlying networks but it is largely unknown how task experience shapes multisensory processing. Long-term musical training is an excellent model for brain plasticity because it shapes the human brain at functional and structural levels, affecting a network of brain areas. In the present study we used magnetoencephalography (MEG to investigate how audio-tactile perception is integrated in the human brain and if musicians show enhancement of the corresponding activation compared to non-musicians. Using a paradigm that allowed the investigation of combined and separate auditory and tactile processing, we found a multisensory incongruency response, generated in frontal, cingulate and cerebellar regions, an auditory mismatch response generated mainly in the auditory cortex and a tactile mismatch response generated in frontal and cerebellar regions. The influence of musical training was seen in the audio-tactile as well as in the auditory condition, indicating enhanced higher-order processing in musicians, while the sources of the tactile MMN were not influenced by long-term musical training. Consistent with the predictive coding model, more basic, bottom-up sensory processing was relatively stable and less affected by expertise, whereas areas for top-down models of multisensory expectancies were modulated by training.

  2. Audio watermarking technologies for automatic cue sheet generation systems

    Science.gov (United States)

    Caccia, Giuseppe; Lancini, Rosa C.; Pascarella, Annalisa; Tubaro, Stefano; Vicario, Elena

    2001-08-01

    Usually watermark is used as a way for hiding information on digital media. The watermarked information may be used to allow copyright protection or user and media identification. In this paper we propose a watermarking scheme for digital audio signals that allow automatic identification of musical pieces transmitted in TV broadcasting programs. In our application the watermark must be, obviously, imperceptible to the users, should be robust to standard TV and radio editing and have a very low complexity. This last item is essential to allow a software real-time implementation of the insertion and detection of watermarks using only a minimum amount of the computation power of a modern PC. In the proposed method the input audio sequence is subdivided in frames. For each frame a watermark spread spectrum sequence is added to the original data. A two steps filtering procedure is used to generate the watermark from a Pseudo-Noise (PN) sequence. The filters approximate respectively the threshold and the frequency masking of the Human Auditory System (HAS). In the paper we discuss first the watermark embedding system then the detection approach. The results of a large set of subjective tests are also presented to demonstrate the quality and robustness of the proposed approach.

  3. Collection of Digital Audio-visual Material Preservation and Backup Data Transfer%典藏音像资料保存与数字化备份转移

    Institute of Scientific and Technical Information of China (English)

    李浚

    2011-01-01

    According to the audio and video material carrier form, storage media technical features type classification accord- ing to different types of collection, audio-visual materials of corresponding preserving method proposed. In audio and video material carrier storage life could not infinite long cases, and many early video data broadcast devices will be eliminated, causing many valuable audio-visual material will collapse of reality, audio-visual materials need to put forward the urgency views. Finally talk about how video data provide detailed digital transfer methods.%根据音像资料载体形式、存储媒介技术特点进行类型划分,针对不同类型的典藏音像资料提出各种相应的保存方法。在音像资料载体保存期不可能无限长的情况下,以及很多早期音像资料播放设备即将被淘汰,致使许多珍贵声像资料面·临无法使用的现实,为此提出音像资料迫切需要数字化的观点。最后为音像资料怎样数字化转移提供了详细方法

  4. Interactive 3D audio: Enhancing awareness of details in immersive soundscapes?

    DEFF Research Database (Denmark)

    Schmidt, Mikkel Nørgaard; Schwartz, Stephen; Larsen, Jan

    2012-01-01

    and presented in either mono, stereo, 3D, or interactive 3D, and performance was evaluated by asking factual questions about details in the audio. Results show that spatial cues can increase attention to background sounds while reducing attention to narrated text, indicating that spatial audio can...

  5. LiveDescribe: Can Amateur Describers Create High-Quality Audio Description?

    Science.gov (United States)

    Branje, Carmen J.; Fels, Deborah I.

    2012-01-01

    Introduction: The study presented here evaluated the usability of the audio description software LiveDescribe and explored the acceptance rates of audio description created by amateur describers who used LiveDescribe to facilitate the creation of their descriptions. Methods: Twelve amateur describers with little or no previous experience with…

  6. Investigating Expectations and Experiences of Audio and Written Assignment Feedback in First-Year Undergraduate Students

    Science.gov (United States)

    Fawcett, Hannah; Oldfield, Jeremy

    2016-01-01

    Previous research suggests that audio feedback may be an important mechanism for facilitating effective and timely assignment feedback. The present study examined expectations and experiences of audio and written feedback provided through "turnitin for iPad®" from students within the same cohort and assignment. The results showed that…

  7. Overview of the 2015 Workshop on Speech, Language and Audio in Multimedia

    NARCIS (Netherlands)

    Gravier, Guillaume; Jones, Gareth J.F.; Larson, Martha; Ordelman, Roeland

    2015-01-01

    The Workshop on Speech, Language and Audio in Multimedia (SLAM) positions itself at at the crossroad of multiple scientific fields - music and audio processing, speech processing, natural language processing and multimedia - to discuss and stimulate research results, projects, datasets and benchmark

  8. A Preliminary Investigation into the Search Behaviour of Users in a Collection of Digitized Broadcast Audio

    DEFF Research Database (Denmark)

    Lund, Haakon; Skov, Mette; Larsen, Birger;

    2014-01-01

    An increasing number of large digitized audio-visual collections within digital humanities have recently been made available for users. Often access to digitized audio-visual collections is hampered by little and inconsistent metadata. This paper presents the preliminary findings from a study of ...

  9. 47 CFR Figure 2 to Subpart N of... - Typical Audio Wave

    Science.gov (United States)

    2010-10-01

    ... 47 Telecommunication 1 2010-10-01 2010-10-01 false Typical Audio Wave 2 Figure 2 to Subpart N of Part 2 Telecommunication FEDERAL COMMUNICATIONS COMMISSION GENERAL FREQUENCY ALLOCATIONS AND RADIO... Audio Wave EC03JN91.006...

  10. An Exploratory Evaluation of User Interfaces for 3D Audio Mixing

    DEFF Research Database (Denmark)

    Gelineck, Steven; Korsgaard, Dannie Michael

    2015-01-01

    The paper presents an exploratory evaluation comparing different versions of a mid-air gesture based interface for mixing 3D audio exploring: (1) how such an interface generally compares to a more traditional physical interface, (2) methods for grabbing/releasing audio channels in mid-air and (3)...

  11. Changes of the Prefrontal EEG (Electroencephalogram) Activities According to the Repetition of Audio-Visual Learning.

    Science.gov (United States)

    Kim, Yong-Jin; Chang, Nam-Kee

    2001-01-01

    Investigates the changes of neuronal response according to a four time repetition of audio-visual learning. Obtains EEG data from the prefrontal (Fp1, Fp2) lobe from 20 subjects at the 8th grade level. Concludes that the habituation of neuronal response shows up in repetitive audio-visual learning and brain hemisphericity can be changed by…

  12. 47 CFR 73.4275 - Tone clusters; audio attention-getting devices.

    Science.gov (United States)

    2010-10-01

    ... 47 Telecommunication 4 2010-10-01 2010-10-01 false Tone clusters; audio attention-getting devices. 73.4275 Section 73.4275 Telecommunication FEDERAL COMMUNICATIONS COMMISSION (CONTINUED) BROADCAST... clusters; audio attention-getting devices. See Public Notice, FCC 76-610, dated July 2, 1976. 60 FCC 2d...

  13. Audio Feedback: Richer Language but No Measurable Impact on Student Performance

    Science.gov (United States)

    Chalmers, Charlotte; MacCallum, Janis; Mowat, Elaine; Fulton, Norma

    2014-01-01

    Audio feedback has been shown to be popular and well received by students. However, there is little published work to indicate how effective audio feedback is in improving student performance. Sixty students from a first year science degree agreed to take part in the study; thirty were randomly assigned to receive written feedback on coursework,…

  14. Audio-video decision support for patients: the documentary genre as a basis for decision aids

    NARCIS (Netherlands)

    Volandes, A.E.; Barry, M.J.; Wood, F.; Elwyn, G.

    2013-01-01

    Objective Decision support tools are increasingly using audio-visual materials. However, disagreement exists about the use of audio-visual materials as they may be subjective and biased. Methods This is a literature review of the major texts for documentary film studies to extrapolate issues of obje

  15. A Management Review and Analysis of Purdue University Libraries and Audio-Visual Center.

    Science.gov (United States)

    Baaske, Jan; And Others

    A management review and analysis was conducted by the staff of the libraries and audio-visual center of Purdue University. Not only were the study team and the eight task forces drawn from all levels of the libraries and audio-visual center staff, but a systematic effort was sustained through inquiries, draft reports and open meetings to involve…

  16. DOUBLE-BOOST DC-AC CONVERTER WITH SLIDING-MODE CONTROL FOR PORTABLE AUDIO

    DEFF Research Database (Denmark)

    Bolten Maizonave, Gert; Andersen, Michael Andreas E.; Kjærgaard, Claus;

    2009-01-01

    The double-boost topology is studied for operation as a dc-ac converter and single stage audio amplifier. A sliding-mode controller is designed in order to achieve fast enough response for the whole audio frequency range. Symmetric, asymmetric and interleaved operation modes are analyzed....

  17. Conflicting audio-haptic feedback in physically based simulation of walking sounds

    DEFF Research Database (Denmark)

    Turchet, Luca; Serafin, Stefania; Dimitrov, Smilen

    2010-01-01

    We describe an audio-haptic experiment conducted using a system which simulates in real-time the auditory and haptic sensation of walking on different surfaces. The system is based on physical models, that drive both the haptic and audio synthesizers, and a pair of shoes enhanced with sensors...

  18. Approaches to building single-stage AC/AC conversion switch-mode audio power amplifiers

    DEFF Research Database (Denmark)

    Ljusev, Petar; Andersen, Michael Andreas E.

    2004-01-01

    This paper discusses the possible topologies and promising approaches towards direct single-phase AC-AC conversion of the mains voltage for audio applications. When compared to standard Class-D switching audio power amplifiers with a separate power supply, it is expected that direct conversion...

  19. Passive Guaranteed Simulation of Analog Audio Circuits: A Port-Hamiltonian Approach

    Directory of Open Access Journals (Sweden)

    Antoine Falaize

    2016-09-01

    Full Text Available We present a method that generates passive-guaranteed stable simulations of analog audio circuits from electronic schematics for real-time issues. On one hand, this method is based on a continuous-time power-balanced state-space representation structured into its energy-storing parts, dissipative parts, and external sources. On the other hand, a numerical scheme is especially designed to preserve this structure and the power balance. These state-space structures define the class of port-Hamiltonian systems. The derivation of this structured system associated with the electronic circuit is achieved by an automated analysis of the interconnection network combined with a dictionary of models for each elementary component. The numerical scheme is based on the combination of finite differences applied on the state (with respect to the time variable and on the total energy (with respect to the state. This combination provides a discrete-time version of the power balance. This set of algorithms is valid for both the linear and nonlinear case. Finally, three applications of increasing complexities are given: a diode clipper, a common-emitter bipolar-junction transistor amplifier, and a wah pedal. The results are compared to offline simulations obtained from a popular circuit simulator.

  20. 多格式音频感知哈希算法%Perceptual Hashing Algorithm for Multi-Format Audio

    Institute of Scientific and Technical Information of China (English)

    张秋余; 省鹏飞; 黄羿博; 董瑞洪; 杨仲平

    2016-01-01

    提出一种基于双树复小波变换的多格式音频感知哈希算法,解决了现有音频认证算法音频格式单一、算法不通用、效率低的问题.首先对预处理后的音频信号进行全局双树复小波变换,获得信号的实小波和复小波系数,对它们分别分帧,帧数相同;对实小波系数计算每帧信号Teager能量算子的模值,作为实小波系数的帧间特征,接着对每帧信号再分帧,提取再分帧帧信号的短时能量作为实小波系数的帧内特征;对复小波系数求取每帧信号的熵值作为复小波系数的帧间特征;最后对上述特征分别进行哈希构造,生成感知哈希序列.实验结果表明,该算法对5种不同格式的音频都具有强鲁棒性,且区分性好,效率高,并能实现小范围篡改检测.%A novel multi-format audio perceptual hashing algorithm based on dual tree complex wavelet transform ( DT-CWT ) was proposed. It solves the problems of the existing audio authentication algo-rithms, including that audio files are kept in a single format, and algorithms are not generic and low effi-ciency. The proposed algorithm first applies the global DT-CWT to the audio signal after pre-processing conducts to obtain the real and complex wavelet coefficients. Next, the coefficients are partitioned in some frames respectively, and the frame numbers are same. For the real wavelet coefficients, the module values of teager energy operator in every frame are computed to serve as its inter-frame feature. And then short-time energy of the new signal, which is generated to frame the frame signal, is computed to serve as its intra-frame feature. For the complex wavelet coefficients, entropy values are obtained in every frame to serve as its inter-frame feature. Finally, the above features are to conduct a hashing structure process to produce the perceptual hashing sequence. Experiments show that the proposed algorithm has good robust-ness and discrimination for audio

  1. Situative Space Tracking within Smart Environments

    DEFF Research Database (Denmark)

    Surie, Dipak; Jäckel, Florian; Janlert, Lars-Erik

    2010-01-01

    that positions objects within individual situative spaces (without tracking their absolute positions) distributed across multiple modalities like vision, audio, and touch is presented. As a proofof- concept, a preliminary evaluation of the tracking system was performed by two subjects within a living...

  2. The Present Space-Time Motion and Deformation Features of the Northeastern Margin of the Qinghai-Xizang(Tibet) Block and Its Adjacent Area

    Institute of Scientific and Technical Information of China (English)

    Zhang Xiaoliang; Jiang Zaisen; Wang Shuangxu; Zhang Xi; Wang Qi; Chen Bing

    2004-01-01

    On the basis of Discontinuous Deformation Analysis (DDA), and considering the moderate intrusion of specific block boundaries to different extents, the first-order block motion model is established for the northeastern margin of Qinghai-Xizang(Tibet) block and the kinematical model for depicting deformation of small regions as well by using GPS observations of three periods (1991, 1999 and 2001 ). By simulating, we obtained the motion features of the firstorder blocks between the large WWN faults on the sides of the studied region, the distribution features of the principal strain rate field and the inhomogeneous motion features with spacetime of the faults in the northern boundary of the Qinghai-Xizang (Tibet) block.

  3. Real-time Covert Communications Channel for Audio Signals

    Directory of Open Access Journals (Sweden)

    Ashraf Seleym

    2012-09-01

    Full Text Available Covert communications channel is considered as a type of secure communications that creates capability to transfer information between entities while hiding the contents of the channel. Multimedia data hiding techniques can be used to establish a covert channel for secret communications within a media carrier. In this paper, a high-rate covert communications channel is developed to exploit an audio stream as a carrier signal using multiple embedding in the Quantization Index Modulation framework. The proposed approach uses multi quantization vectors to increase data transmission rate. The embedding algorithms consider the embedding process as a communications problem, that it uses structured scheme of Multiple Trellis-Coded Quantization jointed with Multiple Trellis-Coded Modulation. Using convolution codes based trellis coding returns a real-time communications, because it can be continuously encoded and decoded. The proposed approach exhibits a high channel capacity due to the increase in data embedding rate without severely increasing in embedding distortion.

  4. Unsupervised incremental online learning and prediction of musical audio signals

    DEFF Research Database (Denmark)

    Marxer, Richard; Purwins, Hendrik

    2016-01-01

    the next event in a musical sequence, given as audio input. The flow of the system is as follows: 1) segmentation by onset detection, 2) timbre representation of each segment by Mel frequency cepstrum coefficients, 3) discretization by incremental clustering, yielding a tree of different sound classes (e......Guided by the idea that musical human-computer interaction may become more effective, intuitive, and creative when basing its computer part on cognitively more plausible learning principles, we employ unsupervised incremental online learning (i.e. clustering) to build a system that predicts.......g. timbre categories/instruments) that can grow or shrink on the fly driven by the instantaneous sound events, resulting in a discrete symbol sequence, 4) extraction of statistical regularities of the symbol sequence, using hierarchical N-grams and the newly introduced conceptual Boltzmann machine...

  5. Real Time Recognition Of Speakers From Internet Audio Stream

    Directory of Open Access Journals (Sweden)

    Weychan Radoslaw

    2015-09-01

    Full Text Available In this paper we present an automatic speaker recognition technique with the use of the Internet radio lossy (encoded speech signal streams. We show an influence of the audio encoder (e.g., bitrate on the speaker model quality. The model of each speaker was calculated with the use of the Gaussian mixture model (GMM approach. Both the speaker recognition and the further analysis were realized with the use of short utterances to facilitate real time processing. The neighborhoods of the speaker models were analyzed with the use of the ISOMAP algorithm. The experiments were based on four 1-hour public debates with 7–8 speakers (including the moderator, acquired from the Polish radio Internet services. The presented software was developed with the MATLAB environment.

  6. Detection of vibrations in the audio range using photorefractive polymers

    Science.gov (United States)

    Mansurova, S.; Espinosa, M.; Rodriguez, P.; Gather, M.; Meerholz, K.

    2006-08-01

    We report on the use of a photorefractive polymer composite as the active material for a planar photo- EMF detector suitable for the adaptive detection of optical phase modulated signals in the audio range (10Hz-10KHz). The composite is based on a conjugated triphenyldiamine- phenylenevinylene polymer (TPD-PPV) and is sensitized with a highly soluble fullerene derivative (PCBM). We demonstrate experimentally that the responsitivity of such polymer based detectors can be remarkably enhanced if the polymer sample is biased by an external dc field. This effect is theoretically explained by the strong dependence of the charge carrier generation rate on the external dc field, which is an inherent property of organic photoconductors.

  7. Audio collection in the SASA Institute of Musicology

    Directory of Open Access Journals (Sweden)

    Lajić-Mihajlović Danka

    2010-01-01

    Full Text Available The paper is relating to audio collection of the Institute of Musicology SASA as extremely important part of this institution’s fund. The collection comprises of valuable sound materials, especially significant collections of fieldwork recordings of traditional folk and church music, as also recordings of pieces of the 19th and 20th century Serbian composers. Information on sound carriers, methodologies and circumstances in which the recordings have been made, their preservation and further treatment with modern technologies, are a part of ethnomusicological and musicological histories in Serbia. According to number of sound recordings, diachronical dimensions that encompass, geographical areas and genre diversity, this collection is one of the most important sound collections of scientific profile in Serbia.

  8. Audio- and TV-products. Power consumption reduction in audio- and TV-products. Final report; Audio- og TV-produkter. Effektminimering i audio- og TV-produkter: Afsluttende rapport

    Energy Technology Data Exchange (ETDEWEB)

    Kierkegaard, P.

    1998-10-01

    The project concerning the audio products resulted in energy savings of 90-97% at efficiencies of 91-96% with full effect and stand-by losses of 0.4-3 W. It is especially new epoch-making methods for pulse modulation (called Controlled Oscillation Modulator, COM and Phase Shifted Carrier Pulse Width Modulation, PSCPWM) and error for correction in the effect conversion (called Multivariable Enhanced Cascade Control, MECC and Pulse Edge Delay Error Correction, PEDEC), which has made the breakthrough. Two patents have been applied for, and new digital amplifiers will be introduced in all the relevant products. The project concerning TV products has shown that a loss reduction in deflecting circuits of ca.20 % may be obtained. (EHS)

  9. The audio-visual revolution: do we really need it?

    Science.gov (United States)

    Townsend, I

    1979-03-01

    In the United Kingdom, The audio-visual revolution has steadily gained converts in the nursing profession. Nurse tutor courses now contain information on the techniques of educational technology and schools of nursing increasingly own (or wish to own) many of the sophisticated electronic aids to teaching that abound. This is taking place at a time of hitherto inexperienced crisis and change. Funds have been or are being made available to buy audio-visual equipment. But its purchase and use relies on satisfying personal whim, prejudice or educational fashion, not on considerations of educational efficiency. In the rush of enthusiasm, the overwhelmed teacher (everywhere; the phenomenon is not confined to nursing) forgets to ask the searching, critical questions: 'Why should we use this aid?','How effective is it?','And, at what?'. Influential writers in this profession have repeatedly called for a more responsible attitude towards published research work of other fields. In an attempt to discover what is known about the answers to this group of questions, an eclectic look at media research is taken and the widespread dissatisfaction existing amongst international educational technologists is noted. The paper isolates out of the literature several causative factors responsible for the present state of affairs. Findings from the field of educational television are cited as representative of an aid which has had a considerable amount of time and research directed at it. The concluding part of the paper shows the decisions to be taken in using or not using educational media as being more complicated than might at first appear.

  10. Effects of virtual speaker density and room reverberation on spatiotemporal thresholds of audio-visual motion coherence.

    Directory of Open Access Journals (Sweden)

    Narayan Sankaran

    Full Text Available The present study examined the effects of spatial sound-source density and reverberation on the spatiotemporal window for audio-visual motion coherence. Three different acoustic stimuli were generated in Virtual Auditory Space: two acoustically "dry" stimuli via the measurement of anechoic head-related impulse responses recorded at either 1° or 5° spatial intervals (Experiment 1, and a reverberant stimulus rendered from binaural room impulse responses recorded at 5° intervals in situ in order to capture reverberant acoustics in addition to head-related cues (Experiment 2. A moving visual stimulus with invariant localization cues was generated by sequentially activating LED's along the same radial path as the virtual auditory motion. Stimuli were presented at 25°/s, 50°/s and 100°/s with a random spatial offset between audition and vision. In a 2AFC task, subjects made a judgment of the leading modality (auditory or visual. No significant differences were observed in the spatial threshold based on the point of subjective equivalence (PSE or the slope of psychometric functions (β across all three acoustic conditions. Additionally, both the PSE and β did not significantly differ across velocity, suggesting a fixed spatial window of audio-visual separation. Findings suggest that there was no loss in spatial information accompanying the reduction in spatial cues and reverberation levels tested, and establish a perceptual measure for assessing the veracity of motion generated from discrete locations and in echoic environments.

  11. Development of an audio-based virtual gaming environment to assist with navigation skills in the blind.

    Science.gov (United States)

    Connors, Erin C; Yazzolino, Lindsay A; Sánchez, Jaime; Merabet, Lotfi B

    2013-03-27

    Audio-based Environment Simulator (AbES) is virtual environment software designed to improve real world navigation skills in the blind. Using only audio based cues and set within the context of a video game metaphor, users gather relevant spatial information regarding a building's layout. This allows the user to develop an accurate spatial cognitive map of a large-scale three-dimensional space that can be manipulated for the purposes of a real indoor navigation task. After game play, participants are then assessed on their ability to navigate within the target physical building represented in the game. Preliminary results suggest that early blind users were able to acquire relevant information regarding the spatial layout of a previously unfamiliar building as indexed by their performance on a series of navigation tasks. These tasks included path finding through the virtual and physical building, as well as a series of drop off tasks. We find that the immersive and highly interactive nature of the AbES software appears to greatly engage the blind user to actively explore the virtual environment. Applications of this approach may extend to larger populations of visually impaired individuals.

  12. 临近空间高超声速导弹红外特性研究%Study on infrared radiation feature of near space hypersonic missile

    Institute of Scientific and Technical Information of China (English)

    张海林; 周林; 左文博; 范奇; 谭西江

    2015-01-01

    研究临近空间高超声速导弹的红外辐射特性,对于反临近空间武器系统侦察监视临近空间目标具有重要意义。通过对临近空间高超声速导弹 X-51 A 试验飞行过程的研究,深入分析了高超声速导弹的红外辐射特征,并建立其红外辐射模型。以导弹蒙皮、发动机及尾喷焰作为高超声速导弹的主要红外辐射源,以 X-51 A 试验飞行器为参考,计算临近空间高超声速导弹在3~5μm 和8~14μm 波段在不同方向上的红外辐射强度,并针对计算结果进行了分析。%It is important for defending near space weapons and detecting near space targets to study the infrared radia-tion characteristics of near space hypersonic missiles.By studying the near space hypersonic missile X -51 A′s flight test,the infrared radiation characteristics of hypersonic missiles are analyzed,and its infrared radiation model is estab-lished.The missile skin,engine and tail flame are reguarded as the primary infrared radiation sources of hypersonic missile.Taking X -51 A test vehicle as a reference,the infrared radiation strength of near space hypersonic missiles is calculated at different directions in 3 ~5 μm and 8 ~1 4 μm waveband,and the calculation results are discussed.

  13. APPLICATION OF PARTIAL LEAST SQUARES REGRESSION FOR AUDIO-VISUAL SPEECH PROCESSING AND MODELING

    Directory of Open Access Journals (Sweden)

    A. L. Oleinik

    2015-09-01

    Full Text Available Subject of Research. The paper deals with the problem of lip region image reconstruction from speech signal by means of Partial Least Squares regression. Such problems arise in connection with development of audio-visual speech processing methods. Audio-visual speech consists of acoustic and visual components (called modalities. Applications of audio-visual speech processing methods include joint modeling of voice and lips’ movement dynamics, synchronization of audio and video streams, emotion recognition, liveness detection. Method. Partial Least Squares regression was applied to solve the posed problem. This method extracts components of initial data with high covariance. These components are used to build regression model. Advantage of this approach lies in the possibility of achieving two goals: identification of latent interrelations between initial data components (e.g. speech signal and lip region image and approximation of initial data component as a function of another one. Main Results. Experimental research on reconstruction of lip region images from speech signal was carried out on VidTIMIT audio-visual speech database. Results of the experiment showed that Partial Least Squares regression is capable of solving reconstruction problem. Practical Significance. Obtained findings give the possibility to assert that Partial Least Squares regression is successfully applicable for solution of vast variety of audio-visual speech processing problems: from synchronization of audio and video streams to liveness detection.

  14. Approach of vehicle plate extraction based on HSV color space and SIFT feature%一种基于HSV颜色空间和SIFT特征的车牌提取算法

    Institute of Scientific and Technical Information of China (English)

    杨涛; 张森林

    2011-01-01

    为了克服SIFT算法直接应用在车牌提取中表现出来的执行时间过长、误配率高的缺陷,提出了一种基于HSV颜色空间与SIFT特征的两级车牌提取算法,先使用HSV颜色空间确定车牌的候选区域,进行快速粗定位,再使用SIFT算法对候选区域进行精确定位与倾斜校正,在精确定位的同时也完成了对车牌汉字的辨识.这种方法不仅减少了SIFT特征的计算量,而且也避免了复杂背景对于SIFT特征匹配的干扰,大大提高了匹配准确率.最后通过编程实验证实本算法有良好的性能.%In order to overcome the drawbacks such as slow execution speed and high mismatch rate in directly apply SIFT feature to vehicle plate extraction,this paper proposed a two steps vehicle plate extraction method based on SIFT feature and HSV color space. First roughed location the candidate region use HSV color space in a short time, then adopted SIFT feature to locate the plate precisely and correct the tilt,in the meantime accomplished the recognition task of the Chinese character on the vehicle plate. Through the approach not only reduced the computation of SIFT feature, but also avoided the background interference, so can match the features fast and correct. Experimental results also show that the method has a good performance.

  15. 数字音频技术编辑软件设计构建研究%Digital Audio Technology Editing Software Design Construction

    Institute of Scientific and Technical Information of China (English)

    赵美玉

    2014-01-01

    随着计算机技术和数字音频技术的飞速发展,计算机技术和电子技术已渗入到各行各业,各种各样方便人们使用的不同用途的软件也大量应运而生,音乐制作方面也不例外。在数字化发展如此快的年代,通过现有技术,MIDI功能就已经到了一个非常完善的程度。电脑音乐给有想象力的人们带来了很宽广的创作空间。伴随着广播电台的数字化建设不断快速推进,广播音频编辑软件结合了计算机技术与数字音频技术,在我国广播电台中得到了广泛应用,因此,我们相信在今后的数字广播的发展中,可以提供更高质量、更多声道的多声道数字音频系统。本文对音频编辑软件的特效、数字音频制作软件设计构建进行了研究分析。%With the rapid development of computer technology and digital audio technology, computer technology and elec-tronic technology has penetrated into all walks of life, using a variety of convenient for people to the different uses of the sof-tware also emerge as the times require, making music is no exception. In the digital age is growing so fast, the existing tech-nology, the MIDI function will have reached a very perfect degree. Computer music brings a broad creative space to people with imagination. Along with the construction of digital radio rapidly advancing, broadcast audio editing software combines computer technology and digital audio technology, in a radio broadcast in China has been widely applied, therefore, we be-lieve that in the development of digital radio in the future, multi-channel digital audio system can provide more high quality, more channels. Effects of digital audio, the audio editing software software design are studied.

  16. Using listener-based perceptual features as intermediate representations in music information retrieval.

    Science.gov (United States)

    Friberg, Anders; Schoonderwaldt, Erwin; Hedblad, Anton; Fabiani, Marco; Elowsson, Anders

    2014-10-01

    The notion of perceptual features is introduced for describing general music properties based on human perception. This is an attempt at rethinking the concept of features, aiming to approach the underlying human perception mechanisms. Instead of using concepts from music theory such as tones, pitches, and chords, a set of nine features describing overall properties of the music was selected. They were chosen from qualitative measures used in psychology studies and motivated from an ecological approach. The perceptual features were rated in two listening experiments using two different data sets. They were modeled both from symbolic and audio data using different sets of computational features. Ratings of emotional expression were predicted using the perceptual features. The results indicate that (1) at least some of the perceptual features are reliable estimates; (2) emotion ratings could be predicted by a small combination of perceptual features with an explained variance from 75% to 93% for the emotional dimensions activity and valence; (3) the perceptual features could only to a limited extent be modeled using existing audio features. Results clearly indicated that a small number of dedicated features were superior to a "brute force" model using a large number of general audio features.

  17. Method for Reading Sensors and Controlling Actuators Using Audio Interfaces of Mobile Devices

    Science.gov (United States)

    Aroca, Rafael V.; Burlamaqui, Aquiles F.; Gonçalves, Luiz M. G.

    2012-01-01

    This article presents a novel closed loop control architecture based on audio channels of several types of computing devices, such as mobile phones and tablet computers, but not restricted to them. The communication is based on an audio interface that relies on the exchange of audio tones, allowing sensors to be read and actuators to be controlled. As an application example, the presented technique is used to build a low cost mobile robot, but the system can also be used in a variety of mechatronics applications and sensor networks, where smartphones are the basic building blocks. PMID:22438726

  18. When the third party observer of a neuropsychological evaluation is an audio-recorder.

    Science.gov (United States)

    Constantinou, Marios; Ashendorf, Lee; McCaffrey, Robert J

    2002-08-01

    The presence of third parties during neuropsychological evaluations is an issue of concern for contemporary neuropsychologists. Previous studies have reported that the presence of an observer during neuropsychological testing alters the performance of individuals under evaluation. The present study sought to investigate whether audio-recording affects the neuropsychological test performance of individuals in the same way that third party observation does. In the presence of an audio-recorder the performance of the participants on memory tests declined. Performance on motor tests, on the other hand, was not affected by the presence of an audio-recorder. The implications of these findings in forensic neuropsychological evaluations are discussed.

  19. Audio watermarking based on psychoacoustic model and critical band wavelet transform

    Institute of Scientific and Technical Information of China (English)

    TAO Zhi; ZHAO Heming; GU Jihua; WU Di

    2007-01-01

    Watermark embedding algorithm based on critical band wavelet transform of digital audio signal is proposed in this paper. The masking threshold for each audio signal segment was calculated on the basic of psychoacoustic model. According to the similarity between critical band of human auditory system and critical band wavelet transform, a watermark was embedded into the low-band and mid-band coefficients of digital wavelet. The embedding strength was adaptively controlled by the masking threshold. The experiment results show that the embedded watermark signal is inaudible, and the watermarked audio signal has good robustness against many attacks such as compression, noise, re-sampling, low-pass filtering.

  20. The temporal window of audio-tactile integration in speech perception

    Science.gov (United States)

    Gick, Bryan; Ikegami, Yoko; Derrick, Donald

    2010-01-01

    Asynchronous cross-modal information is integrated asymmetrically in audio-visual perception. To test whether this asymmetry generalizes across modalities, auditory (aspirated “pa” and unaspirated “ba” stops) and tactile (slight, inaudible, cutaneous air puffs) signals were presented synchronously and asynchronously. Results were similar to previous AV studies: the temporal window of integration for the enhancement effect (but not the interference effect) was asymmetrical, allowing up to 200 ms of asynchrony when the puff followed the audio signal, but only up to 50 ms when the puff preceded the audio signal. These findings suggest that perceivers accommodate differences in physical transmission speed of different multimodal signals. PMID:21110549