WorldWideScience

Sample records for audio feature space

  1. Emotion-based Music Rretrieval on a Well-reduced Audio Feature Space

    DEFF Research Database (Denmark)

    Ruxanda, Maria Magdalena; Chua, Bee Yong; Nanopoulos, Alexandros

    2009-01-01

    -emotion. However, the real-time systems that retrieve music over large music databases, can achieve order of magnitude performance increase, if applying multidimensional indexing over a dimensionally reduced audio feature space. To meet this performance achievement, in this paper, extensive studies are conducted......Music expresses emotion. A number of audio extracted features have influence on the perceived emotional expression of music. These audio features generate a high-dimensional space, on which music similarity retrieval can be performed effectively, with respect to human perception of the music...... on a number of dimensionality reduction algorithms, including both classic and novel approaches. The paper clearly envisages which dimensionality reduction techniques on the considered audio feature space, can preserve in average the accuracy of the emotion-based music retrieval....

  2. Fall Detection Using Smartphone Audio Features.

    Science.gov (United States)

    Cheffena, Michael

    2016-07-01

    An automated fall detection system based on smartphone audio features is developed. The spectrogram, mel frequency cepstral coefficents (MFCCs), linear predictive coding (LPC), and matching pursuit (MP) features of different fall and no-fall sound events are extracted from experimental data. Based on the extracted audio features, four different machine learning classifiers: k-nearest neighbor classifier (k-NN), support vector machine (SVM), least squares method (LSM), and artificial neural network (ANN) are investigated for distinguishing between fall and no-fall events. For each audio feature, the performance of each classifier in terms of sensitivity, specificity, accuracy, and computational complexity is evaluated. The best performance is achieved using spectrogram features with ANN classifier with sensitivity, specificity, and accuracy all above 98%. The classifier also has acceptable computational requirement for training and testing. The system is applicable in home environments where the phone is placed in the vicinity of the user.

  3. Audio feature extraction using probability distribution function

    Science.gov (United States)

    Suhaib, A.; Wan, Khairunizam; Aziz, Azri A.; Hazry, D.; Razlan, Zuradzman M.; Shahriman A., B.

    2015-05-01

    Voice recognition has been one of the popular applications in robotic field. It is also known to be recently used for biometric and multimedia information retrieval system. This technology is attained from successive research on audio feature extraction analysis. Probability Distribution Function (PDF) is a statistical method which is usually used as one of the processes in complex feature extraction methods such as GMM and PCA. In this paper, a new method for audio feature extraction is proposed which is by using only PDF as a feature extraction method itself for speech analysis purpose. Certain pre-processing techniques are performed in prior to the proposed feature extraction method. Subsequently, the PDF result values for each frame of sampled voice signals obtained from certain numbers of individuals are plotted. From the experimental results obtained, it can be seen visually from the plotted data that each individuals' voice has comparable PDF values and shapes.

  4. On the Use of Memory Models in Audio Features

    DEFF Research Database (Denmark)

    Jensen, Karl Kristoffer

    2011-01-01

    Audio feature estimation is potentially improved by including higher- level models. One such model is the Short Term Memory (STM) model. A new paradigm of audio feature estimation is obtained by adding the influence of notes in the STM. These notes are identified when the perceptual spectral flux...

  5. Classifying laughter and speech using audio-visual feature prediction

    NARCIS (Netherlands)

    Petridis, Stavros; Asghar, Ali; Pantic, Maja

    2010-01-01

    In this study, a system that discriminates laughter from speech by modelling the relationship between audio and visual features is presented. The underlying assumption is that this relationship is different between speech and laughter. Neural networks are trained which learn the audio-to-visual and

  6. Music Genre Classification Using MIDI and Audio Features

    Science.gov (United States)

    Cataltepe, Zehra; Yaslan, Yusuf; Sonmez, Abdullah

    2007-12-01

    We report our findings on using MIDI files and audio features from MIDI, separately and combined together, for MIDI music genre classification. We use McKay and Fujinaga's 3-root and 9-leaf genre data set. In order to compute distances between MIDI pieces, we use normalized compression distance (NCD). NCD uses the compressed length of a string as an approximation to its Kolmogorov complexity and has previously been used for music genre and composer clustering. We convert the MIDI pieces to audio and then use the audio features to train different classifiers. MIDI and audio from MIDI classifiers alone achieve much smaller accuracies than those reported by McKay and Fujinaga who used not NCD but a number of domain-based MIDI features for their classification. Combining MIDI and audio from MIDI classifiers improves accuracy and gets closer to, but still worse, accuracies than McKay and Fujinaga's. The best root genre accuracies achieved using MIDI, audio, and combination of them are 0.75, 0.86, and 0.93, respectively, compared to 0.98 of McKay and Fujinaga. Successful classifier combination requires diversity of the base classifiers. We achieve diversity through using certain number of seconds of the MIDI file, different sample rates and sizes for the audio file, and different classification algorithms.

  7. Music Genre Classification Using MIDI and Audio Features

    Directory of Open Access Journals (Sweden)

    Abdullah Sonmez

    2007-01-01

    Full Text Available We report our findings on using MIDI files and audio features from MIDI, separately and combined together, for MIDI music genre classification. We use McKay and Fujinaga's 3-root and 9-leaf genre data set. In order to compute distances between MIDI pieces, we use normalized compression distance (NCD. NCD uses the compressed length of a string as an approximation to its Kolmogorov complexity and has previously been used for music genre and composer clustering. We convert the MIDI pieces to audio and then use the audio features to train different classifiers. MIDI and audio from MIDI classifiers alone achieve much smaller accuracies than those reported by McKay and Fujinaga who used not NCD but a number of domain-based MIDI features for their classification. Combining MIDI and audio from MIDI classifiers improves accuracy and gets closer to, but still worse, accuracies than McKay and Fujinaga's. The best root genre accuracies achieved using MIDI, audio, and combination of them are 0.75, 0.86, and 0.93, respectively, compared to 0.98 of McKay and Fujinaga. Successful classifier combination requires diversity of the base classifiers. We achieve diversity through using certain number of seconds of the MIDI file, different sample rates and sizes for the audio file, and different classification algorithms.

  8. Turkish Music Genre Classification using Audio and Lyrics Features

    Directory of Open Access Journals (Sweden)

    Önder ÇOBAN

    2017-05-01

    Full Text Available Music Information Retrieval (MIR has become a popular research area in recent years. In this context, researchers have developed music information systems to find solutions for such major problems as automatic playlist creation, hit song detection, and music genre or mood classification. Meta-data information, lyrics, or melodic content of music are used as feature resource in previous works. However, lyrics do not often used in MIR systems and the number of works in this field is not enough especially for Turkish. In this paper, firstly, we have extended our previously created Turkish MIR (TMIR dataset, which comprises of Turkish lyrics, by including the audio file of each song. Secondly, we have investigated the effect of using audio and textual features together or separately on automatic Music Genre Classification (MGC. We have extracted textual features from lyrics using different feature extraction models such as word2vec and traditional Bag of Words. We have conducted our experiments on Support Vector Machine (SVM algorithm and analysed the impact of feature selection and different feature groups on MGC. We have considered lyrics based MGC as a text classification task and also investigated the effect of term weighting method. Experimental results show that textual features can also be effective as well as audio features for Turkish MGC, especially when a supervised term weighting method is employed. We have achieved the highest success rate as 99,12\\% by using both audio and textual features together.

  9. Analytical Features: A Knowledge-Based Approach to Audio Feature Generation

    Directory of Open Access Journals (Sweden)

    Pachet François

    2009-01-01

    Full Text Available We present a feature generation system designed to create audio features for supervised classification tasks. The main contribution to feature generation studies is the notion of analytical features (AFs, a construct designed to support the representation of knowledge about audio signal processing. We describe the most important aspects of AFs, in particular their dimensional type system, on which are based pattern-based random generators, heuristics, and rewriting rules. We show how AFs generalize or improve previous approaches used in feature generation. We report on several projects using AFs for difficult audio classification tasks, demonstrating their advantage over standard audio features. More generally, we propose analytical features as a paradigm to bring raw signals into the world of symbolic computation.

  10. Feature Selection for Audio Surveillance in Urban Environment

    Directory of Open Access Journals (Sweden)

    KIKTOVA Eva

    2014-05-01

    Full Text Available This paper presents the work leading to the acoustic event detection system, which is designed to recognize two types of acoustic events (shot and breaking glass in urban environment. For this purpose, a huge front-end processing was performed for the effective parametric representation of an input sound. MFCC features and features computed during their extraction (MELSPEC and FBANK, then MPEG-7 audio descriptors and other temporal and spectral characteristics were extracted. High dimensional feature sets were created and in the next phase reduced by the mutual information based selection algorithms. Hidden Markov Model based classifier was applied and evaluated by the Viterbi decoding algorithm. Thus very effective feature sets were identified and also the less important features were found.

  11. Audio-visual synchrony and feature-selective attention co-amplify early visual processing.

    Science.gov (United States)

    Keitel, Christian; Müller, Matthias M

    2016-05-01

    Our brain relies on neural mechanisms of selective attention and converging sensory processing to efficiently cope with rich and unceasing multisensory inputs. One prominent assumption holds that audio-visual synchrony can act as a strong attractor for spatial attention. Here, we tested for a similar effect of audio-visual synchrony on feature-selective attention. We presented two superimposed Gabor patches that differed in colour and orientation. On each trial, participants were cued to selectively attend to one of the two patches. Over time, spatial frequencies of both patches varied sinusoidally at distinct rates (3.14 and 3.63 Hz), giving rise to pulse-like percepts. A simultaneously presented pure tone carried a frequency modulation at the pulse rate of one of the two visual stimuli to introduce audio-visual synchrony. Pulsed stimulation elicited distinct time-locked oscillatory electrophysiological brain responses. These steady-state responses were quantified in the spectral domain to examine individual stimulus processing under conditions of synchronous versus asynchronous tone presentation and when respective stimuli were attended versus unattended. We found that both, attending to the colour of a stimulus and its synchrony with the tone, enhanced its processing. Moreover, both gain effects combined linearly for attended in-sync stimuli. Our results suggest that audio-visual synchrony can attract attention to specific stimulus features when stimuli overlap in space.

  12. Extraction, Mapping, and Evaluation of Expressive Acoustic Features for Adaptive Digital Audio Effects

    DEFF Research Database (Denmark)

    Holfelt, Jonas; Csapo, Gergely; Andersson, Nikolaj Schwab

    2017-01-01

    This paper describes the design and implementation of a real-time adaptive digital audio effect with an emphasis on using expressive audio features that control effect param- eters. Research in adaptive digital audio effects is cov- ered along with studies about expressivity and important...

  13. Feature Fusion Based Audio-Visual Speaker Identification Using Hidden Markov Model under Different Lighting Variations

    Directory of Open Access Journals (Sweden)

    Md. Rabiul Islam

    2014-01-01

    Full Text Available The aim of the paper is to propose a feature fusion based Audio-Visual Speaker Identification (AVSI system with varied conditions of illumination environments. Among the different fusion strategies, feature level fusion has been used for the proposed AVSI system where Hidden Markov Model (HMM is used for learning and classification. Since the feature set contains richer information about the raw biometric data than any other levels, integration at feature level is expected to provide better authentication results. In this paper, both Mel Frequency Cepstral Coefficients (MFCCs and Linear Prediction Cepstral Coefficients (LPCCs are combined to get the audio feature vectors and Active Shape Model (ASM based appearance and shape facial features are concatenated to take the visual feature vectors. These combined audio and visual features are used for the feature-fusion. To reduce the dimension of the audio and visual feature vectors, Principal Component Analysis (PCA method is used. The VALID audio-visual database is used to measure the performance of the proposed system where four different illumination levels of lighting conditions are considered. Experimental results focus on the significance of the proposed audio-visual speaker identification system with various combinations of audio and visual features.

  14. Audio segmentation using Flattened Local Trimmed Range for ecological acoustic space analysis

    Directory of Open Access Journals (Sweden)

    Giovany Vega

    2016-06-01

    Full Text Available The acoustic space in a given environment is filled with footprints arising from three processes: biophony, geophony and anthrophony. Bioacoustic research using passive acoustic sensors can result in thousands of recordings. An important component of processing these recordings is to automate signal detection. In this paper, we describe a new spectrogram-based approach for extracting individual audio events. Spectrogram-based audio event detection (AED relies on separating the spectrogram into background (i.e., noise and foreground (i.e., signal classes using a threshold such as a global threshold, a per-band threshold, or one given by a classifier. These methods are either too sensitive to noise, designed for an individual species, or require prior training data. Our goal is to develop an algorithm that is not sensitive to noise, does not need any prior training data and works with any type of audio event. To do this, we propose: (1 a spectrogram filtering method, the Flattened Local Trimmed Range (FLTR method, which models the spectrogram as a mixture of stationary and non-stationary energy processes and mitigates the effect of the stationary processes, and (2 an unsupervised algorithm that uses the filter to detect audio events. We measured the performance of the algorithm using a set of six thoroughly validated audio recordings and obtained a sensitivity of 94% and a positive predictive value of 89%. These sensitivity and positive predictive values are very high, given that the validated recordings are diverse and obtained from field conditions. The algorithm was then used to extract audio events in three datasets. Features of these audio events were plotted and showed the unique aspects of the three acoustic communities.

  15. News video story segmentation method using fusion of audio-visual features

    Science.gov (United States)

    Wen, Jun; Wu, Ling-da; Zeng, Pu; Luan, Xi-dao; Xie, Yu-xiang

    2007-11-01

    News story segmentation is an important aspect for news video analysis. This paper presents a method for news video story segmentation. Different form prior works, which base on visual features transform, the proposed technique uses audio features as baseline and fuses visual features with it to refine the results. At first, it selects silence clips as audio features candidate points, and selects shot boundaries and anchor shots as two kinds of visual features candidate points. Then this paper selects audio feature candidates as cues and develops different fusion method, which effectively using diverse type visual candidates to refine audio candidates, to get story boundaries. Experiment results show that this method has high efficiency and adaptability to different kinds of news video.

  16. Audio-Visual Speech Recognition Using MPEG-4 Compliant Visual Features

    Directory of Open Access Journals (Sweden)

    Petar S. Aleksic

    2002-11-01

    Full Text Available We describe an audio-visual automatic continuous speech recognition system, which significantly improves speech recognition performance over a wide range of acoustic noise levels, as well as under clean audio conditions. The system utilizes facial animation parameters (FAPs supported by the MPEG-4 standard for the visual representation of speech. We also describe a robust and automatic algorithm we have developed to extract FAPs from visual data, which does not require hand labeling or extensive training procedures. The principal component analysis (PCA was performed on the FAPs in order to decrease the dimensionality of the visual feature vectors, and the derived projection weights were used as visual features in the audio-visual automatic speech recognition (ASR experiments. Both single-stream and multistream hidden Markov models (HMMs were used to model the ASR system, integrate audio and visual information, and perform a relatively large vocabulary (approximately 1000 words speech recognition experiments. The experiments performed use clean audio data and audio data corrupted by stationary white Gaussian noise at various SNRs. The proposed system reduces the word error rate (WER by 20% to 23% relatively to audio-only speech recognition WERs, at various SNRs (0–30 dB with additive white Gaussian noise, and by 19% relatively to audio-only speech recognition WER under clean audio conditions.

  17. Extraction Of Audio Features For Emotion Recognition System Based On Music

    Directory of Open Access Journals (Sweden)

    Kee Moe Han

    2015-08-01

    Full Text Available Music is the combination of melody linguistic information and the vocalists emotion. Since music is a work of art analyzing emotion in music by computer is a difficult task. Many approaches have been developed to detect the emotions included in music but the results are not satisfactory because emotion is very complex. In this paper the evaluations of audio features from the music files are presented. The extracted features are used to classify the different emotion classes of the vocalists. Musical features extraction is done by using Music Information Retrieval MIR tool box in this paper. The database of 100 music clips are used to classify the emotions perceived in music clips. Music may contain many emotions according to the vocalists mood such as happy sad nervous bored peace etc. In this paper the audio features related to the emotions of the vocalists are extracted to use in emotion recognition system based on music.

  18. Music preferences based on audio features, and its relation to personality

    OpenAIRE

    Dunn, Greg

    2009-01-01

    Recent studies have summarized reported music preferences by genre into four broadly defined categories, which relate to various personality characteristics. Other research has indicated that genre classification is ambiguous and inconsistent. This ambiguity suggests that research relating personality to music preferences based on genre could benefit from a more objective definition of music. This problem is addressed by investigating how music preferences linked to objective audio features r...

  19. Robust and Reversible Audio Watermarking by Modifying Statistical Features in Time Domain

    Directory of Open Access Journals (Sweden)

    Shijun Xiang

    2017-01-01

    Full Text Available Robust and reversible watermarking is a potential technique in many sensitive applications, such as lossless audio or medical image systems. This paper presents a novel robust reversible audio watermarking method by modifying the statistic features in time domain in the way that the histogram of these statistical values is shifted for data hiding. Firstly, the original audio is divided into nonoverlapped equal-sized frames. In each frame, the use of three samples as a group generates a prediction error and a statistical feature value is calculated as the sum of all the prediction errors in the frame. The watermark bits are embedded into the frames by shifting the histogram of the statistical features. The watermark is reversible and robust to common signal processing operations. Experimental results have shown that the proposed method not only is reversible but also achieves satisfactory robustness to MP3 compression of 64 kbps and additive Gaussian noise of 35 dB.

  20. Audio Query by Example Using Similarity Measures between Probability Density Functions of Features

    Directory of Open Access Journals (Sweden)

    Marko Helén

    2010-01-01

    Full Text Available This paper proposes a query by example system for generic audio. We estimate the similarity of the example signal and the samples in the queried database by calculating the distance between the probability density functions (pdfs of their frame-wise acoustic features. Since the features are continuous valued, we propose to model them using Gaussian mixture models (GMMs or hidden Markov models (HMMs. The models parametrize each sample efficiently and retain sufficient information for similarity measurement. To measure the distance between the models, we apply a novel Euclidean distance, approximations of Kullback-Leibler divergence, and a cross-likelihood ratio test. The performance of the measures was tested in simulations where audio samples are automatically retrieved from a general audio database, based on the estimated similarity to a user-provided example. The simulations show that the distance between probability density functions is an accurate measure for similarity. Measures based on GMMs or HMMs are shown to produce better results than that of the existing methods based on simpler statistics or histograms of the features. A good performance with low computational cost is obtained with the proposed Euclidean distance.

  1. An Analysis of Audio Features to Develop a Human Activity Recognition Model Using Genetic Algorithms, Random Forests, and Neural Networks

    Directory of Open Access Journals (Sweden)

    Carlos E. Galván-Tejada

    2016-01-01

    Full Text Available This work presents a human activity recognition (HAR model based on audio features. The use of sound as an information source for HAR models represents a challenge because sound wave analyses generate very large amounts of data. However, feature selection techniques may reduce the amount of data required to represent an audio signal sample. Some of the audio features that were analyzed include Mel-frequency cepstral coefficients (MFCC. Although MFCC are commonly used in voice and instrument recognition, their utility within HAR models is yet to be confirmed, and this work validates their usefulness. Additionally, statistical features were extracted from the audio samples to generate the proposed HAR model. The size of the information is necessary to conform a HAR model impact directly on the accuracy of the model. This problem also was tackled in the present work; our results indicate that we are capable of recognizing a human activity with an accuracy of 85% using the HAR model proposed. This means that minimum computational costs are needed, thus allowing portable devices to identify human activities using audio as an information source.

  2. Audio-Visual Classification of Sports Types

    DEFF Research Database (Denmark)

    Gade, Rikke; Abou-Zleikha, Mohamed; Christensen, Mads Græsbøll

    2015-01-01

    In this work we propose a method for classification of sports types from combined audio and visual features ex- tracted from thermal video. From audio Mel Frequency Cepstral Coefficients (MFCC) are extracted, and PCA are applied to reduce the feature space to 10 dimensions. From the visual modali...

  3. Acoustic Event Detection in Multichannel Audio Using Gated Recurrent Neural Networks with High‐Resolution Spectral Features

    Directory of Open Access Journals (Sweden)

    Hyoung‐Gook Kim

    2017-12-01

    Full Text Available Recently, deep recurrent neural networks have achieved great success in various machine learning tasks, and have also been applied for sound event detection. The detection of temporally overlapping sound events in realistic environments is much more challenging than in monophonic detection problems. In this paper, we present an approach to improve the accuracy of polyphonic sound event detection in multichannel audio based on gated recurrent neural networks in combination with auditory spectral features. In the proposed method, human hearing perception‐based spatial and spectral‐domain noise‐reduced harmonic features are extracted from multichannel audio and used as high‐resolution spectral inputs to train gated recurrent neural networks. This provides a fast and stable convergence rate compared to long short‐term memory recurrent neural networks. Our evaluation reveals that the proposed method outperforms the conventional approaches.

  4. Unique features of space reactors

    International Nuclear Information System (INIS)

    Buden, D.

    1990-01-01

    This paper reports on space reactors that are designed to meet a unique set of requirements; they must be sufficiently compact to be launched in a rocket to their operational location, operate for many years without maintenance and servicing, operate in extreme environments, and reject heat by radiation to space. To meet these restrictions, operating temperatures are much greater than in terrestrial power plants, and the reactors tend to have a fast neutron spectrum. Currently, a new generation of space reactor power plants is being developed. The major effort is in the SP-100 program, where the power plant is being designed for seven years of full power, and no maintenance operation at a reactor outlet operating temperature of 1350 K

  5. Intelligent audio analysis

    CERN Document Server

    Schuller, Björn W

    2013-01-01

    This book provides the reader with the knowledge necessary for comprehension of the field of Intelligent Audio Analysis. It firstly introduces standard methods and discusses the typical Intelligent Audio Analysis chain going from audio data to audio features to audio recognition.  Further, an introduction to audio source separation, and enhancement and robustness are given. After the introductory parts, the book shows several applications for the three types of audio: speech, music, and general sound. Each task is shortly introduced, followed by a description of the specific data and methods applied, experiments and results, and a conclusion for this specific task. The books provides benchmark results and standardized test-beds for a broader range of audio analysis tasks. The main focus thereby lies on the parallel advancement of realism in audio analysis, as too often today’s results are overly optimistic owing to idealized testing conditions, and it serves to stimulate synergies arising from transfer of ...

  6. Adaptive DCTNet for Audio Signal Classification

    OpenAIRE

    Xian, Yin; Pu, Yunchen; Gan, Zhe; Lu, Liang; Thompson, Andrew

    2016-01-01

    In this paper, we investigate DCTNet for audio signal classification. Its output feature is related to Cohen's class of time-frequency distributions. We introduce the use of adaptive DCTNet (A-DCTNet) for audio signals feature extraction. The A-DCTNet applies the idea of constant-Q transform, with its center frequencies of filterbanks geometrically spaced. The A-DCTNet is adaptive to different acoustic scales, and it can better capture low frequency acoustic information that is sensitive to h...

  7. Estimation of violin bowing features from Audio recordings with Convolutional Networks

    DEFF Research Database (Denmark)

    Perez-Carillo, Alfonso; Purwins, Hendrik

    The acquisition of musical gestures and particularly of instrument controls from a musical performance is a field of increasing interest with applications in many research areas. In the last years, the development of novel sensing technologies has allowed the fine measurement of such controls...... and low-cost of the acquisition and its nonintrusive nature. The main challenge is designing robust detection algorithms to be as accurate as the direct approaches. In this paper, we present an indirect acquisition method to estimate violin bowing controls from audio signal analysis based on training...

  8. 47 CFR 25.214 - Technical requirements for space stations in the satellite digital audio radio service and...

    Science.gov (United States)

    2010-10-01

    ... 47 Telecommunication 2 2010-10-01 2010-10-01 false Technical requirements for space stations in the satellite digital audio radio service and associated terrestrial repeaters. 25.214 Section 25.214 Telecommunication FEDERAL COMMUNICATIONS COMMISSION (CONTINUED) COMMON CARRIER SERVICES SATELLITE COMMUNICATIONS...

  9. Audio-visual Classification and Fusion of Spontaneous Affect Data in Likelihood Space

    NARCIS (Netherlands)

    Nicolaou, Mihalis A.; Gunes, Hatice; Pantic, Maja

    2010-01-01

    This paper focuses on audio-visual (using facial expression, shoulder and audio cues) classification of spontaneous affect, utilising generative models for classification (i) in terms of Maximum Likelihood Classification with the assumption that the generative model structure in the classifier is

  10. Deep Visual Attributes vs. Hand-Crafted Audio Features on Multidomain Speech Emotion Recognition

    Directory of Open Access Journals (Sweden)

    Michalis Papakostas

    2017-06-01

    Full Text Available Emotion recognition from speech may play a crucial role in many applications related to human–computer interaction or understanding the affective state of users in certain tasks, where other modalities such as video or physiological parameters are unavailable. In general, a human’s emotions may be recognized using several modalities such as analyzing facial expressions, speech, physiological parameters (e.g., electroencephalograms, electrocardiograms etc. However, measuring of these modalities may be difficult, obtrusive or require expensive hardware. In that context, speech may be the best alternative modality in many practical applications. In this work we present an approach that uses a Convolutional Neural Network (CNN functioning as a visual feature extractor and trained using raw speech information. In contrast to traditional machine learning approaches, CNNs are responsible for identifying the important features of the input thus, making the need of hand-crafted feature engineering optional in many tasks. In this paper no extra features are required other than the spectrogram representations and hand-crafted features were only extracted for validation purposes of our method. Moreover, it does not require any linguistic model and is not specific to any particular language. We compare the proposed approach using cross-language datasets and demonstrate that it is able to provide superior results vs. traditional ones that use hand-crafted features.

  11. Categorizing Video Game Audio

    DEFF Research Database (Denmark)

    Westerberg, Andreas Rytter; Schoenau-Fog, Henrik

    2015-01-01

    they can use audio in video games. The conclusion of this study is that the current models' view of the diegetic spaces, used to categorize video game audio, is not t to categorize all sounds. This can however possibly be changed though a rethinking of how the player interprets audio.......This paper dives into the subject of video game audio and how it can be categorized in order to deliver a message to a player in the most precise way. A new categorization, with a new take on the diegetic spaces, can be used a tool of inspiration for sound- and game-designers to rethink how...

  12. Audio Papers

    DEFF Research Database (Denmark)

    Groth, Sanne Krogh; Samson, Kristine

    2016-01-01

    With this special issue of Seismograf we are happy to present a new format of articles: Audio Papers. Audio papers resemble the regular essay or the academic text in that they deal with a certain topic of interest, but presented in the form of an audio production. The audio paper is an extension...

  13. Modified DCTNet for audio signals classification

    Science.gov (United States)

    Xian, Yin; Pu, Yunchen; Gan, Zhe; Lu, Liang; Thompson, Andrew

    2016-10-01

    In this paper, we investigate DCTNet for audio signal classification. Its output feature is related to Cohen's class of time-frequency distributions. We introduce the use of adaptive DCTNet (A-DCTNet) for audio signals feature extraction. The A-DCTNet applies the idea of constant-Q transform, with its center frequencies of filterbanks geometrically spaced. The A-DCTNet is adaptive to different acoustic scales, and it can better capture low frequency acoustic information that is sensitive to human audio perception than features such as Mel-frequency spectral coefficients (MFSC). We use features extracted by the A-DCTNet as input for classifiers. Experimental results show that the A-DCTNet and Recurrent Neural Networks (RNN) achieve state-of-the-art performance in bird song classification rate, and improve artist identification accuracy in music data. They demonstrate A-DCTNet's applicability to signal processing problems.

  14. Audio Restoration

    Science.gov (United States)

    Esquef, Paulo A. A.

    The first reproducible recording of human voice was made in 1877 on a tinfoil cylinder phonograph devised by Thomas A. Edison. Since then, much effort has been expended to find better ways to record and reproduce sounds. By the mid-1920s, the first electrical recordings appeared and gradually took over purely acoustic recordings. The development of electronic computers, in conjunction with the ability to record data onto magnetic or optical media, culminated in the standardization of compact disc format in 1980. Nowadays, digital technology is applied to several audio applications, not only to improve the quality of modern and old recording/reproduction techniques, but also to trade off sound quality for less storage space and less taxing transmission capacity requirements.

  15. Audio Twister

    DEFF Research Database (Denmark)

    Cermak, Daniel; Moreno Garcia, Rodrigo; Monastiridis, Stefanos

    2015-01-01

    Daniel Cermak-Sassenrath, Rodrigo Moreno Garcia, Stefanos Monastiridis. Audio Twister. Installation. P-Hack Copenhagen 2015, Copenhagen, DK, Apr 24, 2015.......Daniel Cermak-Sassenrath, Rodrigo Moreno Garcia, Stefanos Monastiridis. Audio Twister. Installation. P-Hack Copenhagen 2015, Copenhagen, DK, Apr 24, 2015....

  16. Introduction to audio analysis a MATLAB approach

    CERN Document Server

    Giannakopoulos, Theodoros

    2014-01-01

    Introduction to Audio Analysis serves as a standalone introduction to audio analysis, providing theoretical background to many state-of-the-art techniques. It covers the essential theory necessary to develop audio engineering applications, but also uses programming techniques, notably MATLAB®, to take a more applied approach to the topic. Basic theory and reproducible experiments are combined to demonstrate theoretical concepts from a practical point of view and provide a solid foundation in the field of audio analysis. Audio feature extraction, audio classification, audio segmentation, au

  17. Searching Fragment Spaces with feature trees.

    Science.gov (United States)

    Lessel, Uta; Wellenzohn, Bernd; Lilienthal, Markus; Claussen, Holger

    2009-02-01

    Virtual combinatorial chemistry easily produces billions of compounds, for which conventional virtual screening cannot be performed even with the fastest methods available. An efficient solution for such a scenario is the generation of Fragment Spaces, which encode huge numbers of virtual compounds by their fragments/reagents and rules of how to combine them. Similarity-based searches can be performed in such spaces without ever fully enumerating all virtual products. Here we describe the generation of a huge Fragment Space encoding about 5 * 10(11) compounds based on established in-house synthesis protocols for combinatorial libraries, i.e., we encode practically evaluated combinatorial chemistry protocols in a machine readable form, rendering them accessible to in silico search methods. We show how such searches in this Fragment Space can be integrated as a first step in an overall workflow. It reduces the extremely huge number of virtual products by several orders of magnitude so that the resulting list of molecules becomes more manageable for further more elaborated and time-consuming analysis steps. Results of a case study are presented and discussed, which lead to some general conclusions for an efficient expansion of the chemical space to be screened in pharmaceutical companies.

  18. Oversampling the Minority Class in the Feature Space.

    Science.gov (United States)

    Perez-Ortiz, Maria; Gutierrez, Pedro Antonio; Tino, Peter; Hervas-Martinez, Cesar

    2016-09-01

    The imbalanced nature of some real-world data is one of the current challenges for machine learning researchers. One common approach oversamples the minority class through convex combination of its patterns. We explore the general idea of synthetic oversampling in the feature space induced by a kernel function (as opposed to input space). If the kernel function matches the underlying problem, the classes will be linearly separable and synthetically generated patterns will lie on the minority class region. Since the feature space is not directly accessible, we use the empirical feature space (EFS) (a Euclidean space isomorphic to the feature space) for oversampling purposes. The proposed method is framed in the context of support vector machines, where the imbalanced data sets can pose a serious hindrance. The idea is investigated in three scenarios: 1) oversampling in the full and reduced-rank EFSs; 2) a kernel learning technique maximizing the data class separation to study the influence of the feature space structure (implicitly defined by the kernel function); and 3) a unified framework for preferential oversampling that spans some of the previous approaches in the literature. We support our investigation with extensive experiments over 50 imbalanced data sets.

  19. Quantum magnification of classical sub-Planck phase space features

    International Nuclear Information System (INIS)

    Hensinger, W.K.; Heckenberg, N.; Rubinsztein-Dunlop, H.; Delande, D.

    2002-01-01

    Full text: To understand the relationship between quantum mechanics and classical physics a crucial question to be answered is how distinct classical dynamical phase space features translate into the quantum picture. This problem becomes even more interesting if these phase space features occupy a much smaller volume than ℎ in a phase space spanned by two non-commuting variables such as position and momentum. The question whether phase space structures in quantum mechanics associated with sub-Planck scales have physical signatures has recently evoked a lot of discussion. Here we will show that sub-Planck classical dynamical phase space structures, for example regions of regular motion, can give rise to states whose phase space representation is of size ℎ or larger. This is illustrated using period-1 regions of regular motion (modes of oscillatory motion of a particle in a modulated well) whose volume is distinctly smaller than Planck's constant. They are magnified in the quantum picture and appear as states whose phase space representation is of size h or larger. Cold atoms provide an ideal test bed to probe such fundamental aspects of quantum and classical dynamics. In the experiment a Bose-Einstein condensate is loaded into a far detuned optical lattice. The lattice depth is modulated resulting in the emergence of regions of regular motion surrounded by chaotic motion in the phase space spanned by position and momentum of the atoms along the standing wave. Sub-Planck scaled phase space features in the classical phase space are magnified and appear as distinct broad peaks in the atomic momentum distribution. The corresponding quantum analysis shows states of size Ti which can be associated with much smaller classical dynamical phase space features. This effect may considered as the dynamical equivalent of the Goldstone and Jaffe theorem which predicts the existence of at least one bound state at a bend in a two or three dimensional spatial potential

  20. Software for objective comparison of vocal acoustic features over weeks of audio recording: KLFromRecordingDays

    Directory of Open Access Journals (Sweden)

    Ken Soderstrom

    2017-01-01

    Full Text Available KLFromRecordingDays allows measurement of Kullback–Leibler (KL distances between 2D probability distributions of vocal acoustic features. Greater KL distance measures reflect increased phonological divergence across the vocalizations compared. The software has been used to compare *.wav file recordings made by Sound Analysis Recorder 2011 of songbird vocalizations pre- and post-drug and surgical manipulations. Recordings from individual animals in *.wav format are first organized into subdirectories by recording day and then segmented into individual syllables uttered and acoustic features of these syllables using Sound Analysis Pro 2011 (SAP. KLFromRecordingDays uses syllable acoustic feature data output by SAP to a MySQL table to generate and compare “template” (typically pre-treatment and “target” (typically post-treatment probability distributions. These distributions are a series of virtual 2D plots of the duration of each syllable (as x-axis to each of 13 other acoustic features measured by SAP for that syllable (as y-axes. Differences between “template” and “target” probability distributions for each acoustic feature are determined by calculating KL distance, a measure of divergence of the target 2D distribution pattern from that of the template. KL distances and the mean KL distance across all acoustic features are calculated for each recording day and output to an Excel spreadsheet. Resulting data for individual subjects may then be pooled across treatment groups and graphically summarized and used for statistical comparisons. Because SAP-generated MySQL files are accessed directly, data limits associated with spreadsheet output are avoided, and the totality of vocal output over weeks may be objectively analyzed all at once. The software has been useful for measuring drug effects on songbird vocalizations and assessing recovery from damage to regions of vocal motor cortex. It may be useful in studies employing other

  1. Software for objective comparison of vocal acoustic features over weeks of audio recording: KLFromRecordingDays

    Science.gov (United States)

    Soderstrom, Ken; Alalawi, Ali

    KLFromRecordingDays allows measurement of Kullback-Leibler (KL) distances between 2D probability distributions of vocal acoustic features. Greater KL distance measures reflect increased phonological divergence across the vocalizations compared. The software has been used to compare *.wav file recordings made by Sound Analysis Recorder 2011 of songbird vocalizations pre- and post-drug and surgical manipulations. Recordings from individual animals in *.wav format are first organized into subdirectories by recording day and then segmented into individual syllables uttered and acoustic features of these syllables using Sound Analysis Pro 2011 (SAP). KLFromRecordingDays uses syllable acoustic feature data output by SAP to a MySQL table to generate and compare "template" (typically pre-treatment) and "target" (typically post-treatment) probability distributions. These distributions are a series of virtual 2D plots of the duration of each syllable (as x-axis) to each of 13 other acoustic features measured by SAP for that syllable (as y-axes). Differences between "template" and "target" probability distributions for each acoustic feature are determined by calculating KL distance, a measure of divergence of the target 2D distribution pattern from that of the template. KL distances and the mean KL distance across all acoustic features are calculated for each recording day and output to an Excel spreadsheet. Resulting data for individual subjects may then be pooled across treatment groups and graphically summarized and used for statistical comparisons. Because SAP-generated MySQL files are accessed directly, data limits associated with spreadsheet output are avoided, and the totality of vocal output over weeks may be objectively analyzed all at once. The software has been useful for measuring drug effects on songbird vocalizations and assessing recovery from damage to regions of vocal motor cortex. It may be useful in studies employing other species, and as part of speech

  2. Balancing Audio

    DEFF Research Database (Denmark)

    Walther-Hansen, Mads

    2016-01-01

    is not thoroughly understood. In this paper I treat balance as a metaphor that we use to reason about several different actions in music production, such as adjusting levels, editing the frequency spectrum or the spatiality of the recording. This study is based on an exploration of a linguistic corpus of sound......This paper explores the concept of balance in music production and examines the role of conceptual metaphors in reasoning about audio editing. Balance may be the most central concept in record production, however, the way we cognitively understand and respond meaningfully to a mix requiring balance...

  3. Space Station services and design features for users

    Science.gov (United States)

    Kurzhals, Peter R.; Mckinney, Royce L.

    1987-01-01

    The operational design features and services planned for the NASA Space Station will furnish, in addition to novel opportunities and facilities, lower costs through interface standardization and automation and faster access by means of computer-aided integration and control processes. By furnishing a basis for large-scale space exploitation, the Space Station will possess industrial production and operational services capabilities that may be used by the private sector for commercial ventures; it could also ultimately support lunar and planetary exploration spacecraft assembly and launch facilities.

  4. The features of space-planning and outfitting decisions

    Energy Technology Data Exchange (ETDEWEB)

    Voronov, N.A.; Bezrukov, A.K.

    1982-01-01

    The features of space-planning and outfitting solutions for a primary housing which was assembled with the No 1 auxillary housing are examined. The primary factors which have given rise to an unusual design decision on the depth of the structure of the main housing (12 meters) are noted.

  5. Space moving target detection using time domain feature

    Science.gov (United States)

    Wang, Min; Chen, Jin-yong; Gao, Feng; Zhao, Jin-yu

    2018-01-01

    The traditional space target detection methods mainly use the spatial characteristics of the star map to detect the targets, which can not make full use of the time domain information. This paper presents a new space moving target detection method based on time domain features. We firstly construct the time spectral data of star map, then analyze the time domain features of the main objects (target, stars and the background) in star maps, finally detect the moving targets using single pulse feature of the time domain signal. The real star map target detection experimental results show that the proposed method can effectively detect the trajectory of moving targets in the star map sequence, and the detection probability achieves 99% when the false alarm rate is about 8×10-5, which outperforms those of compared algorithms.

  6. pyAudioAnalysis: An Open-Source Python Library for Audio Signal Analysis.

    Science.gov (United States)

    Giannakopoulos, Theodoros

    2015-01-01

    Audio information plays a rather important role in the increasing digital content that is available today, resulting in a need for methodologies that automatically analyze such content: audio event recognition for home automations and surveillance systems, speech recognition, music information retrieval, multimodal analysis (e.g. audio-visual analysis of online videos for content-based recommendation), etc. This paper presents pyAudioAnalysis, an open-source Python library that provides a wide range of audio analysis procedures including: feature extraction, classification of audio signals, supervised and unsupervised segmentation and content visualization. pyAudioAnalysis is licensed under the Apache License and is available at GitHub (https://github.com/tyiannak/pyAudioAnalysis/). Here we present the theoretical background behind the wide range of the implemented methodologies, along with evaluation metrics for some of the methods. pyAudioAnalysis has been already used in several audio analysis research applications: smart-home functionalities through audio event detection, speech emotion recognition, depression classification based on audio-visual features, music segmentation, multimodal content-based movie recommendation and health applications (e.g. monitoring eating habits). The feedback provided from all these particular audio applications has led to practical enhancement of the library.

  7. The feature-weighted receptive field: an interpretable encoding model for complex feature spaces.

    Science.gov (United States)

    St-Yves, Ghislain; Naselaris, Thomas

    2017-06-20

    We introduce the feature-weighted receptive field (fwRF), an encoding model designed to balance expressiveness, interpretability and scalability. The fwRF is organized around the notion of a feature map-a transformation of visual stimuli into visual features that preserves the topology of visual space (but not necessarily the native resolution of the stimulus). The key assumption of the fwRF model is that activity in each voxel encodes variation in a spatially localized region across multiple feature maps. This region is fixed for all feature maps; however, the contribution of each feature map to voxel activity is weighted. Thus, the model has two separable sets of parameters: "where" parameters that characterize the location and extent of pooling over visual features, and "what" parameters that characterize tuning to visual features. The "where" parameters are analogous to classical receptive fields, while "what" parameters are analogous to classical tuning functions. By treating these as separable parameters, the fwRF model complexity is independent of the resolution of the underlying feature maps. This makes it possible to estimate models with thousands of high-resolution feature maps from relatively small amounts of data. Once a fwRF model has been estimated from data, spatial pooling and feature tuning can be read-off directly with no (or very little) additional post-processing or in-silico experimentation. We describe an optimization algorithm for estimating fwRF models from data acquired during standard visual neuroimaging experiments. We then demonstrate the model's application to two distinct sets of features: Gabor wavelets and features supplied by a deep convolutional neural network. We show that when Gabor feature maps are used, the fwRF model recovers receptive fields and spatial frequency tuning functions consistent with known organizational principles of the visual cortex. We also show that a fwRF model can be used to regress entire deep

  8. Features of the Gravity Probe B Space Vehicle

    Science.gov (United States)

    Reeve, William; Green, Gaylord

    2007-04-01

    Space vehicle performance enabled successful relativity data collection throughout the Gravity Probe B mission. Precision pointing and drag-free translation control was maintained using proportional helium micro-thrusters. Electrical power was provided by rigid, double sided solar arrays. The 1.8 kelvin science instrument temperature was maintained using the largest cryogenic liquid helium dewar ever flown in space. The flight software successfully performed autonomous operations and safemode protection. Features of the Gravity Probe B Space Vehicle mechanisms include: 1) sixteen helium micro-thrusters, the first proportional thrusters flown in space, and large-orifice thruster isolation valves, 2) seven precision and high-authority mass trim mechanisms, 3) four non-pyrotechnic, highly reliable solar array deployment and release mechanism sets. Early incremental prototyping was used extensively to reduce spacecraft development risk. All spacecraft systems were redundant and provided multiple failure tolerance in critical systems. Lockheed Martin performed the spacecraft design, systems engineering, hardware and software integration, environmental testing and launch base operations, as well as on-orbit operations support for the Gravity Probe B space science experiment.

  9. A Method to Detect AAC Audio Forgery

    Directory of Open Access Journals (Sweden)

    Qingzhong Liu

    2015-08-01

    Full Text Available Advanced Audio Coding (AAC, a standardized lossy compression scheme for digital audio, which was designed to be the successor of the MP3 format, generally achieves better sound quality than MP3 at similar bit rates. While AAC is also the default or standard audio format for many devices and AAC audio files may be presented as important digital evidences, the authentication of the audio files is highly needed but relatively missing. In this paper, we propose a scheme to expose tampered AAC audio streams that are encoded at the same encoding bit-rate. Specifically, we design a shift-recompression based method to retrieve the differential features between the re-encoded audio stream at each shifting and original audio stream, learning classifier is employed to recognize different patterns of differential features of the doctored forgery files and original (untouched audio files. Experimental results show that our approach is very promising and effective to detect the forgery of the same encoding bit-rate on AAC audio streams. Our study also shows that shift recompression-based differential analysis is very effective for detection of the MP3 forgery at the same bit rate.

  10. Feature extraction algorithm for space targets based on fractal theory

    Science.gov (United States)

    Tian, Balin; Yuan, Jianping; Yue, Xiaokui; Ning, Xin

    2007-11-01

    In order to offer a potential for extending the life of satellites and reducing the launch and operating costs, satellite servicing including conducting repairs, upgrading and refueling spacecraft on-orbit become much more frequently. Future space operations can be more economically and reliably executed using machine vision systems, which can meet real time and tracking reliability requirements for image tracking of space surveillance system. Machine vision was applied to the research of relative pose for spacecrafts, the feature extraction algorithm was the basis of relative pose. In this paper fractal geometry based edge extraction algorithm which can be used in determining and tracking the relative pose of an observed satellite during proximity operations in machine vision system was presented. The method gets the gray-level image distributed by fractal dimension used the Differential Box-Counting (DBC) approach of the fractal theory to restrain the noise. After this, we detect the consecutive edge using Mathematical Morphology. The validity of the proposed method is examined by processing and analyzing images of space targets. The edge extraction method not only extracts the outline of the target, but also keeps the inner details. Meanwhile, edge extraction is only processed in moving area to reduce computation greatly. Simulation results compared edge detection using the method which presented by us with other detection methods. The results indicate that the presented algorithm is a valid method to solve the problems of relative pose for spacecrafts.

  11. Semantic Context Detection Using Audio Event Fusion

    Directory of Open Access Journals (Sweden)

    Cheng Wen-Huang

    2006-01-01

    Full Text Available Semantic-level content analysis is a crucial issue in achieving efficient content retrieval and management. We propose a hierarchical approach that models audio events over a time series in order to accomplish semantic context detection. Two levels of modeling, audio event and semantic context modeling, are devised to bridge the gap between physical audio features and semantic concepts. In this work, hidden Markov models (HMMs are used to model four representative audio events, that is, gunshot, explosion, engine, and car braking, in action movies. At the semantic context level, generative (ergodic hidden Markov model and discriminative (support vector machine (SVM approaches are investigated to fuse the characteristics and correlations among audio events, which provide cues for detecting gunplay and car-chasing scenes. The experimental results demonstrate the effectiveness of the proposed approaches and provide a preliminary framework for information mining by using audio characteristics.

  12. Digital audio watermarking fundamentals, techniques and challenges

    CERN Document Server

    Xiang, Yong; Yan, Bin

    2017-01-01

    This book offers comprehensive coverage on the most important aspects of audio watermarking, from classic techniques to the latest advances, from commonly investigated topics to emerging research subdomains, and from the research and development achievements to date, to current limitations, challenges, and future directions. It also addresses key topics such as reversible audio watermarking, audio watermarking with encryption, and imperceptibility control methods. The book sets itself apart from the existing literature in three main ways. Firstly, it not only reviews classical categories of audio watermarking techniques, but also provides detailed descriptions, analysis and experimental results of the latest work in each category. Secondly, it highlights the emerging research topic of reversible audio watermarking, including recent research trends, unique features, and the potentials of this subdomain. Lastly, the joint consideration of audio watermarking and encryption is also reviewed. With the help of this...

  13. Efficient Divide-And-Conquer Classification Based on Feature-Space Decomposition

    OpenAIRE

    Guo, Qi; Chen, Bo-Wei; Jiang, Feng; Ji, Xiangyang; Kung, Sun-Yuan

    2015-01-01

    This study presents a divide-and-conquer (DC) approach based on feature space decomposition for classification. When large-scale datasets are present, typical approaches usually employed truncated kernel methods on the feature space or DC approaches on the sample space. However, this did not guarantee separability between classes, owing to overfitting. To overcome such problems, this work proposes a novel DC approach on feature spaces consisting of three steps. Firstly, we divide the feature ...

  14. Audio Conferencing Enhancements

    OpenAIRE

    VESTERINEN, LEENA

    2006-01-01

    Audio conferencing allows multiple people in distant locations to interact in a single voice call. Whilst it can be very useful service it also has several key disadvantages. This thesis study investigated the options for improving the user experience of the mobile teleconferencing applications. In particular, the use of 3D, spatial audio and visualinteractive functionality was investigated as the means of improving the intelligibility and audio perception during the audio...

  15. Back to basics audio

    CERN Document Server

    Nathan, Julian

    1998-01-01

    Back to Basics Audio is a thorough, yet approachable handbook on audio electronics theory and equipment. The first part of the book discusses electrical and audio principles. Those principles form a basis for understanding the operation of equipment and systems, covered in the second section. Finally, the author addresses planning and installation of a home audio system.Julian Nathan joined the audio service and manufacturing industry in 1954 and moved into motion picture engineering and production in 1960. He installed and operated recording theaters in Sydney, Austra

  16. Detecting double compression of audio signal

    Science.gov (United States)

    Yang, Rui; Shi, Yun Q.; Huang, Jiwu

    2010-01-01

    MP3 is the most popular audio format nowadays in our daily life, for example music downloaded from the Internet and file saved in the digital recorder are often in MP3 format. However, low bitrate MP3s are often transcoded to high bitrate since high bitrate ones are of high commercial value. Also audio recording in digital recorder can be doctored easily by pervasive audio editing software. This paper presents two methods for the detection of double MP3 compression. The methods are essential for finding out fake-quality MP3 and audio forensics. The proposed methods use support vector machine classifiers with feature vectors formed by the distributions of the first digits of the quantized MDCT (modified discrete cosine transform) coefficients. Extensive experiments demonstrate the effectiveness of the proposed methods. To the best of our knowledge, this piece of work is the first one to detect double compression of audio signal.

  17. Frequency-Dependent Amplitude Panning for the Stereophonic Image Enhancement of Audio Recorded Using Two Closely Spaced Microphones

    Directory of Open Access Journals (Sweden)

    Chan Jun Chun

    2016-02-01

    Full Text Available In this paper, we propose a new frequency-dependent amplitude panning method for stereophonic image enhancement applied to a sound source recorded using two closely spaced omni-directional microphones. The ability to detect the direction of such a sound source is limited due to weak spatial information, such as the inter-channel time difference (ICTD and inter-channel level difference (ICLD. Moreover, when sound sources are recorded in a convolutive or a real room environment, the detection of sources is affected by reverberation effects. Thus, the proposed method first tries to estimate the source direction depending on the frequency using azimuth-frequency analysis. Then, a frequency-dependent amplitude panning technique is proposed to enhance the stereophonic image by modifying the stereophonic law of sines. To demonstrate the effectiveness of the proposed method, we compare its performance with that of a conventional method based on the beamforming technique in terms of directivity pattern, perceived direction, and quality degradation under three different recording conditions (anechoic, convolutive, and real reverberant. The comparison shows that the proposed method gives us better stereophonic images in a stereo loudspeaker reproduction than the conventional method without any annoying effects.

  18. Advances in audio source seperation and multisource audio content retrieval

    Science.gov (United States)

    Vincent, Emmanuel

    2012-06-01

    Audio source separation aims to extract the signals of individual sound sources from a given recording. In this paper, we review three recent advances which improve the robustness of source separation in real-world challenging scenarios and enable its use for multisource content retrieval tasks, such as automatic speech recognition (ASR) or acoustic event detection (AED) in noisy environments. We present a Flexible Audio Source Separation Toolkit (FASST) and discuss its advantages compared to earlier approaches such as independent component analysis (ICA) and sparse component analysis (SCA). We explain how cues as diverse as harmonicity, spectral envelope, temporal fine structure or spatial location can be jointly exploited by this toolkit. We subsequently present the uncertainty decoding (UD) framework for the integration of audio source separation and audio content retrieval. We show how the uncertainty about the separated source signals can be accurately estimated and propagated to the features. Finally, we explain how this uncertainty can be efficiently exploited by a classifier, both at the training and the decoding stage. We illustrate the resulting performance improvements in terms of speech separation quality and speaker recognition accuracy.

  19. AudioMUD: a multiuser virtual environment for blind people.

    Science.gov (United States)

    Sánchez, Jaime; Hassler, Tiago

    2007-03-01

    A number of virtual environments have been developed during the last years. Among them there are some applications for blind people based on different type of audio, from simple sounds to 3-D audio. In this study, we pursued a different approach. We designed AudioMUD by using spoken text to describe the environment, navigation, and interaction. We have also introduced some collaborative features into the interaction between blind users. The core of a multiuser MUD game is a networked textual virtual environment. We developed AudioMUD by adding some collaborative features to the basic idea of a MUD and placed a simulated virtual environment inside the human body. This paper presents the design and usability evaluation of AudioMUD. Blind learners were motivated when interacted with AudioMUD and helped to improve the interaction through audio and interface design elements.

  20. Musical Audio Synthesis Using Autoencoding Neural Nets

    OpenAIRE

    Sarroff, Andy; Casey, Michael A.

    2014-01-01

    With an optimal network topology and tuning of hyperpa-\\ud rameters, artificial neural networks (ANNs) may be trained\\ud to learn a mapping from low level audio features to one\\ud or more higher-level representations. Such artificial neu-\\ud ral networks are commonly used in classification and re-\\ud gression settings to perform arbitrary tasks. In this work\\ud we suggest repurposing autoencoding neural networks as\\ud musical audio synthesizers. We offer an interactive musi-\\ud cal audio synt...

  1. Feature-space transformation improves supervised segmentation across scanners

    DEFF Research Database (Denmark)

    van Opbroek, Annegreet; Achterberg, Hakim C.; de Bruijne, Marleen

    2015-01-01

    Image-segmentation techniques based on supervised classification generally perform well on the condition that training and test samples have the same feature distribution. However, if training and test images are acquired with different scanners or scanning parameters, their feature distributions...

  2. Web Audio/Video Streaming Tool

    Science.gov (United States)

    Guruvadoo, Eranna K.

    2003-01-01

    In order to promote NASA-wide educational outreach program to educate and inform the public of space exploration, NASA, at Kennedy Space Center, is seeking efficient ways to add more contents to the web by streaming audio/video files. This project proposes a high level overview of a framework for the creation, management, and scheduling of audio/video assets over the web. To support short-term goals, the prototype of a web-based tool is designed and demonstrated to automate the process of streaming audio/video files. The tool provides web-enabled users interfaces to manage video assets, create publishable schedules of video assets for streaming, and schedule the streaming events. These operations are performed on user-defined and system-derived metadata of audio/video assets stored in a relational database while the assets reside on separate repository. The prototype tool is designed using ColdFusion 5.0.

  3. Roundtable Audio Discussion

    Directory of Open Access Journals (Sweden)

    Chris Bigum

    2007-01-01

    Full Text Available RoundTable on Technology, Teaching and Tools. This is a roundtable audio interview conducted by James Farmer, founder of Edublogs, with Anne Bartlett-Bragg (University of Technology Sydney and Chris Bigum (Deakin University. Skype was used to make and record the audio conference and the resulting sound file was edited by Andrew McLauchlan.

  4. Space discretization in SN methods: Features, improvements and convergence patterns

    International Nuclear Information System (INIS)

    Coppa, G.G.M.; Lapenta, G.; Ravetto, P.

    1990-01-01

    A comparative analysis of the space discretization schemes currently used in S N methods is performed and special attention is devoted to direct integration techniques. Some improvements are proposed in one- and two-dimensional applications, which are based on suitable choices for the spatial variation of the collision source. A study of the convergence pattern is carried out for eigenvalue calculations within the space asymptotic approximation by means of both analytical and numerical investigations. (orig.) [de

  5. AUTOMATIC SEGMENTATION OF BROADCAST AUDIO SIGNALS USING AUTO ASSOCIATIVE NEURAL NETWORKS

    Directory of Open Access Journals (Sweden)

    P. Dhanalakshmi

    2010-12-01

    Full Text Available In this paper, we describe automatic segmentation methods for audio broadcast data. Today, digital audio applications are part of our everyday lives. Since there are more and more digital audio databases in place these days, the importance of effective management for audio databases have become prominent. Broadcast audio data is recorded from the Television which comprises of various categories of audio signals. Efficient algorithms for segmenting the audio broadcast data into predefined categories are proposed. Audio features namely Linear prediction coefficients (LPC, Linear prediction cepstral coefficients, and Mel frequency cepstral coefficients (MFCC are extracted to characterize the audio data. Auto Associative Neural Networks are used to segment the audio data into predefined categories using the extracted features. Experimental results indicate that the proposed algorithms can produce satisfactory results.

  6. EVALUASI KEPUASAN PENGGUNA TERHADAP APLIKASI AUDIO BOOKS

    Directory of Open Access Journals (Sweden)

    Raditya Maulana Anuraga

    2017-02-01

    Full Text Available Listeno is the first application audio books in Indonesia so that the users can get the book in audio form like listen to music, Listeno have problems in a feature request Listeno offline mode that have not been released, a security problem mp3 files that must be considered, and the target Listeno not yet reached 100,000 active users. This research has the objective to evaluate user satisfaction to Audio Books with research method approach, Nielsen. The analysis in this study using Importance Performance Analysis (IPA is combined with the index of User Satisfaction (IKP based on the indicators used are: Benefit (Usefulness, Utility (Utility, Usability (Usability, easy to understand (Learnability, Efficient (efficiency , Easy to remember (Memorability, Error (Error, and satisfaction (satisfaction. The results showed Applications User Satisfaction Audio books are quite satisfied with the results of the calculation IKP 69.58%..

  7. Structure Learning in Audio

    DEFF Research Database (Denmark)

    Nielsen, Andreas Brinch

    By having information about the setting a user is in, a computer is able to make decisions proactively to facilitate tasks for the user. Two approaches are taken in this thesis to achieve more information about an audio environment. One approach is that of classifying audio, and a new approach...... investigated. A fast and computationally simple approach that compares recordings and classifies if they are from the same audio environment have been developed, and shows very high accuracy and the ability to synchronize recordings in the case of recording devices which are not connected. A more general model...

  8. CT features of invasion of sublingual space by malignant oropharyngeal tumors

    International Nuclear Information System (INIS)

    Wei Yi; Xiao Jiahe; Zhou Xiangping; Deng Kaihong

    2003-01-01

    Objective: To investigate the CT features of the invasion of sublingual space by malignant oropharyngeal tumors in order to provide more accurate information for clinical treatment. Methods: Fifty-eight cases of pathologically proven malignant oropharyngeal tumors were collected and retrospectively analyzed. Results: Among all the cases, invasion of sublingual space by malignant oropharyngeal tumors could be seen in 14 cases, of which, 7 cases got access to sublingual space through tongue base, 3 cases through parapharyngeal space, 2 cases through pterygomandibular raphe, 2 cases through uncertain routes. Invasion of sublingual space manifested on CT scan as obliteration of fat plane in sublingual space and involvement of the sublingual vessels in the space. Conclusion: Malignant oropharyngeal tumors can invade the adjacent sublingual space via tongue base, pterygomandibular raphe, and parapharyngeal space. The invasion of sublingual space by malignant oropharyngeal tumors manifests in CT as effacement of sublingual fat plane and envelopment of hyoid artery

  9. Improving audio chord transcription by exploiting harmonic and metric knowledge

    NARCIS (Netherlands)

    de Haas, W.B.; Rodrigues Magalhães, J.P.; Wiering, F.

    2012-01-01

    We present a new system for chord transcription from polyphonic musical audio that uses domain-specific knowledge about tonal harmony and metrical position to improve chord transcription performance. Low-level pulse and spectral features are extracted from an audio source using the Vamp plugin

  10. Voice activity detection using audio-visual information

    DEFF Research Database (Denmark)

    Petsatodis, Theodore; Pnevmatikakis, Aristodemos; Boukis, Christos

    2009-01-01

    An audio-visual voice activity detector that uses sensors positioned distantly from the speaker is presented. Its constituting unimodal detectors are based on the modeling of the temporal variation of audio and visual features using Hidden Markov Models; their outcomes are fused using a post...

  11. Built spaces and features associated with user satisfaction in maternity waiting homes in Malawi.

    Science.gov (United States)

    McIntosh, Nathalie; Gruits, Patricia; Oppel, Eva; Shao, Amie

    2018-07-01

    To assess satisfaction with maternity waiting home built spaces and features in women who are at risk for underutilizing maternity waiting homes (i.e. residential facilities that temporarily house near-term pregnant mothers close to healthcare facilities that provide obstetrical care). Specifically we wanted to answer the questions: (1) Are built spaces and features associated with maternity waiting home user satisfaction? (2) Can built spaces and features designed to improve hygiene, comfort, privacy and function improve maternity waiting home user satisfaction? And (3) Which built spaces and features are most important for maternity waiting home user satisfaction? A cross-sectional study comparing satisfaction with standard and non-standard maternity waiting home designs. Between December 2016 and February 2017 we surveyed expectant mothers at two maternity waiting homes that differed in their design of built spaces and features. We used bivariate analyses to assess if built spaces and features were associated with satisfaction. We compared ratings of built spaces and features between the two maternity waiting homes using chi-squares and t-tests to assess if design features to improve hygiene, comfort, privacy and function were associated with higher satisfaction. We used exploratory robust regression analysis to examine the relationship between built spaces and features and maternity waiting home satisfaction. Two maternity waiting homes in Malawi, one that incorporated non-standardized design features to improve hygiene, comfort, privacy, and function (Kasungu maternity waiting home) and the other that had a standard maternity waiting home design (Dowa maternity waiting home). 322 expectant mothers at risk for underutilizing maternity waiting homes (i.e. first-time mothers and those with no pregnancy risk factors) who had stayed at the Kasungu or Dowa maternity waiting homes. There were significant differences in ratings of built spaces and features between the

  12. Perceptual Audio Hashing Functions

    Directory of Open Access Journals (Sweden)

    Emin Anarım

    2005-07-01

    Full Text Available Perceptual hash functions provide a tool for fast and reliable identification of content. We present new audio hash functions based on summarization of the time-frequency spectral characteristics of an audio document. The proposed hash functions are based on the periodicity series of the fundamental frequency and on singular-value description of the cepstral frequencies. They are found, on one hand, to perform very satisfactorily in identification and verification tests, and on the other hand, to be very resilient to a large variety of attacks. Moreover, we address the issue of security of hashes and propose a keying technique, and thereby a key-dependent hash function.

  13. DAFX Digital Audio Effects

    CERN Document Server

    2011-01-01

    The rapid development in various fields of Digital Audio Effects, or DAFX, has led to new algorithms and this second edition of the popular book, DAFX: Digital Audio Effects has been updated throughout to reflect progress in the field. It maintains a unique approach to DAFX with a lecture-style introduction into the basics of effect processing. Each effect description begins with the presentation of the physical and acoustical phenomena, an explanation of the signal processing techniques to achieve the effect, followed by a discussion of musical applications and the control of effect parameter

  14. Nonspeech audio in user interfaces for TV

    NARCIS (Netherlands)

    Sluis, van de Richard; Eggen, J.H.; Rypkema, J.A.

    1997-01-01

    This study explores the end-user benefits of using nonspeech audio in television user interfaces. A prototype of an Electronic Programme Guide (EPG) served as a carrier for the research. One of the features of this EPG is the possibility to search for TV programmes in a category-based way. The EPG

  15. A modular CUDA-based framework for scale-space feature detection in video streams

    International Nuclear Information System (INIS)

    Kinsner, M; Capson, D; Spence, A

    2010-01-01

    Multi-scale image processing techniques enable extraction of features where the size of a feature is either unknown or changing, but the requirement to process image data at multiple scale levels imposes a substantial computational load. This paper describes the architecture and emerging results from the implementation of a GPGPU-accelerated scale-space feature detection framework for video processing. A discrete scale-space representation is generated for image frames within a video stream, and multi-scale feature detection metrics are applied to detect ridges and Gaussian blobs at video frame rates. A modular structure is adopted, in which common feature extraction tasks such as non-maximum suppression and local extrema search may be reused across a variety of feature detectors. Extraction of ridge and blob features is achieved at faster than 15 frames per second on video sequences from a machine vision system, utilizing an NVIDIA GTX 480 graphics card. By design, the framework is easily extended to additional feature classes through the inclusion of feature metrics to be applied to the scale-space representation, and using common post-processing modules to reduce the required CPU workload. The framework is scalable across multiple and more capable GPUs, and enables previously intractable image processing at video frame rates using commodity computational hardware.

  16. Agency Video, Audio and Imagery Library

    Science.gov (United States)

    Grubbs, Rodney

    2015-01-01

    The purpose of this presentation was to inform the ISS International Partners of the new NASA Agency Video, Audio and Imagery Library (AVAIL) website. AVAIL is a new resource for the public to search for and download NASA-related imagery, and is not intended to replace the current process by which the International Partners receive their Space Station imagery products.

  17. An Incremental Classification Algorithm for Mining Data with Feature Space Heterogeneity

    Directory of Open Access Journals (Sweden)

    Yu Wang

    2014-01-01

    Full Text Available Feature space heterogeneity often exists in many real world data sets so that some features are of different importance for classification over different subsets. Moreover, the pattern of feature space heterogeneity might dynamically change over time as more and more data are accumulated. In this paper, we develop an incremental classification algorithm, Supervised Clustering for Classification with Feature Space Heterogeneity (SCCFSH, to address this problem. In our approach, supervised clustering is implemented to obtain a number of clusters such that samples in each cluster are from the same class. After the removal of outliers, relevance of features in each cluster is calculated based on their variations in this cluster. The feature relevance is incorporated into distance calculation for classification. The main advantage of SCCFSH lies in the fact that it is capable of solving a classification problem with feature space heterogeneity in an incremental way, which is favorable for online classification tasks with continuously changing data. Experimental results on a series of data sets and application to a database marketing problem show the efficiency and effectiveness of the proposed approach.

  18. Space-dependent step features: Transient breakdown of slow-roll, homogeneity, and isotropy during inflation

    International Nuclear Information System (INIS)

    Lerner, Rose N.; McDonald, John

    2009-01-01

    A step feature in the inflaton potential can model a transient breakdown of slow-roll inflation. Here we generalize the step feature to include space-dependence, allowing it also to model a breakdown of homogeneity and isotropy. The space-dependent inflaton potential generates a classical curvature perturbation mode characterized by the wave number of the step inhomogeneity. For inhomogeneities small compared with the horizon at the step, space-dependence has a small effect on the curvature perturbation. Therefore, the smoothly oscillating quantum power spectrum predicted by the homogeneous step is robust with respect to subhorizon space-dependence. For inhomogeneities equal to or greater than the horizon at the step, the space-dependent classical mode can dominate, producing a curvature perturbation in which modes of wave number determined by the step inhomogeneity are superimposed on the oscillating power spectrum. Generation of a space-dependent step feature may therefore provide a mechanism to introduce primordial anisotropy into the curvature perturbation. Space-dependence also modifies the quantum fluctuations, in particular, via resonancelike features coming from mode coupling to amplified superhorizon modes. However, these effects are small relative to the classical modes.

  19. Portable Audio Design

    DEFF Research Database (Denmark)

    Groth, Sanne Krogh

    2014-01-01

    attention to the specific genre; a grasping of the complex relationship between site and time, the actual and the virtual; and getting aquatint with the specific site’s soundscape by approaching it both intuitively and systematically. These steps will finally lead to an audio production that not only...

  20. Audio Feedback -- Better Feedback?

    Science.gov (United States)

    Voelkel, Susanne; Mello, Luciane V.

    2014-01-01

    National Student Survey (NSS) results show that many students are dissatisfied with the amount and quality of feedback they get for their work. This study reports on two case studies in which we tried to address these issues by introducing audio feedback to one undergraduate (UG) and one postgraduate (PG) class, respectively. In case study one…

  1. Editing Audio with Audacity

    Directory of Open Access Journals (Sweden)

    Brandon Walsh

    2016-08-01

    Full Text Available For those interested in audio, basic sound editing skills go a long way. Being able to handle and manipulate the materials can help you take control of your object of study: you can zoom in and extract particular moments to analyze, process the audio, and upload the materials to a server to compliment a blog post on the topic. On a more practical level, these skills could also allow you to record and package recordings of yourself or others for distribution. That guest lecture taking place in your department? Record it and edit it yourself! Doing so is a lightweight way to distribute resources among various institutions, and it also helps make the materials more accessible for readers and listeners with a wide variety of learning needs. In this lesson you will learn how to use Audacity to load, record, edit, mix, and export audio files. Sound editing platforms are often expensive and offer extensive capabilities that can be overwhelming to the first-time user, but Audacity is a free and open source alternative that offers powerful capabilities for sound editing with a low barrier for entry. For this lesson we will work with two audio files: a recording of Bach’s Goldberg Variations available from MusOpen and another recording of your own voice that will be made in the course of the lesson. This tutorial uses Audacity 2.1.2, released January 2016.

  2. Overall feature of EAST operation space by using simple Core-SOL-Divertor model

    International Nuclear Information System (INIS)

    Hiwatari, R.; Hatayama, A.; Zhu, S.; Takizuka, T.; Tomita, Y.

    2005-01-01

    We have developed a simple Core-SOL-Divertor (C-S-D) model to investigate qualitatively the overall features of the operational space for the integrated core and edge plasma. To construct the simple C-S-D model, a simple core plasma model of ITER physics guidelines and a two-point SOL-divertor model are used. The simple C-S-D model is applied to the study of the EAST operational space with lower hybrid current drive experiments under various kinds of trade-off for the basic plasma parameters. Effective methods for extending the operation space are also presented. As shown by this study for the EAST operation space, it is evident that the C-S-D model is a useful tool to understand qualitatively the overall features of the plasma operation space. (author)

  3. Visual scan-path analysis with feature space transient fixation moments

    Science.gov (United States)

    Dempere-Marco, Laura; Hu, Xiao-Peng; Yang, Guang-Zhong

    2003-05-01

    The study of eye movements provides useful insight into the cognitive processes underlying visual search tasks. The analysis of the dynamics of eye movements has often been approached from a purely spatial perspective. In many cases, however, it may not be possible to define meaningful or consistent dynamics without considering the features underlying the scan paths. In this paper, the definition of the feature space has been attempted through the concept of visual similarity and non-linear low dimensional embedding, which defines a mapping from the image space into a low dimensional feature manifold that preserves the intrinsic similarity of image patterns. This has enabled the definition of perceptually meaningful features without the use of domain specific knowledge. Based on this, this paper introduces a new concept called Feature Space Transient Fixation Moments (TFM). The approach presented tackles the problem of feature space representation of visual search through the use of TFM. We demonstrate the practical values of this concept for characterizing the dynamics of eye movements in goal directed visual search tasks. We also illustrate how this model can be used to elucidate the fundamental steps involved in skilled search tasks through the evolution of transient fixation moments.

  4. Large anterior temporal Virchow-Robin spaces: unique MR imaging features

    Energy Technology Data Exchange (ETDEWEB)

    Lim, Anthony T. [Monash University, Neuroradiology Service, Monash Imaging, Monash Health, Melbourne, Victoria (Australia); Chandra, Ronil V. [Monash University, Neuroradiology Service, Monash Imaging, Monash Health, Melbourne, Victoria (Australia); Monash University, Department of Surgery, Faculty of Medicine, Nursing and Health Sciences, Melbourne (Australia); Trost, Nicholas M. [St Vincent' s Hospital, Neuroradiology Service, Melbourne (Australia); McKelvie, Penelope A. [St Vincent' s Hospital, Anatomical Pathology, Melbourne (Australia); Stuckey, Stephen L. [Monash University, Neuroradiology Service, Monash Imaging, Monash Health, Melbourne, Victoria (Australia); Monash University, Southern Clinical School, Faculty of Medicine, Nursing and Health Sciences, Melbourne (Australia)

    2015-05-01

    Large Virchow-Robin (VR) spaces may mimic cystic tumor. The anterior temporal subcortical white matter is a recently described preferential location, with only 18 reported cases. Our aim was to identify unique MR features that could increase prospective diagnostic confidence. Thirty-nine cases were identified between November 2003 and February 2014. Demographic, clinical data and the initial radiological report were retrospectively reviewed. Two neuroradiologists reviewed all MR imaging; a neuropathologist reviewed histological data. Median age was 58 years (range 24-86 years); the majority (69 %) was female. There were no clinical symptoms that could be directly referable to the lesion. Two thirds were considered to be VR spaces on the initial radiological report. Mean maximal size was 9 mm (range 5-17 mm); majority (79 %) had perilesional T2 or fluid-attenuated inversion recovery (FLAIR) hyperintensity. The following were identified as potential unique MR features: focal cortical distortion by an adjacent branch of the middle cerebral artery (92 %), smaller adjacent VR spaces (26 %), and a contiguous cerebrospinal fluid (CSF) intensity tract (21 %). Surgery was performed in three asymptomatic patients; histopathology confirmed VR spaces. Unique MR features were retrospectively identified in all three patients. Large anterior temporal lobe VR spaces commonly demonstrate perilesional T2 or FLAIR signal and can be misdiagnosed as cystic tumor. Potential unique MR features that could increase prospective diagnostic confidence include focal cortical distortion by an adjacent branch of the middle cerebral artery, smaller adjacent VR spaces, and a contiguous CSF intensity tract. (orig.)

  5. The audio expert everything you need to know about audio

    CERN Document Server

    Winer, Ethan

    2012-01-01

    The Audio Expert is a comprehensive reference that covers all aspects of audio, with many practical, as well as theoretical, explanations. Providing in-depth descriptions of how audio really works, using common sense plain-English explanations and mechanical analogies with minimal math, the book is written for people who want to understand audio at the deepest, most technical level, without needing an engineering degree. It's presented in an easy-to-read, conversational tone, and includes more than 400 figures and photos augmenting the text.The Audio Expert takes th

  6. Wavelet-based audio embedding and audio/video compression

    Science.gov (United States)

    Mendenhall, Michael J.; Claypoole, Roger L., Jr.

    2001-12-01

    Watermarking, traditionally used for copyright protection, is used in a new and exciting way. An efficient wavelet-based watermarking technique embeds audio information into a video signal. Several effective compression techniques are applied to compress the resulting audio/video signal in an embedded fashion. This wavelet-based compression algorithm incorporates bit-plane coding, index coding, and Huffman coding. To demonstrate the potential of this audio embedding and audio/video compression algorithm, we embed an audio signal into a video signal and then compress. Results show that overall compression rates of 15:1 can be achieved. The video signal is reconstructed with a median PSNR of nearly 33 dB. Finally, the audio signal is extracted from the compressed audio/video signal without error.

  7. Guiding exploration in conformational feature space with Lipschitz underestimation for ab-initio protein structure prediction.

    Science.gov (United States)

    Hao, Xiaohu; Zhang, Guijun; Zhou, Xiaogen

    2018-04-01

    Computing conformations which are essential to associate structural and functional information with gene sequences, is challenging due to the high dimensionality and rugged energy surface of the protein conformational space. Consequently, the dimension of the protein conformational space should be reduced to a proper level, and an effective exploring algorithm should be proposed. In this paper, a plug-in method for guiding exploration in conformational feature space with Lipschitz underestimation (LUE) for ab-initio protein structure prediction is proposed. The conformational space is converted into ultrafast shape recognition (USR) feature space firstly. Based on the USR feature space, the conformational space can be further converted into Underestimation space according to Lipschitz estimation theory for guiding exploration. As a consequence of the use of underestimation model, the tight lower bound estimate information can be used for exploration guidance, the invalid sampling areas can be eliminated in advance, and the number of energy function evaluations can be reduced. The proposed method provides a novel technique to solve the exploring problem of protein conformational space. LUE is applied to differential evolution (DE) algorithm, and metropolis Monte Carlo(MMC) algorithm which is available in the Rosetta; When LUE is applied to DE and MMC, it will be screened by the underestimation method prior to energy calculation and selection. Further, LUE is compared with DE and MMC by testing on 15 small-to-medium structurally diverse proteins. Test results show that near-native protein structures with higher accuracy can be obtained more rapidly and efficiently with the use of LUE. Copyright © 2018 Elsevier Ltd. All rights reserved.

  8. Intelligent Fault Diagnosis of HVCB with Feature Space Optimization-Based Random Forest.

    Science.gov (United States)

    Ma, Suliang; Chen, Mingxuan; Wu, Jianwen; Wang, Yuhao; Jia, Bowen; Jiang, Yuan

    2018-04-16

    Mechanical faults of high-voltage circuit breakers (HVCBs) always happen over long-term operation, so extracting the fault features and identifying the fault type have become a key issue for ensuring the security and reliability of power supply. Based on wavelet packet decomposition technology and random forest algorithm, an effective identification system was developed in this paper. First, compared with the incomplete description of Shannon entropy, the wavelet packet time-frequency energy rate (WTFER) was adopted as the input vector for the classifier model in the feature selection procedure. Then, a random forest classifier was used to diagnose the HVCB fault, assess the importance of the feature variable and optimize the feature space. Finally, the approach was verified based on actual HVCB vibration signals by considering six typical fault classes. The comparative experiment results show that the classification accuracy of the proposed method with the origin feature space reached 93.33% and reached up to 95.56% with optimized input feature vector of classifier. This indicates that feature optimization procedure is successful, and the proposed diagnosis algorithm has higher efficiency and robustness than traditional methods.

  9. A systematic exploration of the micro-blog feature space for teens stress detection.

    Science.gov (United States)

    Zhao, Liang; Li, Qi; Xue, Yuanyuan; Jia, Jia; Feng, Ling

    2016-01-01

    In the modern stressful society, growing teenagers experience severe stress from different aspects from school to friends, from self-cognition to inter-personal relationship, which negatively influences their smooth and healthy development. Being timely and accurately aware of teenagers psychological stress and providing effective measures to help immature teenagers to cope with stress are highly valuable to both teenagers and human society. Previous work demonstrates the feasibility to sense teenagers' stress from their tweeting contents and context on the open social media platform-micro-blog. However, a tweet is still too short for teens to express their stressful status in a comprehensive way. Considering the topic continuity from the tweeting content to the follow-up comments and responses between the teenager and his/her friends, we combine the content of comments and responses under the tweet to supplement the tweet content. Also, such friends' caring comments like "what happened?", "Don't worry!", "Cheer up!", etc. provide hints to teenager's stressful status. Hence, in this paper, we propose to systematically explore the micro-blog feature space, comprised of four kinds of features [tweeting content features (FW), posting features (FP), interaction features (FI), and comment-response features (FC) between teenagers and friends] for teenager' stress category and stress level detection. We extract and analyze these feature values and their impacts on teens stress detection. We evaluate the framework through a real user study of 36 high school students aged 17. Different classifiers are employed to detect potential stress categories and corresponding stress levels. Experimental results show that all the features in the feature space positively affect stress detection, and linguistic negative emotion, proportion of negative sentences, friends' caring comments and teen's reply rate play more significant roles than the rest features. Micro-blog platform provides

  10. High-Order Sparse Linear Predictors for Audio Processing

    DEFF Research Database (Denmark)

    Giacobello, Daniele; van Waterschoot, Toon; Christensen, Mads Græsbøll

    2010-01-01

    Linear prediction has generally failed to make a breakthrough in audio processing, as it has done in speech processing. This is mostly due to its poor modeling performance, since an audio signal is usually an ensemble of different sources. Nevertheless, linear prediction comes with a whole set...... of interesting features that make the idea of using it in audio processing not far fetched, e.g., the strong ability of modeling the spectral peaks that play a dominant role in perception. In this paper, we provide some preliminary conjectures and experiments on the use of high-order sparse linear predictors...... in audio processing. These predictors, successfully implemented in modeling the short-term and long-term redundancies present in speech signals, will be used to model tonal audio signals, both monophonic and polyphonic. We will show how the sparse predictors are able to model efficiently the different...

  11. Audio Mining with emphasis on Music Genre Classification

    DEFF Research Database (Denmark)

    Meng, Anders

    2004-01-01

    Audio is an important part of our daily life, basically it increases our impression of the world around us whether this is communication, music, danger detection etc. Currently the field of Audio Mining, which here includes areas of music genre, music recognition / retrieval, playlist generation...... the world the problem of detecting environments from the input audio is researched as to increase the life quality of hearing-impaired. Basically there is a lot of work within the field of audio mining. The presentation will mainly focus on music genre classification where we have a fixed amount of genres...... to choose from. Basically every audio mining system is more or less consisting of the same stages as for the music genre setting. My research so far has mainly focussed on finding relevant features for music genre classification living at different timescales using early and late information fusion. It has...

  12. Feature-Space Clustering for fMRI Meta-Analysis

    DEFF Research Database (Denmark)

    Goutte, Cyril; Hansen, Lars Kai; Liptrot, Mathew G.

    2001-01-01

    MRI sequences containing several hundreds of images, it is sometimes necessary to invoke feature extraction to reduce the dimensionality of the data space. A second interesting application is in the meta-analysis of fMRI experiment, where features are obtained from a possibly large number of single......-voxel analyses. In particular this allows the checking of the differences and agreements between different methods of analysis. Both approaches are illustrated on a fMRI data set involving visual stimulation, and we show that the feature space clustering approach yields nontrivial results and, in particular......, shows interesting differences between individual voxel analysis performed with traditional methods. © 2001 Wiley-Liss, Inc....

  13. Features of public open spaces and physical activity among children: findings from the CLAN study.

    Science.gov (United States)

    Timperio, Anna; Giles-Corti, Billie; Crawford, David; Andrianopoulos, Nick; Ball, Kylie; Salmon, Jo; Hume, Clare

    2008-11-01

    To examine associations between features of public open spaces, and children's physical activity. 163 children aged 8-9 years and 334 adolescents aged 13-15 years from Melbourne, Australia participated in 2004. A Geographic Information System was used to identify all public open spaces (POS) within 800 m of participants' homes and their closest POS. The features of all POS identified were audited in 2004/5. Accelerometers measured moderate-to-vigorous physical activity (MVPA) after school and on weekends. Linear regression analyses examined associations between features of the closest POS and participants' MVPA. Most participants had a POS within 800 m of their home. The presence of playgrounds was positively associated with younger boys' weekend MVPA (B=24.9 min/day; pPOS were associated with participants' MVPA, although mixed associations were evident. Further research is required to clarify these complex relationships.

  14. An alternative to scale-space representation for extracting local features in image recognition

    DEFF Research Database (Denmark)

    Andersen, Hans Jørgen; Nguyen, Phuong Giang

    2012-01-01

    In image recognition, the common approach for extracting local features using a scale-space representation has usually three main steps; first interest points are extracted at different scales, next from a patch around each interest point the rotation is calculated with corresponding orientation...... and compensation, and finally a descriptor is computed for the derived patch (i.e. feature of the patch). To avoid the memory and computational intensive process of constructing the scale-space, we use a method where no scale-space is required This is done by dividing the given image into a number of triangles...... with sizes dependent on the content of the image, at the location of each triangle. In this paper, we will demonstrate that by rotation of the interest regions at the triangles it is possible in grey scale images to achieve a recognition precision comparable with that of MOPS. The test of the proposed method...

  15. Small signal audio design

    CERN Document Server

    Self, Douglas

    2014-01-01

    Learn to use inexpensive and readily available parts to obtain state-of-the-art performance in all the vital parameters of noise, distortion, crosstalk and so on. With ample coverage of preamplifiers and mixers and a new chapter on headphone amplifiers, this practical handbook provides an extensive repertoire of circuits that can be put together to make almost any type of audio system.A resource packed full of valuable information, with virtually every page revealing nuggets of specialized knowledge not found elsewhere. Essential points of theory that bear on practical performance are lucidly

  16. On equivalent parameter learning in simplified feature space based on Bayesian asymptotic analysis.

    Science.gov (United States)

    Yamazaki, Keisuke

    2012-07-01

    Parametric models for sequential data, such as hidden Markov models, stochastic context-free grammars, and linear dynamical systems, are widely used in time-series analysis and structural data analysis. Computation of the likelihood function is one of primary considerations in many learning methods. Iterative calculation of the likelihood such as the model selection is still time-consuming though there are effective algorithms based on dynamic programming. The present paper studies parameter learning in a simplified feature space to reduce the computational cost. Simplifying data is a common technique seen in feature selection and dimension reduction though an oversimplified space causes adverse learning results. Therefore, we mathematically investigate a condition of the feature map to have an asymptotically equivalent convergence point of estimated parameters, referred to as the vicarious map. As a demonstration to find vicarious maps, we consider the feature space, which limits the length of data, and derive a necessary length for parameter learning in hidden Markov models. Copyright © 2012 Elsevier Ltd. All rights reserved.

  17. Spiral CT features and anatomic basis of posterior pararenal space involvement in acute pancreatitis

    International Nuclear Information System (INIS)

    Min Pengqiu; Yan Zhihan; Yang Hengxuan; Liu Zaiyi; Song Bin; Wu Bing; Zhang Jin; Liu Rongbo

    2005-01-01

    Objective: To evaluate spiral CT features and anatomic basis of the posterior pararenal space (PPS) involvement in acute pancreatitis (AP). Methods: CT images of 87 cases with AP were retrospectively studied with focus on spiral CT features, incidence of the PPS involvement, and its correlations with the posterior renal fascia or lateroconal fascia. Results: Our study showed that the incidence of the PPS involvement was 47% (41/87), with Grade A 53% (46/87), Grade B 24%(21/87), and Grade C 23% (20/87), and Grade 0 53% (46/87), Grade I 22% (19/87), and Grade II 25% (22/87), respectively. The pancreatitis fluid collection in the PPS was continuous with that in the anterior pararenal space or with the fluid between the two laminae of the posterior renal fascia. In 3 follow-up cases, pseudocysts in the PPS were continuous with that in anterior pararenal space below the cone of renal fascia. Conclusion: Spiral CT features of the PPS involvement varies from mild inflammatory changes to fluid collection or phlegmonous mass. Fluid within anterior pararenal space in AP flows into the PPS by three routes. (authors)

  18. Optimal Feature Space Selection in Detecting Epileptic Seizure based on Recurrent Quantification Analysis and Genetic Algorithm

    Directory of Open Access Journals (Sweden)

    Saleh LAshkari

    2016-06-01

    Full Text Available Selecting optimal features based on nature of the phenomenon and high discriminant ability is very important in the data classification problems. Since it doesn't require any assumption about stationary condition and size of the signal and the noise in Recurrent Quantification Analysis (RQA, it may be useful for epileptic seizure Detection. In this study, RQA was used to discriminate ictal EEG from the normal EEG where optimal features selected by combination of algorithm genetic and Bayesian Classifier. Recurrence plots of hundred samples in each two categories were obtained with five distance norms in this study: Euclidean, Maximum, Minimum, Normalized and Fixed Norm. In order to choose optimal threshold for each norm, ten threshold of ε was generated and then the best feature space was selected by genetic algorithm in combination with a bayesian classifier. The results shown that proposed method is capable of discriminating the ictal EEG from the normal EEG where for Minimum norm and 0.1˂ε˂1, accuracy was 100%. In addition, the sensitivity of proposed framework to the ε and the distance norm parameters was low. The optimal feature presented in this study is Trans which it was selected in most feature spaces with high accuracy.

  19. The formation method of the feature space for the identification of fatigued bills

    Science.gov (United States)

    Kang, Dongshik; Oshiro, Ayumu; Ozawa, Kenji; Mitsui, Ikugo

    2014-10-01

    Fatigued bills make a trouble such as the paper jam in a bill handling machine. In the discrimination of fatigued bills using an acoustic signal, the variation of an observed bill sound is considered to be one of causes in misclassification. Therefore a technique has demanded in order to make the classification of fatigued bills more efficient. In this paper, we proposed the algorithm that extracted feature quantity of bill sound from acoustic signal using the frequency difference, and carried out discrimination experiment of fatigued bill money by Support Vector Machine(SVM). The feature quantity of frequency difference can represent the frequency components of an acoustic signal is varied by the fatigued degree of bill money. The generalization performance of SVM does not depend on the size of dimensions of the feature space, even in a high dimensional feature space such as bill-acoustic signals. Furthermore, SVM can induce an optimal classifier which considers the combination of features by the virtue of polynomial kernel functions.

  20. Car audio using DSP for active sound control. DSP ni yoru active seigyo wo mochiita audio

    Energy Technology Data Exchange (ETDEWEB)

    Yamada, K.; Asano, S.; Furukawa, N. (Mitsubishi Motor Corp., Tokyo (Japan))

    1993-06-01

    In the automobile cabin, there are some unique problems which spoil the quality of sound reproduction from audio equipment, such as the narrow space and/or the background noise. The audio signal processing by using DSP (digital signal processor) makes enable a solution to these problems. A car audio with a high amenity has been successfully made by the active sound control using DSP. The DSP consists of an adder, coefficient multiplier, delay unit, and connections. For the actual processing by DSP, are used functions, such as sound field correction, response and processing of noises during driving, surround reproduction, graphic equalizer processing, etc. High effectiveness of the method was confirmed through the actual driving evaluation test. The present paper describes the actual method of sound control technology using DSP. Especially, the dynamic processing of the noise during driving is discussed in detail. 1 ref., 12 figs., 1 tab.

  1. Medical X-ray Image Hierarchical Classification Using a Merging and Splitting Scheme in Feature Space.

    Science.gov (United States)

    Fesharaki, Nooshin Jafari; Pourghassem, Hossein

    2013-07-01

    Due to the daily mass production and the widespread variation of medical X-ray images, it is necessary to classify these for searching and retrieving proposes, especially for content-based medical image retrieval systems. In this paper, a medical X-ray image hierarchical classification structure based on a novel merging and splitting scheme and using shape and texture features is proposed. In the first level of the proposed structure, to improve the classification performance, similar classes with regard to shape contents are grouped based on merging measures and shape features into the general overlapped classes. In the next levels of this structure, the overlapped classes split in smaller classes based on the classification performance of combination of shape and texture features or texture features only. Ultimately, in the last levels, this procedure is also continued forming all the classes, separately. Moreover, to optimize the feature vector in the proposed structure, we use orthogonal forward selection algorithm according to Mahalanobis class separability measure as a feature selection and reduction algorithm. In other words, according to the complexity and inter-class distance of each class, a sub-space of the feature space is selected in each level and then a supervised merging and splitting scheme is applied to form the hierarchical classification. The proposed structure is evaluated on a database consisting of 2158 medical X-ray images of 18 classes (IMAGECLEF 2005 database) and accuracy rate of 93.6% in the last level of the hierarchical structure for an 18-class classification problem is obtained.

  2. Assessing spacing impact on coherent features in a wind turbine array boundary layer

    Directory of Open Access Journals (Sweden)

    N. Ali

    2018-02-01

    intermediate scales are responsible for features seen in the original profile. The variation in streamwise and spanwise spacing leads to changes in the background structure of the turbulence, where the color map based on barycentric map and Reynolds stress anisotropy tensor provides an alternate perspective on the nature of the perturbations within the wind turbine array. The impact of the streamwise and spanwise spacings on power produced is quantified, where the maximum production corresponds with the case of greatest turbine spacing.

  3. Detection of Coronal Mass Ejections Using Multiple Features and Space-Time Continuity

    Science.gov (United States)

    Zhang, Ling; Yin, Jian-qin; Lin, Jia-ben; Feng, Zhi-quan; Zhou, Jin

    2017-07-01

    Coronal Mass Ejections (CMEs) release tremendous amounts of energy in the solar system, which has an impact on satellites, power facilities and wireless transmission. To effectively detect a CME in Large Angle Spectrometric Coronagraph (LASCO) C2 images, we propose a novel algorithm to locate the suspected CME regions, using the Extreme Learning Machine (ELM) method and taking into account the features of the grayscale and the texture. Furthermore, space-time continuity is used in the detection algorithm to exclude the false CME regions. The algorithm includes three steps: i) define the feature vector which contains textural and grayscale features of a running difference image; ii) design the detection algorithm based on the ELM method according to the feature vector; iii) improve the detection accuracy rate by using the decision rule of the space-time continuum. Experimental results show the efficiency and the superiority of the proposed algorithm in the detection of CMEs compared with other traditional methods. In addition, our algorithm is insensitive to most noise.

  4. Spherical phantom for research of radiation situation in outer space. Design-structural special features

    International Nuclear Information System (INIS)

    Kartsev, I.S.; Eremenko, V.G.; Petrov, V.I.; Polenov, B.V.; Yudin, V.N.; Akatov, Yu.A.; Petrov, V.M.; Shurshakov, V.A.

    2005-01-01

    The design-structural features of the updated spherical phantom applied within the frameworks of the space experiment Matreshka-R at the Russian segment of International space station during ISS-8 and ISS-9 expeditions are described. The replacement of 48 polyethylene containers with TLD and STD assemblies by 16 cases installed from external side of the phantom and 4 tissue-equivalent caps of the central disk by 4 cases with detector assemblies is carried out. The updated tissue-equivalent phantom contains the active dosemeter based on 5 MOS detectors. The phantom cover is made from the non-flammable material NT-7. The basic characteristics of the flight specimen of the phantom are presented. The results of its on-Earth testing and real space flights are analyzed [ru

  5. Efficient Audio Power Amplification - Challenges

    DEFF Research Database (Denmark)

    Andersen, Michael Andreas E.

    2005-01-01

    For more than a decade efficient audio power amplification has evolved and today switch-mode audio power amplification in various forms are the state-of-the-art. The technical steps that lead to this evolution are described and in addition many of the challenges still to be faced and where...

  6. Efficient audio power amplification - challenges

    Energy Technology Data Exchange (ETDEWEB)

    Andersen, Michael A.E.

    2005-07-01

    For more than a decade efficient audio power amplification has evolved and today switch-mode audio power amplification in various forms are the state-of-the-art. The technical steps that lead to this evolution are described and in addition many of the challenges still to be faced and where extensive research and development are needed is covered. (au)

  7. A New Ensemble Method with Feature Space Partitioning for High-Dimensional Data Classification

    Directory of Open Access Journals (Sweden)

    Yongjun Piao

    2015-01-01

    Full Text Available Ensemble data mining methods, also known as classifier combination, are often used to improve the performance of classification. Various classifier combination methods such as bagging, boosting, and random forest have been devised and have received considerable attention in the past. However, data dimensionality increases rapidly day by day. Such a trend poses various challenges as these methods are not suitable to directly apply to high-dimensional datasets. In this paper, we propose an ensemble method for classification of high-dimensional data, with each classifier constructed from a different set of features determined by partitioning of redundant features. In our method, the redundancy of features is considered to divide the original feature space. Then, each generated feature subset is trained by a support vector machine, and the results of each classifier are combined by majority voting. The efficiency and effectiveness of our method are demonstrated through comparisons with other ensemble techniques, and the results show that our method outperforms other methods.

  8. Do features of public open spaces vary according to neighbourhood socio-economic status?

    Science.gov (United States)

    Crawford, David; Timperio, Anna; Giles-Corti, Billie; Ball, Kylie; Hume, Clare; Roberts, Rebecca; Andrianopoulos, Nick; Salmon, Jo

    2008-12-01

    This study examined the relations between neighbourhood socio-economic status and features of public open spaces (POS) hypothesised to influence children's physical activity. Data were from the first follow-up of the Children Living in Active Neighbourhoods (CLAN) Study, which involved 540 families of 5-6 and 10-12-year-old children in Melbourne, Australia. The Socio-Economic Index for Areas Index (SEIFA) of Relative Socio-economic Advantage/Disadvantage was used to assign a socioeconomic index score to each child's neighbourhood, based on postcode. Participant addresses were geocoded using a Geographic Information System. The Open Space 2002 spatial data set was used to identify all POS within an 800 m radius of each participant's home. The features of each of these POS (1497) were audited. Variability of POS features was examined across quintiles of neighbourhood SEIFA. Compared with POS in lower socioeconomic neighbourhoods, POS in the highest socioeconomic neighbourhoods had more amenities (e.g. picnic tables and drink fountains) and were more likely to have trees that provided shade, a water feature (e.g. pond, creek), walking and cycling paths, lighting, signage regarding dog access and signage restricting other activities. There were no differences across neighbourhoods in the number of playgrounds or the number of recreation facilities (e.g. number of sports catered for on courts and ovals, the presence of other facilities such as athletics tracks, skateboarding facility and swimming pool). This study suggests that POS in high socioeconomic neighbourhoods possess more features that are likely to promote physical activity amongst children.

  9. A biologically inspired scale-space for illumination invariant feature detection

    International Nuclear Information System (INIS)

    Vonikakis, Vasillios; Chrysostomou, Dimitrios; Kouskouridas, Rigas; Gasteratos, Antonios

    2013-01-01

    This paper presents a new illumination invariant operator, combining the nonlinear characteristics of biological center-surround cells with the classic difference of Gaussians operator. It specifically targets the underexposed image regions, exhibiting increased sensitivity to low contrast, while not affecting performance in the correctly exposed ones. The proposed operator can be used to create a scale-space, which in turn can be a part of a SIFT-based detector module. The main advantage of this illumination invariant scale-space is that, using just one global threshold, keypoints can be detected in both dark and bright image regions. In order to evaluate the degree of illumination invariance that the proposed, as well as other, existing, operators exhibit, a new benchmark dataset is introduced. It features a greater variety of imaging conditions, compared to existing databases, containing real scenes under various degrees and combinations of uniform and non-uniform illumination. Experimental results show that the proposed detector extracts a greater number of features, with a high level of repeatability, compared to other approaches, for both uniform and non-uniform illumination. This, along with its simple implementation, renders the proposed feature detector particularly appropriate for outdoor vision systems, working in environments under uncontrolled illumination conditions. (paper)

  10. Research on Optimal Observation Scale for Damaged Buildings after Earthquake Based on Optimal Feature Space

    Science.gov (United States)

    Chen, J.; Chen, W.; Dou, A.; Li, W.; Sun, Y.

    2018-04-01

    A new information extraction method of damaged buildings rooted in optimal feature space is put forward on the basis of the traditional object-oriented method. In this new method, ESP (estimate of scale parameter) tool is used to optimize the segmentation of image. Then the distance matrix and minimum separation distance of all kinds of surface features are calculated through sample selection to find the optimal feature space, which is finally applied to extract the image of damaged buildings after earthquake. The overall extraction accuracy reaches 83.1 %, the kappa coefficient 0.813. The new information extraction method greatly improves the extraction accuracy and efficiency, compared with the traditional object-oriented method, and owns a good promotional value in the information extraction of damaged buildings. In addition, the new method can be used for the information extraction of different-resolution images of damaged buildings after earthquake, then to seek the optimal observation scale of damaged buildings through accuracy evaluation. It is supposed that the optimal observation scale of damaged buildings is between 1 m and 1.2 m, which provides a reference for future information extraction of damaged buildings.

  11. EEMD Independent Extraction for Mixing Features of Rotating Machinery Reconstructed in Phase Space

    Directory of Open Access Journals (Sweden)

    Zaichao Ma

    2015-04-01

    Full Text Available Empirical Mode Decomposition (EMD, due to its adaptive decomposition property for the non-linear and non-stationary signals, has been widely used in vibration analyses for rotating machinery. However, EMD suffers from mode mixing, which is difficult to extract features independently. Although the improved EMD, well known as the ensemble EMD (EEMD, has been proposed, mode mixing is alleviated only to a certain degree. Moreover, EEMD needs to determine the amplitude of added noise. In this paper, we propose Phase Space Ensemble Empirical Mode Decomposition (PSEEMD integrating Phase Space Reconstruction (PSR and Manifold Learning (ML for modifying EEMD. We also provide the principle and detailed procedure of PSEEMD, and the analyses on a simulation signal and an actual vibration signal derived from a rubbing rotor are performed. The results show that PSEEMD is more efficient and convenient than EEMD in extracting the mixing features from the investigated signal and in optimizing the amplitude of the necessary added noise. Additionally PSEEMD can extract the weak features interfered with a certain amount of noise.

  12. Feature-space-based FMRI analysis using the optimal linear transformation.

    Science.gov (United States)

    Sun, Fengrong; Morris, Drew; Lee, Wayne; Taylor, Margot J; Mills, Travis; Babyn, Paul S

    2010-09-01

    The optimal linear transformation (OLT), an image analysis technique of feature space, was first presented in the field of MRI. This paper proposes a method of extending OLT from MRI to functional MRI (fMRI) to improve the activation-detection performance over conventional approaches of fMRI analysis. In this method, first, ideal hemodynamic response time series for different stimuli were generated by convolving the theoretical hemodynamic response model with the stimulus timing. Second, constructing hypothetical signature vectors for different activity patterns of interest by virtue of the ideal hemodynamic responses, OLT was used to extract features of fMRI data. The resultant feature space had particular geometric clustering properties. It was then classified into different groups, each pertaining to an activity pattern of interest; the applied signature vector for each group was obtained by averaging. Third, using the applied signature vectors, OLT was applied again to generate fMRI composite images with high SNRs for the desired activity patterns. Simulations and a blocked fMRI experiment were employed for the method to be verified and compared with the general linear model (GLM)-based analysis. The simulation studies and the experimental results indicated the superiority of the proposed method over the GLM-based analysis in detecting brain activities.

  13. Audio-visual biofeedback for respiratory-gated radiotherapy: Impact of audio instruction and audio-visual biofeedback on respiratory-gated radiotherapy

    International Nuclear Information System (INIS)

    George, Rohini; Chung, Theodore D.; Vedam, Sastry S.; Ramakrishnan, Viswanathan; Mohan, Radhe; Weiss, Elisabeth; Keall, Paul J.

    2006-01-01

    Purpose: Respiratory gating is a commercially available technology for reducing the deleterious effects of motion during imaging and treatment. The efficacy of gating is dependent on the reproducibility within and between respiratory cycles during imaging and treatment. The aim of this study was to determine whether audio-visual biofeedback can improve respiratory reproducibility by decreasing residual motion and therefore increasing the accuracy of gated radiotherapy. Methods and Materials: A total of 331 respiratory traces were collected from 24 lung cancer patients. The protocol consisted of five breathing training sessions spaced about a week apart. Within each session the patients initially breathed without any instruction (free breathing), with audio instructions and with audio-visual biofeedback. Residual motion was quantified by the standard deviation of the respiratory signal within the gating window. Results: Audio-visual biofeedback significantly reduced residual motion compared with free breathing and audio instruction. Displacement-based gating has lower residual motion than phase-based gating. Little reduction in residual motion was found for duty cycles less than 30%; for duty cycles above 50% there was a sharp increase in residual motion. Conclusions: The efficiency and reproducibility of gating can be improved by: incorporating audio-visual biofeedback, using a 30-50% duty cycle, gating during exhalation, and using displacement-based gating

  14. Advances in audio watermarking based on singular value decomposition

    CERN Document Server

    Dhar, Pranab Kumar

    2015-01-01

    This book introduces audio watermarking methods for copyright protection, which has drawn extensive attention for securing digital data from unauthorized copying. The book is divided into two parts. First, an audio watermarking method in discrete wavelet transform (DWT) and discrete cosine transform (DCT) domains using singular value decomposition (SVD) and quantization is introduced. This method is robust against various attacks and provides good imperceptible watermarked sounds. Then, an audio watermarking method in fast Fourier transform (FFT) domain using SVD and Cartesian-polar transformation (CPT) is presented. This method has high imperceptibility and high data payload and it provides good robustness against various attacks. These techniques allow media owners to protect copyright and to show authenticity and ownership of their material in a variety of applications.   ·         Features new methods of audio watermarking for copyright protection and ownership protection ·         Outl...

  15. Online Distributed Learning Over Networks in RKH Spaces Using Random Fourier Features

    Science.gov (United States)

    Bouboulis, Pantelis; Chouvardas, Symeon; Theodoridis, Sergios

    2018-04-01

    We present a novel diffusion scheme for online kernel-based learning over networks. So far, a major drawback of any online learning algorithm, operating in a reproducing kernel Hilbert space (RKHS), is the need for updating a growing number of parameters as time iterations evolve. Besides complexity, this leads to an increased need of communication resources, in a distributed setting. In contrast, the proposed method approximates the solution as a fixed-size vector (of larger dimension than the input space) using Random Fourier Features. This paves the way to use standard linear combine-then-adapt techniques. To the best of our knowledge, this is the first time that a complete protocol for distributed online learning in RKHS is presented. Conditions for asymptotic convergence and boundness of the networkwise regret are also provided. The simulated tests illustrate the performance of the proposed scheme.

  16. Securing Digital Audio using Complex Quadratic Map

    Science.gov (United States)

    Suryadi, MT; Satria Gunawan, Tjandra; Satria, Yudi

    2018-03-01

    In This digital era, exchanging data are common and easy to do, therefore it is vulnerable to be attacked and manipulated from unauthorized parties. One data type that is vulnerable to attack is digital audio. So, we need data securing method that is not vulnerable and fast. One of the methods that match all of those criteria is securing the data using chaos function. Chaos function that is used in this research is complex quadratic map (CQM). There are some parameter value that causing the key stream that is generated by CQM function to pass all 15 NIST test, this means that the key stream that is generated using this CQM is proven to be random. In addition, samples of encrypted digital sound when tested using goodness of fit test are proven to be uniform, so securing digital audio using this method is not vulnerable to frequency analysis attack. The key space is very huge about 8.1×l031 possible keys and the key sensitivity is very small about 10-10, therefore this method is also not vulnerable against brute-force attack. And finally, the processing speed for both encryption and decryption process on average about 450 times faster that its digital audio duration.

  17. A centralized audio presentation manager

    Energy Technology Data Exchange (ETDEWEB)

    Papp, A.L. III; Blattner, M.M.

    1994-05-16

    The centralized audio presentation manager addresses the problems which occur when multiple programs running simultaneously attempt to use the audio output of a computer system. Time dependence of sound means that certain auditory messages must be scheduled simultaneously, which can lead to perceptual problems due to psychoacoustic phenomena. Furthermore, the combination of speech and nonspeech audio is examined; each presents its own problems of perceptibility in an acoustic environment composed of multiple auditory streams. The centralized audio presentation manager receives abstract parameterized message requests from the currently running programs, and attempts to create and present a sonic representation in the most perceptible manner through the use of a theoretically and empirically designed rule set.

  18. Instrumental Landing Using Audio Indication

    Science.gov (United States)

    Burlak, E. A.; Nabatchikov, A. M.; Korsun, O. N.

    2018-02-01

    The paper proposes an audio indication method for presenting to a pilot the information regarding the relative positions of an aircraft in the tasks of precision piloting. The implementation of the method is presented, the use of such parameters of audio signal as loudness, frequency and modulation are discussed. To confirm the operability of the audio indication channel the experiments using modern aircraft simulation facility were carried out. The simulated performed the instrument landing using the proposed audio method to indicate the aircraft deviations in relation to the slide path. The results proved compatible with the simulated instrumental landings using the traditional glidescope pointers. It inspires to develop the method in order to solve other precision piloting tasks.

  19. ENERGY STAR Certified Audio Video

    Data.gov (United States)

    U.S. Environmental Protection Agency — Certified models meet all ENERGY STAR requirements as listed in the Version 3.0 ENERGY STAR Program Requirements for Audio Video Equipment that are effective as of...

  20. WLAN Technologies for Audio Delivery

    Directory of Open Access Journals (Sweden)

    Nicolas-Alexander Tatlas

    2007-01-01

    Full Text Available Audio delivery and reproduction for home or professional applications may greatly benefit from the adoption of digital wireless local area network (WLAN technologies. The most challenging aspect of such integration relates the synchronized and robust real-time streaming of multiple audio channels to multipoint receivers, for example, wireless active speakers. Here, it is shown that current WLAN solutions are susceptible to transmission errors. A detailed study of the IEEE802.11e protocol (currently under ratification is also presented and all relevant distortions are assessed via an analytical and experimental methodology. A novel synchronization scheme is also introduced, allowing optimized playback for multiple receivers. The perceptual audio performance is assessed for both stereo and 5-channel applications based on either PCM or compressed audio signals.

  1. Realtime Audio with Garbage Collection

    OpenAIRE

    Matheussen, Kjetil Svalastog

    2010-01-01

    Two non-moving concurrent garbage collectors tailored for realtime audio processing are described. Both collectors work on copies of the heap to avoid cache misses and audio-disruptive synchronizations. Both collectors are targeted at multiprocessor personal computers. The first garbage collector works in uncooperative environments, and can replace Hans Boehm's conservative garbage collector for C and C++. The collector does not access the virtual memory system. Neither doe...

  2. Audio localization for mobile robots

    OpenAIRE

    de Guillebon, Thibaut; Grau Saldes, Antoni; Bolea Monte, Yolanda

    2009-01-01

    The department of the University for which I worked is developing a project based on the interaction with robots in the environment. My work was to define an audio system for the robot. This audio system that I have to realize consists on a mobile head which is able to follow the sound in its environment. This subject was treated as a research problem, with the liberty to find and develop different solutions and make them evolve in the chosen way.

  3. Tourism research and audio methods

    DEFF Research Database (Denmark)

    Jensen, Martin Trandberg

    2016-01-01

    Audio methods enriches sensuous tourism ethnographies. • The note suggests five research avenues for future auditory scholarship. • Sensuous tourism research has neglected the role of sounds in embodied tourism experiences.......• Audio methods enriches sensuous tourism ethnographies. • The note suggests five research avenues for future auditory scholarship. • Sensuous tourism research has neglected the role of sounds in embodied tourism experiences....

  4. Modeling Audio Fingerprints : Structure, Distortion, Capacity

    NARCIS (Netherlands)

    Doets, P.J.O.

    2010-01-01

    An audio fingerprint is a compact low-level representation of a multimedia signal. An audio fingerprint can be used to identify audio files or fragments in a reliable way. The use of audio fingerprints for identification consists of two phases. In the enrollment phase known content is fingerprinted,

  5. Do features of public open spaces vary between urban and rural areas?

    Science.gov (United States)

    Veitch, Jenny; Salmon, Jo; Ball, Kylie; Crawford, David; Timperio, Anna

    2013-02-01

    Parks are an important setting for physical activity and specific park features have been shown to be associated with park visitation and physical activity. Most park-based research has been conducted in urban settings with few studies examining rural parks. This study examined differences in features of parks in urban compared with rural areas. In 2009/10 a tool was developed to audit 433 urban and 195 rural parks located in disadvantaged areas of Victoria, Australia. Features assessed included: access; lighting/safety; aesthetics; amenities; paths; outdoor courts/ovals; informal play spaces; and playgrounds (number, diversity, age appropriateness and safety of play equipment). Rural parks scored higher for aesthetics compared with urban parks (5.08 vs 4.44). Urban parks scored higher for access (4.64 vs 3.89), lighting/safety (2.01 vs 1.76), and diversity of play equipment (7.37 vs 6.24), and were more likely to have paths suitable for walking/cycling (58.8% vs 40.9%) and play equipment for older children (68.2% vs 17.1%). Although the findings cannot be generalized to all urban and rural parks, the results may be used to inform advocacy for park development in rural areas to create parks that are more supportive of physical activity for children and adults. Copyright © 2012 Elsevier Inc. All rights reserved.

  6. A scale space approach for unsupervised feature selection in mass spectra classification for ovarian cancer detection.

    Science.gov (United States)

    Ceccarelli, Michele; d'Acierno, Antonio; Facchiano, Angelo

    2009-10-15

    Mass spectrometry spectra, widely used in proteomics studies as a screening tool for protein profiling and to detect discriminatory signals, are high dimensional data. A large number of local maxima (a.k.a. peaks) have to be analyzed as part of computational pipelines aimed at the realization of efficient predictive and screening protocols. With this kind of data dimensions and samples size the risk of over-fitting and selection bias is pervasive. Therefore the development of bio-informatics methods based on unsupervised feature extraction can lead to general tools which can be applied to several fields of predictive proteomics. We propose a method for feature selection and extraction grounded on the theory of multi-scale spaces for high resolution spectra derived from analysis of serum. Then we use support vector machines for classification. In particular we use a database containing 216 samples spectra divided in 115 cancer and 91 control samples. The overall accuracy averaged over a large cross validation study is 98.18. The area under the ROC curve of the best selected model is 0.9962. We improved previous known results on the problem on the same data, with the advantage that the proposed method has an unsupervised feature selection phase. All the developed code, as MATLAB scripts, can be downloaded from http://medeaserver.isa.cnr.it/dacierno/spectracode.htm.

  7. Diffraction of SH-waves by topographic features in a layered transversely isotropic half-space

    Science.gov (United States)

    Ba, Zhenning; Liang, Jianwen; Zhang, Yanju

    2017-01-01

    The scattering of plane SH-waves by topographic features in a layered transversely isotropic (TI) half-space is investigated by using an indirect boundary element method (IBEM). Firstly, the anti-plane dynamic stiffness matrix of the layered TI half-space is established and the free fields are solved by using the direct stiffness method. Then, Green's functions are derived for uniformly distributed loads acting on an inclined line in a layered TI half-space and the scattered fields are constructed with the deduced Green's functions. Finally, the free fields are added to the scattered ones to obtain the global dynamic responses. The method is verified by comparing results with the published isotropic ones. Both the steady-state and transient dynamic responses are evaluated and discussed. Numerical results in the frequency domain show that surface motions for the TI media can be significantly different from those for the isotropic case, which are strongly dependent on the anisotropy property, incident angle and incident frequency. Results in the time domain show that the material anisotropy has important effects on the maximum duration and maximum amplitudes of the time histories.

  8. Audio ADC, Phase I

    Data.gov (United States)

    National Aeronautics and Space Administration — With the availability of small geometry SOI processes, STI has shown that it is possible to design and fabricate improved high performance, analog circuits with...

  9. On Modeling Affect in Audio with Non-Linear Symbolic Dynamics

    Directory of Open Access Journals (Sweden)

    Pauline Mouawad

    2017-09-01

    Full Text Available The discovery of semantic information from complex signals is a task concerned with connecting humans’ perceptions and/or intentions with the signals content. In the case of audio signals, complex perceptions are appraised in a listener’s mind, that trigger affective responses that may be relevant for well-being and survival. In this paper we are interested in the broader question of relations between uncertainty in data as measured using various information criteria and emotions, and we propose a novel method that combines nonlinear dynamics analysis with a method of adaptive time series symbolization that finds the meaningful audio structure in terms of symbolized recurrence properties. In a first phase we obtain symbolic recurrence quantification measures from symbolic recurrence plots, without the need to reconstruct the phase space with embedding. Then we estimate symbolic dynamical invariants from symbolized time series, after embedding. The invariants are: correlation dimension, correlation entropy and Lyapunov exponent. Through their application for the logistic map, we show that our measures are in agreement with known methods from literature. We further show that one symbolic recurrence measure, namely the symbolic Shannon entropy, correlates positively with the positive Lyapunov exponents. Finally we evaluate the performance of our measures in emotion recognition through the implementation of classification tasks for different types of audio signals, and show that in some cases, they perform better than state-of-the-art methods that rely on low-level acoustic features.

  10. Complete fold annotation of the human proteome using a novel structural feature space.

    Science.gov (United States)

    Middleton, Sarah A; Illuminati, Joseph; Kim, Junhyong

    2017-04-13

    Recognition of protein structural fold is the starting point for many structure prediction tools and protein function inference. Fold prediction is computationally demanding and recognizing novel folds is difficult such that the majority of proteins have not been annotated for fold classification. Here we describe a new machine learning approach using a novel feature space that can be used for accurate recognition of all 1,221 currently known folds and inference of unknown novel folds. We show that our method achieves better than 94% accuracy even when many folds have only one training example. We demonstrate the utility of this method by predicting the folds of 34,330 human protein domains and showing that these predictions can yield useful insights into potential biological function, such as prediction of RNA-binding ability. Our method can be applied to de novo fold prediction of entire proteomes and identify candidate novel fold families.

  11. Location audio simplified capturing your audio and your audience

    CERN Document Server

    Miles, Dean

    2014-01-01

    From the basics of using camera, handheld, lavalier, and shotgun microphones to camera calibration and mixer set-ups, Location Audio Simplified unlocks the secrets to clean and clear broadcast quality audio no matter what challenges you face. Author Dean Miles applies his twenty-plus years of experience as a professional location operator to teach the skills, techniques, tips, and secrets needed to produce high-quality production sound on location. Humorous and thoroughly practical, the book covers a wide array of topics, such as:* location selection* field mixing* boo

  12. A Joint Audio-Visual Approach to Audio Localization

    DEFF Research Database (Denmark)

    Jensen, Jesper Rindom; Christensen, Mads Græsbøll

    2015-01-01

    Localization of audio sources is an important research problem, e.g., to facilitate noise reduction. In the recent years, the problem has been tackled using distributed microphone arrays (DMA). A common approach is to apply direction-of-arrival (DOA) estimation on each array (denoted as nodes), a...... time-of-flight cameras. Moreover, we propose an optimal method for weighting such DOA and range information for audio localization. Our experiments on both synthetic and real data show that there is a clear, potential advantage of using the joint audiovisual localization framework....

  13. Semantic Analysis of Multimedial Information Usign Both Audio and Visual Clues

    Directory of Open Access Journals (Sweden)

    Andrej Lukac

    2008-01-01

    Full Text Available Nowadays, there is a lot of information in databases (text, audio/video form, etc.. It is important to be able to describe this data for better orientation in them. It is necessary to apply audio/video properties, which are used for metadata management, segmenting the document into semantically meaningful units, classifying each unit into a predefined scene type, indexing, summarizing the document for efficient retrieval and browsing. Data can be used for system that automatically searches for a specific person in a sequence also for special video sequences. Audio/video properties are presented by descriptors and description schemes. There are many features that can be used to characterize multimedial signals. We can analyze audio and video sequences jointly or considered them completely separately. Our aim is oriented to possibilities of combining multimedial features. Focus is direct into discussion programs, because there are more decisions how to combine audio features with video sequences.

  14. Audio-Visual Speech Recognition Using Lip Information Extracted from Side-Face Images

    Directory of Open Access Journals (Sweden)

    Koji Iwano

    2007-03-01

    Full Text Available This paper proposes an audio-visual speech recognition method using lip information extracted from side-face images as an attempt to increase noise robustness in mobile environments. Our proposed method assumes that lip images can be captured using a small camera installed in a handset. Two different kinds of lip features, lip-contour geometric features and lip-motion velocity features, are used individually or jointly, in combination with audio features. Phoneme HMMs modeling the audio and visual features are built based on the multistream HMM technique. Experiments conducted using Japanese connected digit speech contaminated with white noise in various SNR conditions show effectiveness of the proposed method. Recognition accuracy is improved by using the visual information in all SNR conditions. These visual features were confirmed to be effective even when the audio HMM was adapted to noise by the MLLR method.

  15. Audio-visual temporal recalibration can be constrained by content cues regardless of spatial overlap

    Directory of Open Access Journals (Sweden)

    Warrick eRoseboom

    2013-04-01

    Full Text Available It has now been well established that the point of subjective synchrony for audio and visual events can be shifted following exposure to asynchronous audio-visual presentations, an effect often referred to as temporal recalibration. Recently it was further demonstrated that it is possible to concurrently maintain two such recalibrated, and opposing, estimates of audio-visual temporal synchrony. However, it remains unclear precisely what defines a given audio-visual pair such that it is possible to maintain a temporal relationship distinct from other pairs. It has been suggested that spatial separation of the different audio-visual pairs is necessary to achieve multiple distinct audio-visual synchrony estimates. Here we investigated if this was necessarily true. Specifically, we examined whether it is possible to obtain two distinct temporal recalibrations for stimuli that differed only in featural content. Using both complex (audio visual speech; Experiment 1 and simple stimuli (high and low pitch audio matched with either vertically or horizontally oriented Gabors; Experiment 2 we found concurrent, and opposite, recalibrations despite there being no spatial difference in presentation location at any point throughout the experiment. This result supports the notion that the content of an audio-visual pair can be used to constrain distinct audio-visual synchrony estimates regardless of spatial overlap.

  16. A Cp-theory problem book special features of function spaces

    CERN Document Server

    Tkachuk, Vladimir V

    2014-01-01

    The books in Vladimir Tkachuk’s A Cp-Theory Problem Book series will be the ‘go to’ texts for basic reference to Cp-theory. This second volume, Special Features of Function Spaces, gives a reasonably complete coverage of Cp-theory, systematically introducing each of the major topics and providing  500 carefully selected problems and exercises with complete solutions. Bonus results and open problems are also given. The text is designed to bring a dedicated reader from basic topological principles to the frontiers of modern research covering a wide variety of topics in Cp-theory and general topology at the professional level. The first volume, Topological and Function Spaces © 2011, provided an introduction from scratch to Cp-theory and general topology, preparing the reader for a professional understanding of Cp-theory in the last section of its main text. This second volume continues from the first, and can be used as a textbook for courses in both Cp-theory and general topology as well as a referenc...

  17. Linear sign in cystic brain lesions ≥5 mm: A suggestive feature of perivascular space.

    Science.gov (United States)

    Sung, Jinkyeong; Jang, Jinhee; Choi, Hyun Seok; Jung, So-Lyung; Ahn, Kook-Jin; Kim, Bum-Soo

    2017-11-01

    To determine the prevalence of a linear sign within enlarged perivascular space (EPVS) and chronic lacunar infarction (CLI) ≥ 5 mm on T2-weighted imaging (T2WI) and time-of-flight (TOF) magnetic resonance angiography (MRA), and to evaluate the diagnostic value of the linear signs for EPVS over CLI. This study included 101 patients with cystic lesions ≥ 5 mm on brain MRI including TOF MRA. After classification of cystic lesions into EPVS or CLI, two readers assessed linear signs on T2WI and TOF MRA. We compared the prevalence and the diagnostic performance of linear signs. Among 46 EPVS and 51 CLI, 84 lesions (86.6%) were in basal ganglia. The prevalence of T2 and TOF linear signs was significantly higher in the EPVS than in the CLI (P linear signs showed high sensitivity (> 80%). TOF linear sign showed significantly higher specificity (100%) and accuracy (92.8% and 90.7%) than T2 linear sign (P linear signs were more frequently observed in EPVS than CLI. They showed high sensitivity in differentiation of them, especially for basal ganglia. TOF sign showed higher specificity and accuracy than T2 sign. • Linear sign is a suggestive feature of EPVS. • Time-of-flight magnetic resonance angiography can reveal the lenticulostriate artery within perivascular spaces. • Linear sign helps differentiation of EPVS and CLI, especially in basal ganglia.

  18. WebGL and web audio software lightweight components for multimedia education

    Science.gov (United States)

    Chang, Xin; Yuksel, Kivanc; Skarbek, Władysław

    2017-08-01

    The paper presents the results of our recent work on development of contemporary computing platform DC2 for multimedia education usingWebGL andWeb Audio { the W3C standards. Using literate programming paradigm the WEBSA educational tools were developed. It offers for a user (student), the access to expandable collection of WEBGL Shaders and web Audio scripts. The unique feature of DC2 is the option of literate programming, offered for both, the author and the reader in order to improve interactivity to lightweightWebGL andWeb Audio components. For instance users can define: source audio nodes including synthetic sources, destination audio nodes, and nodes for audio processing such as: sound wave shaping, spectral band filtering, convolution based modification, etc. In case of WebGL beside of classic graphics effects based on mesh and fractal definitions, the novel image processing analysis by shaders is offered like nonlinear filtering, histogram of gradients, and Bayesian classifiers.

  19. Audio power amplifier design handbook

    CERN Document Server

    Self, Douglas

    2013-01-01

    This book is essential for audio power amplifier designers and engineers for one simple reason...it enables you as a professional to develop reliable, high-performance circuits. The Author Douglas Self covers the major issues of distortion and linearity, power supplies, overload, DC-protection and reactive loading. He also tackles unusual forms of compensation and distortion produced by capacitors and fuses. This completely updated fifth edition includes four NEW chapters including one on The XD Principle, invented by the author, and used by Cambridge Audio. Cro

  20. Using Simplified Thermal Inertia to Determine the Theoretical Dry Line in Feature Space for Evapotranspiration Retrieval

    Directory of Open Access Journals (Sweden)

    Sujuan Mi

    2015-08-01

    Full Text Available With the development of quantitative remote sensing, regional evapotranspiration (ET modeling based on the feature space has made substantial progress. Among those feature space based evapotranspiration models, accurate determination of the dry/wet lines remains a challenging task. This paper reports the development of a new model, named DDTI (Determination of Dry line by Thermal Inertia, which determines the theoretical dry line based on the relationship between the thermal inertia and the soil moisture. The Simplified Thermal Inertia value estimated in the North China Plain is consistent with the value measured in the laboratory. Three evaluation methods, which are based on the comparison of the locations of the theoretical dry line determined by two models (DDTI model and the heat energy balance model, the comparison of ET results, and the comparison of the evaporative fraction between the estimates from the two models and the in situ measurements, were used to assess the performance of the new model DDTI. The location of the theoretical dry line determined by DDTI is more reasonable than that determined by the heat energy balance model. ET estimated from DDTI has an RMSE (Root Mean Square Error of 56.77 W/m2 and a bias of 27.17 W/m2; while the heat energy balance model estimated ET with an RMSE of 83.36 W/m2 and a bias of −38.42 W/m2. When comparing the coeffcient of determination for the two models with the observations from Yucheng, DDTI demonstrated ET with an R2 of 0.9065; while the heat energy balance model has an R2 of 0.7729. When compared with the in situ measurements of evaporative fraction (EF at Yucheng Experimental Station, the ET model based on DDTI reproduces the pixel scale EF with an RMSE of 0.149, much lower than that based on the heat energy balance model which has an RMSE of 0.220. Also, the EF bias between the DDTI model and the in situ measurements is 0.064, lower than the EF bias of the heat energy balance model

  1. Engaging Students with Audio Feedback

    Science.gov (United States)

    Cann, Alan

    2014-01-01

    Students express widespread dissatisfaction with academic feedback. Teaching staff perceive a frequent lack of student engagement with written feedback, much of which goes uncollected or unread. Published evidence shows that audio feedback is highly acceptable to students but is underused. This paper explores methods to produce and deliver audio…

  2. Haptic and Audio Interaction Design

    DEFF Research Database (Denmark)

    This book constitutes the refereed proceedings of the 5th International Workshop on Haptic and Audio Interaction Design, HAID 2010 held in Copenhagen, Denmark, in September 2010. The 21 revised full papers presented were carefully reviewed and selected for inclusion in the book. The papers are or...

  3. Radioactive Decay: Audio Data Collection

    Science.gov (United States)

    Struthers, Allan

    2009-01-01

    Many phenomena generate interesting audible time series. This data can be collected and processed using audio software. The free software package "Audacity" is used to demonstrate the process by recording, processing, and extracting click times from an inexpensive radiation detector. The high quality of the data is demonstrated with a simple…

  4. Digital Augmented Reality Audio Headset

    Directory of Open Access Journals (Sweden)

    Jussi Rämö

    2012-01-01

    Full Text Available Augmented reality audio (ARA combines virtual sound sources with the real sonic environment of the user. An ARA system can be realized with a headset containing binaural microphones. Ideally, the ARA headset should be acoustically transparent, that is, it should not cause audible modification to the surrounding sound. A practical implementation of an ARA mixer requires a low-latency headphone reproduction system with additional equalization to compensate for the attenuation and the modified ear canal resonances caused by the headphones. This paper proposes digital IIR filters to realize the required equalization and evaluates a real-time prototype ARA system. Measurements show that the throughput latency of the digital prototype ARA system can be less than 1.4 ms, which is sufficiently small in practice. When the direct and processed sounds are combined in the ear, a comb filtering effect is brought about and appears as notches in the frequency response. The comb filter effect in speech and music signals was studied in a listening test and it was found to be inaudible when the attenuation is 20 dB. Insert ARA headphones have a sufficient attenuation at frequencies above about 1 kHz. The proposed digital ARA system enables several immersive audio applications, such as a virtual audio tourist guide and audio teleconferencing.

  5. Mining potential biomarkers associated with space flight in Caenorhabditis elegans experienced Shenzhou-8 mission with multiple feature selection techniques

    International Nuclear Information System (INIS)

    Zhao, Lei; Gao, Ying; Mi, Dong; Sun, Yeqing

    2016-01-01

    Highlights: • A combined algorithm is proposed to mine biomarkers of spaceflight in C. elegans. • This algorithm makes the feature selection more reliable and robust. • Apply this algorithm to predict 17 positive biomarkers to space environment stress. • The strategy can be used as a general method to select important features. - Abstract: To identify the potential biomarkers associated with space flight, a combined algorithm, which integrates the feature selection techniques, was used to deal with the microarray datasets of Caenorhabditis elegans obtained in the Shenzhou-8 mission. Compared with the ground control treatment, a total of 86 differentially expressed (DE) genes in responses to space synthetic environment or space radiation environment were identified by two filter methods. And then the top 30 ranking genes were selected by the random forest algorithm. Gene Ontology annotation and functional enrichment analyses showed that these genes were mainly associated with metabolism process. Furthermore, clustering analysis showed that 17 genes among these are positive, including 9 for space synthetic environment and 8 for space radiation environment only. These genes could be used as the biomarkers to reflect the space environment stresses. In addition, we also found that microgravity is the main stress factor to change the expression patterns of biomarkers for the short-duration spaceflight.

  6. Mining potential biomarkers associated with space flight in Caenorhabditis elegans experienced Shenzhou-8 mission with multiple feature selection techniques

    Energy Technology Data Exchange (ETDEWEB)

    Zhao, Lei [Institute of Environmental Systems Biology, College of Environmental Science and Engineering, Dalian Maritime University, Dalian 116026 (China); Gao, Ying [Center of Medical Physics and Technology, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Shushanhu Road 350, Hefei 230031 (China); Mi, Dong, E-mail: mid@dlmu.edu.cn [Department of Physics, Dalian Maritime University, Dalian 116026 (China); Sun, Yeqing, E-mail: yqsun@dlmu.edu.cn [Institute of Environmental Systems Biology, College of Environmental Science and Engineering, Dalian Maritime University, Dalian 116026 (China)

    2016-09-15

    Highlights: • A combined algorithm is proposed to mine biomarkers of spaceflight in C. elegans. • This algorithm makes the feature selection more reliable and robust. • Apply this algorithm to predict 17 positive biomarkers to space environment stress. • The strategy can be used as a general method to select important features. - Abstract: To identify the potential biomarkers associated with space flight, a combined algorithm, which integrates the feature selection techniques, was used to deal with the microarray datasets of Caenorhabditis elegans obtained in the Shenzhou-8 mission. Compared with the ground control treatment, a total of 86 differentially expressed (DE) genes in responses to space synthetic environment or space radiation environment were identified by two filter methods. And then the top 30 ranking genes were selected by the random forest algorithm. Gene Ontology annotation and functional enrichment analyses showed that these genes were mainly associated with metabolism process. Furthermore, clustering analysis showed that 17 genes among these are positive, including 9 for space synthetic environment and 8 for space radiation environment only. These genes could be used as the biomarkers to reflect the space environment stresses. In addition, we also found that microgravity is the main stress factor to change the expression patterns of biomarkers for the short-duration spaceflight.

  7. Four-quadrant flyback converter for direct audio power amplification

    Energy Technology Data Exchange (ETDEWEB)

    Ljusev, P.; Andersen, Michael A.E.

    2005-07-01

    This paper presents a bidirectional, four-quadrant yback converter for use in direct audio power amplication. When compared to the standard Class-D switching-mode audio power amplier with separate power supply, the proposed four-quadrant flyback converter provides simple and compact solution with high efciency, higher level of integration, lower component count, less board space and eventually lower cost. Both peak and average current-mode control for use with 4Q flyback power converters are described and compared. Integrated magnetics is presented which simplies the construction of the auxiliary power supplies for control biasing and isolated gate drives. The feasibility of the approach is proven on audio power amplier prototype for subwoofer applications. (au)

  8. HBC-Evo: predicting human breast cancer by exploiting amino acid sequence-based feature spaces and evolutionary ensemble system.

    Science.gov (United States)

    Majid, Abdul; Ali, Safdar

    2015-01-01

    We developed genetic programming (GP)-based evolutionary ensemble system for the early diagnosis, prognosis and prediction of human breast cancer. This system has effectively exploited the diversity in feature and decision spaces. First, individual learners are trained in different feature spaces using physicochemical properties of protein amino acids. Their predictions are then stacked to develop the best solution during GP evolution process. Finally, results for HBC-Evo system are obtained with optimal threshold, which is computed using particle swarm optimization. Our novel approach has demonstrated promising results compared to state of the art approaches.

  9. Bit rates in audio source coding

    NARCIS (Netherlands)

    Veldhuis, Raymond N.J.

    1992-01-01

    The goal is to introduce and solve the audio coding optimization problem. Psychoacoustic results such as masking and excitation pattern models are combined with results from rate distortion theory to formulate the audio coding optimization problem. The solution of the audio optimization problem is a

  10. Audio Frequency Analysis in Mobile Phones

    Science.gov (United States)

    Aguilar, Horacio Munguía

    2016-01-01

    A new experiment using mobile phones is proposed in which its audio frequency response is analyzed using the audio port for inputting external signal and getting a measurable output. This experiment shows how the limited audio bandwidth used in mobile telephony is the main cause of the poor speech quality in this service. A brief discussion is…

  11. Machinery running state identification based on discriminant semi-supervised local tangent space alignment for feature fusion and extraction

    International Nuclear Information System (INIS)

    Su, Zuqiang; Xiao, Hong; Zhang, Yi; Tang, Baoping; Jiang, Yonghua

    2017-01-01

    Extraction of sensitive features is a challenging but key task in data-driven machinery running state identification. Aimed at solving this problem, a method for machinery running state identification that applies discriminant semi-supervised local tangent space alignment (DSS-LTSA) for feature fusion and extraction is proposed. Firstly, in order to extract more distinct features, the vibration signals are decomposed by wavelet packet decomposition WPD, and a mixed-domain feature set consisted of statistical features, autoregressive (AR) model coefficients, instantaneous amplitude Shannon entropy and WPD energy spectrum is extracted to comprehensively characterize the properties of machinery running state(s). Then, the mixed-dimension feature set is inputted into DSS-LTSA for feature fusion and extraction to eliminate redundant information and interference noise. The proposed DSS-LTSA can extract intrinsic structure information of both labeled and unlabeled state samples, and as a result the over-fitting problem of supervised manifold learning and blindness problem of unsupervised manifold learning are overcome. Simultaneously, class discrimination information is integrated within the dimension reduction process in a semi-supervised manner to improve sensitivity of the extracted fusion features. Lastly, the extracted fusion features are inputted into a pattern recognition algorithm to achieve the running state identification. The effectiveness of the proposed method is verified by a running state identification case in a gearbox, and the results confirm the improved accuracy of the running state identification. (paper)

  12. Vascular lesions of the lumbar epidural space: magnetic resonance imaging features of epidural cavernous hemangioma and epidural hematoma

    Directory of Open Access Journals (Sweden)

    Basile Júnior Roberto

    1999-01-01

    Full Text Available The authors report the magnetic resonance imaging diagnostic features in two cases with respectively lumbar epidural hematoma and cavernous hemangioma of the lumbar epidural space. Enhanced MRI T1-weighted scans show a hyperintense signal rim surrounding the vascular lesion. Non-enhanced T2-weighted scans showed hyperintense signal.

  13. Recognizing the Face of Johnny, Suzy, and Me: Insensitivity to the Spacing Among Features at 4 Years of Age

    Science.gov (United States)

    Mondloch, Catherine J.; Leis, Anishka; Maurer, Daphne

    2006-01-01

    Four-year-olds were tested for their ability to use differences in the spacing among features to recognize familiar faces. They were given a storybook depicting multiple views of 2 children. They returned to the laboratory 2 weeks later and used a "magic wand" to play a computer game that tested their ability to recognize the familiarized faces…

  14. Audio-Visual Speaker Diarization Based on Spatiotemporal Bayesian Fusion.

    Science.gov (United States)

    Gebru, Israel D; Ba, Sileye; Li, Xiaofei; Horaud, Radu

    2018-05-01

    Speaker diarization consists of assigning speech signals to people engaged in a dialogue. An audio-visual spatiotemporal diarization model is proposed. The model is well suited for challenging scenarios that consist of several participants engaged in multi-party interaction while they move around and turn their heads towards the other participants rather than facing the cameras and the microphones. Multiple-person visual tracking is combined with multiple speech-source localization in order to tackle the speech-to-person association problem. The latter is solved within a novel audio-visual fusion method on the following grounds: binaural spectral features are first extracted from a microphone pair, then a supervised audio-visual alignment technique maps these features onto an image, and finally a semi-supervised clustering method assigns binaural spectral features to visible persons. The main advantage of this method over previous work is that it processes in a principled way speech signals uttered simultaneously by multiple persons. The diarization itself is cast into a latent-variable temporal graphical model that infers speaker identities and speech turns, based on the output of an audio-visual association process, executed at each time slice, and on the dynamics of the diarization variable itself. The proposed formulation yields an efficient exact inference procedure. A novel dataset, that contains audio-visual training data as well as a number of scenarios involving several participants engaged in formal and informal dialogue, is introduced. The proposed method is thoroughly tested and benchmarked with respect to several state-of-the art diarization algorithms.

  15. Multiple-output support vector machine regression with feature selection for arousal/valence space emotion assessment.

    Science.gov (United States)

    Torres-Valencia, Cristian A; Álvarez, Mauricio A; Orozco-Gutiérrez, Alvaro A

    2014-01-01

    Human emotion recognition (HER) allows the assessment of an affective state of a subject. Until recently, such emotional states were described in terms of discrete emotions, like happiness or contempt. In order to cover a high range of emotions, researchers in the field have introduced different dimensional spaces for emotion description that allow the characterization of affective states in terms of several variables or dimensions that measure distinct aspects of the emotion. One of the most common of such dimensional spaces is the bidimensional Arousal/Valence space. To the best of our knowledge, all HER systems so far have modelled independently, the dimensions in these dimensional spaces. In this paper, we study the effect of modelling the output dimensions simultaneously and show experimentally the advantages in modeling them in this way. We consider a multimodal approach by including features from the Electroencephalogram and a few physiological signals. For modelling the multiple outputs, we employ a multiple output regressor based on support vector machines. We also include an stage of feature selection that is developed within an embedded approach known as Recursive Feature Elimination (RFE), proposed initially for SVM. The results show that several features can be eliminated using the multiple output support vector regressor with RFE without affecting the performance of the regressor. From the analysis of the features selected in smaller subsets via RFE, it can be observed that the signals that are more informative into the arousal and valence space discrimination are the EEG, Electrooculogram/Electromiogram (EOG/EMG) and the Galvanic Skin Response (GSR).

  16. Multiple channel space lattice focusing and features of its use in applied RF linac

    International Nuclear Information System (INIS)

    Kushin, V.; Plotnikov, S.; Zarubin, A.; Bondarev, B.; Durkin, A.

    2000-01-01

    Nowadays the use of multiple channel accelerator systems is well known with some hundred channels helps us to increase total beam intensity proportional to the number of channels while the divergence of the total beam is roughly equal to the divergence of single channel. The accelerator structure for multiple beam linac must provide both transversal and longitudinal stability for every small beam taking into account Coulomb interactions of all the micro beams. The most convenient for accelerator structures with 100 and more beams are the systems that use RF focusing such as RFQ, APF and DTL with rectangular profiles. The common disadvantage of all those systems is connected with decreasing of focusing forces of RF field with particle velocity increase. Our analysis shows that the disadvantage may be overcome in structures with rectangular profiles. For this purpose some additional thin (3-5 mm) focusing electrodes called space lattices (SL) must be arranged within accelerator gaps. The distance between these electrodes is chosen roughly equal to the thickness of additional electrodes. The number of the electrodes must be increased with length of accelerator gaps and may be equal n=1,2...6 and even more. The arrangement of n thin electrodes in accelerator gaps helps us to reach qualitative change of accelerator structure parameters. Firstly, they make n times amplification of the sign-alternate component of RF focusing field without appreciable influence to phasing action of accelerating field. Secondly, introducing of additional electrodes that divide the gap on n small accelerator gaps provides beams shielding from each other within the region of beam acceleration in RF fields between drift tubes. The analysis shows that if n=4-6, it is possible to reach transversal stability of all particles independently of their input phases in RF field. On the other hand, the analysis shows that adiabatic change of synchronous phase at the input stage of acceleration helps us

  17. Research of features and structure of electoral space of Ukraine in 2014 with the use of synthetic approach

    Directory of Open Access Journals (Sweden)

    M. M. Shelemba

    2015-02-01

    Full Text Available The article is aimed at the ground of expediency of the use of synthetic authorial model for research of features and structure of electoral space of Ukraine in 2014 year. Methodological principles of the use of synthetic model are expounded with the use of quality and quantitative methods researches of electoral space, among that methods of factor and cross­correlation analysis. A synthetic model (approach that is built on the basis of the use of the best scientific approaches takes into account features and progress of electoral space of Ukraine trends. The analysis of features and structure of electoral space of Ukraine is conducted in 2014 with the use of an offer model. The application author synthetic model allows the study of the use of association factor and correlation analysis to justify support to political parties during election campaigns, respectively, depending on the factors and the most important correlates. It was found that electoral choice depends on the actions of those factors in the highest degree the expectations of the region. This article has shown that the use of Ukraine at this stage of the investigated during election campaigns as the most significant social correlates of «Human Development Index» is reasonable and one that makes it possible to obtain reliable results. It is proved that a high level of correlation holds at a high level of support the party and, consequently, high sense of social correlates all variants of expert research.

  18. Smartphone audio port data collection cookbook

    Directory of Open Access Journals (Sweden)

    Kyle Forinash

    2018-06-01

    Full Text Available The audio port of a smartphone is designed to send and receive audio but can be harnessed for portable, economical, and accurate data collection from a variety of sources. While smartphones have internal sensors to measure a number of physical phenomena such as acceleration, magnetism and illumination levels, measurement of other phenomena such as voltage, external temperature, or accurate timing of moving objects are excluded. The audio port cannot be only employed to sense external phenomena. It has the additional advantage of timing precision; because audio is recorded or played at a controlled rate separated from other smartphone activities, timings based on audio can be highly accurate. The following outlines unpublished details of the audio port technical elements for data collection, a general data collection recipe and an example timing application for Android devices.

  19. Presence and the utility of audio spatialization

    DEFF Research Database (Denmark)

    Bormann, Karsten

    2005-01-01

    The primary concern of this paper is whether the utility of audio spatialization, as opposed to the fidelity of audio spatialization, impacts presence. An experiment is reported that investigates the presence-performance relationship by decoupling spatial audio fidelity (realism) from task...... performance by varying the spatial fidelity of the audio independently of its relevance to performance on the search task that subjects were to perform. This was achieved by having conditions in which subjects searched for a music-playing radio (an active sound source) and having conditions in which...... supplied only nonattenuated audio was detrimental to performance. Even so, this group of subjects consistently had the largest increase in presence scores over the baseline experiment. Further, the Witmer and Singer (1998) presence questionnaire was more sensitive to whether the audio source was active...

  20. Modified BTC Algorithm for Audio Signal Coding

    Directory of Open Access Journals (Sweden)

    TOMIC, S.

    2016-11-01

    Full Text Available This paper describes modification of a well-known image coding algorithm, named Block Truncation Coding (BTC and its application in audio signal coding. BTC algorithm was originally designed for black and white image coding. Since black and white images and audio signals have different statistical characteristics, the application of this image coding algorithm to audio signal presents a novelty and a challenge. Several implementation modifications are described in this paper, while the original idea of the algorithm is preserved. The main modifications are performed in the area of signal quantization, by designing more adequate quantizers for audio signal processing. The result is a novel audio coding algorithm, whose performance is presented and analyzed in this research. The performance analysis indicates that this novel algorithm can be successfully applied in audio signal coding.

  1. Functionality of system components: Conservation of protein function in protein feature space

    DEFF Research Database (Denmark)

    Jensen, Lars Juhl; Ussery, David; Brunak, Søren

    2003-01-01

    well on organisms other than the one on which it was trained. We evaluate the performance of such a method, ProtFun, which relies on protein features as its sole input, and show that the method gives similar performance for most eukaryotes and performs much better than anticipated on archaea......Many protein features useful for prediction of protein function can be predicted from sequence, including posttranslational modifications, subcellular localization, and physical/chemical properties. We show here that such protein features are more conserved among orthologs than paralogs, indicating...... they are crucial for protein function and thus subject to selective pressure. This means that a function prediction method based on sequence-derived features may be able to discriminate between proteins with different function even when they have highly similar structure. Also, such a method is likely to perform...

  2. Active Learning for Automatic Audio Processing of Unwritten Languages (ALAPUL)

    Science.gov (United States)

    2016-07-01

    AFRL-RH-WP-TR-2016-0074 ACTIVE LEARNING FOR AUTOMATIC AUDIO PROCESSING OF UNWRITTEN LANGUAGES (ALAPUL) Dimitra Vergyri Andreas Kathol Wen Wang...FA8650-15-C-9101 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) *Dimitra Vergyri; Andreas Kathol; Wen Wang; Chris Bartels; Julian VanHout...feature transform through deep auto-encoders for better phone recognition performance. We target iterative learning to improve the system through

  3. Making the Switch to Digital Audio

    Directory of Open Access Journals (Sweden)

    Shannon Gwin Mitchell

    2004-12-01

    Full Text Available In this article, the authors describe the process of converting from analog to digital audio data. They address the step-by-step decisions that they made in selecting hardware and software for recording and converting digital audio, issues of system integration, and cost considerations. The authors present a brief description of how digital audio is being used in their current research project and how it has enhanced the “quality” of their qualitative research.

  4. New audio applications of beryllium metal

    International Nuclear Information System (INIS)

    Sato, M.

    1977-01-01

    The major applications of beryllium metal in the field of audio appliances are for the vibrating cones for the two types of speakers 'TWITTER' for high range sound and 'SQUAWKER' for mid range sound, and also for beryllium cantilever tube assembled in stereo cartridge. These new applications are based on the characteristic property of beryllium having high ratio of modulus of elasticity to specific gravity. The production of these audio parts is described, and the audio response is shown. (author)

  5. Noise-Canceling Helmet Audio System

    Science.gov (United States)

    Seibert, Marc A.; Culotta, Anthony J.

    2007-01-01

    A prototype helmet audio system has been developed to improve voice communication for the wearer in a noisy environment. The system was originally intended to be used in a space suit, wherein noise generated by airflow of the spacesuit life-support system can make it difficult for remote listeners to understand the astronaut s speech and can interfere with the astronaut s attempt to issue vocal commands to a voice-controlled robot. The system could be adapted to terrestrial use in helmets of protective suits that are typically worn in noisy settings: examples include biohazard, fire, rescue, and diving suits. The system (see figure) includes an array of microphones and small loudspeakers mounted at fixed positions in a helmet, amplifiers and signal-routing circuitry, and a commercial digital signal processor (DSP). Notwithstanding the fixed positions of the microphones and loudspeakers, the system can accommodate itself to any normal motion of the wearer s head within the helmet. The system operates in conjunction with a radio transceiver. An audio signal arriving via the transceiver intended to be heard by the wearer is adjusted in volume and otherwise conditioned and sent to the loudspeakers. The wearer s speech is collected by the microphones, the outputs of which are logically combined (phased) so as to form a microphone- array directional sensitivity pattern that discriminates in favor of sounds coming from vicinity of the wearer s mouth and against sounds coming from elsewhere. In the DSP, digitized samples of the microphone outputs are processed to filter out airflow noise and to eliminate feedback from the loudspeakers to the microphones. The resulting conditioned version of the wearer s speech signal is sent to the transceiver.

  6. Local Control of Audio Environment: A Review of Methods and Applications

    Directory of Open Access Journals (Sweden)

    Jussi Kuutti

    2014-02-01

    Full Text Available The concept of a local audio environment is to have sound playback locally restricted such that, ideally, adjacent regions of an indoor or outdoor space could exhibit their own individual audio content without interfering with each other. This would enable people to listen to their content of choice without disturbing others next to them, yet, without any headphones to block conversation. In practice, perfect sound containment in free air cannot be attained, but a local audio environment can still be satisfactorily approximated using directional speakers. Directional speakers may be based on regular audible frequencies or they may employ modulated ultrasound. Planar, parabolic, and array form factors are commonly used. The directivity of a speaker improves as its surface area and sound frequency increases, making these the main design factors for directional audio systems. Even directional speakers radiate some sound outside the main beam, and sound can also reflect from objects. Therefore, directional speaker systems perform best when there is enough ambient noise to mask the leaking sound. Possible areas of application for local audio include information and advertisement audio feed in commercial facilities, guiding and narration in museums and exhibitions, office space personalization, control room messaging, rehabilitation environments, and entertainment audio systems.

  7. BAT: An open-source, web-based audio events annotation tool

    OpenAIRE

    Blai Meléndez-Catalan, Emilio Molina, Emilia Gómez

    2017-01-01

    In this paper we present BAT (BMAT Annotation Tool), an open-source, web-based tool for the manual annotation of events in audio recordings developed at BMAT (Barcelona Music and Audio Technologies). The main feature of the tool is that it provides an easy way to annotate the salience of simultaneous sound sources. Additionally, it allows to define multiple ontologies to adapt to multiple tasks and offers the possibility to cross-annotate audio data. Moreover, it is easy to install and deploy...

  8. Probing features in inflaton potential and reionization history with future CMB space observations

    Science.gov (United States)

    Hazra, Dhiraj Kumar; Paoletti, Daniela; Ballardini, Mario; Finelli, Fabio; Shafieloo, Arman; Smoot, George F.; Starobinsky, Alexei A.

    2018-02-01

    We consider the prospects of probing features in the primordial power spectrum with future Cosmic Microwave Background (CMB) polarization measurements. In the scope of the inflationary scenario, such features in the spectrum can be produced by local non-smooth pieces in an inflaton potential (smooth and quasi-flat in general) which in turn may originate from fast phase transitions during inflation in other quantum fields interacting with the inflaton. They can fit some outliers in the CMB temperature power spectrum which are unaddressed within the standard inflationary ΛCDM model. We consider Wiggly Whipped Inflation (WWI) as a theoretical framework leading to improvements in the fit to the Planck 2015 temperature and polarization data in comparison with the standard inflationary models, although not at a statistically significant level. We show that some type of features in the potential within the WWI models, leading to oscillations in the primordial power spectrum that extend to intermediate and small scales can be constrained with high confidence (at 3σ or higher confidence level) by an instrument as the Cosmic ORigins Explorer (CORE). In order to investigate the possible confusion between inflationary features and footprints from the reionization era, we consider an extended reionization history with monotonic increase of free electrons with decrease in redshift. We discuss the present constraints on this model of extended reionization and future predictions with CORE. We also project, to what extent, this extended reionization can create confusion in identifying inflationary features in the data.

  9. Feature Space Dimensionality Reduction for Real-Time Vision-Based Food Inspection

    Directory of Open Access Journals (Sweden)

    Mai Moussa CHETIMA

    2009-03-01

    Full Text Available Machine vision solutions are becoming a standard for quality inspection in several manufacturing industries. In the processed-food industry where the appearance attributes of the product are essential to customer’s satisfaction, visual inspection can be reliably achieved with machine vision. But such systems often involve the extraction of a larger number of features than those actually needed to ensure proper quality control, making the process less efficient and difficult to tune. This work experiments with several feature selection techniques in order to reduce the number of attributes analyzed by a real-time vision-based food inspection system. Identifying and removing as much irrelevant and redundant information as possible reduces the dimensionality of the data and allows classification algorithms to operate faster. In some cases, accuracy on classification can even be improved. Filter-based and wrapper-based feature selectors are experimentally evaluated on different bakery products to identify the best performing approaches.

  10. SOCIAL DISTANCES AS A FEATURE OF THE CONTEMPORARY RUSSIAN SOCIAL SPACE

    Directory of Open Access Journals (Sweden)

    Л А Беляева

    2018-12-01

    Full Text Available Social space is a theoretical construct that allows to consider many key problems of social development including the society’s consolidation. The author defines social space as a set of social statuses and distances. Their objective characteristics are interrelated with subjective indicators identified through the opinions of individuals. The balance of statuses and distances in society and the acceptability of this structure for the majority of population ensure the stability of society and effective social control. If this balance is disturbed, social tensions arise and threaten the stability and consolidation of society. Thus, the ideas of the theories of social space possess a considerable heuristic potential for revealing urgent problems of social development such as solidarity, social stratification and mobility, social networks and their interaction, connections of local communities within and with the world, interaction of structured social relations and individual and collective practices, genesis of social space as a result of social production represented by both things and relationships, etc. According to the theory of P. Bourdieu, the author con-siders social space as a structure of social statuses based on the set of different types of capital: economic, cultural, social, and symbolic. The author uses statistical data and results of the monitoring survey conducted on the all-Russian sample. The article proposes some tested empirical indicators that proved the increase of social distances in Russia due to the redistribution of economic capital and, as a consequence, of cultural and social capitals. Thus, the social space of Russia cannot be considered stable. To ensure its greater stability we need a set of measures to reduce social distances: re-industrialization to create high-tech jobs, development of digital economy, and improvement of the mass secondary and higher education system - these measures can create a basis for the

  11. FEATURES OF PSYCHOLOGICAL SPACE SOVEREIGNTY MAINTAINED BY PEOPLE WITH DIFFERENT ATTITUDE TO SOLITUDE

    Directory of Open Access Journals (Sweden)

    Nadezhda Alekseevna Garipova

    2017-06-01

    Practical implications. The results can be useful for developing psychocorrection sessions and trainings. The data can be helpful for specialists of Family Psychological Support centers and for instructors of “Ecological Psychology”, “Family Relations Psychology” disciplines. The study carried out is likely to be highly educational since many respondents participating in the survey admitted that they had never considered personal boundaries violation to be the reason for marital conflicts. They also lacked information concerning psychological space, how to regulate personal space boundaries and how to respond to other family members behavior in an adequate manner.

  12. Correspondence between audio and visual deep models for musical instrument detection in video recordings

    OpenAIRE

    Slizovskaia, Olga; Gómez, Emilia; Haro, Gloria

    2017-01-01

    This work aims at investigating cross-modal connections between audio and video sources in the task of musical instrument recognition. We also address in this work the understanding of the representations learned by convolutional neural networks (CNNs) and we study feature correspondence between audio and visual components of a multimodal CNN architecture. For each instrument category, we select the most activated neurons and investigate exist- ing cross-correlations between neurons from the ...

  13. Study on the construction of multi-dimensional Remote Sensing feature space for hydrological drought

    International Nuclear Information System (INIS)

    Xiang, Daxiang; Tan, Debao; Wen, Xiongfei; Shen, Shaohong; Li, Zhe; Cui, Yuanlai

    2014-01-01

    Hydrological drought refers to an abnormal water shortage caused by precipitation and surface water shortages or a groundwater imbalance. Hydrological drought is reflected in a drop of surface water, decrease of vegetation productivity, increase of temperature difference between day and night and so on. Remote sensing permits the observation of surface water, vegetation, temperature and other information from a macro perspective. This paper analyzes the correlation relationship and differentiation of both remote sensing and surface measured indicators, after the selection and extraction a series of representative remote sensing characteristic parameters according to the spectral characterization of surface features in remote sensing imagery, such as vegetation index, surface temperature and surface water from HJ-1A/B CCD/IRS data. Finally, multi-dimensional remote sensing features such as hydrological drought are built on a intelligent collaborative model. Further, for the Dong-ting lake area, two drought events are analyzed for verification of multi-dimensional features using remote sensing data with different phases and field observation data. The experiments results proved that multi-dimensional features are a good method for hydrological drought

  14. Fault-tolerant feature-based estimation of space debris rotational motion during active removal missions

    Science.gov (United States)

    Biondi, Gabriele; Mauro, Stefano; Pastorelli, Stefano; Sorli, Massimo

    2018-05-01

    One of the key functionalities required by an Active Debris Removal mission is the assessment of the target kinematics and inertial properties. Passive sensors, such as stereo cameras, are often included in the onboard instrumentation of a chaser spacecraft for capturing sequential photographs and for tracking features of the target surface. A plenty of methods, based on Kalman filtering, are available for the estimation of the target's state from feature positions; however, to guarantee the filter convergence, they typically require continuity of measurements and the capability of tracking a fixed set of pre-defined features of the object. These requirements clash with the actual tracking conditions: failures in feature detection often occur and the assumption of having some a-priori knowledge about the shape of the target could be restrictive in certain cases. The aim of the presented work is to propose a fault-tolerant alternative method for estimating the angular velocity and the relative magnitudes of the principal moments of inertia of the target. Raw data regarding the positions of the tracked features are processed to evaluate corrupted values of a 3-dimentional parameter which entirely describes the finite screw motion of the debris and which primarily is invariant on the particular set of considered features of the object. Missing values of the parameter are completely restored exploiting the typical periodicity of the rotational motion of an uncontrolled satellite: compressed sensing techniques, typically adopted for recovering images or for prognostic applications, are herein used in a completely original fashion for retrieving a kinematic signal that appears sparse in the frequency domain. Due to its invariance about the features, no assumptions are needed about the target's shape and continuity of the tracking. The obtained signal is useful for the indirect evaluation of an attitude signal that feeds an unscented Kalman filter for the estimation of

  15. Distortion Estimation in Compressed Music Using Only Audio Fingerprints

    NARCIS (Netherlands)

    Doets, P.J.O.; Lagendijk, R.L.

    2008-01-01

    An audio fingerprint is a compact yet very robust representation of the perceptually relevant parts of an audio signal. It can be used for content-based audio identification, even when the audio is severely distorted. Audio compression changes the fingerprint slightly. We show that these small

  16. Non-retinotopic feature processing in the absence of retinotopic spatial layout and the construction of perceptual space from motion.

    Science.gov (United States)

    Ağaoğlu, Mehmet N; Herzog, Michael H; Oğmen, Haluk

    2012-10-15

    The spatial representation of a visual scene in the early visual system is well known. The optics of the eye map the three-dimensional environment onto two-dimensional images on the retina. These retinotopic representations are preserved in the early visual system. Retinotopic representations and processing are among the most prevalent concepts in visual neuroscience. However, it has long been known that a retinotopic representation of the stimulus is neither sufficient nor necessary for perception. Saccadic Stimulus Presentation Paradigm and the Ternus-Pikler displays have been used to investigate non-retinotopic processes with and without eye movements, respectively. However, neither of these paradigms eliminates the retinotopic representation of the spatial layout of the stimulus. Here, we investigated how stimulus features are processed in the absence of a retinotopic layout and in the presence of retinotopic conflict. We used anorthoscopic viewing (slit viewing) and pitted a retinotopic feature-processing hypothesis against a non-retinotopic feature-processing hypothesis. Our results support the predictions of the non-retinotopic feature-processing hypothesis and demonstrate the ability of the visual system to operate non-retinotopically at a fine feature processing level in the absence of a retinotopic spatial layout. Our results suggest that perceptual space is actively constructed from the perceptual dimension of motion. The implications of these findings for normal ecological viewing conditions are discussed. 2012 Elsevier Ltd. All rights reserved

  17. Audio Recording of Children with Dyslalia

    Directory of Open Access Journals (Sweden)

    Stefan Gheorghe Pentiuc

    2008-01-01

    Full Text Available In this paper we present our researches regarding automat parsing of audio recordings. These recordings are obtained from children with dyslalia and are necessary for an accurate identification of speech problems. We develop a software application that helps parsing audio, real time, recordings.

  18. Audio Recording of Children with Dyslalia

    OpenAIRE

    Stefan Gheorghe Pentiuc; Maria D. Schipor; Ovidiu A. Schipor

    2008-01-01

    In this paper we present our researches regarding automat parsing of audio recordings. These recordings are obtained from children with dyslalia and are necessary for an accurate identification of speech problems. We develop a software application that helps parsing audio, real time, recordings.

  19. A listening test system for automotive audio

    DEFF Research Database (Denmark)

    Christensen, Flemming; Geoff, Martin; Minnaar, Pauli

    2005-01-01

    This paper describes a system for simulating automotive audio through headphones for the purposes of conducting listening experiments in the laboratory. The system is based on binaural technology and consists of a component for reproducing the sound of the audio system itself and a component...

  20. Fusion for Audio-Visual Laughter Detection

    NARCIS (Netherlands)

    Reuderink, B.

    2007-01-01

    Laughter is a highly variable signal, and can express a spectrum of emotions. This makes the automatic detection of laughter a challenging but interesting task. We perform automatic laughter detection using audio-visual data from the AMI Meeting Corpus. Audio-visual laughter detection is performed

  1. Digital signal processor for silicon audio playback devices; Silicon audio saisei kikiyo digital signal processor

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2000-03-01

    The digital audio signal processor (DSP) TC9446F series has been developed silicon audio playback devices with a memory medium of, e.g., flash memory, DVD players, and AV devices, e.g., TV sets. It corresponds to AAC (advanced audio coding) (2ch) and MP3 (MPEG1 Layer3), as the audio compressing techniques being used for transmitting music through an internet. It also corresponds to compressed types, e.g., Dolby Digital, DTS (digital theater system) and MPEG2 audio, being adopted for, e.g., DVDs. It can carry a built-in audio signal processing program, e.g., Dolby ProLogic, equalizer, sound field controlling, and 3D sound. TC9446XB has been lined up anew. It adopts an FBGA (fine pitch ball grid array) package for portable audio devices. (translated by NEDO)

  2. Design of batch audio/video conversion platform based on JavaEE

    Science.gov (United States)

    Cui, Yansong; Jiang, Lianpin

    2018-03-01

    With the rapid development of digital publishing industry, the direction of audio / video publishing shows the diversity of coding standards for audio and video files, massive data and other significant features. Faced with massive and diverse data, how to quickly and efficiently convert to a unified code format has brought great difficulties to the digital publishing organization. In view of this demand and present situation in this paper, basing on the development architecture of Sptring+SpringMVC+Mybatis, and combined with the open source FFMPEG format conversion tool, a distributed online audio and video format conversion platform with a B/S structure is proposed. Based on the Java language, the key technologies and strategies designed in the design of platform architecture are analyzed emphatically in this paper, designing and developing a efficient audio and video format conversion system, which is composed of “Front display system”, "core scheduling server " and " conversion server ". The test results show that, compared with the ordinary audio and video conversion scheme, the use of batch audio and video format conversion platform can effectively improve the conversion efficiency of audio and video files, and reduce the complexity of the work. Practice has proved that the key technology discussed in this paper can be applied in the field of large batch file processing, and has certain practical application value.

  3. “Wrapping” X3DOM around Web Audio API

    Directory of Open Access Journals (Sweden)

    Andreas Stamoulias

    2015-12-01

    Full Text Available Spatial sound has a conceptual role in the Web3D environments, due to highly realism scenes that can provide. Lately the efforts are concentrated on the extension of the X3D/ X3DOM through spatial sound attributes. This paper presents a novel method for the introduction of spatial sound components in the X3DOM framework, based on X3D specification and Web Audio API. The proposed method incorporates the introduction of enhanced sound nodes for X3DOM which are derived by the implementation of the X3D standard components, enriched with accessional features of Web Audio API. Moreover, several examples-scenarios developed for the evaluation of our approach. The implemented examples established the achievability of new registered nodes in X3DOM, for spatial sound characteristics in Web3D virtual worlds.

  4. Computationally Efficient Clustering of Audio-Visual Meeting Data

    Science.gov (United States)

    Hung, Hayley; Friedland, Gerald; Yeo, Chuohao

    This chapter presents novel computationally efficient algorithms to extract semantically meaningful acoustic and visual events related to each of the participants in a group discussion using the example of business meeting recordings. The recording setup involves relatively few audio-visual sensors, comprising a limited number of cameras and microphones. We first demonstrate computationally efficient algorithms that can identify who spoke and when, a problem in speech processing known as speaker diarization. We also extract visual activity features efficiently from MPEG4 video by taking advantage of the processing that was already done for video compression. Then, we present a method of associating the audio-visual data together so that the content of each participant can be managed individually. The methods presented in this article can be used as a principal component that enables many higher-level semantic analysis tasks needed in search, retrieval, and navigation.

  5. Features of Virchow-Robin spaces in newly diagnosed multiple sclerosis patients

    Energy Technology Data Exchange (ETDEWEB)

    Etemadifar, Masoud [Department of Clinical and Biological Sciences, Division of Neurology, San Luigi Gonzaga School of Medicine, Orbassano (Torino), Turin (Italy); Department of Neurology, Isfahan University of Medical Sciences, Isfahan (Iran, Islamic Republic of); Isfahan Research Committee of Multiple Sclerosis (IRCOMS), Isfahan University of Medical Sciences, Isfahan (Iran, Islamic Republic of); Hekmatnia, Ali; Tayari, Nazila [Department of Radiology, Isfahan University of Medical Sciences, Isfahan (Iran, Islamic Republic of); Kazemi, Mojtaba [Department of Neurology, Isfahan University of Medical Sciences, Isfahan (Iran, Islamic Republic of); Ghazavi, Amirhossein [Department of Radiology, Isfahan University of Medical Sciences, Isfahan (Iran, Islamic Republic of); Akbari, Mojtaba [Department of Epidemiology and Statistics, Isfahan University of Medical Sciences, Isfahan (Iran, Islamic Republic of); Maghzi, Amir-Hadi, E-mail: maghzi@edc.mui.ac.ir [Isfahan Research Committee of Multiple Sclerosis (IRCOMS), Isfahan University of Medical Sciences, Isfahan (Iran, Islamic Republic of); Neuroimmunology Unit, Centre for Neuroscience and Trauma, Blizard Institute of Cell and Molecular Science, Barts and the London School of Medicine and Dentistry, London (United Kingdom); Isfahan Neurosciences Research Center, Isfahan University of Medical Sciences, Isfahan (Iran, Islamic Republic of)

    2011-11-15

    Background: Virchow-Robin spaces (VRSs) are perivascular pia-lined extensions of the subarachnoid space around the arteries and veins as they enter the brain parenchyma. These spaces are responsible for inflammatory processes within the brain. Objectives: This study was designed to shed more light on the location, size and shape of VRSs on 3 mm slice thickness, 1.5 Tesla MRI scans of newly diagnosed MS patients in Isfahan, Iran and compare the results with healthy age- and sex-matched controls. Methods: We evaluated MRI scans of 73 MS patients obtained within 3 months of MS onset and compared them with MRI scans from 73 age- and sex-matched healthy volunteers. Three mm section proton density, T2W and FLAIR MR images were obtained for all subjects. The location, size and shape of VRSs were compared between the two groups. Results: The total number of VRSs was significantly more in the MS group (p < 0.001). The distribution of VRSs were significantly more located in the high convexity areas in the MS group (p < 0.001), while there was no significant differences in other regions. The round shaped VRSs were significantly more detected on MRI scans of MS patients, and curvilinear shapes were significantly more frequently observed in healthy volunteers, however there were no significant differences for oval shaped VRSs between the two groups. The number of VRSs with the size over than 2 mm were significantly more observed in the MS groups compared to controls. We also observed some differences in the characteristics of VRSs between the genders in the MS group. Conclusion: The results of this study shed more light on the usefulness of VRSs as an MRI marker for the disease. In addition, according to our results VRSs might also have implication to determine the prognosis of the disease. However, larger studies with more advanced MRI techniques are required to confirm our results.

  6. Features of Virchow-Robin spaces in newly diagnosed multiple sclerosis patients

    International Nuclear Information System (INIS)

    Etemadifar, Masoud; Hekmatnia, Ali; Tayari, Nazila; Kazemi, Mojtaba; Ghazavi, Amirhossein; Akbari, Mojtaba; Maghzi, Amir-Hadi

    2011-01-01

    Background: Virchow-Robin spaces (VRSs) are perivascular pia-lined extensions of the subarachnoid space around the arteries and veins as they enter the brain parenchyma. These spaces are responsible for inflammatory processes within the brain. Objectives: This study was designed to shed more light on the location, size and shape of VRSs on 3 mm slice thickness, 1.5 Tesla MRI scans of newly diagnosed MS patients in Isfahan, Iran and compare the results with healthy age- and sex-matched controls. Methods: We evaluated MRI scans of 73 MS patients obtained within 3 months of MS onset and compared them with MRI scans from 73 age- and sex-matched healthy volunteers. Three mm section proton density, T2W and FLAIR MR images were obtained for all subjects. The location, size and shape of VRSs were compared between the two groups. Results: The total number of VRSs was significantly more in the MS group (p < 0.001). The distribution of VRSs were significantly more located in the high convexity areas in the MS group (p < 0.001), while there was no significant differences in other regions. The round shaped VRSs were significantly more detected on MRI scans of MS patients, and curvilinear shapes were significantly more frequently observed in healthy volunteers, however there were no significant differences for oval shaped VRSs between the two groups. The number of VRSs with the size over than 2 mm were significantly more observed in the MS groups compared to controls. We also observed some differences in the characteristics of VRSs between the genders in the MS group. Conclusion: The results of this study shed more light on the usefulness of VRSs as an MRI marker for the disease. In addition, according to our results VRSs might also have implication to determine the prognosis of the disease. However, larger studies with more advanced MRI techniques are required to confirm our results.

  7. Flexible feature-space-construction architecture and its VLSI implementation for multi-scale object detection

    Science.gov (United States)

    Luo, Aiwen; An, Fengwei; Zhang, Xiangyu; Chen, Lei; Huang, Zunkai; Jürgen Mattausch, Hans

    2018-04-01

    Feature extraction techniques are a cornerstone of object detection in computer-vision-based applications. The detection performance of vison-based detection systems is often degraded by, e.g., changes in the illumination intensity of the light source, foreground-background contrast variations or automatic gain control from the camera. In order to avoid such degradation effects, we present a block-based L1-norm-circuit architecture which is configurable for different image-cell sizes, cell-based feature descriptors and image resolutions according to customization parameters from the circuit input. The incorporated flexibility in both the image resolution and the cell size for multi-scale image pyramids leads to lower computational complexity and power consumption. Additionally, an object-detection prototype for performance evaluation in 65 nm CMOS implements the proposed L1-norm circuit together with a histogram of oriented gradients (HOG) descriptor and a support vector machine (SVM) classifier. The proposed parallel architecture with high hardware efficiency enables real-time processing, high detection robustness, small chip-core area as well as low power consumption for multi-scale object detection.

  8. Reducing the n-gram feature space of class C GPCRs to subtype-discriminating patterns

    Directory of Open Access Journals (Sweden)

    König Caroline

    2014-12-01

    Full Text Available G protein-coupled receptors (GPCRs are a large and heterogeneous superfamily of receptors that are key cell players for their role as extracellular signal transmitters. Class C GPCRs, in particular, are of great interest in pharmacology. The lack of knowledge about their full 3-D structure prompts the use of their primary amino acid sequences for the construction of robust classifiers, capable of discriminating their different subtypes. In this paper, we investigate the use of feature selection techniques to build Support Vector Machine (SVM-based classification models from selected receptor subsequences described as n-grams. We show that this approach to classification is useful for finding class C GPCR subtype-specific motifs.

  9. High-Fidelity Piezoelectric Audio Device

    Science.gov (United States)

    Woodward, Stanley E.; Fox, Robert L.; Bryant, Robert G.

    2003-01-01

    ModalMax is a very innovative means of harnessing the vibration of a piezoelectric actuator to produce an energy efficient low-profile device with high-bandwidth high-fidelity audio response. The piezoelectric audio device outperforms many commercially available speakers made using speaker cones. The piezoelectric device weighs substantially less (4 g) than the speaker cones which use magnets (10 g). ModalMax devices have extreme fabrication simplicity. The entire audio device is fabricated by lamination. The simplicity of the design lends itself to lower cost. The piezoelectric audio device can be used without its acoustic chambers and thereby resulting in a very low thickness of 0.023 in. (0.58 mm). The piezoelectric audio device can be completely encapsulated, which makes it very attractive for use in wet environments. Encapsulation does not significantly alter the audio response. Its small size (see Figure 1) is applicable to many consumer electronic products, such as pagers, portable radios, headphones, laptop computers, computer monitors, toys, and electronic games. The audio device can also be used in automobile or aircraft sound systems.

  10. Implementing Audio-CASI on Windows’ Platforms

    Science.gov (United States)

    Cooley, Philip C.; Turner, Charles F.

    2011-01-01

    Audio computer-assisted self interviewing (Audio-CASI) technologies have recently been shown to provide important and sometimes dramatic improvements in the quality of survey measurements. This is particularly true for measurements requiring respondents to divulge highly sensitive information such as their sexual, drug use, or other sensitive behaviors. However, DOS-based Audio-CASI systems that were designed and adopted in the early 1990s have important limitations. Most salient is the poor control they provide for manipulating the video presentation of survey questions. This article reports our experiences adapting Audio-CASI to Microsoft Windows 3.1 and Windows 95 platforms. Overall, our Windows-based system provided the desired control over video presentation and afforded other advantages including compatibility with a much wider array of audio devices than our DOS-based Audio-CASI technologies. These advantages came at the cost of increased system requirements --including the need for both more RAM and larger hard disks. While these costs will be an issue for organizations converting large inventories of PCS to Windows Audio-CASI today, this will not be a serious constraint for organizations and individuals with small inventories of machines to upgrade or those purchasing new machines today. PMID:22081743

  11. Improvements of ModalMax High-Fidelity Piezoelectric Audio Device

    Science.gov (United States)

    Woodard, Stanley E.

    2005-01-01

    ModalMax audio speakers have been enhanced by innovative means of tailoring the vibration response of thin piezoelectric plates to produce a high-fidelity audio response. The ModalMax audio speakers are 1 mm in thickness. The device completely supplants the need to have a separate driver and speaker cone. ModalMax speakers can perform the same applications of cone speakers, but unlike cone speakers, ModalMax speakers can function in harsh environments such as high humidity or extreme wetness. New design features allow the speakers to be completely submersed in salt water, making them well suited for maritime applications. The sound produced from the ModalMax audio speakers has sound spatial resolution that is readily discernable for headset users.

  12. The Visualization and Analysis of POI Features under Network Space Supported by Kernel Density Estimation

    Directory of Open Access Journals (Sweden)

    YU Wenhao

    2015-01-01

    Full Text Available The distribution pattern and the distribution density of urban facility POIs are of great significance in the fields of infrastructure planning and urban spatial analysis. The kernel density estimation, which has been usually utilized for expressing these spatial characteristics, is superior to other density estimation methods (such as Quadrat analysis, Voronoi-based method, for that the Kernel density estimation considers the regional impact based on the first law of geography. However, the traditional kernel density estimation is mainly based on the Euclidean space, ignoring the fact that the service function and interrelation of urban feasibilities is carried out on the network path distance, neither than conventional Euclidean distance. Hence, this research proposed a computational model of network kernel density estimation, and the extension type of model in the case of adding constraints. This work also discussed the impacts of distance attenuation threshold and height extreme to the representation of kernel density. The large-scale actual data experiment for analyzing the different POIs' distribution patterns (random type, sparse type, regional-intensive type, linear-intensive type discusses the POI infrastructure in the city on the spatial distribution of characteristics, influence factors, and service functions.

  13. Linear sign in cystic brain lesions ≥5 mm. A suggestive feature of perivascular space

    Energy Technology Data Exchange (ETDEWEB)

    Sung, Jinkyeong [The Catholic University of Korea, Department of Radiology, Seoul St. Mary' s Hospital, College of Medicine, Seoul (Korea, Republic of); The Catholic University of Korea, Department of Radiology, St. Vincent' s Hospital, College of Medicine, Seoul (Korea, Republic of); Jang, Jinhee; Choi, Hyun Seok; Jung, So-Lyung; Ahn, Kook-Jin; Kim, Bum-soo [The Catholic University of Korea, Department of Radiology, Seoul St. Mary' s Hospital, College of Medicine, Seoul (Korea, Republic of)

    2017-11-15

    To determine the prevalence of a linear sign within enlarged perivascular space (EPVS) and chronic lacunar infarction (CLI) ≥ 5 mm on T2-weighted imaging (T2WI) and time-of-flight (TOF) magnetic resonance angiography (MRA), and to evaluate the diagnostic value of the linear signs for EPVS over CLI. This study included 101 patients with cystic lesions ≥ 5 mm on brain MRI including TOF MRA. After classification of cystic lesions into EPVS or CLI, two readers assessed linear signs on T2WI and TOF MRA. We compared the prevalence and the diagnostic performance of linear signs. Among 46 EPVS and 51 CLI, 84 lesions (86.6%) were in basal ganglia. The prevalence of T2 and TOF linear signs was significantly higher in the EPVS than in the CLI (P <.001). For the diagnosis of EPVS, T2 and TOF linear signs showed high sensitivity (> 80%). TOF linear sign showed significantly higher specificity (100%) and accuracy (92.8% and 90.7%) than T2 linear sign (P <.001). T2 and TOF linear signs were more frequently observed in EPVS than CLI. They showed high sensitivity in differentiation of them, especially for basal ganglia. TOF sign showed higher specificity and accuracy than T2 sign. (orig.)

  14. Features of space-charge-limited emission in foil-less diodes

    Energy Technology Data Exchange (ETDEWEB)

    Wu, Ping; Yuan, Keliang; Liu, Guozhi [Department of Engineering Physics, Tsinghua University, Beijing 100084 (China); Science and Technology on High Power Microwave Laboratory, Northwest Institute of Nuclear Technology, Xi' an 710024 (China); Sun, Jun [Science and Technology on High Power Microwave Laboratory, Northwest Institute of Nuclear Technology, Xi' an 710024 (China)

    2014-12-15

    Space-charge-limited (SCL) current can always be obtained from the blade surface of annular cathodes in foil-less diodes which are widely used in O-type relativistic high power microwave generators. However, there is little theoretical analysis regarding it due to the mathematical complexity, and almost all formulas about the SCL current in foil-less diodes are based on numerical simulation results. This paper performs an initial trial in calculation of the SCL current from annular cathodes theoretically under the ultra-relativistic assumption and the condition of infinitely large guiding magnetic field. The numerical calculation based on the theoretical research is coherent with the particle-in-cell (PIC) simulation result to some extent under a diode voltage of 850 kV. Despite that the theoretical research gives a much larger current than the PIC simulation (41.3 kA for the former and 9.7 kA for the latter), which is induced by the ultra-relativistic assumption in the theoretical research, they both show the basic characteristic of emission from annular cathodes in foil-less diodes, i.e., the emission enhancement at the cathode blade edges, especially at the outer edge. This characteristic is confirmed to some extent in our experimental research of cathode plasma photographing under the same diode voltage and a guiding magnetic field of 4 T.

  15. Main circulator design features for HTR 100, HTR 500 and space heating plants

    International Nuclear Information System (INIS)

    Engel, J.; Glass, D.

    1988-01-01

    All design alternatives for modern high-temperature reactors have a common circulator concept: It is based on a vertical shaft design with a flying impeller. The circulators are equipped with active magnetic bearings and are driven by induction motors connected to variable-speed static converters. Due to their multiple functions during normal reactor operation and under accident conditions, extremely high requirements are made to safety-relevant circulators, since with the reactor pressurized as well as under depressurized conditions specified delivery heads and flow rates have to be ensured. The use of active magnetic bearings permits to obtain maintenance-free operation and functional safety to an extent which had not been achieved before. Magnetic bearings are therefore provided for the total range including primary gas circulators of a drive power of several MW as well as circulators for helium loops of reactor auxiliary systems. The essential feature for using active magnetic bearings is the retainer bearing technology, preventing contact between rotor and static circulator parts upon unintended deenergisation of the magnets. Results of current experiments are reported. Another aspect to be considered for reliable long-term operation for several decades is the effect of rotor dynamics. The various natural frequencies resulting from torsion and bending modes in view of a drive by a frequency-controlled induction motor have to be considered as well as the specific characteristics of the active magnetic bearings. Special attention has to be directed to the internal cooling loop so as to ensure that reactor temperature excursions in the event of deviation from normal operation can be overcome without damage. For circulator components exposed to temperature fields the design characteristics are determined by combining experimental and analytical methods. The coordination of all component parts is currently being optimized on a prototype circulator whose detailed

  16. Video equipment of tele dosimetry and audio

    International Nuclear Information System (INIS)

    Ojeda R, M.A.; Padilla C, I.

    2007-01-01

    To develop a work in an area with high radiation, it requires of a detailed knowledge of the surroundings work, a communication and effective vision, a near dosimetric control. In a work where the spaces variables and reduced accesses exist, noise that hinders the communication, defendant operative condition, radiation field and taking of decision, it is necessary to have tools that allow a total control of the environment to make opportune and effective decisions, there where the task is developed. Under this elementary concept, it was developed in the Laguna Verde Central a project that it allowed a mechanism, interactive of control in spaces complex; to see, to hear, to speak, to measure. This concept takes to the creation of an equipped system with closed circuit of television, wireless communication systems, tele dosimetry wireless systems, VHS and DVD recording equipment, uninterrupted energy units. The system requires of an electric power socket, and the installation of two cables by CCTV camera. The system is mobilized by a person. He puts on in operation in 5 minutes using a verification list. The concept was developed in the project denominated VETA-1, (Video Equipment of Tele dosimetry and Audio). It is objective of this work to present before the society the development of the VETA-1 tool that conclude in their first prototype in May of the present year. The VETA-1 project arises by a necessity of optimizing dose, it is an ALARA tool, with a countless applications, like it was proven in the 12 recharge stop of the Unit 1. The VETA-1 project integrate a recording system, with the primary end of analyzing in the place where the task is developed the details for an effective and opportune decision, but the resulting information is of utility for the personnel's training and the planning of future works. The VETA-1 system is an ALARA tool of quick response control. (Author)

  17. Parametric time-frequency domain spatial audio

    CERN Document Server

    Delikaris-Manias, Symeon; Politis, Archontis

    2018-01-01

    This book provides readers with the principles and best practices in spatial audio signal processing. It describes how sound fields and their perceptual attributes are captured and analyzed within the time-frequency domain, how essential representation parameters are coded, and how such signals are efficiently reproduced for practical applications. The book is split into four parts starting with an overview of the fundamentals. It then goes on to explain the reproduction of spatial sound before offering an examination of signal-dependent spatial filtering. The book finishes with coverage of both current and future applications and the direction that spatial audio research is heading in. Parametric Time-frequency Domain Spatial Audio focuses on applications in entertainment audio, including music, home cinema, and gaming--covering the capturing and reproduction of spatial sound as well as its generation, transduction, representation, transmission, and perception. This book will teach readers the tools needed...

  18. Design of an audio advertisement dataset

    Science.gov (United States)

    Fu, Yutao; Liu, Jihong; Zhang, Qi; Geng, Yuting

    2015-12-01

    Since more and more advertisements swarm into radios, it is necessary to establish an audio advertising dataset which could be used to analyze and classify the advertisement. A method of how to establish a complete audio advertising dataset is presented in this paper. The dataset is divided into four different kinds of advertisements. Each advertisement's sample is given in *.wav file format, and annotated with a txt file which contains its file name, sampling frequency, channel number, broadcasting time and its class. The classifying rationality of the advertisements in this dataset is proved by clustering the different advertisements based on Principal Component Analysis (PCA). The experimental results show that this audio advertisement dataset offers a reliable set of samples for correlative audio advertisement experimental studies.

  19. Augmenting Environmental Interaction in Audio Feedback Systems

    Directory of Open Access Journals (Sweden)

    Seunghun Kim

    2016-04-01

    Full Text Available Audio feedback is defined as a positive feedback of acoustic signals where an audio input and output form a loop, and may be utilized artistically. This article presents new context-based controls over audio feedback, leading to the generation of desired sonic behaviors by enriching the influence of existing acoustic information such as room response and ambient noise. This ecological approach to audio feedback emphasizes mutual sonic interaction between signal processing and the acoustic environment. Mappings from analyses of the received signal to signal-processing parameters are designed to emphasize this specificity as an aesthetic goal. Our feedback system presents four types of mappings: approximate analyses of room reverberation to tempo-scale characteristics, ambient noise to amplitude and two different approximations of resonances to timbre. These mappings are validated computationally and evaluated experimentally in different acoustic conditions.

  20. CERN automatic audio-conference service

    CERN Multimedia

    Sierra Moral, R

    2009-01-01

    Scientists from all over the world need to collaborate with CERN on a daily basis. They must be able to communicate effectively on their joint projects at any time; as a result telephone conferences have become indispensable and widely used. Managed by 6 operators, CERN already has more than 20000 hours and 5700 audio-conferences per year. However, the traditional telephone based audio-conference system needed to be modernized in three ways. Firstly, to provide the participants with more autonomy in the organization of their conferences; secondly, to eliminate the constraints of manual intervention by operators; and thirdly, to integrate the audio-conferences into a collaborative working framework. The large number, and hence cost, of the conferences prohibited externalization and so the CERN telecommunications team drew up a specification to implement a new system. It was decided to use a new commercial collaborative audio-conference solution based on the SIP protocol. The system was tested as the first Euro...

  1. Virtual Microphones for Multichannel Audio Resynthesis

    Directory of Open Access Journals (Sweden)

    Athanasios Mouchtaris

    2003-09-01

    Full Text Available Multichannel audio offers significant advantages for music reproduction, including the ability to provide better localization and envelopment, as well as reduced imaging distortion. On the other hand, multichannel audio is a demanding media type in terms of transmission requirements. Often, bandwidth limitations prohibit transmission of multiple audio channels. In such cases, an alternative is to transmit only one or two reference channels and recreate the rest of the channels at the receiving end. Here, we propose a system capable of synthesizing the required signals from a smaller set of signals recorded in a particular venue. These synthesized “virtual” microphone signals can be used to produce multichannel recordings that accurately capture the acoustics of that venue. Applications of the proposed system include transmission of multichannel audio over the current Internet infrastructure and, as an extension of the methods proposed here, remastering existing monophonic and stereophonic recordings for multichannel rendering.

  2. CERN automatic audio-conference service

    CERN Document Server

    Sierra Moral, R

    2010-01-01

    Scientists from all over the world need to collaborate with CERN on a daily basis. They must be able to communicate effectively on their joint projects at any time; as a result telephone conferences have become indispensable and widely used. Managed by 6 operators, CERN already has more than 20000 hours and 5700 audio-conferences per year. However, the traditional telephone based audio-conference system needed to be modernized in three ways. Firstly, to provide the participants with more autonomy in the organization of their conferences; secondly, to eliminate the constraints of manual intervention by operators; and thirdly, to integrate the audio-conferences into a collaborative working framework. The large number, and hence cost, of the conferences prohibited externalization and so the CERN telecommunications team drew up a specification to implement a new system. It was decided to use a new commercial collaborative audio-conference solution based on the SIP protocol. The system was tested as the first Euro...

  3. Spatial audio reproduction with primary ambient extraction

    CERN Document Server

    He, JianJun

    2017-01-01

    This book first introduces the background of spatial audio reproduction, with different types of audio content and for different types of playback systems. A literature study on the classical and emerging Primary Ambient Extraction (PAE) techniques is presented. The emerging techniques aim to improve the extraction performance and also enhance the robustness of PAE approaches in dealing with more complex signals encountered in practice. The in-depth theoretical study helps readers to understand the rationales behind these approaches. Extensive objective and subjective experiments validate the feasibility of applying PAE in spatial audio reproduction systems. These experimental results, together with some representative audio examples and MATLAB codes of the key algorithms, illustrate clearly the differences among various approaches and also help readers gain insights on selecting different approaches for different applications.

  4. Audio production principles practical studio applications

    CERN Document Server

    Elmosnino, Stephane

    2018-01-01

    A new and fully practical guide to all of the key topics in audio production, this book covers the entire workflow from pre-production, to recording all kinds of instruments, to mixing theories and tools, and finally to mastering.

  5. The Effects of Audio-Visual Recorded and Audio Recorded Listening Tasks on the Accuracy of Iranian EFL Learners' Oral Production

    Science.gov (United States)

    Drood, Pooya; Asl, Hanieh Davatgari

    2016-01-01

    The ways in which task in classrooms has developed and proceeded have receive great attention in the field of language teaching and learning in the sense that they draw attention of learners to the competing features such as accuracy, fluency, and complexity. English audiovisual and audio recorded materials have been widely used by teachers and…

  6. Huffman coding in advanced audio coding standard

    Science.gov (United States)

    Brzuchalski, Grzegorz

    2012-05-01

    This article presents several hardware architectures of Advanced Audio Coding (AAC) Huffman noiseless encoder, its optimisations and working implementation. Much attention has been paid to optimise the demand of hardware resources especially memory size. The aim of design was to get as short binary stream as possible in this standard. The Huffman encoder with whole audio-video system has been implemented in FPGA devices.

  7. Audio Technology and Mobile Human Computer Interaction

    DEFF Research Database (Denmark)

    Chamberlain, Alan; Bødker, Mads; Hazzard, Adrian

    2017-01-01

    Audio-based mobile technology is opening up a range of new interactive possibilities. This paper brings some of those possibilities to light by offering a range of perspectives based in this area. It is not only the technical systems that are developing, but novel approaches to the design...... and understanding of audio-based mobile systems are evolving to offer new perspectives on interaction and design and support such systems to be applied in areas, such as the humanities....

  8. Virtual environment display for a 3D audio room simulation

    Science.gov (United States)

    Chapin, William L.; Foster, Scott

    1992-06-01

    Recent developments in virtual 3D audio and synthetic aural environments have produced a complex acoustical room simulation. The acoustical simulation models a room with walls, ceiling, and floor of selected sound reflecting/absorbing characteristics and unlimited independent localizable sound sources. This non-visual acoustic simulation, implemented with 4 audio ConvolvotronsTM by Crystal River Engineering and coupled to the listener with a Poihemus IsotrakTM, tracking the listener's head position and orientation, and stereo headphones returning binaural sound, is quite compelling to most listeners with eyes closed. This immersive effect should be reinforced when properly integrated into a full, multi-sensory virtual environment presentation. This paper discusses the design of an interactive, visual virtual environment, complementing the acoustic model and specified to: 1) allow the listener to freely move about the space, a room of manipulable size, shape, and audio character, while interactively relocating the sound sources; 2) reinforce the listener's feeling of telepresence into the acoustical environment with visual and proprioceptive sensations; 3) enhance the audio with the graphic and interactive components, rather than overwhelm or reduce it; and 4) serve as a research testbed and technology transfer demonstration. The hardware/software design of two demonstration systems, one installed and one portable, are discussed through the development of four iterative configurations. The installed system implements a head-coupled, wide-angle, stereo-optic tracker/viewer and multi-computer simulation control. The portable demonstration system implements a head-mounted wide-angle, stereo-optic display, separate head and pointer electro-magnetic position trackers, a heterogeneous parallel graphics processing system, and object oriented C++ program code.

  9. Audio-Visual Tibetan Speech Recognition Based on a Deep Dynamic Bayesian Network for Natural Human Robot Interaction

    Directory of Open Access Journals (Sweden)

    Yue Zhao

    2012-12-01

    Full Text Available Audio-visual speech recognition is a natural and robust approach to improving human-robot interaction in noisy environments. Although multi-stream Dynamic Bayesian Network and coupled HMM are widely used for audio-visual speech recognition, they fail to learn the shared features between modalities and ignore the dependency of features among the frames within each discrete state. In this paper, we propose a Deep Dynamic Bayesian Network (DDBN to perform unsupervised extraction of spatial-temporal multimodal features from Tibetan audio-visual speech data and build an accurate audio-visual speech recognition model under a no frame-independency assumption. The experiment results on Tibetan speech data from some real-world environments showed the proposed DDBN outperforms the state-of-art methods in word recognition accuracy.

  10. Audiovisual laughter detection based on temporal features

    NARCIS (Netherlands)

    Petridis, Stavros; Nijholt, Antinus; Nijholt, A.; Pantic, M.; Pantic, Maja; Poel, Mannes; Poel, M.; Hondorp, G.H.W.

    2008-01-01

    Previous research on automatic laughter detection has mainly been focused on audio-based detection. In this study we present an audiovisual approach to distinguishing laughter from speech based on temporal features and we show that the integration of audio and visual information leads to improved

  11. Mobile video-to-audio transducer and motion detection for sensory substitution

    Directory of Open Access Journals (Sweden)

    Maxime eAmbard

    2015-10-01

    Full Text Available Visuo-auditory sensory substitution systems are augmented reality devices that translate a video stream into an audio stream in order to help the blind in daily tasks requiring visuo-spatial information. In this work, we present both a new mobile device and a transcoding method specifically designed to sonify moving objects. Frame differencing is used to extract spatial features from the video stream and two-dimensional spatial information is converted into audio cues using pitch, interaural time difference and interaural level difference. Using numerical methods, we attempt to reconstruct visuo-spatial information based on audio signals generated from various video stimuli. We show that despite a contrasted visual background and a highly lossy encoding method, the information in the audio signal is sufficient to allow object localization, object trajectory evaluation, object approach detection, and spatial separation of multiple objects. We also show that this type of audio signal can be interpreted by human users by asking ten subjects to discriminate trajectories based on generated audio signals.

  12. Key features of human episodic recollection in the cross-episode retrieval of rat hippocampus representations of space.

    Directory of Open Access Journals (Sweden)

    Eduard Kelemen

    2013-07-01

    Full Text Available Neurophysiological studies focus on memory retrieval as a reproduction of what was experienced and have established that neural discharge is replayed to express memory. However, cognitive psychology has established that recollection is not a verbatim replay of stored information. Recollection is constructive, the product of memory retrieval cues, the information stored in memory, and the subject's state of mind. We discovered key features of constructive recollection embedded in the rat CA1 ensemble discharge during an active avoidance task. Rats learned two task variants, one with the arena stable, the other with it rotating; each variant defined a distinct behavioral episode. During the rotating episode, the ensemble discharge of CA1 principal neurons was dynamically organized to concurrently represent space in two distinct codes. The code for spatial reference frame switched rapidly between representing the rat's current location in either the stationary spatial frame of the room or the rotating frame of the arena. The code for task variant switched less frequently between a representation of the current rotating episode and the stable episode from the rat's past. The characteristics and interplay of these two hippocampal codes revealed three key properties of constructive recollection. (1 Although the ensemble representations of the stable and rotating episodes were distinct, ensemble discharge during rotation occasionally resembled the stable condition, demonstrating cross-episode retrieval of the representation of the remote, stable episode. (2 This cross-episode retrieval at the level of the code for task variant was more likely when the rotating arena was about to match its orientation in the stable episode. (3 The likelihood of cross-episode retrieval was influenced by preretrieval information that was signaled at the level of the code for spatial reference frame. Thus key features of episodic recollection manifest in rat hippocampal

  13. Using Audio-Derived Affective Offset to Enhance TV Recommendation

    DEFF Research Database (Denmark)

    Shepstone, Sven Ewan; Tan, Zheng-Hua; Jensen, Søren Holdt

    2014-01-01

    . First a user's mood profile is determined using 12-class audio-based emotion classifications . An initial TV content item is then displayed to the user based on the extracted mood profile. The user has the option to either accept the recommendation, or to critique the item once or several times......, by navigating the emotion space to request an alternative match. The final match is then compared to the initial match, in terms of the difference in the items' affective parameterization . This offset is then utilized in future recommendation sessions. The system was evaluated by eliciting three different...

  14. Safe, Affordable, Convenient: Environmental Features of Malls and Other Public Spaces Used by Older Adults for Walking.

    Science.gov (United States)

    King, Diane K; Allen, Peg; Jones, Dina L; Marquez, David X; Brown, David R; Rosenberg, Dori; Janicek, Sarah; Allen, Laila; Belza, Basia

    2016-03-01

    Midlife and older adults use shopping malls for walking, but little research has examined mall characteristics that contribute to their walkability. We used modified versions of the Centers for Disease Control and Prevention (CDC)-Healthy Aging Research Network (HAN) Environmental Audit and the System for Observing Play and Recreation in Communities (SOPARC) tool to systematically observe 443 walkers in 10 shopping malls. We also observed 87 walkers in 6 community-based nonmall/nongym venues where older adults routinely walked for physical activity. All venues had public transit stops and accessible parking. All malls and 67% of nonmalls had wayfinding aids, and most venues (81%) had an established circuitous walking route and clean, well-maintained public restrooms (94%). All venues had level floor surfaces, and one-half had benches along the walking route. Venues varied in hours of access, programming, tripping hazards, traffic control near entrances, and lighting. Despite diversity in location, size, and purpose, the mall and nonmall venues audited shared numerous environmental features known to promote walking in older adults and few barriers to walking. Future research should consider programmatic features and outreach strategies to expand the use of malls and other suitable public spaces for walking.

  15. Safe, Affordable, Convenient: Environmental Features of Malls and Other Public Spaces Used by Older Adults for Walking

    Science.gov (United States)

    King, Diane K.; Allen, Peg; Jones, Dina L.; Marquez, David X.; Brown, David R.; Rosenberg, Dori; Janicek, Sarah; Allen, Laila; Belza, Basia

    2016-01-01

    Background Midlife and older adults use shopping malls for walking, but little research has examined mall characteristics that contribute to their walkability. Methods We used modified versions of the Centers for Disease Control and Prevention (CDC)-Healthy Aging Research Network (HAN) Environmental Audit and the System for Observing Play and Recreation in Communities (SOPARC) tool to systematically observe 443 walkers in 10 shopping malls. We also observed 87 walkers in 6 community-based nonmall/nongym venues where older adults routinely walked for physical activity. Results All venues had public transit stops and accessible parking. All malls and 67% of nonmalls had wayfinding aids, and most venues (81%) had an established circuitous walking route and clean, well-maintained public restrooms (94%). All venues had level floor surfaces, and one-half had benches along the walking route. Venues varied in hours of access, programming, tripping hazards, traffic control near entrances, and lighting. Conclusions Despite diversity in location, size, and purpose, the mall and nonmall venues audited shared numerous environmental features known to promote walking in older adults and few barriers to walking. Future research should consider programmatic features and outreach strategies to expand the use of malls and other suitable public spaces for walking. PMID:26181907

  16. Lip-reading aids word recognition most in moderate noise: a Bayesian explanation using high-dimensional feature space.

    Science.gov (United States)

    Ma, Wei Ji; Zhou, Xiang; Ross, Lars A; Foxe, John J; Parra, Lucas C

    2009-01-01

    Watching a speaker's facial movements can dramatically enhance our ability to comprehend words, especially in noisy environments. From a general doctrine of combining information from different sensory modalities (the principle of inverse effectiveness), one would expect that the visual signals would be most effective at the highest levels of auditory noise. In contrast, we find, in accord with a recent paper, that visual information improves performance more at intermediate levels of auditory noise than at the highest levels, and we show that a novel visual stimulus containing only temporal information does the same. We present a Bayesian model of optimal cue integration that can explain these conflicts. In this model, words are regarded as points in a multidimensional space and word recognition is a probabilistic inference process. When the dimensionality of the feature space is low, the Bayesian model predicts inverse effectiveness; when the dimensionality is high, the enhancement is maximal at intermediate auditory noise levels. When the auditory and visual stimuli differ slightly in high noise, the model makes a counterintuitive prediction: as sound quality increases, the proportion of reported words corresponding to the visual stimulus should first increase and then decrease. We confirm this prediction in a behavioral experiment. We conclude that auditory-visual speech perception obeys the same notion of optimality previously observed only for simple multisensory stimuli.

  17. Lip-reading aids word recognition most in moderate noise: a Bayesian explanation using high-dimensional feature space.

    Directory of Open Access Journals (Sweden)

    Wei Ji Ma

    Full Text Available Watching a speaker's facial movements can dramatically enhance our ability to comprehend words, especially in noisy environments. From a general doctrine of combining information from different sensory modalities (the principle of inverse effectiveness, one would expect that the visual signals would be most effective at the highest levels of auditory noise. In contrast, we find, in accord with a recent paper, that visual information improves performance more at intermediate levels of auditory noise than at the highest levels, and we show that a novel visual stimulus containing only temporal information does the same. We present a Bayesian model of optimal cue integration that can explain these conflicts. In this model, words are regarded as points in a multidimensional space and word recognition is a probabilistic inference process. When the dimensionality of the feature space is low, the Bayesian model predicts inverse effectiveness; when the dimensionality is high, the enhancement is maximal at intermediate auditory noise levels. When the auditory and visual stimuli differ slightly in high noise, the model makes a counterintuitive prediction: as sound quality increases, the proportion of reported words corresponding to the visual stimulus should first increase and then decrease. We confirm this prediction in a behavioral experiment. We conclude that auditory-visual speech perception obeys the same notion of optimality previously observed only for simple multisensory stimuli.

  18. Could Audio-Described Films Benefit from Audio Introductions? An Audience Response Study

    Science.gov (United States)

    Romero-Fresco, Pablo; Fryer, Louise

    2013-01-01

    Introduction: Time constraints limit the quantity and type of information conveyed in audio description (AD) for films, in particular the cinematic aspects. Inspired by introductory notes for theatre AD, this study developed audio introductions (AIs) for "Slumdog Millionaire" and "Man on Wire." Each AI comprised 10 minutes of…

  19. Audio stream classification for multimedia database search

    Science.gov (United States)

    Artese, M.; Bianco, S.; Gagliardi, I.; Gasparini, F.

    2013-03-01

    Search and retrieval of huge archives of Multimedia data is a challenging task. A classification step is often used to reduce the number of entries on which to perform the subsequent search. In particular, when new entries of the database are continuously added, a fast classification based on simple threshold evaluation is desirable. In this work we present a CART-based (Classification And Regression Tree [1]) classification framework for audio streams belonging to multimedia databases. The database considered is the Archive of Ethnography and Social History (AESS) [2], which is mainly composed of popular songs and other audio records describing the popular traditions handed down generation by generation, such as traditional fairs, and customs. The peculiarities of this database are that it is continuously updated; the audio recordings are acquired in unconstrained environment; and for the non-expert human user is difficult to create the ground truth labels. In our experiments, half of all the available audio files have been randomly extracted and used as training set. The remaining ones have been used as test set. The classifier has been trained to distinguish among three different classes: speech, music, and song. All the audio files in the dataset have been previously manually labeled into the three classes above defined by domain experts.

  20. Audio Description as a Pedagogical Tool

    Directory of Open Access Journals (Sweden)

    Georgina Kleege

    2015-05-01

    Full Text Available Audio description is the process of translating visual information into words for people who are blind or have low vision. Typically such description has focused on films, museum exhibitions, images and video on the internet, and live theater. Because it allows people with visual impairments to experience a variety of cultural and educational texts that would otherwise be inaccessible, audio description is a mandated aspect of disability inclusion, although it remains markedly underdeveloped and underutilized in our classrooms and in society in general. Along with increasing awareness of disability, audio description pushes students to practice close reading of visual material, deepen their analysis, and engage in critical discussions around the methodology, standards and values, language, and role of interpretation in a variety of academic disciplines. We outline a few pedagogical interventions that can be customized to different contexts to develop students' writing and critical thinking skills through guided description of visual material.

  1. Near-field Localization of Audio

    DEFF Research Database (Denmark)

    Jensen, Jesper Rindom; Christensen, Mads Græsbøll

    2014-01-01

    Localization of audio sources using microphone arrays has been an important research problem for more than two decades. Many traditional methods for solving the problem are based on a two-stage procedure: first, information about the audio source, such as time differences-of-arrival (TDOAs......) and gain ratios-of-arrival (GROAs) between microphones is estimated, and, second, this knowledge is used to localize the audio source. These methods often have a low computational complexity, but this comes at the cost of a limited estimation accuracy. Therefore, we propose a new localization approach......, where the desired signal is modeled using TDOAs and GROAs, which are determined by the source location. This facilitates the derivation of one-stage, maximum likelihood methods under a white Gaussian noise assumption that is applicable in both near- and far-field scenarios. Simulations show...

  2. Frequency Hopping Method for Audio Watermarking

    Directory of Open Access Journals (Sweden)

    A. Anastasijević

    2012-11-01

    Full Text Available This paper evaluates the degradation of audio content for a perceptible removable watermark. Two different approaches to embedding the watermark in the spectral domain were investigated. The frequencies for watermark embedding are chosen according to a pseudorandom sequence making the methods robust. Consequentially, the lower quality audio can be used for promotional purposes. For a fee, the watermark can be removed with a secret watermarking key. Objective and subjective testing was conducted in order to measure degradation level for the watermarked music samples and to examine residual distortion for different parameters of the watermarking algorithm and different music genres.

  3. Audio Networking in the Music Industry

    Directory of Open Access Journals (Sweden)

    Glebs Kuzmics

    2018-01-01

    Full Text Available This paper surveys the rôle of computer networking technologies in the music industry. A comparison of their relevant technologies, their defining advantages and disadvantages; analyses and discussion of the situation in the market of network enabled audio products followed by a discussion of different devices are presented. The idea of replacing a proprietary solution with open-source and freeware software programs has been chosen as the fundamental concept of this research. The technologies covered include: native IEEE AVnu Alliance Audio Video Bridging (AVB, CobraNet®, Audinate Dante™ and Harman BLU Link.

  4. Nonlinear dynamic macromodeling techniques for audio systems

    Science.gov (United States)

    Ogrodzki, Jan; Bieńkowski, Piotr

    2015-09-01

    This paper develops a modelling method and a models identification technique for the nonlinear dynamic audio systems. Identification is performed by means of a behavioral approach based on a polynomial approximation. This approach makes use of Discrete Fourier Transform and Harmonic Balance Method. A model of an audio system is first created and identified and then it is simulated in real time using an algorithm of low computational complexity. The algorithm consists in real time emulation of the system response rather than in simulation of the system itself. The proposed software is written in Python language using object oriented programming techniques. The code is optimized for a multithreads environment.

  5. Personalized Audio Systems - a Bayesian Approach

    DEFF Research Database (Denmark)

    Nielsen, Jens Brehm; Jensen, Bjørn Sand; Hansen, Toke Jansen

    2013-01-01

    Modern audio systems are typically equipped with several user-adjustable parameters unfamiliar to most users listening to the system. To obtain the best possible setting, the user is forced into multi-parameter optimization with respect to the users's own objective and preference. To address this......, the present paper presents a general inter-active framework for personalization of such audio systems. The framework builds on Bayesian Gaussian process regression in which a model of the users's objective function is updated sequentially. The parameter setting to be evaluated in a given trial is selected...

  6. Interactive Football-Training Based on Rebounders with Hit Position Sensing and Audio-Visual Feedback

    DEFF Research Database (Denmark)

    Jensen, Mads Møller; Grønbæk, Kaj; Thomassen, Nikolaj

    2014-01-01

    . However, most of these tools are created with a single goal, either to measure or train, and are often used and tested in very controlled settings. In this paper, we present an interactive football-training platform, called Football Lab, featuring sensor- mounted rebounders as well as audio-visual...

  7. Demonstration of pattern transfer into sub-100 nm polysilicon line/space features patterned with extreme ultraviolet lithography

    International Nuclear Information System (INIS)

    Cardinale, G. F.; Henderson, C. C.; Goldsmith, J. E. M.; Mangat, P. J. S.; Cobb, J.; Hector, S. D.

    1999-01-01

    In two separate experiments, we have successfully demonstrated the transfer of dense- and loose-pitch line/space (L/S) photoresist features, patterned with extreme ultraviolet (EUV) lithography, into an underlying hard mask material. In both experiments, a deep-UV photoresist (∼90 nm thick) was spin cast in bilayer format onto a hard mask (50-90 nm thick) and was subsequently exposed to EUV radiation using a 10x reduction EUV exposure system. The EUV reticle was fabricated at Motorola (Tempe, AZ) using a subtractive process with Ta-based absorbers on Mo/Si multilayer mask blanks. In the first set of experiments, following the EUV exposures, the L/S patterns were transferred first into a SiO 2 hard mask (60 nm thick) using a reactive ion etch (RIE), and then into polysilicon (350 nm thick) using a triode-coupled plasma RIE etcher at the University of California, Berkeley, microfabrication facilities. The latter etch process, which produced steep (>85 degree sign ) sidewalls, employed a HBr/Cl chemistry with a large (>10:1) etch selectivity of polysilicon to silicon dioxide. In the second set of experiments, hard mask films of SiON (50 nm thick) and SiO 2 (87 nm thick) were used. A RIE was performed at Motorola using a halogen gas chemistry that resulted in a hard mask-to-photoresist etch selectivity >3:1 and sidewall profile angles ≥85 degree sign . Line edge roughness (LER) and linewidth critical dimension (CD) measurements were performed using Sandia's GORA(c) CD digital image analysis software. Low LER values (6-9 nm, 3σ, one side) and good CD linearity (better than 10%) were demonstrated for the final pattern-transferred dense polysilicon L/S features from 80 to 175 nm. In addition, pattern transfer (into polysilicon) of loose-pitch (1:2) L/S features with CDs≥60 nm was demonstrated. (c) 1999 American Vacuum Society

  8. Motif distributions in phase-space networks for characterizing experimental two-phase flow patterns with chaotic features.

    Science.gov (United States)

    Gao, Zhong-Ke; Jin, Ning-De; Wang, Wen-Xu; Lai, Ying-Cheng

    2010-07-01

    The dynamics of two-phase flows have been a challenging problem in nonlinear dynamics and fluid mechanics. We propose a method to characterize and distinguish patterns from inclined water-oil flow experiments based on the concept of network motifs that have found great usage in network science and systems biology. In particular, we construct from measured time series phase-space complex networks and then calculate the distribution of a set of distinct network motifs. To gain insight, we first test the approach using time series from classical chaotic systems and find a universal feature: motif distributions from different chaotic systems are generally highly heterogeneous. Our main finding is that the distributions from experimental two-phase flows tend to be heterogeneous as well, suggesting the underlying chaotic nature of the flow patterns. Calculation of the maximal Lyapunov exponent provides further support for this. Motif distributions can thus be a feasible tool to understand the dynamics of realistic two-phase flow patterns.

  9. Object-Based Change Detection in Urban Areas: The Effects of Segmentation Strategy, Scale, and Feature Space on Unsupervised Methods

    Directory of Open Access Journals (Sweden)

    Lei Ma

    2016-09-01

    Full Text Available Object-based change detection (OBCD has recently been receiving increasing attention as a result of rapid improvements in the resolution of remote sensing data. However, some OBCD issues relating to the segmentation of high-resolution images remain to be explored. For example, segmentation units derived using different segmentation strategies, segmentation scales, feature space, and change detection methods have rarely been assessed. In this study, we have tested four common unsupervised change detection methods using different segmentation strategies and a series of segmentation scale parameters on two WorldView-2 images of urban areas. We have also evaluated the effect of adding extra textural and Normalized Difference Vegetation Index (NDVI information instead of using only spectral information. Our results indicated that change detection methods performed better at a medium scale than at a fine scale where close to the pixel size. Multivariate Alteration Detection (MAD always outperformed the other methods tested, at the same confidence level. The overall accuracy appeared to benefit from using a two-date segmentation strategy rather than single-date segmentation. Adding textural and NDVI information appeared to reduce detection accuracy, but the magnitude of this reduction was not consistent across the different unsupervised methods and segmentation strategies. We conclude that a two-date segmentation strategy is useful for change detection in high-resolution imagery, but that the optimization of thresholds is critical for unsupervised change detection methods. Advanced methods need be explored that can take advantage of additional textural or other parameters.

  10. Audio wiring guide how to wire the most popular audio and video connectors

    CERN Document Server

    Hechtman, John

    2012-01-01

    Whether you're a pro or an amateur, a musician or into multimedia, you can't afford to guess about audio wiring. The Audio Wiring Guide is a comprehensive, easy-to-use guide that explains exactly what you need to know. No matter the size of your wiring project or installation, this handy tool provides you with the essential information you need and the techniques to use it. Using The Audio Wiring Guide is like having an expert at your side. By following the clear, step-by-step directions, you can do professional-level work at a fraction of the cost.

  11. Music information retrieval in compressed audio files: a survey

    Science.gov (United States)

    Zampoglou, Markos; Malamos, Athanasios G.

    2014-07-01

    In this paper, we present an organized survey of the existing literature on music information retrieval systems in which descriptor features are extracted directly from the compressed audio files, without prior decompression to pulse-code modulation format. Avoiding the decompression step and utilizing the readily available compressed-domain information can significantly lighten the computational cost of a music information retrieval system, allowing application to large-scale music databases. We identify a number of systems relying on compressed-domain information and form a systematic classification of the features they extract, the retrieval tasks they tackle and the degree in which they achieve an actual increase in the overall speed-as well as any resulting loss in accuracy. Finally, we discuss recent developments in the field, and the potential research directions they open toward ultra-fast, scalable systems.

  12. A High-Voltage Class D Audio Amplifier for Dielectric Elastomer Transducers

    DEFF Research Database (Denmark)

    Nielsen, Dennis; Knott, Arnold; Andersen, Michael A. E.

    2014-01-01

    Dielectric Elastomer (DE) transducers have emerged as a very interesting alternative to the traditional electrodynamic transducer. Lightweight, small size and high maneuverability are some of the key features of the DE transducer. An amplifier for the DE transducer suitable for audio applications...... is proposed and analyzed. The amplifier addresses the issue of a high impedance load, ensuring a linear response over the midrange region of the audio bandwidth (100 Hz – 3.5 kHz). THD+N below 0.1% are reported for the ± 300 V prototype amplifier producing a maximum of 125 Var at a peak efficiency of 95 %....

  13. Audio Haptic Videogaming for Developing Wayfinding Skills in Learners Who are Blind.

    Science.gov (United States)

    Sánchez, Jaime; de Borba Campos, Marcia; Espinoza, Matías; Merabet, Lotfi B

    2014-01-01

    Interactive digital technologies are currently being developed as a novel tool for education and skill development. Audiopolis is an audio and haptic based videogame designed for developing orientation and mobility (O&M) skills in people who are blind. We have evaluated the cognitive impact of videogame play on O&M skills by assessing performance on a series of behavioral tasks carried out in both indoor and outdoor virtual spaces. Our results demonstrate that the use of Audiopolis had a positive impact on the development and use of O&M skills in school-aged learners who are blind. The impact of audio and haptic information on learning is also discussed.

  14. CERN automatic audio-conference service

    International Nuclear Information System (INIS)

    Sierra Moral, Rodrigo

    2010-01-01

    Scientists from all over the world need to collaborate with CERN on a daily basis. They must be able to communicate effectively on their joint projects at any time; as a result telephone conferences have become indispensable and widely used. Managed by 6 operators, CERN already has more than 20000 hours and 5700 audio-conferences per year. However, the traditional telephone based audio-conference system needed to be modernized in three ways. Firstly, to provide the participants with more autonomy in the organization of their conferences; secondly, to eliminate the constraints of manual intervention by operators; and thirdly, to integrate the audio-conferences into a collaborative working framework. The large number, and hence cost, of the conferences prohibited externalization and so the CERN telecommunications team drew up a specification to implement a new system. It was decided to use a new commercial collaborative audio-conference solution based on the SIP protocol. The system was tested as the first European pilot and several improvements (such as billing, security, redundancy...) were implemented based on CERN's recommendations. The new automatic conference system has been operational since the second half of 2006. It is very popular for the users and has doubled the number of conferences in the past two years.

  15. CERN automatic audio-conference service

    Energy Technology Data Exchange (ETDEWEB)

    Sierra Moral, Rodrigo, E-mail: Rodrigo.Sierra@cern.c [CERN, IT Department 1211 Geneva-23 (Switzerland)

    2010-04-01

    Scientists from all over the world need to collaborate with CERN on a daily basis. They must be able to communicate effectively on their joint projects at any time; as a result telephone conferences have become indispensable and widely used. Managed by 6 operators, CERN already has more than 20000 hours and 5700 audio-conferences per year. However, the traditional telephone based audio-conference system needed to be modernized in three ways. Firstly, to provide the participants with more autonomy in the organization of their conferences; secondly, to eliminate the constraints of manual intervention by operators; and thirdly, to integrate the audio-conferences into a collaborative working framework. The large number, and hence cost, of the conferences prohibited externalization and so the CERN telecommunications team drew up a specification to implement a new system. It was decided to use a new commercial collaborative audio-conference solution based on the SIP protocol. The system was tested as the first European pilot and several improvements (such as billing, security, redundancy...) were implemented based on CERN's recommendations. The new automatic conference system has been operational since the second half of 2006. It is very popular for the users and has doubled the number of conferences in the past two years.

  16. CERN automatic audio-conference service

    Science.gov (United States)

    Sierra Moral, Rodrigo

    2010-04-01

    Scientists from all over the world need to collaborate with CERN on a daily basis. They must be able to communicate effectively on their joint projects at any time; as a result telephone conferences have become indispensable and widely used. Managed by 6 operators, CERN already has more than 20000 hours and 5700 audio-conferences per year. However, the traditional telephone based audio-conference system needed to be modernized in three ways. Firstly, to provide the participants with more autonomy in the organization of their conferences; secondly, to eliminate the constraints of manual intervention by operators; and thirdly, to integrate the audio-conferences into a collaborative working framework. The large number, and hence cost, of the conferences prohibited externalization and so the CERN telecommunications team drew up a specification to implement a new system. It was decided to use a new commercial collaborative audio-conference solution based on the SIP protocol. The system was tested as the first European pilot and several improvements (such as billing, security, redundancy...) were implemented based on CERN's recommendations. The new automatic conference system has been operational since the second half of 2006. It is very popular for the users and has doubled the number of conferences in the past two years.

  17. Audio Journal in an ELT Context

    Directory of Open Access Journals (Sweden)

    Neşe Aysin Siyli

    2012-09-01

    Full Text Available It is widely acknowledged that one of the most serious problems students of English as a foreign language face is their deprivation of practicing the language outside the classroom. Generally, the classroom is the sole environment where they can practice English, which by its nature does not provide rich setting to help students develop their competence by putting the language into practice. Motivated by this need, this descriptive study investigated the impact of audio dialog journals on students’ speaking skills. It also aimed to gain insights into students’ and teacher’s opinions on keeping audio dialog journals outside the class. The data of the study developed from student and teacher audio dialog journals, student written feedbacks, interviews held with the students, and teacher observations. The descriptive analysis of the data revealed that audio dialog journals served a number of functions ranging from cognitive to linguistic, from pedagogical to psychological, and social. The findings and pedagogical implications of the study are discussed in detail.

  18. Spatial audio quality perception (part 2)

    DEFF Research Database (Denmark)

    Conetta, R.; Brookes, T.; Rumsey, F.

    2015-01-01

    location, envelopment, coverage angle, ensemble width, and spaciousness. They can also impact timbre, and changes to timbre can then influence spatial perception. Previously obtained data was used to build a regression model of perceived spatial audio quality in terms of spatial and timbral metrics...

  19. Study of audio speakers containing ferrofluid

    Energy Technology Data Exchange (ETDEWEB)

    Rosensweig, R E [34 Gloucester Road, Summit, NJ 07901 (United States); Hirota, Y; Tsuda, S [Ferrotec, 1-4-14 Kyobashi, chuo-Ku, Tokyo 104-0031 (Japan); Raj, K [Ferrotec, 33 Constitution Drive, Bedford, NH 03110 (United States)

    2008-05-21

    This work validates a method for increasing the radial restoring force on the voice coil in audio speakers containing ferrofluid. In addition, a study is made of factors influencing splash loss of the ferrofluid due to shock. Ferrohydrodynamic analysis is employed throughout to model behavior, and predictions are compared to experimental data.

  20. An ESL Audio-Script Writing Workshop

    Science.gov (United States)

    Miller, Carla

    2012-01-01

    The roles of dialogue, collaborative writing, and authentic communication have been explored as effective strategies in second language writing classrooms. In this article, the stages of an innovative, multi-skill writing method, which embeds students' personal voices into the writing process, are explored. A 10-step ESL Audio Script Writing Model…

  1. Audible Aliasing Distortion in Digital Audio Synthesis

    Directory of Open Access Journals (Sweden)

    J. Schimmel

    2012-04-01

    Full Text Available This paper deals with aliasing distortion in digital audio signal synthesis of classic periodic waveforms with infinite Fourier series, for electronic musical instruments. When these waveforms are generated in the digital domain then the aliasing appears due to its unlimited bandwidth. There are several techniques for the synthesis of these signals that have been designed to avoid or reduce the aliasing distortion. However, these techniques have high computing demands. One can say that today's computers have enough computing power to use these methods. However, we have to realize that today’s computer-aided music production requires tens of multi-timbre voices generated simultaneously by software synthesizers and the most of the computing power must be reserved for hard-disc recording subsystem and real-time audio processing of many audio channels with a lot of audio effects. Trivially generated classic analog synthesizer waveforms are therefore still effective for sound synthesis. We cannot avoid the aliasing distortion but spectral components produced by the aliasing can be masked with harmonic components and thus made inaudible if sufficient oversampling ratio is used. This paper deals with the assessment of audible aliasing distortion with the help of a psychoacoustic model of simultaneous masking and compares the computing demands of trivial generation using oversampling with those of other methods.

  2. All About Audio Equalization: Solutions and Frontiers

    Directory of Open Access Journals (Sweden)

    Vesa Välimäki

    2016-05-01

    Full Text Available Audio equalization is a vast and active research area. The extent of research means that one often cannot identify the preferred technique for a particular problem. This review paper bridges those gaps, systemically providing a deep understanding of the problems and approaches in audio equalization, their relative merits and applications. Digital signal processing techniques for modifying the spectral balance in audio signals and applications of these techniques are reviewed, ranging from classic equalizers to emerging designs based on new advances in signal processing and machine learning. Emphasis is placed on putting the range of approaches within a common mathematical and conceptual framework. The application areas discussed herein are diverse, and include well-defined, solvable problems of filter design subject to constraints, as well as newly emerging challenges that touch on problems in semantics, perception and human computer interaction. Case studies are given in order to illustrate key concepts and how they are applied in practice. We also recommend preferred signal processing approaches for important audio equalization problems. Finally, we discuss current challenges and the uncharted frontiers in this field. The source code for methods discussed in this paper is made available at https://code.soundsoftware.ac.uk/projects/allaboutaudioeq.

  3. Feature-space assessment of electrical impedance tomography coregistered with computed tomography in detecting multiple contrast targets

    International Nuclear Information System (INIS)

    Krishnan, Kalpagam; Liu, Jeff; Kohli, Kirpal

    2014-01-01

    Purpose: Fusion of electrical impedance tomography (EIT) with computed tomography (CT) can be useful as a clinical tool for providing additional physiological information about tissues, but requires suitable fusion algorithms and validation procedures. This work explores the feasibility of fusing EIT and CT images using an algorithm for coregistration. The imaging performance is validated through feature space assessment on phantom contrast targets. Methods: EIT data were acquired by scanning a phantom using a circuit, configured for injecting current through 16 electrodes, placed around the phantom. A conductivity image of the phantom was obtained from the data using electrical impedance and diffuse optical tomography reconstruction software (EIDORS). A CT image of the phantom was also acquired. The EIT and CT images were fused using a region of interest (ROI) coregistration fusion algorithm. Phantom imaging experiments were carried out on objects of different contrasts, sizes, and positions. The conductive medium of the phantoms was made of a tissue-mimicking bolus material that is routinely used in clinical radiation therapy settings. To validate the imaging performance in detecting different contrasts, the ROI of the phantom was filled with distilled water and normal saline. Spatially separated cylindrical objects of different sizes were used for validating the imaging performance in multiple target detection. Analyses of the CT, EIT and the EIT/CT phantom images were carried out based on the variations of contrast, correlation, energy, and homogeneity, using a gray level co-occurrence matrix (GLCM). A reference image of the phantom was simulated using EIDORS, and the performances of the CT and EIT imaging systems were evaluated and compared against the performance of the EIT/CT system using various feature metrics, detectability, and structural similarity index measures. Results: In detecting distilled and normal saline water in bolus medium, EIT as a stand

  4. A Perceptually Reweighted Mixed-Norm Method for Sparse Approximation of Audio Signals

    DEFF Research Database (Denmark)

    Christensen, Mads Græsbøll; Sturm, Bob L.

    2011-01-01

    using standard software. A prominent feature of the new method is that it solves a problem that is closely related to the objective of coding, namely rate-distortion optimization. In computer simulations, we demonstrate the properties of the algorithm and its application to real audio signals.......In this paper, we consider the problem of finding sparse representations of audio signals for coding purposes. In doing so, it is of utmost importance that when only a subset of the present components of an audio signal are extracted, it is the perceptually most important ones. To this end, we...... propose a new iterative algorithm based on two principles: 1) a reweighted l1-norm based measure of sparsity; and 2) a reweighted l2-norm based measure of perceptual distortion. Using these measures, the considered problem is posed as a constrained convex optimization problem that can be solved optimally...

  5. Hierarchical structure for audio-video based semantic classification of sports video sequences

    Science.gov (United States)

    Kolekar, M. H.; Sengupta, S.

    2005-07-01

    A hierarchical structure for sports event classification based on audio and video content analysis is proposed in this paper. Compared to the event classifications in other games, those of cricket are very challenging and yet unexplored. We have successfully solved cricket video classification problem using a six level hierarchical structure. The first level performs event detection based on audio energy and Zero Crossing Rate (ZCR) of short-time audio signal. In the subsequent levels, we classify the events based on video features using a Hidden Markov Model implemented through Dynamic Programming (HMM-DP) using color or motion as a likelihood function. For some of the game-specific decisions, a rule-based classification is also performed. Our proposed hierarchical structure can easily be applied to any other sports. Our results are very promising and we have moved a step forward towards addressing semantic classification problems in general.

  6. Extracting meaning from audio signals - a machine learning approach

    DEFF Research Database (Denmark)

    Larsen, Jan

    2007-01-01

    * Machine learning framework for sound search * Genre classification * Music and audio separation * Wind noise suppression......* Machine learning framework for sound search * Genre classification * Music and audio separation * Wind noise suppression...

  7. Audio segmentation of broadcast news in the Albayzin-2010 evaluation: overview, results, and discussion

    Directory of Open Access Journals (Sweden)

    Butko Taras

    2011-01-01

    Full Text Available Abstract Recently, audio segmentation has attracted research interest because of its usefulness in several applications like audio indexing and retrieval, subtitling, monitoring of acoustic scenes, etc. Moreover, a previous audio segmentation stage may be useful to improve the robustness of speech technologies like automatic speech recognition and speaker diarization. In this article, we present the evaluation of broadcast news audio segmentation systems carried out in the context of the Albayzín-2010 evaluation campaign. That evaluation consisted of segmenting audio from the 3/24 Catalan TV channel into five acoustic classes: music, speech, speech over music, speech over noise, and the other. The evaluation results displayed the difficulty of this segmentation task. In this article, after presenting the database and metric, as well as the feature extraction methods and segmentation techniques used by the submitted systems, the experimental results are analyzed and compared, with the aim of gaining an insight into the proposed solutions, and looking for directions which are promising.

  8. Consequence of audio visual collection in school libraries

    OpenAIRE

    Kuri, Ramesh

    2016-01-01

    The collection of Audio-Visual in library plays important role in teaching and learning. The importance of audio visual (AV) technology in education should not be underestimated. If audio-visual collection in library is carefully planned and designed, it can provide a rich learning environment. In this article, an author discussed the consequences of Audio-Visual collection in libraries especially for students of school library

  9. Anthropomorphic Coding of Speech and Audio: A Model Inversion Approach

    Directory of Open Access Journals (Sweden)

    W. Bastiaan Kleijn

    2005-06-01

    Full Text Available Auditory modeling is a well-established methodology that provides insight into human perception and that facilitates the extraction of signal features that are most relevant to the listener. The aim of this paper is to provide a tutorial on perceptual speech and audio coding using an invertible auditory model. In this approach, the audio signal is converted into an auditory representation using an invertible auditory model. The auditory representation is quantized and coded. Upon decoding, it is then transformed back into the acoustic domain. This transformation converts a complex distortion criterion into a simple one, thus facilitating quantization with low complexity. We briefly review past work on auditory models and describe in more detail the components of our invertible model and its inversion procedure, that is, the method to reconstruct the signal from the output of the auditory model. We summarize attempts to use the auditory representation for low-bit-rate coding. Our approach also allows the exploitation of the inherent redundancy of the human auditory system for the purpose of multiple description (joint source-channel coding.

  10. 47 CFR 10.520 - Common audio attention signal.

    Science.gov (United States)

    2010-10-01

    ... 47 Telecommunication 1 2010-10-01 2010-10-01 false Common audio attention signal. 10.520 Section... Equipment Requirements § 10.520 Common audio attention signal. A Participating CMS Provider and equipment manufacturers may only market devices for public use under part 10 that include an audio attention signal that...

  11. Debugging of Class-D Audio Power Amplifiers

    DEFF Research Database (Denmark)

    Crone, Lasse; Pedersen, Jeppe Arnsdorf; Mønster, Jakob Døllner

    2012-01-01

    Determining and optimizing the performance of a Class-D audio power amplier can be very dicult without knowledge of the use of audio performance measuring equipment and of how the various noise and distortion sources in uence the audio performance. This paper gives an introduction on how to measure...

  12. Fusion of audio and visual cues for laughter detection

    NARCIS (Netherlands)

    Petridis, Stavros; Pantic, Maja

    Past research on automatic laughter detection has focused mainly on audio-based detection. Here we present an audio- visual approach to distinguishing laughter from speech and we show that integrating the information from audio and video channels leads to improved performance over single-modal

  13. A Novel Method Using Abstract Convex Underestimation in Ab-Initio Protein Structure Prediction for Guiding Search in Conformational Feature Space.

    Science.gov (United States)

    Hao, Xiao-Hu; Zhang, Gui-Jun; Zhou, Xiao-Gen; Yu, Xu-Feng

    2016-01-01

    To address the searching problem of protein conformational space in ab-initio protein structure prediction, a novel method using abstract convex underestimation (ACUE) based on the framework of evolutionary algorithm was proposed. Computing such conformations, essential to associate structural and functional information with gene sequences, is challenging due to the high-dimensionality and rugged energy surface of the protein conformational space. As a consequence, the dimension of protein conformational space should be reduced to a proper level. In this paper, the high-dimensionality original conformational space was converted into feature space whose dimension is considerably reduced by feature extraction technique. And, the underestimate space could be constructed according to abstract convex theory. Thus, the entropy effect caused by searching in the high-dimensionality conformational space could be avoided through such conversion. The tight lower bound estimate information was obtained to guide the searching direction, and the invalid searching area in which the global optimal solution is not located could be eliminated in advance. Moreover, instead of expensively calculating the energy of conformations in the original conformational space, the estimate value is employed to judge if the conformation is worth exploring to reduce the evaluation time, thereby making computational cost lower and the searching process more efficient. Additionally, fragment assembly and the Monte Carlo method are combined to generate a series of metastable conformations by sampling in the conformational space. The proposed method provides a novel technique to solve the searching problem of protein conformational space. Twenty small-to-medium structurally diverse proteins were tested, and the proposed ACUE method was compared with It Fix, HEA, Rosetta and the developed method LEDE without underestimate information. Test results show that the ACUE method can more rapidly and more

  14. Comparative evaluation of audio and audio - tactile methods to improve oral hygiene status of visually impaired school children

    OpenAIRE

    R Krishnakumar; Swarna Swathi Silla; Sugumaran K Durai; Mohan Govindarajan; Syed Shaheed Ahamed; Logeshwari Mathivanan

    2016-01-01

    Background: Visually impaired children are unable to maintain good oral hygiene, as their tactile abilities are often underdeveloped owing to their visual disturbances. Conventional brushing techniques are often poorly comprehended by these children and hence, it was decided to evaluate the effectiveness of audio and audio-tactile methods in improving the oral hygiene of these children. Objective: To evaluate and compare the effectiveness of audio and audio-tactile methods in improving oral h...

  15. Semantic Indexing of Multimedia Content Using Visual, Audio, and Text Cues

    Directory of Open Access Journals (Sweden)

    W. H. Adams

    2003-02-01

    Full Text Available We present a learning-based approach to the semantic indexing of multimedia content using cues derived from audio, visual, and text features. We approach the problem by developing a set of statistical models for a predefined lexicon. Novel concepts are then mapped in terms of the concepts in the lexicon. To achieve robust detection of concepts, we exploit features from multiple modalities, namely, audio, video, and text. Concept representations are modeled using Gaussian mixture models (GMM, hidden Markov models (HMM, and support vector machines (SVM. Models such as Bayesian networks and SVMs are used in a late-fusion approach to model concepts that are not explicitly modeled in terms of features. Our experiments indicate promise in the proposed classification and fusion methodologies: our proposed fusion scheme achieves more than 10% relative improvement over the best unimodal concept detector.

  16. New musical organology : the audio-games

    OpenAIRE

    Zénouda , Hervé

    2012-01-01

    International audience; This article aims to shed light on a new and emerging creative field: " Audio Games, " a crossroad between video games and computer music. Today, a plethora of tiny applications, which propose entertaining audiovisual experiences with a preponderant sound dimension, are available for game consoles, computers, and mobile phones. These experiences represent a new universe where the gameplay of video games is applied to musical composition, hence creating new links betwee...

  17. Audio Networking in the Music Industry

    OpenAIRE

    Glebs Kuzmics; Maaruf Ali

    2018-01-01

    This paper surveys the rôle of computer networking technologies in the music industry. A comparison of their relevant technologies, their defining advantages and disadvantages; analyses and discussion of the situation in the market of network enabled audio products followed by a discussion of different devices are presented. The idea of replacing a proprietary solution with open-source and freeware software programs has been chosen as the fundamental concept of this research. The technologies...

  18. Digitisation of the CERN Audio Archives

    CERN Multimedia

    Maximilien Brice

    2006-01-01

    Since the creation of CERN in 1954 until mid 1980s, the audiovisual service has recorded hundreds of hours of moments of life at CERN on audio tapes. These moments range from inaugurations of new facilities to VIP speeches and general interest cultural seminars The preservation process started in June 2005 On these pictures, we see Waltraud Hug working on an open-reel tape.

  19. Features of motivation of the crewmembers in an enclosed space at atmospheric pressure changes during breathing inert gases.

    Science.gov (United States)

    Komarevcev, Sergey

    Since the 1960s, our psychologists are working on experimenting with small groups in isolation .It was associated with the beginning of spaceflight and necessity to study of human behaviors in ways different from the natural habitat of man .Those, who study human behavior especially in isolation, know- that the behavior in isolation markedly different from that in the natural situations. It associated with the development of new, more adaptive behaviors (1) What are the differences ? First of all , isolation is achieved by the fact ,that the group is in a closed space. How experiments show - the crew members have changed the basic personality traits, such as motivation Statement of the problem and methods. In our experimentation we were interested in changing the features of human motivation (strength, stability and direction of motivation) in terms of a closed group in the modified atmosphere pressure and breathing inert gases. Also, we were interested in particular external and internal motivation of the individual in the circumstances. To conduct experimentation , we used an experimental barocomplex GVK -250 , which placed a group of six mans. A task was to spend fifteen days in isolation on barokomplex when breathing oxigen - xenon mixture of fifteen days in isolation on the same complex when breathing oxygen- helium mixture and fifteen days of isolation on the same complex when breathing normal air All this time, the subjects were isolated under conditions of atmospheric pressure changes , closer to what you normally deal divers. We assumed that breathing inert mixtures can change the strength and stability , and with it , the direction and stability of motivation. To check our results, we planned on using the battery of psychological techniques : 1. Schwartz technique that measures personal values and behavior in society, DORS procedure ( measurement of fatigue , monotony , satiety and stress ) and riffs that give the test once a week. Our assumption is

  20. Automatic Detection and Classification of Audio Events for Road Surveillance Applications

    Directory of Open Access Journals (Sweden)

    Noor Almaadeed

    2018-06-01

    Full Text Available This work investigates the problem of detecting hazardous events on roads by designing an audio surveillance system that automatically detects perilous situations such as car crashes and tire skidding. In recent years, research has shown several visual surveillance systems that have been proposed for road monitoring to detect accidents with an aim to improve safety procedures in emergency cases. However, the visual information alone cannot detect certain events such as car crashes and tire skidding, especially under adverse and visually cluttered weather conditions such as snowfall, rain, and fog. Consequently, the incorporation of microphones and audio event detectors based on audio processing can significantly enhance the detection accuracy of such surveillance systems. This paper proposes to combine time-domain, frequency-domain, and joint time-frequency features extracted from a class of quadratic time-frequency distributions (QTFDs to detect events on roads through audio analysis and processing. Experiments were carried out using a publicly available dataset. The experimental results conform the effectiveness of the proposed approach for detecting hazardous events on roads as demonstrated by 7% improvement of accuracy rate when compared against methods that use individual temporal and spectral features.

  1. ANALYSIS OF MULTIMODAL FUSION TECHNIQUES FOR AUDIO-VISUAL SPEECH RECOGNITION

    Directory of Open Access Journals (Sweden)

    D.V. Ivanko

    2016-05-01

    Full Text Available The paper deals with analytical review, covering the latest achievements in the field of audio-visual (AV fusion (integration of multimodal information. We discuss the main challenges and report on approaches to address them. One of the most important tasks of the AV integration is to understand how the modalities interact and influence each other. The paper addresses this problem in the context of AV speech processing and speech recognition. In the first part of the review we set out the basic principles of AV speech recognition and give the classification of audio and visual features of speech. Special attention is paid to the systematization of the existing techniques and the AV data fusion methods. In the second part we provide a consolidated list of tasks and applications that use the AV fusion based on carried out analysis of research area. We also indicate used methods, techniques, audio and video features. We propose classification of the AV integration, and discuss the advantages and disadvantages of different approaches. We draw conclusions and offer our assessment of the future in the field of AV fusion. In the further research we plan to implement a system of audio-visual Russian continuous speech recognition using advanced methods of multimodal fusion.

  2. Detection Of Alterations In Audio Files Using Spectrograph Analysis

    Directory of Open Access Journals (Sweden)

    Anandha Krishnan G

    2015-08-01

    Full Text Available The corresponding study was carried out to detect changes in audio file using spectrograph. An audio file format is a file format for storing digital audio data on a computer system. A sound spectrograph is a laboratory instrument that displays a graphical representation of the strengths of the various component frequencies of a sound as time passes. The objectives of the study were to find the changes in spectrograph of audio after altering them to compare altering changes with spectrograph of original files and to check for similarity and difference in mp3 and wav. Five different alterations were carried out on each audio file to analyze the differences between the original and the altered file. For altering the audio file MP3 or WAV by cutcopy the file was opened in Audacity. A different audio was then pasted to the audio file. This new file was analyzed to view the differences. By adjusting the necessary parameters the noise was reduced. The differences between the new file and the original file were analyzed. By adjusting the parameters from the dialog box the necessary changes were made. The edited audio file was opened in the software named spek where after analyzing a graph is obtained of that particular file which is saved for further analysis. The original audio graph received was combined with the edited audio file graph to see the alterations.

  3. A Smart Audio on Demand Application on Android Systems

    Directory of Open Access Journals (Sweden)

    Ing-Jr Ding

    2015-05-01

    Full Text Available This paper describes a study of the realization of intelligent Audio on Demand (AOD processing in the embedded system environment. This study describes the development of innovative Android software that will enhance user experience of the increasingly popular number of smart mobile devices now available on the market. The application we developed can accumulate records of the songs that are played and automatically analyze the favorite song types of a user. The application can also select sound control playback functions to make operation more convenient. A large number of different types of music genre were collected to create a sound database and build an intelligent AOD processing mechanism. Formant analysis was used to extract voice features and the K-means clustering method and acoustic modeling technology of the Gaussian mixture model (GMM were used to study and develop the application mechanism. The processes we developed run smoothly in the embedded Android platform.

  4. Stream/Bounce Event Perception Reveals a Temporal Limit of Motion Correspondence Based on Surface Feature over Space and Time

    Directory of Open Access Journals (Sweden)

    Yousuke Kawachi

    2011-06-01

    Full Text Available We examined how stream/bounce event perception is affected by motion correspondence based on the surface features of moving objects passing behind an occlusion. In the stream/bounce display two identical objects moving across each other in a two-dimensional display can be perceived as either streaming through or bouncing off each other at coincidence. Here, surface features such as colour (Experiments 1 and 2 or luminance (Experiment 3 were switched between the two objects at coincidence. The moment of coincidence was invisible to observers due to an occluder. Additionally, the presentation of the moving objects was manipulated in duration after the feature switch at coincidence. The results revealed that a postcoincidence duration of approximately 200 ms was required for the visual system to stabilize judgments of stream/bounce events by determining motion correspondence between the objects across the occlusion on the basis of the surface feature. The critical duration was similar across motion speeds of objects and types of surface features. Moreover, controls (Experiments 4a–4c showed that cognitive bias based on feature (colour/luminance congruency across the occlusion could not fully account for the effects of surface features on the stream/bounce judgments. We discuss the roles of motion correspondence, visual feature processing, and attentive tracking in the stream/bounce judgments.

  5. Audio-vestibular signs and symptoms in Chiari malformation type i. Case series and literature review.

    Science.gov (United States)

    Guerra Jiménez, Gloria; Mazón Gutiérrez, Ángel; Marco de Lucas, Enrique; Valle San Román, Natalia; Martín Laez, Rubén; Morales Angulo, Carmelo

    2015-01-01

    Chiari malformation is an alteration of the base of the skull with herniation through the foramen magnum of the brain stem and cerebellum. Although the most common presentation is occipital headache, the association of audio-vestibular symptoms is not rare. The aim of our study was to describe audio-vestibular signs and symptoms in Chiari malformation type i (CM-I). We performed a retrospective observational study of patients referred to our unit during the last 5 years. We also carried out a literature review of audio-vestibular signs and symptoms in this disease. There were 9 patients (2 males and 7 females), with an average age of 42.8 years. Five patients presented a Ménière-like syndrome; 2 cases, a recurrent vertigo with peripheral features; one patient showed a sudden hearing loss; and one case suffered a sensorineural hearing loss with early childhood onset. The most common audio-vestibular symptom indicated in the literature in patients with CM-I is unsteadiness (49%), followed by dizziness (18%), nystagmus (15%) and hearing loss (15%). Nystagmus is frequently horizontal (74%) or down-beating (18%). Other audio-vestibular signs and symptoms are tinnitus (11%), aural fullness (10%) and hyperacusis (1%). Occipital headache that increases with Valsalva manoeuvres and hand paresthesias are very suggestive symptoms. The appearance of audio-vestibular manifestations in CM-I makes it common to refer these patients to neurotologists. Unsteadiness, vertiginous syndromes and sensorineural hearing loss are frequent. Nystagmus, especially horizontal and down-beating, is not rare. It is important for neurotologists to familiarise themselves with CM-I symptoms to be able to consider it in differential diagnosis. Copyright © 2014 Elsevier España, S.L.U. y Sociedad Española de Otorrinolaringología y Patología Cérvico-Facial. All rights reserved.

  6. Audio-visual onset differences are used to determine syllable identity for ambiguous audio-visual stimulus pairs.

    Science.gov (United States)

    Ten Oever, Sanne; Sack, Alexander T; Wheat, Katherine L; Bien, Nina; van Atteveldt, Nienke

    2013-01-01

    Content and temporal cues have been shown to interact during audio-visual (AV) speech identification. Typically, the most reliable unimodal cue is used more strongly to identify specific speech features; however, visual cues are only used if the AV stimuli are presented within a certain temporal window of integration (TWI). This suggests that temporal cues denote whether unimodal stimuli belong together, that is, whether they should be integrated. It is not known whether temporal cues also provide information about the identity of a syllable. Since spoken syllables have naturally varying AV onset asynchronies, we hypothesize that for suboptimal AV cues presented within the TWI, information about the natural AV onset differences can aid in speech identification. To test this, we presented low-intensity auditory syllables concurrently with visual speech signals, and varied the stimulus onset asynchronies (SOA) of the AV pair, while participants were instructed to identify the auditory syllables. We revealed that specific speech features (e.g., voicing) were identified by relying primarily on one modality (e.g., auditory). Additionally, we showed a wide window in which visual information influenced auditory perception, that seemed even wider for congruent stimulus pairs. Finally, we found a specific response pattern across the SOA range for syllables that were not reliably identified by the unimodal cues, which we explained as the result of the use of natural onset differences between AV speech signals. This indicates that temporal cues not only provide information about the temporal integration of AV stimuli, but additionally convey information about the identity of AV pairs. These results provide a detailed behavioral basis for further neuro-imaging and stimulation studies to unravel the neurofunctional mechanisms of the audio-visual-temporal interplay within speech perception.

  7. Elicitation of attributes for the evaluation of audio-on audio-interference

    DEFF Research Database (Denmark)

    Francombe, Jon; Mason, R.; Dewhirst, M.

    2014-01-01

    procedure was used to reduce these phrases into a comprehensive set of attributes. Groups of experienced and inexperienced listeners determined nine and eight attributes, respectively. These attribute sets were combined by the listeners to produce a final set of 12 attributes: masking, calming, distraction......An experiment to determine the perceptual attributes of the experience of listening to a target audio program in the presence of an audio interferer was performed. The first stage was a free elicitation task in which a total of 572 phrases were produced. In the second stage, a consensus vocabulary...

  8. AudioPairBank: Towards A Large-Scale Tag-Pair-Based Audio Content Analysis

    OpenAIRE

    Sager, Sebastian; Elizalde, Benjamin; Borth, Damian; Schulze, Christian; Raj, Bhiksha; Lane, Ian

    2016-01-01

    Recently, sound recognition has been used to identify sounds, such as car and river. However, sounds have nuances that may be better described by adjective-noun pairs such as slow car, and verb-noun pairs such as flying insects, which are under explored. Therefore, in this work we investigate the relation between audio content and both adjective-noun pairs and verb-noun pairs. Due to the lack of datasets with these kinds of annotations, we collected and processed the AudioPairBank corpus cons...

  9. [Intermodal timing cues for audio-visual speech recognition].

    Science.gov (United States)

    Hashimoto, Masahiro; Kumashiro, Masaharu

    2004-06-01

    The purpose of this study was to investigate the limitations of lip-reading advantages for Japanese young adults by desynchronizing visual and auditory information in speech. In the experiment, audio-visual speech stimuli were presented under the six test conditions: audio-alone, and audio-visually with either 0, 60, 120, 240 or 480 ms of audio delay. The stimuli were the video recordings of a face of a female Japanese speaking long and short Japanese sentences. The intelligibility of the audio-visual stimuli was measured as a function of audio delays in sixteen untrained young subjects. Speech intelligibility under the audio-delay condition of less than 120 ms was significantly better than that under the audio-alone condition. On the other hand, the delay of 120 ms corresponded to the mean mora duration measured for the audio stimuli. The results implied that audio delays of up to 120 ms would not disrupt lip-reading advantage, because visual and auditory information in speech seemed to be integrated on a syllabic time scale. Potential applications of this research include noisy workplace in which a worker must extract relevant speech from all the other competing noises.

  10. Predicting the Overall Spatial Quality of Automotive Audio Systems

    Science.gov (United States)

    Koya, Daisuke

    The spatial quality of automotive audio systems is often compromised due to their unideal listening environments. Automotive audio systems need to be developed quickly due to industry demands. A suitable perceptual model could evaluate the spatial quality of automotive audio systems with similar reliability to formal listening tests but take less time. Such a model is developed in this research project by adapting an existing model of spatial quality for automotive audio use. The requirements for the adaptation were investigated in a literature review. A perceptual model called QESTRAL was reviewed, which predicts the overall spatial quality of domestic multichannel audio systems. It was determined that automotive audio systems are likely to be impaired in terms of the spatial attributes that were not considered in developing the QESTRAL model, but metrics are available that might predict these attributes. To establish whether the QESTRAL model in its current form can accurately predict the overall spatial quality of automotive audio systems, MUSHRA listening tests using headphone auralisation with head tracking were conducted to collect results to be compared against predictions by the model. Based on guideline criteria, the model in its current form could not accurately predict the overall spatial quality of automotive audio systems. To improve prediction performance, the QESTRAL model was recalibrated and modified using existing metrics of the model, those that were proposed from the literature review, and newly developed metrics. The most important metrics for predicting the overall spatial quality of automotive audio systems included those that were interaural cross-correlation (IACC) based, relate to localisation of the frontal audio scene, and account for the perceived scene width in front of the listener. Modifying the model for automotive audio systems did not invalidate its use for domestic audio systems. The resulting model predicts the overall spatial

  11. Audio frequency in vivo optical coherence elastography

    Science.gov (United States)

    Adie, Steven G.; Kennedy, Brendan F.; Armstrong, Julian J.; Alexandrov, Sergey A.; Sampson, David D.

    2009-05-01

    We present a new approach to optical coherence elastography (OCE), which probes the local elastic properties of tissue by using optical coherence tomography to measure the effect of an applied stimulus in the audio frequency range. We describe the approach, based on analysis of the Bessel frequency spectrum of the interferometric signal detected from scatterers undergoing periodic motion in response to an applied stimulus. We present quantitative results of sub-micron excitation at 820 Hz in a layered phantom and the first such measurements in human skin in vivo.

  12. Audio frequency in vivo optical coherence elastography

    International Nuclear Information System (INIS)

    Adie, Steven G; Kennedy, Brendan F; Armstrong, Julian J; Alexandrov, Sergey A; Sampson, David D

    2009-01-01

    We present a new approach to optical coherence elastography (OCE), which probes the local elastic properties of tissue by using optical coherence tomography to measure the effect of an applied stimulus in the audio frequency range. We describe the approach, based on analysis of the Bessel frequency spectrum of the interferometric signal detected from scatterers undergoing periodic motion in response to an applied stimulus. We present quantitative results of sub-micron excitation at 820 Hz in a layered phantom and the first such measurements in human skin in vivo.

  13. Predistortion of a Bidirectional Cuk Audio Amplifier

    DEFF Research Database (Denmark)

    Birch, Thomas Hagen; Nielsen, Dennis; Knott, Arnold

    2014-01-01

    Some non-linear amplifier topologies are capable of providing a larger voltage gain than one from a DC source, which could make them suitable for various applications. However, the non-linearities introduce a significant amount of harmonic distortion (THD). Some of this distortion could be reduced...... using predistortion. This paper suggests linearizing a nonlinear bidirectional Cuk audio amplifier using an analog predistortion approach. A prototype power stage was built and results show that a voltage gain of up to 9 dB and reduction in THD from 6% down to 3% was obtainable using this approach....

  14. Mixing audio concepts, practices and tools

    CERN Document Server

    Izhaki, Roey

    2013-01-01

    Your mix can make or break a record, and mixing is an essential catalyst for a record deal. Professional engineers with exceptional mixing skills can earn vast amounts of money and find that they are in demand by the biggest acts. To develop such skills, you need to master both the art and science of mixing. The new edition of this bestselling book offers all you need to know and put into practice in order to improve your mixes. Covering the entire process --from fundamental concepts to advanced techniques -- and offering a multitude of audio samples, tips and tricks, this boo

  15. Calibration of an audio frequency noise generator

    DEFF Research Database (Denmark)

    Diamond, Joseph M.

    1966-01-01

    a noise bandwidth Bn = π/2 × (3dB bandwidth). To apply this method to low audio frequencies, the noise bandwidth of the low Q parallel resonant circuit has been found, including the effects of both series and parallel damping. The method has been used to calibrate a General Radio 1390-B noise generator...... it is used for measurement purposes. The spectral density of a noise source may be found by measuring its rms output over a known noise bandwidth. Such a bandwidth may be provided by a passive filter using accurately known elements. For example, the parallel resonant circuit with purely parallel damping has...

  16. Non Audio-Video gesture recognition system

    DEFF Research Database (Denmark)

    Craciunescu, Razvan; Mihovska, Albena Dimitrova; Kyriazakos, Sofoklis

    2016-01-01

    Gesture recognition is a topic in computer science and language technology with the goal of interpreting human gestures via mathematical algorithms. Gestures can originate from any bodily motion or state but commonly originate from the face or hand. Current research focus includes on the emotion...... recognition from the face and hand gesture recognition. Gesture recognition enables humans to communicate with the machine and interact naturally without any mechanical devices. This paper investigates the possibility to use non-audio/video sensors in order to design a low-cost gesture recognition device...

  17. Combining Video, Audio and Lexical Indicators of Affect in Spontaneous Conversation via Particle Filtering.

    Science.gov (United States)

    Savran, Arman; Cao, Houwei; Shah, Miraj; Nenkova, Ani; Verma, Ragini

    2012-01-01

    We present experiments on fusing facial video, audio and lexical indicators for affect estimation during dyadic conversations. We use temporal statistics of texture descriptors extracted from facial video, a combination of various acoustic features, and lexical features to create regression based affect estimators for each modality. The single modality regressors are then combined using particle filtering, by treating these independent regression outputs as measurements of the affect states in a Bayesian filtering framework, where previous observations provide prediction about the current state by means of learned affect dynamics. Tested on the Audio-visual Emotion Recognition Challenge dataset, our single modality estimators achieve substantially higher scores than the official baseline method for every dimension of affect. Our filtering-based multi-modality fusion achieves correlation performance of 0.344 (baseline: 0.136) and 0.280 (baseline: 0.096) for the fully continuous and word level sub challenges, respectively.

  18. Fault Diagnosis using Audio and Vibration Signals in a Circulating Pump

    International Nuclear Information System (INIS)

    Henríquez, P; Alonso, J B; Ferrer, M A; Travieso, C M; Gómez, G

    2012-01-01

    This paper presents the use of audio and vibration signals in fault diagnosis of a circulating pump. The novelty of this paper is the use of audio signals acquired by microphones. The objective of this paper is to determine if audio signals are capable to distinguish between normal and different abnormal conditions in a circulating pump. In order to compare results, vibration signals are also acquired and analysed. Wavelet package is used to obtain the energies in different frequency bands from the audio and vibration signals. Neural networks are used to evaluate the discrimination ability of the extracted features between normal and fault conditions. The results show that information from sound signals can distinguish between normal and different faulty conditions with a success rate of 83.33%, 98% and 91.33% for each microphone respectively. These success rates are similar and even higher that those obtained from accelerometers (68%, 90.67% and 71.33% for each accelerometer respectively). Success rates also show that the position of microphones and accelerometers affects on the final results.

  19. Unimodal Learning Enhances Crossmodal Learning in Robotic Audio-Visual Tracking

    DEFF Research Database (Denmark)

    Shaikh, Danish; Bodenhagen, Leon; Manoonpong, Poramate

    2017-01-01

    Crossmodal sensory integration is a fundamental feature of the brain that aids in forming an coherent and unified representation of observed events in the world. Spatiotemporally correlated sensory stimuli brought about by rich sensorimotor experiences drive the development of crossmodal integrat...... a non-holonomic robotic agent towards a moving audio-visual target. Simulation results demonstrate that unimodal learning enhances crossmodal learning and improves both the overall accuracy and precision of multisensory orientation response....

  20. Unimodal Learning Enhances Crossmodal Learning in Robotic Audio-Visual Tracking

    DEFF Research Database (Denmark)

    Shaikh, Danish; Bodenhagen, Leon; Manoonpong, Poramate

    2018-01-01

    Crossmodal sensory integration is a fundamental feature of the brain that aids in forming an coherent and unified representation of observed events in the world. Spatiotemporally correlated sensory stimuli brought about by rich sensorimotor experiences drive the development of crossmodal integrat...... a non-holonomic robotic agent towards a moving audio-visual target. Simulation results demonstrate that unimodal learning enhances crossmodal learning and improves both the overall accuracy and precision of multisensory orientation response....

  1. Use of Effective Audio in E-learning Courseware

    OpenAIRE

    Ray, Kisor

    2015-01-01

    E-Learning uses electronic media, information & communication technologies to provide education to the masses. E-learning deliver hypertext, text, audio, images, animation and videos using desktop standalone computer, local area network based intranet and internet based contents. While producing an e-learning content or course-ware, a major decision making factor is whether to use audio for the benefit of the end users. Generally, three types of audio can be used in e-learning: narration, mus...

  2. Investigating the impact of audio instruction and audio-visual biofeedback for lung cancer radiation therapy

    Science.gov (United States)

    George, Rohini

    Lung cancer accounts for 13% of all cancers in the Unites States and is the leading cause of deaths among both men and women. The five-year survival for lung cancer patients is approximately 15%.(ACS facts & figures) Respiratory motion decreases accuracy of thoracic radiotherapy during imaging and delivery. To account for respiration, generally margins are added during radiation treatment planning, which may cause a substantial dose delivery to normal tissues and increase the normal tissue toxicity. To alleviate the above-mentioned effects of respiratory motion, several motion management techniques are available which can reduce the doses to normal tissues, thereby reducing treatment toxicity and allowing dose escalation to the tumor. This may increase the survival probability of patients who have lung cancer and are receiving radiation therapy. However the accuracy of these motion management techniques are inhibited by respiration irregularity. The rationale of this thesis was to study the improvement in regularity of respiratory motion by breathing coaching for lung cancer patients using audio instructions and audio-visual biofeedback. A total of 331 patient respiratory motion traces, each four minutes in length, were collected from 24 lung cancer patients enrolled in an IRB-approved breathing-training protocol. It was determined that audio-visual biofeedback significantly improved the regularity of respiratory motion compared to free breathing and audio instruction, thus improving the accuracy of respiratory gated radiotherapy. It was also observed that duty cycles below 30% showed insignificant reduction in residual motion while above 50% there was a sharp increase in residual motion. The reproducibility of exhale based gating was higher than that of inhale base gating. Modeling the respiratory cycles it was found that cosine and cosine 4 models had the best correlation with individual respiratory cycles. The overall respiratory motion probability distribution

  3. Tactile Earth and Space Science Materials for Students with Visual Impairments: Contours, Craters, Asteroids, and Features of Mars

    Science.gov (United States)

    Rule, Audrey C.

    2011-01-01

    New tactile curriculum materials for teaching Earth and planetary science lessons on rotation=revolution, silhouettes of objects from different views, contour maps, impact craters, asteroids, and topographic features of Mars to 11 elementary and middle school students with sight impairments at a week-long residential summer camp are presented…

  4. Cortical Integration of Audio-Visual Information

    Science.gov (United States)

    Vander Wyk, Brent C.; Ramsay, Gordon J.; Hudac, Caitlin M.; Jones, Warren; Lin, David; Klin, Ami; Lee, Su Mei; Pelphrey, Kevin A.

    2013-01-01

    We investigated the neural basis of audio-visual processing in speech and non-speech stimuli. Physically identical auditory stimuli (speech and sinusoidal tones) and visual stimuli (animated circles and ellipses) were used in this fMRI experiment. Relative to unimodal stimuli, each of the multimodal conjunctions showed increased activation in largely non-overlapping areas. The conjunction of Ellipse and Speech, which most resembles naturalistic audiovisual speech, showed higher activation in the right inferior frontal gyrus, fusiform gyri, left posterior superior temporal sulcus, and lateral occipital cortex. The conjunction of Circle and Tone, an arbitrary audio-visual pairing with no speech association, activated middle temporal gyri and lateral occipital cortex. The conjunction of Circle and Speech showed activation in lateral occipital cortex, and the conjunction of Ellipse and Tone did not show increased activation relative to unimodal stimuli. Further analysis revealed that middle temporal regions, although identified as multimodal only in the Circle-Tone condition, were more strongly active to Ellipse-Speech or Circle-Speech, but regions that were identified as multimodal for Ellipse-Speech were always strongest for Ellipse-Speech. Our results suggest that combinations of auditory and visual stimuli may together be processed by different cortical networks, depending on the extent to which speech or non-speech percepts are evoked. PMID:20709442

  5. Semantic Labeling of Nonspeech Audio Clips

    Directory of Open Access Journals (Sweden)

    Xiaojuan Ma

    2010-01-01

    Full Text Available Human communication about entities and events is primarily linguistic in nature. While visual representations of information are shown to be highly effective as well, relatively little is known about the communicative power of auditory nonlinguistic representations. We created a collection of short nonlinguistic auditory clips encoding familiar human activities, objects, animals, natural phenomena, machinery, and social scenes. We presented these sounds to a broad spectrum of anonymous human workers using Amazon Mechanical Turk and collected verbal sound labels. We analyzed the human labels in terms of their lexical and semantic properties to ascertain that the audio clips do evoke the information suggested by their pre-defined captions. We then measured the agreement with the semantically compatible labels for each sound clip. Finally, we examined which kinds of entities and events, when captured by nonlinguistic acoustic clips, appear to be well-suited to elicit information for communication, and which ones are less discriminable. Our work is set against the broader goal of creating resources that facilitate communication for people with some types of language loss. Furthermore, our data should prove useful for future research in machine analysis/synthesis of audio, such as computational auditory scene analysis, and annotating/querying large collections of sound effects.

  6. Newnes audio and Hi-Fi engineer's pocket book

    CERN Document Server

    Capel, Vivian

    2013-01-01

    Newnes Audio and Hi-Fi Engineer's Pocket Book, Second Edition provides concise discussion of several audio topics. The book is comprised of 10 chapters that cover different audio equipment. The coverage of the text includes microphones, gramophones, compact discs, and tape recorders. The book also covers high-quality radio, amplifiers, and loudspeakers. The book then reviews the concepts of sound and acoustics, and presents some facts and formulas relevant to audio. The text will be useful to sound engineers and other professionals whose work involves sound systems.

  7. Audio scene segmentation for video with generic content

    Science.gov (United States)

    Niu, Feng; Goela, Naveen; Divakaran, Ajay; Abdel-Mottaleb, Mohamed

    2008-01-01

    In this paper, we present a content-adaptive audio texture based method to segment video into audio scenes. The audio scene is modeled as a semantically consistent chunk of audio data. Our algorithm is based on "semantic audio texture analysis." At first, we train GMM models for basic audio classes such as speech, music, etc. Then we define the semantic audio texture based on those classes. We study and present two types of scene changes, those corresponding to an overall audio texture change and those corresponding to a special "transition marker" used by the content creator, such as a short stretch of music in a sitcom or silence in dramatic content. Unlike prior work using genre specific heuristics, such as some methods presented for detecting commercials, we adaptively find out if such special transition markers are being used and if so, which of the base classes are being used as markers without any prior knowledge about the content. Our experimental results show that our proposed audio scene segmentation works well across a wide variety of broadcast content genres.

  8. Materials Science Research Hardware for Application on the International Space Station: an Overview of Typical Hardware Requirements and Features

    Science.gov (United States)

    Schaefer, D. A.; Cobb, S.; Fiske, M. R.; Srinivas, R.

    2000-01-01

    NASA's Marshall Space Flight Center (MSFC) is the lead center for Materials Science Microgravity Research. The Materials Science Research Facility (MSRF) is a key development effort underway at MSFC. The MSRF will be the primary facility for microgravity materials science research on board the International Space Station (ISS) and will implement the NASA Materials Science Microgravity Research Program. It will operate in the U.S. Laboratory Module and support U. S. Microgravity Materials Science Investigations. This facility is being designed to maintain the momentum of the U.S. role in microgravity materials science and support NASA's Human Exploration and Development of Space (HEDS) Enterprise goals and objectives for Materials Science. The MSRF as currently envisioned will consist of three Materials Science Research Racks (MSRR), which will be deployed to the International Space Station (ISS) in phases, Each rack is being designed to accommodate various Experiment Modules, which comprise processing facilities for peer selected Materials Science experiments. Phased deployment will enable early opportunities for the U.S. and International Partners, and support the timely incorporation of technology updates to the Experiment Modules and sensor devices.

  9. THE MOBILE SPACE AND MOBILE TARGETING ENVIRONMENT FOR INTERNET USERS: FEATURES OF MODEL SUBMISSION AND USING IN EDUCATION

    OpenAIRE

    V. Bykov

    2013-01-01

    Article submitted the results of the analysis of the use of mobile devices in education. The substantiation of the definition of user mobility in the Internet space, taking into account the variability of mobile devices and communications. The use of mobile devices in the educational process is based on the paradigm of open and equal access to quality education. Considered the technology of using different types of devices and their functions . The conditions of user mobility in the internet ...

  10. A Content-Adaptive Analysis and Representation Framework for Audio Event Discovery from "Unscripted" Multimedia

    Science.gov (United States)

    Radhakrishnan, Regunathan; Divakaran, Ajay; Xiong, Ziyou; Otsuka, Isao

    2006-12-01

    We propose a content-adaptive analysis and representation framework to discover events using audio features from "unscripted" multimedia such as sports and surveillance for summarization. The proposed analysis framework performs an inlier/outlier-based temporal segmentation of the content. It is motivated by the observation that "interesting" events in unscripted multimedia occur sparsely in a background of usual or "uninteresting" events. We treat the sequence of low/mid-level features extracted from the audio as a time series and identify subsequences that are outliers. The outlier detection is based on eigenvector analysis of the affinity matrix constructed from statistical models estimated from the subsequences of the time series. We define the confidence measure on each of the detected outliers as the probability that it is an outlier. Then, we establish a relationship between the parameters of the proposed framework and the confidence measure. Furthermore, we use the confidence measure to rank the detected outliers in terms of their departures from the background process. Our experimental results with sequences of low- and mid-level audio features extracted from sports video show that "highlight" events can be extracted effectively as outliers from a background process using the proposed framework. We proceed to show the effectiveness of the proposed framework in bringing out suspicious events from surveillance videos without any a priori knowledge. We show that such temporal segmentation into background and outliers, along with the ranking based on the departure from the background, can be used to generate content summaries of any desired length. Finally, we also show that the proposed framework can be used to systematically select "key audio classes" that are indicative of events of interest in the chosen domain.

  11. 47 CFR 25.144 - Licensing provisions for the 2.3 GHz satellite digital audio radio service.

    Science.gov (United States)

    2010-10-01

    ... 47 Telecommunication 2 2010-10-01 2010-10-01 false Licensing provisions for the 2.3 GHz satellite digital audio radio service. 25.144 Section 25.144 Telecommunication FEDERAL COMMUNICATIONS COMMISSION (CONTINUED) COMMON CARRIER SERVICES SATELLITE COMMUNICATIONS Applications and Licenses Space Stations § 25...

  12. Visualising the environmental appearance of audio products

    Energy Technology Data Exchange (ETDEWEB)

    Stilma, M. [Univ. of Twente, Enschede (Netherlands); Stevels, A. [Delft Univ. of Technology, Delft (Netherlands)]|[Philips Consumer Electronics, Eindhoven (Netherlands); Christiaans, H.; Kandachar, P. [Delft Univ. of Technology, Delft (Netherlands)

    2004-07-01

    Can environmental friendliness be communicated by the design style and appearance of products? (such as form, colour, style or material)? Consumers are interested in buying environmental products and design styles might be used as communicative tools. However, current 'green' products show something else. Environmental aspects are chiefly promoted by marketing programs based on technical items like the use of materials, hazardous substances, energy consumption, etc. By a qualitative and exploratory research the environmental design styles according to consumers' opinions were analysed with larger audio products as case study. Visible distinctive differences can be identified between the most and the least environmental rated products. A 'Green flagship', which claims to be environmentally orientated, wasn't recognised as such by consumers. And women and men perceive environmental friendliness in another way. From this research can be concluded that more attention is needed to visualise the good technical environmental performance of products. (orig.)

  13. Time-Scale Invariant Audio Data Embedding

    Directory of Open Access Journals (Sweden)

    Mansour Mohamed F

    2003-01-01

    Full Text Available We propose a novel algorithm for high-quality data embedding in audio. The algorithm is based on changing the relative length of the middle segment between two successive maximum and minimum peaks to embed data. Spline interpolation is used to change the lengths. To ensure smooth monotonic behavior between peaks, a hybrid orthogonal and nonorthogonal wavelet decomposition is used prior to data embedding. The possible data embedding rates are between 20 and 30 bps. However, for practical purposes, we use repetition codes, and the effective embedding data rate is around 5 bps. The algorithm is invariant after time-scale modification, time shift, and time cropping. It gives high-quality output and is robust to mp3 compression.

  14. Audio visual information materials for risk communication

    International Nuclear Information System (INIS)

    Gunji, Ikuko; Tabata, Rimiko; Ohuchi, Naomi

    2005-07-01

    Japan Nuclear Cycle Development Institute (JNC), Tokai Works set up the Risk Communication Study Team in January, 2001 to promote mutual understanding between the local residents and JNC. The Team has studied risk communication from various viewpoints and developed new methods of public relations which are useful for the local residents' risk perception toward nuclear issues. We aim to develop more effective risk communication which promotes a better mutual understanding of the local residents, by providing the risk information of the nuclear fuel facilities such a Reprocessing Plant and other research and development facilities. We explain the development process of audio visual information materials which describe our actual activities and devices for the risk management in nuclear fuel facilities, and our discussion through the effectiveness measurement. (author)

  15. Tune in the Net with RealAudio.

    Science.gov (United States)

    Buchanan, Larry

    1997-01-01

    Describes how to connect to the RealAudio Web site to download a player that provides sound from Web pages to the computer through streaming technology. Explains hardware and software requirements and provides addresses for other RealAudio Web sites are provided, including weather information and current news. (LRW)

  16. Four-quadrant flyback converter for direct audio power amplification

    DEFF Research Database (Denmark)

    Ljusev, Petar; Andersen, Michael Andreas E.

    2005-01-01

    This paper presents a bidirectional, four-quadrant flyback converter for use in direct audio power amplification. When compared to the standard Class-D switching audio power amplifier with a separate power supply, the proposed four-quadrant flyback converter provides simple solution with better...

  17. Four-quadrant flyback converter for direct audio power amplification

    OpenAIRE

    Ljusev, Petar; Andersen, Michael Andreas E.

    2005-01-01

    This paper presents a bidirectional, four-quadrant flyback converter for use in direct audio power amplification. When compared to the standard Class-D switching audio power amplifier with a separate power supply, the proposed four-quadrant flyback converter provides simple solution with better efficiency, higher level of integration and lower component count.

  18. Unsupervised topic modelling on South African parliament audio data

    CSIR Research Space (South Africa)

    Kleynhans, N

    2014-11-01

    Full Text Available Using a speech recognition system to convert spoken audio to text can enable the structuring of large collections of spoken audio data. A convenient means to summarise or cluster spoken data is to identify the topic under discussion. There are many...

  19. The Effect of Audio and Animation in Multimedia Instruction

    Science.gov (United States)

    Koroghlanian, Carol; Klein, James D.

    2004-01-01

    This study investigated the effects of audio, animation, and spatial ability in a multimedia computer program for high school biology. Participants completed a multimedia program that presented content by way of text or audio with lean text. In addition, several instructional sequences were presented either with static illustrations or animations.…

  20. The Use of Audio and Animation in Computer Based Instruction.

    Science.gov (United States)

    Koroghlanian, Carol; Klein, James D.

    This study investigated the effects of audio, animation, and spatial ability in a computer-based instructional program for biology. The program presented instructional material via test or audio with lean text and included eight instructional sequences presented either via static illustrations or animations. High school students enrolled in a…

  1. Multi Carrier Modulation Audio Power Amplifier with Programmable Logic

    DEFF Research Database (Denmark)

    Christiansen, Theis; Andersen, Toke Meyer; Knott, Arnold

    2009-01-01

    While switch-mode audio power amplifiers allow compact implementations and high output power levels due to their high power efficiency, they are very well known for creating electromagnetic interference (EMI) with other electronic equipment. To lower the EMI of switch-mode (class D) audio power a...

  2. Let Their Voices Be Heard! Building a Multicultural Audio Collection.

    Science.gov (United States)

    Tucker, Judith Cook

    1992-01-01

    Discusses building a multicultural audio collection for a library. Gives some guidelines about selecting materials that really represent different cultures. Audio materials that are considered fall roughly into the categories of children's stories, didactic materials, oral histories, poetry and folktales, and music. The goal is an authentic…

  3. Efficiency in audio processing : filter banks and transcoding

    NARCIS (Netherlands)

    Lee, Jun Wei

    2007-01-01

    Audio transcoding is the conversion of digital audio from one compressed form A to another compressed form B, where A and B have different compression properties, such as a different bit-rate, sampling frequency or compression method. This is typically achieved by decoding A to an intermediate

  4. Parametric Audio Based Decoder and Music Synthesizer for Mobile Applications

    NARCIS (Netherlands)

    Oomen, A.W.J.; Szczerba, M.Z.; Therssen, D.

    2011-01-01

    This paper reviews parametric audio coders and discusses novel technologies introduced in a low-complexity, low-power consumption audiodecoder and music synthesizer platform developed by the authors. Thedecoder uses parametric coding scheme based on the MPEG-4 Parametric Audio standard. In order to

  5. Decision-level fusion for audio-visual laughter detection

    NARCIS (Netherlands)

    Reuderink, B.; Poel, M.; Truong, K.; Poppe, R.; Pantic, M.

    2008-01-01

    Laughter is a highly variable signal, which can be caused by a spectrum of emotions. This makes the automatic detection of laughter a challenging, but interesting task. We perform automatic laughter detection using audio-visual data from the AMI Meeting Corpus. Audio-visual laughter detection is

  6. Decision-Level Fusion for Audio-Visual Laughter Detection

    NARCIS (Netherlands)

    Reuderink, B.; Poel, Mannes; Truong, Khiet Phuong; Poppe, Ronald Walter; Pantic, Maja; Popescu-Belis, Andrei; Stiefelhagen, Rainer

    Laughter is a highly variable signal, which can be caused by a spectrum of emotions. This makes the automatic detection of laugh- ter a challenging, but interesting task. We perform automatic laughter detection using audio-visual data from the AMI Meeting Corpus. Audio- visual laughter detection is

  7. Haptic and Audio-visual Stimuli: Enhancing Experiences and Interaction

    NARCIS (Netherlands)

    Nijholt, Antinus; Dijk, Esko O.; Lemmens, Paul M.C.; Luitjens, S.B.

    2010-01-01

    The intention of the symposium on Haptic and Audio-visual stimuli at the EuroHaptics 2010 conference is to deepen the understanding of the effect of combined Haptic and Audio-visual stimuli. The knowledge gained will be used to enhance experiences and interactions in daily life. To this end, a

  8. Automated Speech and Audio Analysis for Semantic Access to Multimedia

    NARCIS (Netherlands)

    Jong, F.M.G. de; Ordelman, R.; Huijbregts, M.

    2006-01-01

    The deployment and integration of audio processing tools can enhance the semantic annotation of multimedia content, and as a consequence, improve the effectiveness of conceptual access tools. This paper overviews the various ways in which automatic speech and audio analysis can contribute to

  9. Automated speech and audio analysis for semantic access to multimedia

    NARCIS (Netherlands)

    de Jong, Franciska M.G.; Ordelman, Roeland J.F.; Huijbregts, M.A.H.; Avrithis, Y.; Kompatsiaris, Y.; Staab, S.; O' Connor, N.E.

    2006-01-01

    The deployment and integration of audio processing tools can enhance the semantic annotation of multimedia content, and as a consequence, improve the effectiveness of conceptual access tools. This paper overviews the various ways in which automatic speech and audio analysis can contribute to

  10. Multilevel inverter based class D audio amplifier for capacitive transducers

    DEFF Research Database (Denmark)

    Nielsen, Dennis; Knott, Arnold; Andersen, Michael A. E.

    2014-01-01

    The reduced semiconductor voltage stress makes the multilevel inverters especially interesting, when driving capacitive transducers for audio applications. A ± 300 V flying capacitor class D audio amplifier driving a 100 nF load in the midrange region of 0.1-3.5 kHz with Total Harmonic Distortion...

  11. Audio Teleconferencing: Low Cost Technology for External Studies Networking.

    Science.gov (United States)

    Robertson, Bill

    1987-01-01

    This discussion of the benefits of audio teleconferencing for distance education programs and for business and government applications focuses on the recent experience of Canadian educational users. Four successful operating models and their costs are reviewed, and it is concluded that audio teleconferencing is cost efficient and educationally…

  12. Content Discovery from Composite Audio : An unsupervised approach

    NARCIS (Netherlands)

    Lu, L.

    2009-01-01

    In this thesis, we developed and assessed a novel robust and unsupervised framework for semantic inference from composite audio signals. We focused on the problem of detecting audio scenes and grouping them into meaningful clusters. Our approach addressed all major steps in a general process of

  13. Removable Watermarking Sebagai Pengendalian Terhadap Cyber Crime Pada Audio Digital

    Directory of Open Access Journals (Sweden)

    Reyhani Lian Putri

    2017-08-01

    Full Text Available Perkembangan teknologi informasi yang pesat menuntut penggunanya untuk lebih berhati-hati seiring semakin meningkatnya cyber crime.Banyak pihak telah mengembangkan berbagai teknik perlindungan data digital, salah satunya adalah watermarking. Teknologi watermarking berfungsi untuk memberikan identitas, melindungi, atau menandai data digital, baik audio, citra, ataupun video, yang mereka miliki. Akan tetapi, teknik tersebut masih dapat diretas oleh oknum-oknum yang tidak bertanggung jawab.Pada penelitian ini, proses watermarking diterapkan pada audio digital dengan menyisipkan watermark yang terdengar jelas oleh indera pendengaran manusia (perceptible pada audio host.Hal ini bertujuan agar data audio dapat terlindungi dan apabila ada pihak lain yang ingin mendapatkan data audio tersebut harus memiliki “kunci” untuk menghilangkan watermark. Proses removable watermarking ini dilakukan pada data watermark yang sudah diketahui metode penyisipannya, agar watermark dapat dihilangkan sehingga kualitas audio menjadi lebih baik. Dengan menggunakan metode ini diperoleh kinerja audio watermarking pada nilai distorsi tertinggi dengan rata-rata nilai SNR sebesar7,834 dB dan rata-rata nilai ODG sebesar -3,77.Kualitas audio meningkat setelah watermark dihilangkan, di mana rata-rata SNR menjadi sebesar 24,986 dB dan rata-rata ODG menjadi sebesar -1,064 serta nilai MOS sebesar 4,40.

  14. Selected Audio-Visual Materials for Consumer Education. [New Version.

    Science.gov (United States)

    Johnston, William L.

    Ninety-two films, filmstrips, multi-media kits, slides, and audio cassettes, produced between 1964 and 1974, are listed in this selective annotated bibliography on consumer education. The major portion of the bibliography is devoted to films and filmstrips. The main topics of the audio-visual materials include purchasing, advertising, money…

  15. Biomedical image representation approach using visualness and spatial information in a concept feature space for interactive region-of-interest-based retrieval.

    Science.gov (United States)

    Rahman, Md Mahmudur; Antani, Sameer K; Demner-Fushman, Dina; Thoma, George R

    2015-10-01

    This article presents an approach to biomedical image retrieval by mapping image regions to local concepts where images are represented in a weighted entropy-based concept feature space. The term "concept" refers to perceptually distinguishable visual patches that are identified locally in image regions and can be mapped to a glossary of imaging terms. Further, the visual significance (e.g., visualness) of concepts is measured as the Shannon entropy of pixel values in image patches and is used to refine the feature vector. Moreover, the system can assist the user in interactively selecting a region-of-interest (ROI) and searching for similar image ROIs. Further, a spatial verification step is used as a postprocessing step to improve retrieval results based on location information. The hypothesis that such approaches would improve biomedical image retrieval is validated through experiments on two different data sets, which are collected from open access biomedical literature.

  16. Performance Characterization of Loctite (Registered Trademark) 242 and 271 Liquid Locking Compounds (LLCs) as a Secondary Locking Feature for International Space Station (ISS) Fasteners

    Science.gov (United States)

    Dube, Michael J.; Gamwell, Wayne R.

    2011-01-01

    Several International Space Station (ISS) hardware components use Loctite (and other polymer based liquid locking compounds (LLCs)) as a means of meeting the secondary (redundant) locking feature requirement for fasteners. The primary locking method is the fastener preload, with the application of the Loctite compound which when cured is intended to resist preload reduction. The reliability of these compounds has been questioned due to a number of failures during ground testing. The ISS Program Manager requested the NASA Engineering and Safety Center (NESC) to characterize and quantify sensitivities of Loctite being used as a secondary locking feature. The findings and recommendations provided in this investigation apply to the anaerobic LLCs Loctite 242 and 271. No other anaerobic LLCs were evaluated for this investigation. This document contains the findings and recommendations of the NESC investigation

  17. AUDIO CRYPTANALYSIS- AN APPLICATION OF SYMMETRIC KEY CRYPTOGRAPHY AND AUDIO STEGANOGRAPHY

    Directory of Open Access Journals (Sweden)

    Smita Paira

    2016-09-01

    Full Text Available In the recent trend of network and technology, “Cryptography” and “Steganography” have emerged out as the essential elements of providing network security. Although Cryptography plays a major role in the fabrication and modification of the secret message into an encrypted version yet it has certain drawbacks. Steganography is the art that meets one of the basic limitations of Cryptography. In this paper, a new algorithm has been proposed based on both Symmetric Key Cryptography and Audio Steganography. The combination of a randomly generated Symmetric Key along with LSB technique of Audio Steganography sends a secret message unrecognizable through an insecure medium. The Stego File generated is almost lossless giving a 100 percent recovery of the original message. This paper also presents a detailed experimental analysis of the algorithm with a brief comparison with other existing algorithms and a future scope. The experimental verification and security issues are promising.

  18. Robust audio-visual speech recognition under noisy audio-video conditions.

    Science.gov (United States)

    Stewart, Darryl; Seymour, Rowan; Pass, Adrian; Ming, Ji

    2014-02-01

    This paper presents the maximum weighted stream posterior (MWSP) model as a robust and efficient stream integration method for audio-visual speech recognition in environments, where the audio or video streams may be subjected to unknown and time-varying corruption. A significant advantage of MWSP is that it does not require any specific measurements of the signal in either stream to calculate appropriate stream weights during recognition, and as such it is modality-independent. This also means that MWSP complements and can be used alongside many of the other approaches that have been proposed in the literature for this problem. For evaluation we used the large XM2VTS database for speaker-independent audio-visual speech recognition. The extensive tests include both clean and corrupted utterances with corruption added in either/both the video and audio streams using a variety of types (e.g., MPEG-4 video compression) and levels of noise. The experiments show that this approach gives excellent performance in comparison to another well-known dynamic stream weighting approach and also compared to any fixed-weighted integration approach in both clean conditions or when noise is added to either stream. Furthermore, our experiments show that the MWSP approach dynamically selects suitable integration weights on a frame-by-frame basis according to the level of noise in the streams and also according to the naturally fluctuating relative reliability of the modalities even in clean conditions. The MWSP approach is shown to maintain robust recognition performance in all tested conditions, while requiring no prior knowledge about the type or level of noise.

  19. Perceived Audio Quality Analysis in Digital Audio Broadcasting Plus System Based on PEAQ

    Directory of Open Access Journals (Sweden)

    K. Ulovec

    2018-04-01

    Full Text Available Broadcasters need to decide on bitrates of the services in the multiplex transmitted via Digital Audio Broadcasting Plus system. The bitrate should be set as low as possible for maximal number of services, but with high quality, not lower than in conventional analog systems. In this paper, the objective method Perceptual Evaluation of Audio Quality is used to analyze the perceived audio quality for appropriate codecs --- MP2 and AAC offering three profiles. The main aim is to determine dependencies on the type of signal --- music and speech, the number of channels --- stereo and mono, and the bitrate. Results indicate that only MP2 codec and AAC Low Complexity profile reach imperceptible quality loss. The MP2 codec needs higher bitrate than AAC Low Complexity profile for the same quality. For the both versions of AAC High-Efficiency profiles, the limit bitrates are determined above which less complex profiles outperform the more complex ones and higher bitrates above these limits are not worth using. It is shown that stereo music has worse quality than stereo speech generally, whereas for mono, the dependencies vary upon the codec/profile. Furthermore, numbers of services satisfying various quality criteria are presented.

  20. Concepts, strategies and potentials using hypo-g and other features of the space environment for commercialization using higher plants

    Science.gov (United States)

    Krikorian, A. D.

    1985-01-01

    Opportunities for releasing, capturing, constructing and/or fixing the differential expressions or response potentials of the higher plant genome in the hypo-g environment for commercialization are explored. General strategies include improved plant-growing, crop and forestry production systems which conserve soil, water, labor and energy resources, and nutritional partitioning and mobilization of nutrients and synthates. Tissue and cell culture techniques of commercial potential include the growing and manipulation of cultured plant cells in vitro in a bioreactor to produce biologicals and secondary plants of economic value. The facilitation of plant breeding, the cloning of specific pathogen-free materials, the elimination of growing point or apex viruses, and the increase of plant yield are other O-g applications. The space environment may be advantageous in somatic embryogenesis, the culture of alkaloids, and the development of completely new crop plant germ plasm.

  1. Hierarchical vs non-hierarchical audio indexation and classification for video genres

    Science.gov (United States)

    Dammak, Nouha; BenAyed, Yassine

    2018-04-01

    In this paper, Support Vector Machines (SVMs) are used for segmenting and indexing video genres based on only audio features extracted at block level, which has a prominent asset by capturing local temporal information. The main contribution of our study is to show the wide effect on the classification accuracies while using an hierarchical categorization structure based on Mel Frequency Cepstral Coefficients (MFCC) audio descriptor. In fact, the classification consists in three common video genres: sports videos, music clips and news scenes. The sub-classification may divide each genre into several multi-speaker and multi-dialect sub-genres. The validation of this approach was carried out on over 360 minutes of video span yielding a classification accuracy of over 99%.

  2. Neuromorphic Audio-Visual Sensor Fusion on a Sound-Localising Robot

    Directory of Open Access Journals (Sweden)

    Vincent Yue-Sek Chan

    2012-02-01

    Full Text Available This paper presents the first robotic system featuring audio-visual sensor fusion with neuromorphic sensors. We combine a pair of silicon cochleae and a silicon retina on a robotic platform to allow the robot to learn sound localisation through self-motion and visual feedback, using an adaptive ITD-based sound localisation algorithm. After training, the robot can localise sound sources (white or pink noise in a reverberant environment with an RMS error of 4 to 5 degrees in azimuth. In the second part of the paper, we investigate the source binding problem. An experiment is conducted to test the effectiveness of matching an audio event with a corresponding visual event based on their onset time. The results show that this technique can be quite effective, despite its simplicity.

  3. Horatio Audio-Describes Shakespeare's "Hamlet": Blind and Low-Vision Theatre-Goers Evaluate an Unconventional Audio Description Strategy

    Science.gov (United States)

    Udo, J. P.; Acevedo, B.; Fels, D. I.

    2010-01-01

    Audio description (AD) has been introduced as one solution for providing people who are blind or have low vision with access to live theatre, film and television content. However, there is little research to inform the process, user preferences and presentation style. We present a study of a single live audio-described performance of Hart House…

  4. Subjective and Objective Assessment of Perceived Audio Quality of Current Digital Audio Broadcasting Systems and Web-Casting Applications

    NARCIS (Netherlands)

    Pocta, P.; Beerends, J.G.

    2015-01-01

    This paper investigates the impact of different audio codecs typically deployed in current digital audio broadcasting (DAB) systems and web-casting applications, which represent a main source of quality impairment in these systems and applications, on the quality perceived by the end user. Both

  5. ESA personal communications and digital audio broadcasting systems based on non-geostationary satellites

    Science.gov (United States)

    Logalbo, P.; Benedicto, J.; Viola, R.

    1993-01-01

    Personal Communications and Digital Audio Broadcasting are two new services that the European Space Agency (ESA) is investigating for future European and Global Mobile Satellite systems. ESA is active in promoting these services in their various mission options including non-geostationary and geostationary satellite systems. A Medium Altitude Global Satellite System (MAGSS) for global personal communications at L and S-band, and a Multiregional Highly inclined Elliptical Orbit (M-HEO) system for multiregional digital audio broadcasting at L-band are described. Both systems are being investigated by ESA in the context of future programs, such as Archimedes, which are intended to demonstrate the new services and to develop the technology for future non-geostationary mobile communication and broadcasting satellites.

  6. Study of growth and development features of ten ground cover plants in Kish Island green space in warm season

    Directory of Open Access Journals (Sweden)

    S. Shooshtarian

    2016-05-01

    Full Text Available Having special ecological condition, Kish Island has a restricted range of native species of ornamental plants. Expansion of urban green space in this Island is great of importance due to its outstanding touristy position in the South of Iran. The purpose of this study was to investigate the growth and development of groundcover plants planted in four different regions of Kish Island and to recommend the most suitable and adaptable species for each region. Ten groundcover species included Festuca ovina L., Glaucium flavum Crantz., Frankenia thymifolia Desf., Sedum spurium Bieb., Sedum acre L., .Potentilla verna L., Carpobrotus acinaciformis (L. L. Bolus., Achillea millefolium L., Alternanthera dentata Moench. and Lampranthus spectabilis Haw. Evaluation of growth and development had been made by measurement of morphological characteristics such as height, covering area, leaf number and area, dry and fresh total weights and visual scoring. Physiological traits included proline and chlorophyll contents evaluated. This study was designed in factorial layout based on completely randomized blocks design with six replicates. Results showed that in terms of indices such as covering area, visual quality, height, total weight, and chlorophyll content, Pavioon and Sadaf plants had the most and the worst performances, respectively in comparison to other regions’ plants. Based on evaluated characteristics, C. acinaciformis, L. spectabilis and F. thymifolia had the most expansion and growth in all quadruplet regions and are recommend for planting in Kish Island and similar climates.

  7. THE MOBILE SPACE AND MOBILE TARGETING ENVIRONMENT FOR INTERNET USERS: FEATURES OF MODEL SUBMISSION AND USING IN EDUCATION

    Directory of Open Access Journals (Sweden)

    V. Bykov

    2013-08-01

    Full Text Available Article submitted the results of the analysis of the use of mobile devices in education. The substantiation of the definition of user mobility in the Internet space, taking into account the variability of mobile devices and communications. The use of mobile devices in the educational process is based on the paradigm of open and equal access to quality education. Considered the technology of using different types of devices and their functions . The conditions of user mobility in the internet environment, the factors influencing it, the creation and storage of mobile communications resources . Provided with basic mathematical model of user behavior in a virtual network. A model of migration as a user from device to device , and its geographic move , and then use the resulting model for the design of distance learning systems . Preliminary forecasts have been made on the development of education in the transition from the remote technology to open. It is assumed the appearance of new types of personal devices that will combine the power of a desktop PC and the autonomy of smartphones with constant access for broadband wireless connection to the Internet. The use of cloud technology to store and process information resources training helps centralize and synchronize data and access to them from different devices.

  8. Optimized Audio Classification and Segmentation Algorithm by Using Ensemble Methods

    Directory of Open Access Journals (Sweden)

    Saadia Zahid

    2015-01-01

    Full Text Available Audio segmentation is a basis for multimedia content analysis which is the most important and widely used application nowadays. An optimized audio classification and segmentation algorithm is presented in this paper that segments a superimposed audio stream on the basis of its content into four main audio types: pure-speech, music, environment sound, and silence. An algorithm is proposed that preserves important audio content and reduces the misclassification rate without using large amount of training data, which handles noise and is suitable for use for real-time applications. Noise in an audio stream is segmented out as environment sound. A hybrid classification approach is used, bagged support vector machines (SVMs with artificial neural networks (ANNs. Audio stream is classified, firstly, into speech and nonspeech segment by using bagged support vector machines; nonspeech segment is further classified into music and environment sound by using artificial neural networks and lastly, speech segment is classified into silence and pure-speech segments on the basis of rule-based classifier. Minimum data is used for training classifier; ensemble methods are used for minimizing misclassification rate and approximately 98% accurate segments are obtained. A fast and efficient algorithm is designed that can be used with real-time multimedia applications.

  9. Portable audio electronics for impedance-based measurements in microfluidics

    International Nuclear Information System (INIS)

    Wood, Paul; Sinton, David

    2010-01-01

    We demonstrate the use of audio electronics-based signals to perform on-chip electrochemical measurements. Cell phones and portable music players are examples of consumer electronics that are easily operated and are ubiquitous worldwide. Audio output (play) and input (record) signals are voltage based and contain frequency and amplitude information. A cell phone, laptop soundcard and two compact audio players are compared with respect to frequency response; the laptop soundcard provides the most uniform frequency response, while the cell phone performance is found to be insufficient. The audio signals in the common portable music players and laptop soundcard operate in the range of 20 Hz to 20 kHz and are found to be applicable, as voltage input and output signals, to impedance-based electrochemical measurements in microfluidic systems. Validated impedance-based measurements of concentration (0.1–50 mM), flow rate (2–120 µL min −1 ) and particle detection (32 µm diameter) are demonstrated. The prevailing, lossless, wave audio file format is found to be suitable for data transmission to and from external sources, such as a centralized lab, and the cost of all hardware (in addition to audio devices) is ∼10 USD. The utility demonstrated here, in combination with the ubiquitous nature of portable audio electronics, presents new opportunities for impedance-based measurements in portable microfluidic systems. (technical note)

  10. Musical examination to bridge audio data and sheet music

    Science.gov (United States)

    Pan, Xunyu; Cross, Timothy J.; Xiao, Liangliang; Hei, Xiali

    2015-03-01

    The digitalization of audio is commonly implemented for the purpose of convenient storage and transmission of music and songs in today's digital age. Analyzing digital audio for an insightful look at a specific musical characteristic, however, can be quite challenging for various types of applications. Many existing musical analysis techniques can examine a particular piece of audio data. For example, the frequency of digital sound can be easily read and identified at a specific section in an audio file. Based on this information, we could determine the musical note being played at that instant, but what if you want to see a list of all the notes played in a song? While most existing methods help to provide information about a single piece of the audio data at a time, few of them can analyze the available audio file on a larger scale. The research conducted in this work considers how to further utilize the examination of audio data by storing more information from the original audio file. In practice, we develop a novel musical analysis system Musicians Aid to process musical representation and examination of audio data. Musicians Aid solves the previous problem by storing and analyzing the audio information as it reads it rather than tossing it aside. The system can provide professional musicians with an insightful look at the music they created and advance their understanding of their work. Amateur musicians could also benefit from using it solely for the purpose of obtaining feedback about a song they were attempting to play. By comparing our system's interpretation of traditional sheet music with their own playing, a musician could ensure what they played was correct. More specifically, the system could show them exactly where they went wrong and how to adjust their mistakes. In addition, the application could be extended over the Internet to allow users to play music with one another and then review the audio data they produced. This would be particularly

  11. Current-Driven Switch-Mode Audio Power Amplifiers

    DEFF Research Database (Denmark)

    Knott, Arnold; Buhl, Niels Christian; Andersen, Michael A. E.

    2012-01-01

    The conversion of electrical energy into sound waves by electromechanical transducers is proportional to the current through the coil of the transducer. However virtually all audio power amplifiers provide a controlled voltage through the interface to the transducer. This paper is presenting...... a switch-mode audio power amplifier not only providing controlled current but also being supplied by current. This results in an output filter size reduction by a factor of 6. The implemented prototype shows decent audio performance with THD + N below 0.1 %....

  12. DOA Estimation of Audio Sources in Reverberant Environments

    DEFF Research Database (Denmark)

    Jensen, Jesper Rindom; Nielsen, Jesper Kjær; Heusdens, Richard

    2016-01-01

    Reverberation is well-known to have a detrimental impact on many localization methods for audio sources. We address this problem by imposing a model for the early reflections as well as a model for the audio source itself. Using these models, we propose two iterative localization methods...... that estimate the direction-of-arrival (DOA) of both the direct path of the audio source and the early reflections. In these methods, the contribution of the early reflections is essentially subtracted from the signal observations before localization of the direct path component, which may reduce the estimation...

  13. Dynamically-Loaded Hardware Libraries (HLL) Technology for Audio Applications

    DEFF Research Database (Denmark)

    Esposito, A.; Lomuscio, A.; Nunzio, L. Di

    2016-01-01

    In this work, we apply hardware acceleration to embedded systems running audio applications. We present a new framework, Dynamically-Loaded Hardware Libraries or HLL, to dynamically load hardware libraries on reconfigurable platforms (FPGAs). Provided a library of application-specific processors......, we load on-the-fly the specific processor in the FPGA, and we transfer the execution from the CPU to the FPGA-based accelerator. The proposed architecture provides excellent flexibility with respect to the different audio applications implemented, high quality audio, and an energy efficient solution....

  14. Can audio recording improve patients' recall of outpatient consultations?

    DEFF Research Database (Denmark)

    Wolderslund, Maiken; Kofoed, Poul-Erik; Axboe, Mette

    Introduction In order to give patients possibility to listen to their consultation again, we have designed a system which gives the patients access to digital audio recordings of their consultations. An Interactive Voice Response platform enables the audio recording and gives the patients access...... and those who have not (control).The audio recordings and the interviews are coded according to six themes: Test results, Treatment, Risks, Future tests, Advice and Plan. Afterwards the extent of patients recall is assessed by comparing the accuracy of the patient’s statements (interview...

  15. A review of lossless audio compression standards and algorithms

    Science.gov (United States)

    Muin, Fathiah Abdul; Gunawan, Teddy Surya; Kartiwi, Mira; Elsheikh, Elsheikh M. A.

    2017-09-01

    Over the years, lossless audio compression has gained popularity as researchers and businesses has become more aware of the need for better quality and higher storage demand. This paper will analyse various lossless audio coding algorithm and standards that are used and available in the market focusing on Linear Predictive Coding (LPC) specifically due to its popularity and robustness in audio compression, nevertheless other prediction methods are compared to verify this. Advanced representation of LPC such as LSP decomposition techniques are also discussed within this paper.

  16. Robustness evaluation of transactional audio watermarking systems

    Science.gov (United States)

    Neubauer, Christian; Steinebach, Martin; Siebenhaar, Frank; Pickel, Joerg

    2003-06-01

    Distribution via Internet is of increasing importance. Easy access, transmission and consumption of digitally represented music is very attractive to the consumer but led also directly to an increasing problem of illegal copying. To cope with this problem watermarking is a promising concept since it provides a useful mechanism to track illicit copies by persistently attaching property rights information to the material. Especially for online music distribution the use of so-called transaction watermarking, also denoted with the term bitstream watermarking, is beneficial since it offers the opportunity to embed watermarks directly into perceptually encoded material without the need of full decompression/compression. Besides the concept of bitstream watermarking, former publications presented the complexity, the audio quality and the detection performance. These results are now extended by an assessment of the robustness of such schemes. The detection performance before and after applying selected attacks is presented for MPEG-1/2 Layer 3 (MP3) and MPEG-2/4 AAC bitstream watermarking, contrasted to the performance of PCM spread spectrum watermarking.

  17. Analysis of musical expression in audio signals

    Science.gov (United States)

    Dixon, Simon

    2003-01-01

    In western art music, composers communicate their work to performers via a standard notation which specificies the musical pitches and relative timings of notes. This notation may also include some higher level information such as variations in the dynamics, tempo and timing. Famous performers are characterised by their expressive interpretation, the ability to convey structural and emotive information within the given framework. The majority of work on audio content analysis focusses on retrieving score-level information; this paper reports on the extraction of parameters describing the performance, a task which requires a much higher degree of accuracy. Two systems are presented: BeatRoot, an off-line beat tracking system which finds the times of musical beats and tracks changes in tempo throughout a performance, and the Performance Worm, a system which provides a real-time visualisation of the two most important expressive dimensions, tempo and dynamics. Both of these systems are being used to process data for a large-scale study of musical expression in classical and romantic piano performance, which uses artificial intelligence (machine learning) techniques to discover fundamental patterns or principles governing expressive performance.

  18. Video genre classification using multimodal features

    Science.gov (United States)

    Jin, Sung Ho; Bae, Tae Meon; Choo, Jin Ho; Ro, Yong Man

    2003-12-01

    We propose a video genre classification method using multimodal features. The proposed method is applied for the preprocessing of automatic video summarization or the retrieval and classification of broadcasting video contents. Through a statistical analysis of low-level and middle-level audio-visual features in video, the proposed method can achieve good performance in classifying several broadcasting genres such as cartoon, drama, music video, news, and sports. In this paper, we adopt MPEG-7 audio-visual descriptors as multimodal features of video contents and evaluate the performance of the classification by feeding the features into a decision tree-based classifier which is trained by CART. The experimental results show that the proposed method can recognize several broadcasting video genres with a high accuracy and the classification performance with multimodal features is superior to the one with unimodal features in the genre classification.

  19. Audio Arduino - an ALSA (Advanced Linux Sound Architecture) audio driver for FTDI-based Arduinos

    DEFF Research Database (Denmark)

    Dimitrov, Smilen; Serafin, Stefania

    2011-01-01

    be considered to be a system, that encompasses design decisions on both hardware and software levels - that also demand a certain understanding of the architecture of the target PC operating system. This project outlines how an Arduino Duemillanove board (containing a USB interface chip, manufactured by Future...... Technology Devices International Ltd [FTDI] company) can be demonstrated to behave as a full-duplex, mono, 8-bit 44.1 kHz soundcard, through an implementation of: a PC audio driver for ALSA (Advanced Linux Sound Architecture); a matching program for the Arduino's ATmega microcontroller - and nothing more...

  20. Mixed-Signal Architectures for High-Efficiency and Low-Distortion Digital Audio Processing and Power Amplification

    Directory of Open Access Journals (Sweden)

    Pierangelo Terreni

    2010-01-01

    Full Text Available The paper addresses the algorithmic and architectural design of digital input power audio amplifiers. A modelling platform, based on a meet-in-the-middle approach between top-down and bottom-up design strategies, allows a fast but still accurate exploration of the mixed-signal design space. Different amplifier architectures are configured and compared to find optimal trade-offs among different cost-functions: low distortion, high efficiency, low circuit complexity and low sensitivity to parameter changes. A novel amplifier architecture is derived; its prototype implements digital processing IP macrocells (oversampler, interpolating filter, PWM cross-point deriver, noise shaper, multilevel PWM modulator, dead time compensator on a single low-complexity FPGA while off-chip components are used only for the power output stage (LC filter and power MOS bridge; no heatsink is required. The resulting digital input amplifier features a power efficiency higher than 90% and a total harmonic distortion down to 0.13% at power levels of tens of Watts. Discussions towards the full-silicon integration of the mixed-signal amplifier in embedded devices, using BCD technology and targeting power levels of few Watts, are also reported.

  1. Towards Structural Analysis of Audio Recordings in the Presence of Musical Variations

    Directory of Open Access Journals (Sweden)

    Müller Meinard

    2007-01-01

    Full Text Available One major goal of structural analysis of an audio recording is to automatically extract the repetitive structure or, more generally, the musical form of the underlying piece of music. Recent approaches to this problem work well for music, where the repetitions largely agree with respect to instrumentation and tempo, as is typically the case for popular music. For other classes of music such as Western classical music, however, musically similar audio segments may exhibit significant variations in parameters such as dynamics, timbre, execution of note groups, modulation, articulation, and tempo progression. In this paper, we propose a robust and efficient algorithm for audio structure analysis, which allows to identify musically similar segments even in the presence of large variations in these parameters. To account for such variations, our main idea is to incorporate invariance at various levels simultaneously: we design a new type of statistical features to absorb microvariations, introduce an enhanced local distance measure to account for local variations, and describe a new strategy for structure extraction that can cope with the global variations. Our experimental results with classical and popular music show that our algorithm performs successfully even in the presence of significant musical variations.

  2. A Perceptual Model for Sinusoidal Audio Coding Based on Spectral Integration

    Directory of Open Access Journals (Sweden)

    Jensen Søren Holdt

    2005-01-01

    Full Text Available Psychoacoustical models have been used extensively within audio coding applications over the past decades. Recently, parametric coding techniques have been applied to general audio and this has created the need for a psychoacoustical model that is specifically suited for sinusoidal modelling of audio signals. In this paper, we present a new perceptual model that predicts masked thresholds for sinusoidal distortions. The model relies on signal detection theory and incorporates more recent insights about spectral and temporal integration in auditory masking. As a consequence, the model is able to predict the distortion detectability. In fact, the distortion detectability defines a (perceptually relevant norm on the underlying signal space which is beneficial for optimisation algorithms such as rate-distortion optimisation or linear predictive coding. We evaluate the merits of the model by combining it with a sinusoidal extraction method and compare the results with those obtained with the ISO MPEG-1 Layer I-II recommended model. Listening tests show a clear preference for the new model. More specifically, the model presented here leads to a reduction of more than 20% in terms of number of sinusoids needed to represent signals at a given quality level.

  3. Class D audio amplifiers for high voltage capacitive transducers

    DEFF Research Database (Denmark)

    Nielsen, Dennis

    of high volume, weight, and cost. High efficient class D amplifiers are now widely available offering power densities, that their linear counterparts can not match. Unlike the technology of audio amplifiers, the loudspeaker is still based on the traditional electrodynamic transducer invented by C.W. Rice......Audio reproduction systems contains two key components, the amplifier and the loudspeaker. In the last 20 – 30 years the technology of audio amplifiers have performed a fundamental shift of paradigm. Class D audio amplifiers have replaced the linear amplifiers, suffering from the well-known issues...... with the low level of acoustical output power and complex amplifier requirements, have limited the commercial success of the technology. Horn or compression drivers are typically favoured, when high acoustic output power is required, this is however at the expense of significant distortion combined...

  4. Aurally Aided Visual Search Performance Comparing Virtual Audio Systems

    DEFF Research Database (Denmark)

    Larsen, Camilla Horne; Lauritsen, David Skødt; Larsen, Jacob Junker

    2014-01-01

    Due to increased computational power, reproducing binaural hearing in real-time applications, through usage of head-related transfer functions (HRTFs), is now possible. This paper addresses the differences in aurally-aided visual search performance between a HRTF enhanced audio system (3D) and an...... with white dots. The results indicate that 3D audio yields faster search latencies than panning audio, especially with larger amounts of distractors. The applications of this research could fit virtual environments such as video games or virtual simulations.......Due to increased computational power, reproducing binaural hearing in real-time applications, through usage of head-related transfer functions (HRTFs), is now possible. This paper addresses the differences in aurally-aided visual search performance between a HRTF enhanced audio system (3D...

  5. Aurally Aided Visual Search Performance Comparing Virtual Audio Systems

    DEFF Research Database (Denmark)

    Larsen, Camilla Horne; Lauritsen, David Skødt; Larsen, Jacob Junker

    2014-01-01

    Due to increased computational power reproducing binaural hearing in real-time applications, through usage of head-related transfer functions (HRTFs), is now possible. This paper addresses the differences in aurally-aided visual search performance between an HRTF enhanced audio system (3D) and an...... with white dots. The results indicate that 3D audio yields faster search latencies than panning audio, especially with larger amounts of distractors. The applications of this research could fit virtual environments such as video games or virtual simulations.......Due to increased computational power reproducing binaural hearing in real-time applications, through usage of head-related transfer functions (HRTFs), is now possible. This paper addresses the differences in aurally-aided visual search performance between an HRTF enhanced audio system (3D...

  6. Perancangan Sistem Audio Mobil Berbasiskan Sistem Pakar dan Web

    Directory of Open Access Journals (Sweden)

    Djunaidi Santoso

    2011-12-01

    Full Text Available Designing car audio that fits user’s needs is a fun activity. However, the design often consumes more time and costly since it should be consulted to the experts several times. For easy access to information in designing a car audio system as well as error prevention, an car audio system based on expert system and web is designed for those who do not have sufficient time and expense to consult directly to experts. This system consists of tutorial modules designed using the HyperText Preprocessor (PHP and MySQL as database. This car audio system design is evaluated uses black box testing method which focuses on the functional needs of the application. Tests are performed by providing inputs and produce outputs corresponding to the function of each module. The test results prove the correspondence between input and output, which means that the program meet the initial goals of the design. 

  7. Proper Use of Audio-Visual Aids: Essential for Educators.

    Science.gov (United States)

    Dejardin, Conrad

    1989-01-01

    Criticizes educators as the worst users of audio-visual aids and among the worst public speakers. Offers guidelines for the proper use of an overhead projector and the development of transparencies. (DMM)

  8. Ferrite bead effect on Class-D amplifier audio quality

    OpenAIRE

    Haddad , Kevin El; Mrad , Roberto; Morel , Florent; Pillonnet , Gael; Vollaire , Christian; Nagari , Angelo

    2014-01-01

    International audience; This paper studies the effect of ferrite beads on the audio quality of Class-D audio amplifiers. This latter is a switch-ing circuit which creates high frequency harmonics. Generally, a filter is used at the amplifier output for the sake of electro-magnetic compatibility (EMC). So often, in integrated solutions, this filter contains ferrite beads which are magnetic components and present nonlinear behavior. Time domain measurements and their equivalence in frequency do...

  9. Precision Scaling of Neural Networks for Efficient Audio Processing

    OpenAIRE

    Ko, Jong Hwan; Fromm, Josh; Philipose, Matthai; Tashev, Ivan; Zarar, Shuayb

    2017-01-01

    While deep neural networks have shown powerful performance in many audio applications, their large computation and memory demand has been a challenge for real-time processing. In this paper, we study the impact of scaling the precision of neural networks on the performance of two common audio processing tasks, namely, voice-activity detection and single-channel speech enhancement. We determine the optimal pair of weight/neuron bit precision by exploring its impact on both the performance and ...

  10. Design guidelines for audio presentation of graphs and tables

    OpenAIRE

    Brown, L.M.; Brewster, S.A.; Ramloll, S.A.; Burton, R.; Riedel, B.

    2003-01-01

    Audio can be used to make visualisations accessible to blind and visually impaired people. The MultiVis Project has carried out research into suitable methods for presenting graphs and tables to blind people through the use of both speech and non-speech audio. This paper presents guidelines extracted from this research. These guidelines will enable designers to implement visualisation systems for blind and visually impaired users, and will provide a framework for researchers wishing to invest...

  11. El Digital Audio Tape Recorder. Contra autores y creadores

    Directory of Open Access Journals (Sweden)

    Jun Ono

    2015-01-01

    Full Text Available La llamada "DAT" (abreviatura por "digital audio tape recorder" / grabadora digital de audio ha recibido cobertura durante mucho tiempo en los medios masivos de Japón y otros países, como un producto acústico electrónico nuevo y controversial de la industria japonesa de artefactos electrónicos. ¿Qué ha pasado con el objeto de esta controversia?

  12. IELTS speaking instruction through audio/voice conferencing

    Directory of Open Access Journals (Sweden)

    Hamed Ghaemi

    2012-02-01

    Full Text Available The currentstudyaimsatinvestigatingtheimpactofAudio/Voiceconferencing,asanewapproachtoteaching speaking, on the speakingperformanceand/orspeakingband score ofIELTScandidates.Experimentalgroupsubjectsparticipated in an audio conferencing classwhile those of the control group enjoyed attending in a traditional IELTS Speakingclass. At the endofthestudy,allsubjectsparticipatedinanIELTSExaminationheldonNovemberfourthin Tehran,Iran.To compare thegroupmeansforthestudy,anindependentt-testanalysiswasemployed.Thedifferencebetween experimental and control groupwasconsideredtobestatisticallysignificant(P<0.01.Thatisthecandidates in experimental group have outperformed the ones in control group in IELTS Speaking test scores.

  13. Digital signal processing methods and algorithms for audio conferencing systems

    OpenAIRE

    Lindström, Fredric

    2007-01-01

    Today, we are interconnected almost all over the planet. Large multinational companies operate worldwide, but also an increasing number of small and medium sized companies do business overseas. As people travel to meet and do businesses, the already exposed earth is subject to even more strain. Audio conferencing is an attractive alternative to travel, which is becoming more and more appreciated. Audio conferences can of course not replace all types of meetings, but can help companies to cut ...

  14. Automated processing of massive audio/video content using FFmpeg

    Directory of Open Access Journals (Sweden)

    Kia Siang Hock

    2014-01-01

    Full Text Available Audio and video content forms an integral, important and expanding part of the digital collections in libraries and archives world-wide. While these memory institutions are familiar and well-versed in the management of more conventional materials such as books, periodicals, ephemera and images, the handling of audio (e.g., oral history recordings and video content (e.g., audio-visual recordings, broadcast content requires additional toolkits. In particular, a robust and comprehensive tool that provides a programmable interface is indispensable when dealing with tens of thousands of hours of audio and video content. FFmpeg is comprehensive and well-established open source software that is capable of the full-range of audio/video processing tasks (such as encode, decode, transcode, mux, demux, stream and filter. It is also capable of handling a wide-range of audio and video formats, a unique challenge in memory institutions. It comes with a command line interface, as well as a set of developer libraries that can be incorporated into applications.

  15. Online feature selection with streaming features.

    Science.gov (United States)

    Wu, Xindong; Yu, Kui; Ding, Wei; Wang, Hao; Zhu, Xingquan

    2013-05-01

    We propose a new online feature selection framework for applications with streaming features where the knowledge of the full feature space is unknown in advance. We define streaming features as features that flow in one by one over time whereas the number of training examples remains fixed. This is in contrast with traditional online learning methods that only deal with sequentially added observations, with little attention being paid to streaming features. The critical challenges for Online Streaming Feature Selection (OSFS) include 1) the continuous growth of feature volumes over time, 2) a large feature space, possibly of unknown or infinite size, and 3) the unavailability of the entire feature set before learning starts. In the paper, we present a novel Online Streaming Feature Selection method to select strongly relevant and nonredundant features on the fly. An efficient Fast-OSFS algorithm is proposed to improve feature selection performance. The proposed algorithms are evaluated extensively on high-dimensional datasets and also with a real-world case study on impact crater detection. Experimental results demonstrate that the algorithms achieve better compactness and higher prediction accuracy than existing streaming feature selection algorithms.

  16. Análisis de viabilidad y desarrollo de una app de audio

    OpenAIRE

    Vega Díaz, Jorge

    2016-01-01

    Els terminals de comunicació mòbil incorporen sistemes d'enregistrament i reproducció de senyal d'àudio. El projecte preten analitzar diferents funcions de tractament de senyal en temps real, incorporant filtratge, anàlisi espectral i edició. This project consists on the design and implementation of an audio editing tool for Android devices, with similar features that we can find in operating systems like Linux, Windows or Mac. This app should have the basic editing operations, such as aud...

  17. Determination of over current protection thresholds for class D audio amplifiers

    DEFF Research Database (Denmark)

    Nyboe, Flemming; Risbo, L; Andreani, Pietro

    2005-01-01

    Monolithic class-D audio amplifiers typically feature built-in over current protection circuitry that shuts down the amplifier in case of a short circuit on the output speaker terminals. To minimize cost, the threshold at which the device shuts down must be set just above the maximum current...... that can flow in the loudspeaker during normal operation. The current required is determined by the complex loudspeaker impedance and properties of the music signals played. This work presents a statistical analysis of peak output currents when playing music on typical loudspeakers for home entertainment....

  18. Acoustic Heritage and Audio Creativity: the Creative Application of Sound in the Representation, Understanding and Experience of Past Environments

    Directory of Open Access Journals (Sweden)

    Damian Murphy

    2017-06-01

    Full Text Available Acoustic Heritage is one aspect of archaeoacoustics, and refers more specifically to the quantifiable acoustic properties of buildings, sites and landscapes from our architectural and archaeological past, forming an important aspect of our intangible cultural heritage. Auralisation, the audio equivalent of 3D visualisation, enables these acoustic properties, captured via the process of measurement and survey, or computer-based modelling, to form the basis of an audio reconstruction and presentation of the studied space. This article examines the application of auralisation and audio creativity as a means to explore our acoustic heritage, thereby diversifying and enhancing the toolset available to the digital heritage or humanities researcher. The Open Acoustic Impulse Response (OpenAIR library is an online repository for acoustic impulse response and auralisation data, with a significant part having been gathered from a broad range of heritage sites. The methodology used to gather this acoustic data is discussed, together with the processes used in generating and calibrating a comparable computer model, and how the data generated might be analysed and presented. The creative use of this acoustic data is also considered, in the context of music production, mixed media artwork and audio for gaming. More relevant to digital heritage is how these data can be used to create new experiences of past environments, as information, interpretation, guide or artwork and ultimately help to articulate new research questions and explorations of our acoustic heritage.

  19. Comparing observer models and feature selection methods for a task-based statistical assessment of digital breast tomsynthesis in reconstruction space

    Science.gov (United States)

    Park, Subok; Zhang, George Z.; Zeng, Rongping; Myers, Kyle J.

    2014-03-01

    A task-based assessment of image quality1 for digital breast tomosynthesis (DBT) can be done in either the projected or reconstructed data space. As the choice of observer models and feature selection methods can vary depending on the type of task and data statistics, we previously investigated the performance of two channelized- Hotelling observer models in conjunction with 2D Laguerre-Gauss (LG) and two implementations of partial least squares (PLS) channels along with that of the Hotelling observer in binary detection tasks involving DBT projections.2, 3 The difference in these observers lies in how the spatial correlation in DBT angular projections is incorporated in the observer's strategy to perform the given task. In the current work, we extend our method to the reconstructed data space of DBT. We investigate how various model observers including the aforementioned compare for performing the binary detection of a spherical signal embedded in structured breast phantoms with the use of DBT slices reconstructed via filtered back projection. We explore how well the model observers incorporate the spatial correlation between different numbers of reconstructed DBT slices while varying the number of projections. For this, relatively small and large scan angles (24° and 96°) are used for comparison. Our results indicate that 1) given a particular scan angle, the number of projections needed to achieve the best performance for each observer is similar across all observer/channel combinations, i.e., Np = 25 for scan angle 96° and Np = 13 for scan angle 24°, and 2) given these sufficient numbers of projections, the number of slices for each observer to achieve the best performance differs depending on the channel/observer types, which is more pronounced in the narrow scan angle case.

  20. The Fungible Audio-Visual Mapping and its Experience

    Directory of Open Access Journals (Sweden)

    Adriana Sa

    2014-12-01

    Full Text Available This article draws a perceptual approach to audio-visual mapping. Clearly perceivable cause and effect relationships can be problematic if one desires the audience to experience the music. Indeed perception would bias those sonic qualities that fit previous concepts of causation, subordinating other sonic qualities, which may form the relations between the sounds themselves. The question is, how can an audio-visual mapping produce a sense of causation, and simultaneously confound the actual cause-effect relationships. We call this a fungible audio-visual mapping. Our aim here is to glean its constitution and aspect. We will report a study, which draws upon methods from experimental psychology to inform audio-visual instrument design and composition. The participants are shown several audio-visual mapping prototypes, after which we pose quantitative and qualitative questions regarding their sense of causation, and their sense of understanding the cause-effect relationships. The study shows that a fungible mapping requires both synchronized and seemingly non-related components – sufficient complexity to be confusing. As the specific cause-effect concepts remain inconclusive, the sense of causation embraces the whole. 

  1. Imagination and Modern Audio Visual Form

    Directory of Open Access Journals (Sweden)

    Ana Đurković

    2017-09-01

    Full Text Available Through three episodes Archetype of modern fairy tales, the mysterious world of fantasy and reality,tell as a serious story about archetypes, symbols, knowledge of good and evil. Rts editor: Natasa Neskovic Written and directed by: Suncica Jergovic Editing: Ana Djurkovic How to illuminate concept of phantasy and affective factors in our imagination a priori something so imaginary, by their genetic provenance, such as a movie scene, or digital picture and sound. You can not always avoid the association to a valid phrase of arnhajm’s truth: mass age -massage: the medium is the message. In elementary and tersely definition of „the shot“ from Plaževsky film language there is term for „le cadre“, however these are selected bits of reality, immanent frame that contains the individual act of images divided of the continent’s view of reality, handling the specific code of semantic value, when its’s imaginative, of course, by aesthetic categories and evaluations. In this type of positive simulacrum, it can not be better segment for the current thinking about the limits of imagination and truth in contemporary media, and contemporary global environment, than the original audio-visual forms through whose prism we search throught a fairy tale in a same time myth and imagination as well as exploring its overall impact on the personality. Everything can be a fairy tale, even false, amoral platitudes politicized by political lobbies in a contemporary existing power sistems, but this is no fairy tale authenticity in it, or creative act, nor humanity and artificial and historical entity of a man that is always present in the ethical effort of a true artist. So, we are investigating the conditions of creative images, modalities of audiovisual media in film language,and it is the archetype of the fairy tale, which, with its psychodynamics still exists and which is removed when the modern man is tired of lies and simulations during his global

  2. One Message, Many Voices: Mobile Audio Counselling in Health Education.

    Science.gov (United States)

    Pimmer, Christoph; Mbvundula, Francis

    2018-01-01

    Health workers' use of counselling information on their mobile phones for health education is a central but little understood phenomenon in numerous mobile health (mHealth) projects in Sub-Saharan Africa. Drawing on empirical data from an interpretive case study in the setting of the Millennium Villages Project in rural Malawi, this research investigates the ways in which community health workers (CHWs) perceive that audio-counselling messages support their health education practice. Three main themes emerged from the analysis: phone-aided audio counselling (1) legitimises the CHWs' use of mobile phones during household visits; (2) helps CHWs to deliver a comprehensive counselling message; (3) supports CHWs in persuading communities to change their health practices. The findings show the complexity and interplay of the multi-faceted, sociocultural, political, and socioemotional meanings associated with audio-counselling use. Practical implications and the demand for further research are discussed.

  3. Efficiency Optimization in Class-D Audio Amplifiers

    DEFF Research Database (Denmark)

    Yamauchi, Akira; Knott, Arnold; Jørgensen, Ivan Harald Holger

    2015-01-01

    This paper presents a new power efficiency optimization routine for designing Class-D audio amplifiers. The proposed optimization procedure finds design parameters for the power stage and the output filter, and the optimum switching frequency such that the weighted power losses are minimized under...... the given constraints. The optimization routine is applied to minimize the power losses in a 130 W class-D audio amplifier based on consumer behavior investigations, where the amplifier operates at idle and low power levels most of the time. Experimental results demonstrate that the optimization method can...... lead to around 30 % of efficiency improvement at 1.3 W output power without significant effects on both audio performance and the efficiency at high power levels....

  4. Sistema de adquisición y procesamiento de audio

    OpenAIRE

    Pérez Segurado, Rubén

    2015-01-01

    El objetivo de este proyecto es el diseño y la implementación de una plataforma para un sistema de procesamiento de audio. El sistema recibirá una señal de audio analógica desde una fuente de audio, permitirá realizar un tratamiento digital de dicha señal y generará una señal procesada que se enviará a unos altavoces externos. Para la realización del sistema de procesamiento se empleará: - Un dispositivo FPGA de Lattice, modelo MachX02-7000-HE, en la cual estarán todas la...

  5. Music Identification System Using MPEG-7 Audio Signature Descriptors

    Science.gov (United States)

    You, Shingchern D.; Chen, Wei-Hwa; Chen, Woei-Kae

    2013-01-01

    This paper describes a multiresolution system based on MPEG-7 audio signature descriptors for music identification. Such an identification system may be used to detect illegally copied music circulated over the Internet. In the proposed system, low-resolution descriptors are used to search likely candidates, and then full-resolution descriptors are used to identify the unknown (query) audio. With this arrangement, the proposed system achieves both high speed and high accuracy. To deal with the problem that a piece of query audio may not be inside the system's database, we suggest two different methods to find the decision threshold. Simulation results show that the proposed method II can achieve an accuracy of 99.4% for query inputs both inside and outside the database. Overall, it is highly possible to use the proposed system for copyright control. PMID:23533359

  6. Music Identification System Using MPEG-7 Audio Signature Descriptors

    Directory of Open Access Journals (Sweden)

    Shingchern D. You

    2013-01-01

    Full Text Available This paper describes a multiresolution system based on MPEG-7 audio signature descriptors for music identification. Such an identification system may be used to detect illegally copied music circulated over the Internet. In the proposed system, low-resolution descriptors are used to search likely candidates, and then full-resolution descriptors are used to identify the unknown (query audio. With this arrangement, the proposed system achieves both high speed and high accuracy. To deal with the problem that a piece of query audio may not be inside the system’s database, we suggest two different methods to find the decision threshold. Simulation results show that the proposed method II can achieve an accuracy of 99.4% for query inputs both inside and outside the database. Overall, it is highly possible to use the proposed system for copyright control.

  7. Technical Evaluation Report 31: Internet Audio Products (3/ 3

    Directory of Open Access Journals (Sweden)

    Jim Rudolph

    2004-08-01

    Full Text Available Two contrasting additions to the online audio market are reviewed: iVocalize, a browser-based audio-conferencing software, and Skype, a PC-to-PC Internet telephone tool. These products are selected for review on the basis of their success in gaining rapid popular attention and usage during 2003-04. The iVocalize review emphasizes the product’s role in the development of a series of successful online audio communities – notably several serving visually impaired users. The Skype review stresses the ease with which the product may be used for simultaneous PC-to-PC communication among up to five users. Editor’s Note: This paper serves as an introduction to reports about online community building, and reviews of online products for disabled persons, in the next ten reports in this series. JPB, Series Ed.

  8. Class-D audio amplifiers with negative feedback

    OpenAIRE

    Cox, Stephen M.; Candy, B. H.

    2006-01-01

    There are many different designs for audio amplifiers. Class-D, or switching, amplifiers generate their output signal in the form of a high-frequency square wave of variable duty cycle (ratio of on time to off time). The square-wave nature of the output allows a particularly efficient output stage, with minimal losses. The output is ultimately filtered to remove components of the spectrum above the audio range. Mathematical models are derived here for a variety of related class-D amplifier de...

  9. A second-order class-D audio amplifier

    OpenAIRE

    Cox, Stephen M.; Tan, M.T.; Yu, J.

    2011-01-01

    Class-D audio amplifiers are particularly efficient, and this efficiency has led to their ubiquity in a wide range of modern electronic appliances. Their output takes the form of a high-frequency square wave whose duty cycle (ratio of on-time to off-time) is modulated at low frequency according to the audio signal. A mathematical model is developed here for a second-order class-D amplifier design (i.e., containing one second-order integrator) with negative feedback. We derive exact expression...

  10. Design of a WAV audio player based on K20

    Directory of Open Access Journals (Sweden)

    Xu Yu

    2016-01-01

    Full Text Available The designed player uses the Freescale Company’s MK20DX128VLH7 as the core control ship, and its hardware platform is equipped with VS1003 audio decoder, OLED display interface, USB interface and SD card slot. The player uses the open source embedded real-time operating system μC/OS-II, Freescale USB Stack V4.1.1 and FATFS, and a graphical user interface is developed to improve the user experience based on CGUI. In general, the designed WAV audio player has a strong applicability and a good practical value.

  11. Cambridge English First 2 audio CDs : authentic examination papers

    CERN Document Server

    2016-01-01

    Four authentic Cambridge English Language Assessment examination papers for the Cambridge English: First (FCE) exam. These examination papers for the Cambridge English: First (FCE) exam provide the most authentic exam preparation available, allowing candidates to familiarise themselves with the content and format of the exam and to practise useful exam techniques. The Audio CDs contain the recorded material to allow thorough preparation for the Listening paper and are designed to be used with the Student's Book. A Student's Book with or without answers and a Student's Book with answers and downloadable Audio are available separately. These tests are also available as Cambridge English: First Tests 5-8 on Testbank.org.uk

  12. Audio engineering 101 a beginner's guide to music production

    CERN Document Server

    Dittmar, Tim

    2013-01-01

    Audio Engineering 101 is a real world guide for starting out in the recording industry. If you have the dream, the ideas, the music and the creativity but don't know where to start, then this book is for you!Filled with practical advice on how to navigate the recording world, from an author with first-hand, real-life experience, Audio Engineering 101 will help you succeed in the exciting, but tough and confusing, music industry. Covering all you need to know about the recording process, from the characteristics of sound to a guide to microphones to analog versus digital

  13. Animation, audio, and spatial ability: Optimizing multimedia for scientific explanations

    Science.gov (United States)

    Koroghlanian, Carol May

    This study investigated the effects of audio, animation and spatial ability in a computer based instructional program for biology. The program presented instructional material via text or audio with lean text and included eight instructional sequences presented either via static illustrations or animations. High school students enrolled in a biology course were blocked by spatial ability and randomly assigned to one of four treatments (Text-Static Illustration Audio-Static Illustration, Text-Animation, Audio-Animation). The study examined the effects of instructional mode (Text vs. Audio), illustration mode (Static Illustration vs. Animation) and spatial ability (Low vs. High) on practice and posttest achievement, attitude and time. Results for practice achievement indicated that high spatial ability participants achieved more than low spatial ability participants. Similar results for posttest achievement and spatial ability were not found. Participants in the Static Illustration treatments achieved the same as participants in the Animation treatments on both the practice and posttest. Likewise, participants in the Text treatments achieved the same as participants in the Audio treatments on both the practice and posttest. In terms of attitude, participants responded favorably to the computer based instructional program. They found the program interesting, felt the static illustrations or animations made the explanations easier to understand and concentrated on learning the material. Furthermore, participants in the Animation treatments felt the information was easier to understand than participants in the Static Illustration treatments. However, no difference for any attitude item was found for participants in the Text as compared to those in the Audio treatments. Significant differences were found by Spatial Ability for three attitude items concerning concentration and interest. In all three items, the low spatial ability participants responded more positively

  14. Minimizing Crosstalk in Self Oscillating Switch Mode Audio Power Amplifiers

    DEFF Research Database (Denmark)

    Knott, Arnold; Ploug, Rasmus Overgaard

    2012-01-01

    a method to minimize this phenomenon by improving the integrity of the various power distribution systems of the amplifier. The method is then applied to an amplifier built for this investigation. The results show that the crosstalk is suppressed with 30 dB, but is not entirely eliminated......The varying switching frequencies of self oscillating switch mode audio amplifiers have been known to cause interchannel intermodulation disturbances in multi channel configurations. This crosstalk phenomenon has a negative impact on the audio performance. The goal of this paper is to present...

  15. Can audio recording of outpatient consultations improve patient outcome?

    DEFF Research Database (Denmark)

    Wolderslund, Maiken; Kofoed, Poul-Erik; Axboe, Mette

    different departments: Orthopedics, Urology, Internal Medicine and Pediatrics. A total of 5,460 patients will be included from the outpatient clinics. All patients randomized to an intervention group are offered audio recording of their consultation. An Interactive Voice Response platform enables an audio....... The intervention will be evaluated using a questionnaire measuring different aspect of patients recall and understanding of the information given, patients need for additional information subsequent to the consultation and their overall satisfaction with the consultation. Results The study will be conducted from...

  16. AKTIVITAS SEKUNDER AUDIO UNTUK MENJAGA KEWASPADAAN PENGEMUDI MOBIL INDONESIA

    Directory of Open Access Journals (Sweden)

    Iftikar Zahedi Sutalaksana

    2013-03-01

    Full Text Available Tingkat kecelakaan lalu lintas yang melibatkan mobil di Indonesia semakin mengkhawatirkan. Tingginya peran faktor manusia sebagai penyebab utama kejadian kecelakaan patut diperhatikan. Penurunan kewaspadaan saat mengemudi akibat kantuk atau kelelahan merupakan salah satu kondisi yang mendorong terjadinya kecelakaan. Tulisan ini memaparkan aplikasi audio response test sebagai aktivitas sekunder dalam mengemudikan mobil. Response test yang dimaksud merupakan seperangkat aplikasi pada dashboard mobil yang menuntut respon pengemudi setiap stimulus suara bekerja. Audio response test ini diusulkan sebagai pemantau tingkat kewaspadaan pengemudi selama berkendara. Kewaspadaan pengemudi merupakan kondisi selama berkendara yang terjaga, awas, dan mampu memproses semua stimulus dengan baik. Hasil studi ini menghasilkan suatu bentuk audio response test yang terintegrasi dengan sistem berkendara di dalam mobil. Sumber bunyi diperdengarkan dengan intensitas konstan antara 80-85 dB. Bunyi akan berhenti jika pengemudi memberikan respon atas stimulus suara tersebut. Response test ini dirancang untuk mampu memantau tingkat kewaspadaan pengemudi selama berkendara. Penerapannya diharapkan mampu membantu menekan tingkat kecelakaan lalu lintas di Indonesia. Kata kunci: mengemudi, aktivitas sekunder, audio, kewaspadaan, response test   Abstract   The level of traffic accidents involving cars in Indonesia increasingly alarming. The high role of the human factor as the main cause of accident noteworthy. Decreased alertness while driving due to sleepiness or fatigue is one of the conditions that led to the accident. This paper describes an audio application response test as a secondary activity of driving a car. Response test is a set of applications on the dashboard of a car that demands a response driver each stimulus voice work. Audio response was proposed as test monitors the driver's level of alertness while driving. Vigilance driver was driving conditions during

  17. A conceptual framework for audio-visual museum media

    DEFF Research Database (Denmark)

    Kirkedahl Lysholm Nielsen, Mikkel

    2017-01-01

    In today's history museums, the past is communicated through many other means than original artefacts. This interdisciplinary and theoretical article suggests a new approach to studying the use of audio-visual media, such as film, video and related media types, in a museum context. The centre...... and museum studies, existing case studies, and real life observations, the suggested framework instead stress particular characteristics of contextual use of audio-visual media in history museums, such as authenticity, virtuality, interativity, social context and spatial attributes of the communication...

  18. The Single- and Multichannel Audio Recordings Database (SMARD)

    DEFF Research Database (Denmark)

    Nielsen, Jesper Kjær; Jensen, Jesper Rindom; Jensen, Søren Holdt

    2014-01-01

    A new single- and multichannel audio recordings database (SMARD) is presented in this paper. The database contains recordings from a box-shaped listening room for various loudspeaker and array types. The recordings were made for 48 different configurations of three different loudspeakers and four...... different microphone arrays. In each configuration, 20 different audio segments were played and recorded ranging from simple artificial sounds to polyphonic music. SMARD can be used for testing algorithms developed for numerous application, and we give examples of source localisation results....

  19. Deep Complementary Bottleneck Features for Visual Speech Recognition

    NARCIS (Netherlands)

    Petridis, Stavros; Pantic, Maja

    Deep bottleneck features (DBNFs) have been used successfully in the past for acoustic speech recognition from audio. However, research on extracting DBNFs for visual speech recognition is very limited. In this work, we present an approach to extract deep bottleneck visual features based on deep

  20. Analysis of current-bidirectional buck-boost based switch-mode audio amplifier

    DEFF Research Database (Denmark)

    Bolten Maizonave, Gert; Andersen, Michael A. E.; Kjærgaard, Claus

    2011-01-01

    The following studdy was carried out in order to assses quantitatively the performannce of the buck--boost converter whhen used as swiitch-mode audio amplifier. It comprises of, to beggin with, the de limitation of design criteria bassed on the state of-the-art solution, which is based...... in a differential mode buckbased amplifier with a boost converter as power supply. The averaged switch modelling of the differential mode current bidirectional topology is also used, in order to analyze the steady state and frequency-wise behaviour of this converter and parameterize it to meet the design criteria....... Next, several piecewise-linear siimulation resultss are shown with detail enough to emphasize the features of the converter. A simple prototype is implemented to verify the main predicted features. Presently no previous publicat ion could be found containing a thorough analysis of this topology...

  1. Do MRI features at baseline predict radiographic joint space narrowing in the medial compartment of the osteoarthritic knee 2 years later?

    Energy Technology Data Exchange (ETDEWEB)

    Madan-Sharma, Ruby; Kornaat, Peter R.; Bloem, Johannes L.; Watt, Iain [Leiden University Medical Center, Department of Radiology, Leiden (Netherlands); Kloppenburg, Margreet; Botha-Scheepers, Stella A. [Leiden University Medical Center, Department of Rheumatology, Leiden (Netherlands); Graverand, Marie-Pierre Hellio le [Pfizer Groton, Groton, CT (United States)

    2008-09-15

    The purpose of the study was to relate magnetic resonance imaging (MRI) features at baseline with radiographically determined joint space narrowing (JSN) in the medial compartment of the knee after 2 years in a group of patients with symptomatic osteoarthritis at multiple joint sites. MRI of the knee and standardized radiographs were obtained at baseline and after 2 years in 186 patients (81% female; aged 43-76 years; mean 60 years). MRI was analyzed for bone marrow lesions, cysts, osteophytes, hyaline cartilage defects, joint effusion, and meniscal pathology in the medial compartment. Radiographs were scored semiquantitatively for JSN in the medial tibiofemoral joint using the Osteoarthritis Research Society International (OARSI) atlas. Radiological progression was defined as {>=}1 grade increase. Associations between baseline magnetic resonance (MR) parameters and subsequent radiographic JSN changes were assessed using logistic regression. Relative risk (RR) was then calculated. Radiographic progression of JSN was observed in 17 (9.1%) of 186 patients. Eleven patients had a Kellgren and Lawrence (KL) score of {>=}2. A significant association was observed between all patients and meniscal tears (RR 3.57; confidence interval (CI) 1.08-10.0) and meniscal subluxation (RR 2.73; CI 1.20-5.41), between KL<2 and meniscal subluxation (RR 11.3; CI 2.49-29.49) and KL {>=} 2 and meniscus tears (RR 8.91; CI 1.13-22.84) and radiographic JSN 2 years later. Follow-up MR in 15 of 17 patients with progressive JSN showed only new meniscal abnormalities and no progression of cartilage loss. Meniscal pathology (tears and/or meniscal subluxation) was the only MRI parameter to be associated with subsequent radiographic progression of JSN in the medial tibiofemoral compartment on a radiograph 2 years later, as assessed by the OARSI score. (orig.)

  2. Evaluation of an Audio Cassette Tape Lecture Course

    Science.gov (United States)

    Blank, Jerome W.

    1975-01-01

    An audio-cassette continuing education course (Selected Topics in Pharmacology) from Extension Services in Pharmacy at the University of Wisconsin was offered to a selected test market of pharmacists and evaluated using a pre-, post-test design. Results showed significant increase in cognitive knowledge and strong approval of students. (JT)

  3. Subband coding of digital audio signals without loss of quality

    NARCIS (Netherlands)

    Veldhuis, Raymond N.J.; Breeuwer, Marcel; van de Waal, Robbert

    1989-01-01

    A subband coding system for high quality digital audio signals is described. To achieve low bit rates at a high quality level, it exploits the simultaneous masking effect of the human ear. It is shown how this effect can be used in an adaptive bit-allocation scheme. The proposed approach has been

  4. Audio-visual materials usage preference among agricultural ...

    African Journals Online (AJOL)

    It was found that respondents preferred radio, television, poster, advert, photographs, specimen, bulletin, magazine, cinema, videotape, chalkboard, and bulletin board as audio-visual materials for extension work. These are the materials that can easily be manipulated and utilized for extension work. Nigerian Journal of ...

  5. Streaming Audio and Video: New Challenges and Opportunities for Museums.

    Science.gov (United States)

    Spadaccini, Jim

    Streaming audio and video present new challenges and opportunities for museums. Streaming media is easier to author and deliver to Internet audiences than ever before; digital video editing is commonplace now that the tools--computers, digital video cameras, and hard drives--are so affordable; the cost of serving video files across the Internet…

  6. A Power Efficient Audio Amplifier Combining Switching and Linear Techniques

    NARCIS (Netherlands)

    van der Zee, Ronan A.R.; van Tuijl, Adrianus Johannes Maria

    1998-01-01

    Integrated Class D audio amplifiers are very power efficient, but require an external filter which prevents further integration. Also due to this filter, large feedback factors are hard to realise, so that the load influences the distortion- and transfer characteristics. The amplifier presented in

  7. Improved Techniques for Automatic Chord Recognition from Music Audio Signals

    Science.gov (United States)

    Cho, Taemin

    2014-01-01

    This thesis is concerned with the development of techniques that facilitate the effective implementation of capable automatic chord transcription from music audio signals. Since chord transcriptions can capture many important aspects of music, they are useful for a wide variety of music applications and also useful for people who learn and perform…

  8. Haptic and Visual feedback in 3D Audio Mixing Interfaces

    DEFF Research Database (Denmark)

    Gelineck, Steven; Overholt, Daniel

    2015-01-01

    This paper describes the implementation and informal evaluation of a user interface that explores haptic feedback for 3D audio mixing. The implementation compares different approaches using either the LEAP Motion for mid-air hand gesture control, or the Novint Falcon for active haptic feed- back...

  9. Audio-Visual Aid in Teaching "Fatty Liver"

    Science.gov (United States)

    Dash, Sambit; Kamath, Ullas; Rao, Guruprasad; Prakash, Jay; Mishra, Snigdha

    2016-01-01

    Use of audio visual tools to aid in medical education is ever on a rise. Our study intends to find the efficacy of a video prepared on "fatty liver," a topic that is often a challenge for pre-clinical teachers, in enhancing cognitive processing and ultimately learning. We prepared a video presentation of 11:36 min, incorporating various…

  10. Studies on a Spatialized Audio Interface for Sonar

    Science.gov (United States)

    2011-10-03

    addition of spatialized audio to visual displays for sonar is much akin to the development of talking movies in the early days of cinema and can be...than using the brute-force approach. PCA is one among several techniques that share similarities with the computational architecture of a

  11. The Role of Audio Media in the Lives of Children.

    Science.gov (United States)

    Christenson, Peter G.; Lindlof, Thomas R.

    Mass communication researchers have largely ignored the role of audio media and popular music in the lives of children, yet the available evidence shows that children do listen. Extant studies yield a consistent developmental portrait of childrens' listening frequency, but there is a notable lack of programatic research over the past decade, one…

  12. The relationship between basic audio quality and overall listening experience.

    Science.gov (United States)

    Schoeffler, Michael; Herre, Jürgen

    2016-09-01

    Basic audio quality (BAQ) is a well-known perceptual attribute, which is rated in various listening test methods to measure the performance of audio systems. Unfortunately, when it comes to purchasing audio systems, BAQ might not have a significant influence on the customers' buying decisions since other factors, like brand loyalty, might be more important. In contrast to BAQ, overall listening experience (OLE) is an affective attribute which incorporates all aspects that are important to an individual assessor, including his or her preference for music genre and audio quality. In this work, the relationship between BAQ and OLE is investigated in more detail. To this end, an experiment was carried out, in which participants rated the BAQ and the OLE of music excerpts with different timbral and spatial degradations. In a between-group-design procedure, participants were assigned into two groups, in each of which a different set of stimuli was rated. The results indicate that rating of both attributes, BAQ and OLE, leads to similar rankings, even if a different set of stimuli is rated. In contrast to the BAQ ratings, which were more influenced by timbral than spatial degradations, the OLE ratings were almost equally influenced by timbral and spatial degradations.

  13. Market potential for interactive audio-visual media

    NARCIS (Netherlands)

    Leurdijk, A.; Limonard, S.

    2005-01-01

    NM2 (New Media for a New Millennium) develops tools for interactive, personalised and non-linear audio-visual content that will be tested in seven pilot productions. This paper looks at the market potential for these productions from a technological, a business and a users' perspective. It shows

  14. Towards a universal representation for audio information retrieval and analysis

    DEFF Research Database (Denmark)

    Jensen, Bjørn Sand; Troelsgaard, Rasmus; Larsen, Jan

    2013-01-01

    A fundamental and general representation of audio and music which integrates multi-modal data sources is important for both application and basic research purposes. In this paper we address this challenge by proposing a multi-modal version of the Latent Dirichlet Allocation model which provides a...

  15. Computationally efficient clustering of audio-visual meeting data

    NARCIS (Netherlands)

    Hung, H.; Friedland, G.; Yeo, C.; Shao, L.; Shan, C.; Luo, J.; Etoh, M.

    2010-01-01

    This chapter presents novel computationally efficient algorithms to extract semantically meaningful acoustic and visual events related to each of the participants in a group discussion using the example of business meeting recordings. The recording setup involves relatively few audio-visual sensors,

  16. Multi Carrier Modulator for Switch-Mode Audio Power Amplifiers

    DEFF Research Database (Denmark)

    Knott, Arnold; Pfaffinger, Gerhard; Andersen, Michael Andreas E.

    2008-01-01

    While switch-mode audio power amplifiers allow compact implementations and high output power levels due to their high power efficiency, they are very well known for creating electromagnetic interference (EMI) with other electronic equipment, in particular radio receivers. Lowering the EMI of swit...

  17. Audio Quality Assurance : An Application of Cross Correlation

    DEFF Research Database (Denmark)

    Jurik, Bolette Ammitzbøll; Nielsen, Jesper Asbjørn Sindahl

    2012-01-01

    We describe algorithms for automated quality assurance on content of audio files in context of preservation actions and access. The algorithms use cross correlation to compare the sound waves. They are used to do overlap analysis in an access scenario, where preserved radio broadcasts are used in...

  18. Real-time Loudspeaker Distance Estimation with Stereo Audio

    DEFF Research Database (Denmark)

    Nielsen, Jesper Kjær; Gaubitch, Nikolay; Heusdens, Richard

    2015-01-01

    Knowledge on how a number of loudspeakers are positioned relative to a listening position can be used to enhance the listening experience. Usually, these loudspeaker positions are estimated using calibration signals, either audible or psycho-acoustically hidden inside the desired audio signal...

  19. Audio-Visual Perception System for a Humanoid Robotic Head

    Directory of Open Access Journals (Sweden)

    Raquel Viciana-Abad

    2014-05-01

    Full Text Available One of the main issues within the field of social robotics is to endow robots with the ability to direct attention to people with whom they are interacting. Different approaches follow bio-inspired mechanisms, merging audio and visual cues to localize a person using multiple sensors. However, most of these fusion mechanisms have been used in fixed systems, such as those used in video-conference rooms, and thus, they may incur difficulties when constrained to the sensors with which a robot can be equipped. Besides, within the scope of interactive autonomous robots, there is a lack in terms of evaluating the benefits of audio-visual attention mechanisms, compared to only audio or visual approaches, in real scenarios. Most of the tests conducted have been within controlled environments, at short distances and/or with off-line performance measurements. With the goal of demonstrating the benefit of fusing sensory information with a Bayes inference for interactive robotics, this paper presents a system for localizing a person by processing visual and audio data. Moreover, the performance of this system is evaluated and compared via considering the technical limitations of unimodal systems. The experiments show the promise of the proposed approach for the proactive detection and tracking of speakers in a human-robot interactive framework.

  20. Topological mappings of video and audio data.

    Science.gov (United States)

    Fyfe, Colin; Barbakh, Wesam; Ooi, Wei Chuan; Ko, Hanseok

    2008-12-01

    We review a new form of self-organizing map which is based on a nonlinear projection of latent points into data space, identical to that performed in the Generative Topographic Mapping (GTM).(1) But whereas the GTM is an extension of a mixture of experts, this model is an extension of a product of experts.(2) We show visualisation and clustering results on a data set composed of video data of lips uttering 5 Korean vowels. Finally we note that we may dispense with the probabilistic underpinnings of the product of experts and derive the same algorithm as a minimisation of mean squared error between the prototypes and the data. This leads us to suggest a new algorithm which incorporates local and global information in the clustering. Both ot the new algorithms achieve better results than the standard Self-Organizing Map.

  1. A Model of Distraction in an Audio-on-Audio Interference Situation with Music Program Material

    DEFF Research Database (Denmark)

    Francombe, J.; Mason, R.; Dewhirst, M.

    2015-01-01

    listener can be viewed as having a personal sound zone system. In order to evaluate and optimize such situations in a perceptually relevant manner, the authors created a predictive model using the features that contribute to the distraction from unwanted sounds. Feature extraction was motivated...

  2. Overview of the 2015 Workshop on Speech, Language and Audio in Multimedia

    NARCIS (Netherlands)

    Gravier, Guillaume; Jones, Gareth J.F.; Larson, Martha; Ordelman, Roeland J.F.

    2015-01-01

    The Workshop on Speech, Language and Audio in Multimedia (SLAM) positions itself at at the crossroad of multiple scientific fields - music and audio processing, speech processing, natural language processing and multimedia - to discuss and stimulate research results, projects, datasets and

  3. Transcript of Audio Narrative Portion of: Scandinavian Heritage. A Set of Five Audio-Visual Film Strip/Cassette Presentations.

    Science.gov (United States)

    Anderson, Gerald D.; Olson, David B.

    The document presents the transcript of the audio narrative portion of approximately 100 interviews with first and second generation Scandinavian immigrants to the United States. The document is intended for use by secondary school classroom teachers as they develop and implement educational programs related to the Scandinavian heritage in…

  4. Deutsch Durch Audio-Visuelle Methode: An Audio-Lingual-Oral Approach to the Teaching of German.

    Science.gov (United States)

    Dickinson Public Schools, ND. Instructional Media Center.

    This teaching guide, designed to accompany Chilton's "Deutsch Durch Audio-Visuelle Methode" for German 1 and 2 in a three-year secondary school program, focuses major attention on the operational plan of the program and a student orientation unit. A section on teaching a unit discusses four phases: (1) presentation, (2) explanation, (3)…

  5. Automatic Organisation and Quality Analysis of User-Generated Content with Audio Fingerprinting

    OpenAIRE

    Cavaco, Sofia; Magalhaes, Joao; Mordido, Gonçalo

    2018-01-01

    The increase of the quantity of user-generated content experienced in social media has boosted the importance of analysing and organising the content by its quality. Here, we propose a method that uses audio fingerprinting to organise and infer the quality of user-generated audio content. The proposed method detects the overlapping segments between different audio clips to organise and cluster the data according to events, and to infer the audio quality of the samples. A test setup with conce...

  6. Audio-Visual Temporal Recalibration Can be Constrained by Content Cues Regardless of Spatial Overlap

    OpenAIRE

    Roseboom, Warrick; Kawabe, Takahiro; Nishida, Shin?Ya

    2013-01-01

    It has now been well established that the point of subjective synchrony for audio and visual events can be shifted following exposure to asynchronous audio-visual presentations, an effect often referred to as temporal recalibration. Recently it was further demonstrated that it is possible to concurrently maintain two such recalibrated, and opposing, estimates of audio-visual temporal synchrony. However, it remains unclear precisely what defines a given audio-visual pair such that it is possib...

  7. Documentary management of the sport audio-visual information in the generalist televisions

    OpenAIRE

    Jorge Caldera Serrano; Felipe Alonso

    2007-01-01

    The management of the sport audio-visual documentation of the Information Systems of the state, zonal and local chains is analyzed within the framework. For it it is made makes a route by the documentary chain that makes the sport audio-visual information with the purpose of being analyzing each one of the parameters, showing therefore a series of recommendations and norms for the preparation of the sport audio-visual registry. Evidently the audio-visual sport documentation difference i...

  8. Spectacular Attractions: Museums, Audio-Visuals and the Ghosts of Memory

    Directory of Open Access Journals (Sweden)

    Mandelli Elisa

    2015-12-01

    Full Text Available In the last decades, moving images have become a common feature not only in art museums, but also in a wide range of institutions devoted to the conservation and transmission of memory. This paper focuses on the role of audio-visuals in the exhibition design of history and memory museums, arguing that they are privileged means to achieve the spectacular effects and the visitors’ emotional and “experiential” engagement that constitute the main objective of contemporary museums. I will discuss this topic through the concept of “cinematic attraction,” claiming that when embedded in displays, films and moving images often produce spectacular mises en scène with immersive effects, creating wonder and astonishment, and involving visitors on an emotional, visceral and physical level. Moreover, I will consider the diffusion of audio-visual witnesses of real or imaginary historical characters, presented in Phantasmagoria-like displays that simulate ghostly and uncanny apparitions, creating an ambiguous and often problematic coexistence of truth and illusion, subjectivity and objectivity, facts and imagination.

  9. A Novel Chewing Detection System Based on PPG, Audio, and Accelerometry.

    Science.gov (United States)

    Papapanagiotou, Vasileios; Diou, Christos; Zhou, Lingchuan; van den Boer, Janet; Mars, Monica; Delopoulos, Anastasios

    2017-05-01

    In the context of dietary management, accurate monitoring of eating habits is receiving increased attention. Wearable sensors, combined with the connectivity and processing of modern smartphones, can be used to robustly extract objective and real-time measurements of human behavior. In particular, for the task of chewing detection, several approaches based on an in-ear microphone can be found in the literature, while other types of sensors have also been reported, such as strain sensors. In this paper, performed in the context of the SPLENDID project, we propose to combine an in-ear microphone with a photoplethysmography (PPG) sensor placed in the ear concha, in a new high accuracy and low sampling rate prototype chewing detection system. We propose a pipeline that initially processes each sensor signal separately, and then fuses both to perform the final detection. Features are extracted from each modality, and support vector machine (SVM) classifiers are used separately to perform snacking detection. Finally, we combine the SVM scores from both signals in a late-fusion scheme, which leads to increased eating detection accuracy. We evaluate the proposed eating monitoring system on a challenging, semifree living dataset of 14 subjects, which includes more than 60 h of audio and PPG signal recordings. Results show that fusing the audio and PPG signals significantly improves the effectiveness of eating event detection, achieving accuracy up to 0.938 and class-weighted accuracy up to 0.892.

  10. Parametric Packet-Layer Model for Evaluation Audio Quality in Multimedia Streaming Services

    Science.gov (United States)

    Egi, Noritsugu; Hayashi, Takanori; Takahashi, Akira

    We propose a parametric packet-layer model for monitoring audio quality in multimedia streaming services such as Internet protocol television (IPTV). This model estimates audio quality of experience (QoE) on the basis of quality degradation due to coding and packet loss of an audio sequence. The input parameters of this model are audio bit rate, sampling rate, frame length, packet-loss frequency, and average burst length. Audio bit rate, packet-loss frequency, and average burst length are calculated from header information in received IP packets. For sampling rate, frame length, and audio codec type, the values or the names used in monitored services are input into this model directly. We performed a subjective listening test to examine the relationships between these input parameters and perceived audio quality. The codec used in this test was the Advanced Audio Codec-Low Complexity (AAC-LC), which is one of the international standards for audio coding. On the basis of the test results, we developed an audio quality evaluation model. The verification results indicate that audio quality estimated by the proposed model has a high correlation with perceived audio quality.

  11. Audio-Tutorial Instruction: A Strategy For Teaching Introductory College Geology.

    Science.gov (United States)

    Fenner, Peter; Andrews, Ted F.

    The rationale of audio-tutorial instruction is discussed, and the history and development of the audio-tutorial botany program at Purdue University is described. Audio-tutorial programs in geology at eleven colleges and one school are described, illustrating several ways in which programs have been developed and integrated into courses. Programs…

  12. Interactive 3D audio: Enhancing awareness of details in immersive soundscapes?

    DEFF Research Database (Denmark)

    Schmidt, Mikkel Nørgaard; Schwartz, Stephen; Larsen, Jan

    2012-01-01

    Spatial audio and the possibility of interacting with the audio environment is thought to increase listeners' attention to details in a soundscape. This work examines if interactive 3D audio enhances listeners' ability to recall details in a soundscape. Nine different soundscapes were constructed...

  13. Effects of Hearing Protection Device Attenuation on Unmanned Aerial Vehicle (UAV) Audio Signatures

    Science.gov (United States)

    2016-03-01

    UAV ) Audio Signatures by Melissa Bezandry, Adrienne Raglin, and John Noble Approved for public release; distribution...Research Laboratory Effects of Hearing Protection Device Attenuation on Unmanned Aerial Vehicle ( UAV ) Audio Signatures by Melissa Bezandry...Aerial Vehicle ( UAV ) Audio Signatures 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) Melissa Bezandry

  14. Responding Effectively to Composition Students: Comparing Student Perceptions of Written and Audio Feedback

    Science.gov (United States)

    Bilbro, J.; Iluzada, C.; Clark, D. E.

    2013-01-01

    The authors compared student perceptions of audio and written feedback in order to assess what types of students may benefit from receiving audio feedback on their essays rather than written feedback. Many instructors previously have reported the advantages they see in audio feedback, but little quantitative research has been done on how the…

  15. Space space space

    CERN Document Server

    Trembach, Vera

    2014-01-01

    Space is an introduction to the mysteries of the Universe. Included are Task Cards for independent learning, Journal Word Cards for creative writing, and Hands-On Activities for reinforcing skills in Math and Language Arts. Space is a perfect introduction to further research of the Solar System.

  16. The long-term prospects of citizens managing urban green space: From place making to place-keeping? : Special feature:TURFGRASS

    NARCIS (Netherlands)

    Mattijssen, T.J.M.; van der Jagt, A.P.N.; Buijs, A.E.; Elands, B.H.M.; Erlwein, S.; Lafortezza, R.

    2017-01-01

    Abstract This paper discusses the long-term management or ‘place-keeping’ of urban green space by citizens and highlights enabling and constraining factors that play a crucial role in this continuity. While authorities have historically been in charge of managing public green spaces, there is an

  17. Underdetermined Blind Audio Source Separation Using Modal Decomposition

    Directory of Open Access Journals (Sweden)

    Abdeldjalil Aïssa-El-Bey

    2007-03-01

    Full Text Available This paper introduces new algorithms for the blind separation of audio sources using modal decomposition. Indeed, audio signals and, in particular, musical signals can be well approximated by a sum of damped sinusoidal (modal components. Based on this representation, we propose a two-step approach consisting of a signal analysis (extraction of the modal components followed by a signal synthesis (grouping of the components belonging to the same source using vector clustering. For the signal analysis, two existing algorithms are considered and compared: namely the EMD (empirical mode decomposition algorithm and a parametric estimation algorithm using ESPRIT technique. A major advantage of the proposed method resides in its validity for both instantaneous and convolutive mixtures and its ability to separate more sources than sensors. Simulation results are given to compare and assess the performance of the proposed algorithms.

  18. Underdetermined Blind Audio Source Separation Using Modal Decomposition

    Directory of Open Access Journals (Sweden)

    Aïssa-El-Bey Abdeldjalil

    2007-01-01

    Full Text Available This paper introduces new algorithms for the blind separation of audio sources using modal decomposition. Indeed, audio signals and, in particular, musical signals can be well approximated by a sum of damped sinusoidal (modal components. Based on this representation, we propose a two-step approach consisting of a signal analysis (extraction of the modal components followed by a signal synthesis (grouping of the components belonging to the same source using vector clustering. For the signal analysis, two existing algorithms are considered and compared: namely the EMD (empirical mode decomposition algorithm and a parametric estimation algorithm using ESPRIT technique. A major advantage of the proposed method resides in its validity for both instantaneous and convolutive mixtures and its ability to separate more sources than sensors. Simulation results are given to compare and assess the performance of the proposed algorithms.

  19. Audio teleconferencing: creative use of a forgotten innovation.

    Science.gov (United States)

    Mather, Carey; Marlow, Annette

    2012-06-01

    As part of a regional School of Nursing and Midwifery's commitment to addressing recruitment and retention issues, approximately 90% of second year undergraduate student nurses undertake clinical placements at: multipurpose centres; regional or district hospitals; aged care; or community centres based in rural and remote regions within the State. The remaining 10% undertake professional experience placement in urban areas only. This placement of a large cohort of students, in low numbers in a variety of clinical settings, initiated the need to provide consistent support to both students and staff at these facilities. Subsequently the development of an audio teleconferencing model of clinical facilitation to guide student teaching and learning and to provide support to registered nurse preceptors in clinical practice was developed. This paper draws on Weimer's 'Personal Accounts of Change' approach to describe, discuss and evaluate the modifications that have occurred since the inception of this audio teleconferencing model (Weimer, 2006).

  20. Audio Visual Media Components in Educational Game for Elementary Students

    Directory of Open Access Journals (Sweden)

    Meilani Hartono

    2016-12-01

    Full Text Available The purpose of this research was to review and implement interactive audio visual media used in an educational game to improve elementary students’ interest in learning mathematics. The game was developed for desktop platform. The art of the game was set as 2D cartoon art with animation and audio in order to make students more interest. There were four mini games developed based on the researches on mathematics study. Development method used was Multimedia Development Life Cycle (MDLC that consists of requirement, design, development, testing, and implementation phase. Data collection methods used are questionnaire, literature study, and interview. The conclusion is elementary students interest with educational game that has fun and active (moving objects, with fast tempo of music, and carefree color like blue. This educational game is hoped to be an alternative teaching tool combined with conventional teaching method.

  1. An introduction to audio content analysis applications in signal processing and music informatics

    CERN Document Server

    Lerch, Alexander

    2012-01-01

    "With the proliferation of digital audio distribution over digital media, audio content analysis is fast becoming a requirement for designers of intelligent signal-adaptive audio processing systems. Written by a well-known expert in the field, this book provides quick access to different analysis algorithms and allows comparison between different approaches to the same task, making it useful for newcomers to audio signal processing and industry experts alike. A review of relevant fundamentals in audio signal processing, psychoacoustics, and music theory, as well as downloadable MATLAB files are also included"--

  2. Amplitude Modulated Sinusoidal Signal Decomposition for Audio Coding

    DEFF Research Database (Denmark)

    Christensen, M. G.; Jacobson, A.; Andersen, S. V.

    2006-01-01

    In this paper, we present a decomposition for sinusoidal coding of audio, based on an amplitude modulation of sinusoids via a linear combination of arbitrary basis vectors. The proposed method, which incorporates a perceptual distortion measure, is based on a relaxation of a nonlinear least......-squares minimization. Rate-distortion curves and listening tests show that, compared to a constant-amplitude sinusoidal coder, the proposed decomposition offers perceptually significant improvements in critical transient signals....

  3. Pitch range variations improve cognitive processing of audio messages

    OpenAIRE

    Rodero Antón, Emma; Potter, Rob F.; Prieto Vives, Pilar, 1965-

    2017-01-01

    This study explores the effect of different speaker intonation strategies in audio messages on attention, autonomic arousal, and memory. An experiment was conducted in which participants listened to 16 radio commercials produced to vary in pitch range across sentences. Dependent variables were self-reported effectiveness and adequacy, psychophysiological arousal and attention, immediate word recall and recognition of information. Results showed that messages conveyed with pitch variations ach...

  4. Parameter and state estimation using audio and video signals

    OpenAIRE

    Evestedt, Magnus

    2005-01-01

    The complexity of industrial systems and the mathematical models to describe them increases. In many cases point sensors are no longer sufficient to provide controllers and monitoring instruments with the information necessary for operation. The need for other types of information, such as audio and video, has grown. Suitable applications range in a broad spectrum from microelectromechanical systems and bio-medical engineering to papermaking and steel production. This thesis is divided into f...

  5. Modular Sensor Environment : Audio Visual Industry Monitoring Applications

    OpenAIRE

    Guillot, Calvin

    2017-01-01

    This work was made for Electro Waves Oy. The company specializes in Audio-visual services and interactive systems. The purpose of this work is to design and implement a modular sensor environment for the company, which will be used for developing automated systems. This thesis begins with an introduction to sensor systems and their different topologies. It is followed by an introduction to the technologies used in this project. The system is divided in three parts. The client, tha...

  6. Rating Algorithm for Pronunciation of English Based on Audio Feature Pattern Matching

    Directory of Open Access Journals (Sweden)

    Li Kun

    2015-01-01

    Full Text Available With the increasing internationalization of China, language communication has become an important channel for us to adapt to the political and economic environment. How to improve English learners’ language learning efficiency in limited conditions has turned into a problem demanding prompt solution at present. This paper applies two pronunciation patterns according to the actual needs of English pronunciation rating: to-be-evaluated pronunciation pattern and standard pronunciation pattern. It will translate the patterns into English pronunciation rating results through European distance. Besides, this paper will introduce the design philosophy of the whole algorithm in combination with CHMM matching pattern. Each link of the CHMM pattern will be given selective analysis while a contrast experiment between the CHMM matching pattern and the other two patterns will be conducted. From the experiment results, it can be concluded that CHMM pattern is the best option.

  7. Comparison of Linear Prediction Models for Audio Signals

    Directory of Open Access Journals (Sweden)

    2009-03-01

    Full Text Available While linear prediction (LP has become immensely popular in speech modeling, it does not seem to provide a good approach for modeling audio signals. This is somewhat surprising, since a tonal signal consisting of a number of sinusoids can be perfectly predicted based on an (all-pole LP model with a model order that is twice the number of sinusoids. We provide an explanation why this result cannot simply be extrapolated to LP of audio signals. If noise is taken into account in the tonal signal model, a low-order all-pole model appears to be only appropriate when the tonal components are uniformly distributed in the Nyquist interval. Based on this observation, different alternatives to the conventional LP model can be suggested. Either the model should be changed to a pole-zero, a high-order all-pole, or a pitch prediction model, or the conventional LP model should be preceded by an appropriate frequency transform, such as a frequency warping or downsampling. By comparing these alternative LP models to the conventional LP model in terms of frequency estimation accuracy, residual spectral flatness, and perceptual frequency resolution, we obtain several new and promising approaches to LP-based audio modeling.

  8. audio-ultrasonic waves by argon gas discharge

    International Nuclear Information System (INIS)

    Ragheb, M.S.

    2010-01-01

    in the present work, wave emission formed by audio-ultrasonic plasma is investigated. the evidence of the magnetic and electric fields presence is performed by experimental technique. comparison between experimental field measurements and several plasma wave methods reveals the plasma audio-ultrasonic radiations mode. this plasma is a symmetrically driven capacitive discharge, consisting of three interactive regions: the electrodes, the sheaths, and the positive column regions . the discharge voltage is up to 900 volts, the discharge current flowing through the plasma attains a value of 360 mA .the frequency of the discharge voltage covers the audio and the ultrasonic range up to 100 khz. the effective plasma working distance has increased to attain the total length of the tube of 40 cm. a non-disturbing method using an external coil is used to measure the electric discharge field in a plane perpendicular to that of the plasma axe tube. this method proves the existence of a current flowing in a direction perpendicular to the plasma axe tube. a system of minute coils sensors proved the existence of two fields in two perpendicular directions . comparison between different observed fields reveals the existence of propagating electromagnetic waves due to the alternating current flowing through the skin plasma tube. the field intensity distribution along the tube draws the discharge current behavior between the two plasma electrodes that can be used to predict the range of the plasma discharge current.

  9. Automatic summarization of soccer highlights using audio-visual descriptors.

    Science.gov (United States)

    Raventós, A; Quijada, R; Torres, Luis; Tarrés, Francesc

    2015-01-01

    Automatic summarization generation of sports video content has been object of great interest for many years. Although semantic descriptions techniques have been proposed, many of the approaches still rely on low-level video descriptors that render quite limited results due to the complexity of the problem and to the low capability of the descriptors to represent semantic content. In this paper, a new approach for automatic highlights summarization generation of soccer videos using audio-visual descriptors is presented. The approach is based on the segmentation of the video sequence into shots that will be further analyzed to determine its relevance and interest. Of special interest in the approach is the use of the audio information that provides additional robustness to the overall performance of the summarization system. For every video shot a set of low and mid level audio-visual descriptors are computed and lately adequately combined in order to obtain different relevance measures based on empirical knowledge rules. The final summary is generated by selecting those shots with highest interest according to the specifications of the user and the results of relevance measures. A variety of results are presented with real soccer video sequences that prove the validity of the approach.

  10. Literary Genres in Social Life: A Narrative, Audio-visual and Poetic Approach

    Directory of Open Access Journals (Sweden)

    Luis Felipe González Gutiérrez

    2008-05-01

    Full Text Available The proposal, "Literary Genres in Social Life: a Narrative, Audio-visual and Poetic Approach", attempts, by objective, to present/display to the academic psychology community and compatible social science disciplines the main contributions of literary genre theory through a social constructionist understanding of narrations and daily stories, and by means of an interactive construction of narrative collage. This work, sustained by an investigation financed by the University Santo Tomás in Bogota, Colombia, "Understanding of structuralist literary theories in the development of the narrative 'I' within the social constructionist approach", tries to propose alternative spaces for the presentation of its investigative results through the expression of metaphors, visual narrative sequences and interactive artistic forms, which invite the spectator to share in and to include/understand important concepts in the consolidation of social forms of construction of the quotidian. URN: urn:nbn:de:0114-fqs0802373

  11. Applying Spatial Audio to Human Interfaces: 25 Years of NASA Experience

    Science.gov (United States)

    Begault, Durand R.; Wenzel, Elizabeth M.; Godfrey, Martine; Miller, Joel D.; Anderson, Mark R.

    2010-01-01

    From the perspective of human factors engineering, the inclusion of spatial audio within a human-machine interface is advantageous from several perspectives. Demonstrated benefits include the ability to monitor multiple streams of speech and non-speech warning tones using a cocktail party advantage, and for aurally-guided visual search. Other potential benefits include the spatial coordination and interaction of multimodal events, and evaluation of new communication technologies and alerting systems using virtual simulation. Many of these technologies were developed at NASA Ames Research Center, beginning in 1985. This paper reviews examples and describes the advantages of spatial sound in NASA-related technologies, including space operations, aeronautics, and search and rescue. The work has involved hardware and software development as well as basic and applied research.

  12. The French Space Operation Act: Scope and Main Features. Introduction to the Technical Regulation Considerations about the Implementation in the Launcher Field

    Science.gov (United States)

    Cahuzac, Francois

    2010-09-01

    This publication provides a presentation of the new French Space Operation Act(hereafter FSOA). The main objectives of FSOA are to institute a clarified legal regime for launch operations. The technical regulation associated to the act is set forth, in particular for the safety of persons and property, the protection of public health and the environment. First, we give an overview of the institutional and legal framework implemented in accordance with the act. The general purpose of this French Space Operation Act(hereafter FSOA) is to set up a coherent national regime of authorization and control of Space operations under the French jurisdiction or for which the French Government bears international liability either under UN Treaties principles(namely the 1967 Outer Space Treaty, the 1972 Liability Convention and the 1976 Registration Convention) or in accordance with its European commitments with the ESA organization and its Members States. For a given space operation, the operator must show that systems and procedures that he intends to implement are compliant with the technical regulation. The regime of authorization leads to a request of authorization for each launch operation. Thus, licences concerning operator management organization or a given space system can be obtained. These licences help to simplify the authorization file required for a given space operation. The technical regulation is presented in another article, and will be issued in 2010 by the French Minister in charge of space activities. A brief description of the organization associated to the implementation of the authorization regime in the launcher field is presented.

  13. Formal usability evaluation of audio track widget graphical representation for two-dimensional stage audio mixing interface

    OpenAIRE

    Dewey, Christopher; Wakefield, Jonathan P.

    2017-01-01

    The two-dimensional stage paradigm (2DSP) has been suggested as an alternative audio mixing interface (AMI). This study seeks to refine the 2DSP by formally evaluating graphical track visualisation styles. Track visualisations considered were text only, circles containing text, individually coloured circles containing text, circles colour coded by instrument type with text, icons with text superimposed, circles with RMS related dynamic opacity and a traditional AMI. The usability evaluation f...

  14. Understanding Legacy Features with Featureous

    DEFF Research Database (Denmark)

    Olszak, Andrzej; Jørgensen, Bo Nørregaard

    2011-01-01

    Java programs called Featureous that addresses this issue. Featureous allows a programmer to easily establish feature-code traceability links and to analyze their characteristics using a number of visualizations. Featureous is an extension to the NetBeans IDE, and can itself be extended by third...

  15. Feature Article

    Indian Academy of Sciences (India)

    Home; Journals; Resonance – Journal of Science Education. Feature Article. Articles in Resonance – Journal of Science Education. Volume 1 Issue 1 January 1996 pp 80-85 Feature Article. What's New in Computers Windows 95 · Vijnan Shastri · More Details Fulltext PDF. Volume 1 Issue 1 January 1996 pp 86-89 Feature ...

  16. APPLICATION OF CONTROLLED SOURCE AUDIO MAGNETOTELLURIC (CSAMT AT GEOTHERMAL

    Directory of Open Access Journals (Sweden)

    Susilawati S.

    2017-04-01

    Full Text Available CSAMT or Controlled Source Audio-Magnetotelluric is one of the Geophysics methods to determine the resistivity of rock under earth surface. CSAMT method utilizes artificial stream and injected into the ground, the frequency of artificial sources ranging from 0.1 Hz to 10 kHz, CSAMT data source effect correction is inverted. From the inversion results showed that there is a layer having resistivity values ranged between 2.5 Ω.m – 15 Ω.m, which is interpreted that the layer is clay.

  17. A listening test system for automotive audio - listeners

    DEFF Research Database (Denmark)

    Choisel, Sylvain; Hegarty, Patrick; Christensen, Flemming

    2007-01-01

    A series of experiments was conducted in order to validate an experimental procedure to perform listening tests on car audio systems in a simulation of the car environment in a laboratory, using binaural synthesis with head-tracking. Seven experts and 40 non-expert listeners rated a range...... of stimuli for 15 sound-quality attributes developed by the experts. This paper presents a comparison between the attribute ratings from the two groups of participants. Overall preference of the non-experts was also measured using direct ratings as well as indirect scaling based on paired comparisons...

  18. Digital audio recordings improve the outcomes of patient consultations

    DEFF Research Database (Denmark)

    Wolderslund, Maiken; Kofoed, Poul-Erik; Holst, René

    2017-01-01

    OBJECTIVES: To investigate the effects on patients' outcome of the consultations when provided with: a Digital Audio Recording (DAR) of the consultation and a Question Prompt List (QPL). METHODS: This is a three-armed randomised controlled cluster trial. One group of patients received standard care......, while the other two groups received either the QPL in combination with a recording of their consultation or only the recording. Patients from four outpatient clinics participated: Paediatric, Orthopaedic, Internal Medicine, and Urology. The effects were evaluated by patient-administered questionnaires...

  19. Audio-haptic interaction in simulated walking experiences

    DEFF Research Database (Denmark)

    Serafin, Stefania

    2011-01-01

    and interchangeable use of the haptic and auditory modality in floor interfaces, and for the synergy of perception and action in capturing and guiding human walking. We describe the technology developed in the context of this project, together with some experiments performed to evaluate the role of auditory......In this paper an overview of the work conducted on audio-haptic physically based simulation and evaluation of walking is provided. This work has been performed in the context of the Natural Interactive Walking (NIW) project, whose goal is to investigate possibilities for the integrated...... and haptic feedback in walking tasks....

  20. An assessment of individualized technical ear training for audio production.

    Science.gov (United States)

    Kim, Sungyoung

    2015-07-01

    An individualized technical ear training method is compared to a non-individualized method. The efficacy of the individualized method is assessed using a standardized test conducted before and after the training period. Participants who received individualized training improved better than the control group on the test. Results indicate the importance of individualized training for acquisition of spectrum-identification and spectrum-matching skills. Individualized training, therefore, should be implemented by default into technical ear training programs used in audio production industry and education.

  1. Sinusoidal Analysis-Synthesis of Audio Using Perceptual Criteria

    Science.gov (United States)

    Painter, Ted; Spanias, Andreas

    2003-12-01

    This paper presents a new method for the selection of sinusoidal components for use in compact representations of narrowband audio. The method consists of ranking and selecting the most perceptually relevant sinusoids. The idea behind the method is to maximize the matching between the auditory excitation pattern associated with the original signal and the corresponding auditory excitation pattern associated with the modeled signal that is being represented by a small set of sinusoidal parameters. The proposed component-selection methodology is shown to outperform the maximum signal-to-mask ratio selection strategy in terms of subjective quality.

  2. Digital video and audio broadcasting technology a practical engineering guide

    CERN Document Server

    Fischer, Walter

    2010-01-01

    Digital Video and Audio Broadcasting Technology - A Practical Engineering Guide' deals with all the most important digital television, sound radio and multimedia standards such as MPEG, DVB, DVD, DAB, ATSC, T-DMB, DMB-T, DRM and ISDB-T. The book provides an in-depth look at these subjects in terms of practical experience. In addition it contains chapters on the basics of technologies such as analog television, digital modulation, COFDM or mathematical transformations between time and frequency domains. The attention in the respective field under discussion is focussed on aspects of measuring t

  3. Synthesis of audio spectra using a diffraction model.

    Science.gov (United States)

    Vijayakumar, V; Eswaran, C

    2006-12-01

    It is shown that the intensity variations of an audio signal in the frequency domain can be obtained by using a mathematical function containing a series of weighted complex Bessel functions. With proper choice of values for two parameters, this function can transform an input spectrum of discrete frequencies of unit intensity into the known spectra of different musical instruments. Specific examples of musical instruments are considered for evaluating the performance of this method. It is found that this function yields musical spectra with a good degree of accuracy.

  4. The complete guide to high-end audio

    CERN Document Server

    Harley, Robert

    2015-01-01

    An updated edition of what many consider the "bible of high-end audio"   In this newly revised and updated fifth edition, Robert Harley, editor in chief of the Absolute Sound magazine, tells you everything you need to know about buying and enjoying high-quality hi-fi. With this book, discover how to get the best sound for your money, how to identify the weak links in your system and upgrade where it will do the most good, how to set up and tweak your system for maximum performance, and how to become a more perceptive and appreciative listener. Just a few of the secrets you will learn cover hi

  5. [Voix d'Or, an audio tool to revive memories].

    Science.gov (United States)

    Braunschweig, Lina

    2010-01-01

    Voix d'Or is an audio tool designed to awaken the affective memory of elderly people and particularly those suffering from Alzheimer's disease. Every month it offers new radio programmes to initiate or facilitate leisure and entertainment activities, memory workshops or provide the basis of quiet moments. The tool has a double objective: to procure well-being, boost the individual's self-esteem and recognise his/her history and to facilitate exchange and communication between the residents and the staff of a care home.

  6. Amplificador de audio en clase A para auriculares

    OpenAIRE

    Martín Ruiz, Manuel

    2012-01-01

    El presente proyecto muestra el desarrollo, la simulación y la implantación de un amplificador de audio de altas prestaciones, empleando para ello transistores discretos y amplificadores operacionales sobre una PCB diseñada previamente con un programa software. La aplicación de este amplificador será como amplificador de potencia para auriculares de alta impedancia. El circuito empleará una técnica de realimentación directa sobre los auriculares conectados a 4 hilos. El amplificador incorpora...

  7. Tools for signal compression applications to speech and audio coding

    CERN Document Server

    Moreau, Nicolas

    2013-01-01

    This book presents tools and algorithms required to compress/uncompress signals such as speech and music. These algorithms are largely used in mobile phones, DVD players, HDTV sets, etc. In a first rather theoretical part, this book presents the standard tools used in compression systems: scalar and vector quantization, predictive quantization, transform quantization, entropy coding. In particular we show the consistency between these different tools. The second part explains how these tools are used in the latest speech and audio coders. The third part gives Matlab programs simulating t

  8. MP3 audio-editing software for the department of radiology

    International Nuclear Information System (INIS)

    Hong Qingfen; Sun Canhui; Li Ziping; Meng Quanfei; Jiang Li

    2006-01-01

    Objective: To evaluate the MP3 audio-editing software in the daily work in the department of radiology. Methods: The audio content of daily consultation seminar, held in the department of radiology every morning, was recorded and converted into MP3 audio format by a computer integrated recording device. The audio data were edited, archived, and eventually saved in the computer memory storage media, which was experimentally replayed and applied in the research or teaching. Results: MP3 audio-editing was a simple process and convenient for saving and searching the data. The record could be easily replayed. Conclusion: MP3 audio-editing perfectly records and saves the contents of consultation seminar, and has replaced the conventional hand writing notes. It is a valuable tool in both research and teaching in the department. (authors)

  9. Direct-conversion switching-mode audio power amplifier with active capacitive voltage clamp

    DEFF Research Database (Denmark)

    Ljusev, Petar; Andersen, Michael Andreas E.

    2005-01-01

    This paper discusses the advantages and problems when implementing direct energy conversion switching-mode audio power amplifiers. It is shown that the total integration of the power supply and Class D audio power amplifier into one compact direct converter can simplify the design, increase...... efficiency, reduce the product volume and lower its cost. As an example, the principle of operation and the measurements made on a direct-conversion switching-mode audio power amplifier with active capacitive voltage clamp are presented....

  10. On the relevance of spectral features for instrument classification

    DEFF Research Database (Denmark)

    Nielsen, Andreas Brinch; Sigurdsson, Sigurdur; Hansen, Lars Kai

    2007-01-01

    Automatic knowledge extraction from music signals is a key component for most music organization and music information retrieval systems. In this paper, we consider the problem of instrument modelling and instrument classification from the rough audio data. Existing systems for automatic instrument...... classification operate normally on a relatively large number of features, from which those related to the spectrum of the audio signal are particularly relevant. In this paper, we confront two different models about the spectral characterization of musical instruments. The first assumes a constant envelope...

  11. Automatic processing of CERN video, audio and photo archives

    Energy Technology Data Exchange (ETDEWEB)

    Kwiatek, M [CERN, Geneva (Switzerland)], E-mail: Michal.Kwiatek@cem.ch

    2008-07-15

    The digitalization of CERN audio-visual archives, a major task currently in progress, will generate over 40 TB of video, audio and photo files. Storing these files is one issue, but a far more important challenge is to provide long-time coherence of the archive and to make these files available on-line with minimum manpower investment. An infrastructure, based on standard CERN services, has been implemented, whereby master files, stored in the CERN Distributed File System (DFS), are discovered and scheduled for encoding into lightweight web formats based on predefined profiles. Changes in master files, conversion profiles or in the metadata database (read from CDS, the CERN Document Server) are automatically detected and the media re-encoded whenever necessary. The encoding processes are run on virtual servers provided on-demand by the CERN Server Self Service Centre, so that new servers can be easily configured to adapt to higher load. Finally, the generated files are made available from the CERN standard web servers with streaming implemented using Windows Media Services.

  12. Extraction of Information of Audio-Visual Contents

    Directory of Open Access Journals (Sweden)

    Carlos Aguilar

    2011-10-01

    Full Text Available In this article we show how it is possible to use Channel Theory (Barwise and Seligman, 1997 for modeling the process of information extraction realized by audiences of audio-visual contents. To do this, we rely on the concepts pro- posed by Channel Theory and, especially, its treatment of representational systems. We then show how the information that an agent is capable of extracting from the content depends on the number of channels he is able to establish between the content and the set of classifications he is able to discriminate. The agent can endeavor the extraction of information through these channels from the totality of content; however, we discuss the advantages of extracting from its constituents in order to obtain a greater number of informational items that represent it. After showing how the extraction process is endeavored for each channel, we propose a method of representation of all the informative values an agent can obtain from a content using a matrix constituted by the channels the agent is able to establish on the content (source classifications, and the ones he can understand as individual (destination classifications. We finally show how this representation allows reflecting the evolution of the informative items through the evolution of audio-visual content.

  13. Audio-tactile integration and the influence of musical training.

    Directory of Open Access Journals (Sweden)

    Anja Kuchenbuch

    Full Text Available Perception of our environment is a multisensory experience; information from different sensory systems like the auditory, visual and tactile is constantly integrated. Complex tasks that require high temporal and spatial precision of multisensory integration put strong demands on the underlying networks but it is largely unknown how task experience shapes multisensory processing. Long-term musical training is an excellent model for brain plasticity because it shapes the human brain at functional and structural levels, affecting a network of brain areas. In the present study we used magnetoencephalography (MEG to investigate how audio-tactile perception is integrated in the human brain and if musicians show enhancement of the corresponding activation compared to non-musicians. Using a paradigm that allowed the investigation of combined and separate auditory and tactile processing, we found a multisensory incongruency response, generated in frontal, cingulate and cerebellar regions, an auditory mismatch response generated mainly in the auditory cortex and a tactile mismatch response generated in frontal and cerebellar regions. The influence of musical training was seen in the audio-tactile as well as in the auditory condition, indicating enhanced higher-order processing in musicians, while the sources of the tactile MMN were not influenced by long-term musical training. Consistent with the predictive coding model, more basic, bottom-up sensory processing was relatively stable and less affected by expertise, whereas areas for top-down models of multisensory expectancies were modulated by training.

  14. Theory and Application of Audio-Based Assessment of Cough

    Directory of Open Access Journals (Sweden)

    Yan Shi

    2018-01-01

    Full Text Available Cough is a common symptom of many respiratory diseases. Many medical literatures underline that a system for the automatic, objective, and reliable detection of cough events is important and very promising to detect pathology severity in chronic cough disease. In order to track the development status of an audio-based cough monitoring system, we briefly described the history of objective cough detection and then illustrated the cough sound generating principle. The probable endpoints of cough clinical studies, including cough frequency, intensity of coughing, and acoustic properties of cough sound, were analyzed in this paper. Finally, we introduce some successful cough monitoring equipment and their recognition algorithm in detail. It can be obtained that, firstly, acoustic variability of cough sounds within and between individuals makes it difficult to assess the intensity of coughing. Furthermore, now great progress in audio-based cough detection is being made. Moreover, accurate portable objective monitoring systems will be available and widely used in home care and clinical trials in the near future.

  15. Automatic processing of CERN video, audio and photo archives

    International Nuclear Information System (INIS)

    Kwiatek, M

    2008-01-01

    The digitalization of CERN audio-visual archives, a major task currently in progress, will generate over 40 TB of video, audio and photo files. Storing these files is one issue, but a far more important challenge is to provide long-time coherence of the archive and to make these files available on-line with minimum manpower investment. An infrastructure, based on standard CERN services, has been implemented, whereby master files, stored in the CERN Distributed File System (DFS), are discovered and scheduled for encoding into lightweight web formats based on predefined profiles. Changes in master files, conversion profiles or in the metadata database (read from CDS, the CERN Document Server) are automatically detected and the media re-encoded whenever necessary. The encoding processes are run on virtual servers provided on-demand by the CERN Server Self Service Centre, so that new servers can be easily configured to adapt to higher load. Finally, the generated files are made available from the CERN standard web servers with streaming implemented using Windows Media Services

  16. A compact electroencephalogram recording device with integrated audio stimulation system

    Science.gov (United States)

    Paukkunen, Antti K. O.; Kurttio, Anttu A.; Leminen, Miika M.; Sepponen, Raimo E.

    2010-06-01

    A compact (96×128×32 mm3, 374 g), battery-powered, eight-channel electroencephalogram recording device with an integrated audio stimulation system and a wireless interface is presented. The recording device is capable of producing high-quality data, while the operating time is also reasonable for evoked potential studies. The effective measurement resolution is about 4 nV at 200 Hz sample rate, typical noise level is below 0.7 μVrms at 0.16-70 Hz, and the estimated operating time is 1.5 h. An embedded audio decoder circuit reads and plays wave sound files stored on a memory card. The activities are controlled by an 8 bit main control unit which allows accurate timing of the stimuli. The interstimulus interval jitter measured is less than 1 ms. Wireless communication is made through bluetooth and the data recorded are transmitted to an external personal computer (PC) interface in real time. The PC interface is implemented with LABVIEW® and in addition to data acquisition it also allows online signal processing, data storage, and control of measurement activities such as contact impedance measurement, for example. The practical application of the device is demonstrated in mismatch negativity experiment with three test subjects.

  17. Audio-tactile integration and the influence of musical training.

    Science.gov (United States)

    Kuchenbuch, Anja; Paraskevopoulos, Evangelos; Herholz, Sibylle C; Pantev, Christo

    2014-01-01

    Perception of our environment is a multisensory experience; information from different sensory systems like the auditory, visual and tactile is constantly integrated. Complex tasks that require high temporal and spatial precision of multisensory integration put strong demands on the underlying networks but it is largely unknown how task experience shapes multisensory processing. Long-term musical training is an excellent model for brain plasticity because it shapes the human brain at functional and structural levels, affecting a network of brain areas. In the present study we used magnetoencephalography (MEG) to investigate how audio-tactile perception is integrated in the human brain and if musicians show enhancement of the corresponding activation compared to non-musicians. Using a paradigm that allowed the investigation of combined and separate auditory and tactile processing, we found a multisensory incongruency response, generated in frontal, cingulate and cerebellar regions, an auditory mismatch response generated mainly in the auditory cortex and a tactile mismatch response generated in frontal and cerebellar regions. The influence of musical training was seen in the audio-tactile as well as in the auditory condition, indicating enhanced higher-order processing in musicians, while the sources of the tactile MMN were not influenced by long-term musical training. Consistent with the predictive coding model, more basic, bottom-up sensory processing was relatively stable and less affected by expertise, whereas areas for top-down models of multisensory expectancies were modulated by training.

  18. Speech and audio processing for coding, enhancement and recognition

    CERN Document Server

    Togneri, Roberto; Narasimha, Madihally

    2015-01-01

    This book describes the basic principles underlying the generation, coding, transmission and enhancement of speech and audio signals, including advanced statistical and machine learning techniques for speech and speaker recognition with an overview of the key innovations in these areas. Key research undertaken in speech coding, speech enhancement, speech recognition, emotion recognition and speaker diarization are also presented, along with recent advances and new paradigms in these areas. ·         Offers readers a single-source reference on the significant applications of speech and audio processing to speech coding, speech enhancement and speech/speaker recognition. Enables readers involved in algorithm development and implementation issues for speech coding to understand the historical development and future challenges in speech coding research; ·         Discusses speech coding methods yielding bit-streams that are multi-rate and scalable for Voice-over-IP (VoIP) Networks; ·     �...

  19. Audio-Visual Fusion for Sound Source Localization and Improved Attention

    International Nuclear Information System (INIS)

    Lee, Byoung Gi; Choi, Jong Suk; Yoon, Sang Suk; Choi, Mun Taek; Kim, Mun Sang; Kim, Dai Jin

    2011-01-01

    Service robots are equipped with various sensors such as vision camera, sonar sensor, laser scanner, and microphones. Although these sensors have their own functions, some of them can be made to work together and perform more complicated functions. AudioFvisual fusion is a typical and powerful combination of audio and video sensors, because audio information is complementary to visual information and vice versa. Human beings also mainly depend on visual and auditory information in their daily life. In this paper, we conduct two studies using audioFvision fusion: one is on enhancing the performance of sound localization, and the other is on improving robot attention through sound localization and face detection

  20. On the relative importance of audio and video in the presence of packet losses

    DEFF Research Database (Denmark)

    Korhonen, Jari; Reiter, Ulrich; Myakotnykh, Eugene

    2010-01-01

    In streaming applications, unequal protection of audio and video tracks may be necessary to maintain the optimal perceived overall quality. For this purpose, the application should be aware of the relative importance of audio and video in an audiovisual sequence. In this paper, we propose...... a subjective test arrangement for finding the optimal tradeoff between subjective audio and video qualities in situations when it is not possible to have perfect quality for both modalities concurrently. Our results show that content poses a significant impact on the preferred compromise between audio...... and video quality, but also that the currently used classification criteria for content are not sufficient to predict the users’ preference...

  1. Audio-Visual Fusion for Sound Source Localization and Improved Attention

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Byoung Gi; Choi, Jong Suk; Yoon, Sang Suk; Choi, Mun Taek; Kim, Mun Sang [Korea Institute of Science and Technology, Daejeon (Korea, Republic of); Kim, Dai Jin [Pohang University of Science and Technology, Pohang (Korea, Republic of)

    2011-07-15

    Service robots are equipped with various sensors such as vision camera, sonar sensor, laser scanner, and microphones. Although these sensors have their own functions, some of them can be made to work together and perform more complicated functions. AudioFvisual fusion is a typical and powerful combination of audio and video sensors, because audio information is complementary to visual information and vice versa. Human beings also mainly depend on visual and auditory information in their daily life. In this paper, we conduct two studies using audioFvision fusion: one is on enhancing the performance of sound localization, and the other is on improving robot attention through sound localization and face detection.

  2. Paper-Based Textbooks with Audio Support for Print-Disabled Students.

    Science.gov (United States)

    Fujiyoshi, Akio; Ohsawa, Akiko; Takaira, Takuya; Tani, Yoshiaki; Fujiyoshi, Mamoru; Ota, Yuko

    2015-01-01

    Utilizing invisible 2-dimensional codes and digital audio players with a 2-dimensional code scanner, we developed paper-based textbooks with audio support for students with print disabilities, called "multimodal textbooks." Multimodal textbooks can be read with the combination of the two modes: "reading printed text" and "listening to the speech of the text from a digital audio player with a 2-dimensional code scanner." Since multimodal textbooks look the same as regular textbooks and the price of a digital audio player is reasonable (about 30 euro), we think multimodal textbooks are suitable for students with print disabilities in ordinary classrooms.

  3. Incorporating Data Link Features into a Multi-Function Display to Support Self-Separation and Spacing Tasks for General Aviation Pilots

    Science.gov (United States)

    Adams, Catherine A.; Murdoch, Jennifer L.; Consiglio, Maria C.; WIlliams, Daniel M.

    2005-01-01

    One objective of the Small Aircraft Transportation System (SATS) Higher Volume Operations (HVO) project is to increase the capacity and utilization of small non-towered, non-radar equipped airports by transferring traffic management activities to an automated Airport Management Module (AMM) and separation responsibilities to general aviation (GA) pilots. Implementation of this concept required the development of a research Multi-Function Display (MFD) to support the interactive communications between pilots and the AMM. The interface also had to accommodate traffic awareness, self-separation, and spacing tasks through dynamic messaging and symbology for flight path conformance and conflict detection and alerting (CDA). The display served as the mechanism to support the examination of the viability of executing instrument operations designed for SATS designated airports. Results of simulation and flight experiments conducted at the National Aeronautics and Space Administration's (NASA) Langley Research Center indicate that the concept, as facilitated by the research MFD, did not increase pilots subjective workload levels or reduce their situation awareness (SA). Post-test usability assessments revealed that pilots preferred using the enhanced MFD to execute flight procedures, reporting improved SA over conventional instrument flight rules (IFR) procedures.

  4. Passive Guaranteed Simulation of Analog Audio Circuits: A Port-Hamiltonian Approach

    Directory of Open Access Journals (Sweden)

    Antoine Falaize

    2016-09-01

    Full Text Available We present a method that generates passive-guaranteed stable simulations of analog audio circuits from electronic schematics for real-time issues. On one hand, this method is based on a continuous-time power-balanced state-space representation structured into its energy-storing parts, dissipative parts, and external sources. On the other hand, a numerical scheme is especially designed to preserve this structure and the power balance. These state-space structures define the class of port-Hamiltonian systems. The derivation of this structured system associated with the electronic circuit is achieved by an automated analysis of the interconnection network combined with a dictionary of models for each elementary component. The numerical scheme is based on the combination of finite differences applied on the state (with respect to the time variable and on the total energy (with respect to the state. This combination provides a discrete-time version of the power balance. This set of algorithms is valid for both the linear and nonlinear case. Finally, three applications of increasing complexities are given: a diode clipper, a common-emitter bipolar-junction transistor amplifier, and a wah pedal. The results are compared to offline simulations obtained from a popular circuit simulator.

  5. Auto-Associative Recurrent Neural Networks and Long Term Dependencies in Novelty Detection for Audio Surveillance Applications

    Science.gov (United States)

    Rossi, A.; Montefoschi, F.; Rizzo, A.; Diligenti, M.; Festucci, C.

    2017-10-01

    Machine Learning applied to Automatic Audio Surveillance has been attracting increasing attention in recent years. In spite of several investigations based on a large number of different approaches, little attention had been paid to the environmental temporal evolution of the input signal. In this work, we propose an exploration in this direction comparing the temporal correlations extracted at the feature level with the one learned by a representational structure. To this aim we analysed the prediction performances of a Recurrent Neural Network architecture varying the length of the processed input sequence and the size of the time window used in the feature extraction. Results corroborated the hypothesis that sequential models work better when dealing with data characterized by temporal order. However, so far the optimization of the temporal dimension remains an open issue.

  6. Audio- and TV-products. Power consumption reduction in audio- and TV-products. Final report; Audio- og TV-produkter. Effektminimering i audio- og TV-produkter: Afsluttende rapport

    Energy Technology Data Exchange (ETDEWEB)

    Kierkegaard, P.

    1998-10-01

    The project concerning the audio products resulted in energy savings of 90-97% at efficiencies of 91-96% with full effect and stand-by losses of 0.4-3 W. It is especially new epoch-making methods for pulse modulation (called Controlled Oscillation Modulator, COM and Phase Shifted Carrier Pulse Width Modulation, PSCPWM) and error for correction in the effect conversion (called Multivariable Enhanced Cascade Control, MECC and Pulse Edge Delay Error Correction, PEDEC), which has made the breakthrough. Two patents have been applied for, and new digital amplifiers will be introduced in all the relevant products. The project concerning TV products has shown that a loss reduction in deflecting circuits of ca.20 % may be obtained. (EHS)

  7. Audio Logo Recognition, Reduced Articulation and Coding Orientation

    DEFF Research Database (Denmark)

    Bonde, Anders; Hansen, Allan Grutt

    2013-01-01

    In this paper we explore an interdisciplinary theoretical framework for the analysis of corporate audio logos and their effectiveness regarding recognisability and identification. This is done by combining three different academic disciplines: 1) social semiotics, 2) branding theory and 3) music...... on musicological descriptors. We consider as a starting point Kress and Van Leeuwen’s (1996, 2006) conceptualisation of ‘modality’, which is central to their ‘visual grammar’ theory and subsequently extended to auditory expressions such as spoken language, music and sound effects (Van Leeuwen, 1999). While...... connected to notions of brand recognisability and brand identification, thus resulting in the concept of ‘Reduced Articulation Form’ (RAF). The concept has been tested empirically through a survey of 137 upper secondary school students. On the basis of a conditioning experiment, manipulating five existing...

  8. Audio collection in the SASA Institute of Musicology

    Directory of Open Access Journals (Sweden)

    Lajić-Mihajlović Danka

    2010-01-01

    Full Text Available The paper is relating to audio collection of the Institute of Musicology SASA as extremely important part of this institution’s fund. The collection comprises of valuable sound materials, especially significant collections of fieldwork recordings of traditional folk and church music, as also recordings of pieces of the 19th and 20th century Serbian composers. Information on sound carriers, methodologies and circumstances in which the recordings have been made, their preservation and further treatment with modern technologies, are a part of ethnomusicological and musicological histories in Serbia. According to number of sound recordings, diachronical dimensions that encompass, geographical areas and genre diversity, this collection is one of the most important sound collections of scientific profile in Serbia.

  9. A Novel Audio Cryptosystem Using Chaotic Maps and DNA Encoding

    Directory of Open Access Journals (Sweden)

    S. J. Sheela

    2017-01-01

    Full Text Available Chaotic maps have good potential in security applications due to their inherent characteristics relevant to cryptography. This paper introduces a new audio cryptosystem based on chaotic maps, hybrid chaotic shift transform (HCST, and deoxyribonucleic acid (DNA encoding rules. The scheme uses chaotic maps such as two-dimensional modified Henon map (2D-MHM and standard map. The 2D-MHM which has sophisticated chaotic behavior for an extensive range of control parameters is used to perform HCST. DNA encoding technology is used as an auxiliary tool which enhances the security of the cryptosystem. The performance of the algorithm is evaluated for various speech signals using different encryption/decryption quality metrics. The simulation and comparison results show that the algorithm can achieve good encryption results and is able to resist several cryptographic attacks. The various types of analysis revealed that the algorithm is suitable for narrow band radio communication and real-time speech encryption applications.

  10. Real Time Recognition Of Speakers From Internet Audio Stream

    Directory of Open Access Journals (Sweden)

    Weychan Radoslaw

    2015-09-01

    Full Text Available In this paper we present an automatic speaker recognition technique with the use of the Internet radio lossy (encoded speech signal streams. We show an influence of the audio encoder (e.g., bitrate on the speaker model quality. The model of each speaker was calculated with the use of the Gaussian mixture model (GMM approach. Both the speaker recognition and the further analysis were realized with the use of short utterances to facilitate real time processing. The neighborhoods of the speaker models were analyzed with the use of the ISOMAP algorithm. The experiments were based on four 1-hour public debates with 7–8 speakers (including the moderator, acquired from the Polish radio Internet services. The presented software was developed with the MATLAB environment.

  11. Deep learning, audio adversaries, and music content analysis

    DEFF Research Database (Denmark)

    Kereliuk, Corey Mose; Sturm, Bob L.; Larsen, Jan

    2015-01-01

    We present the concept of adversarial audio in the context of deep neural networks (DNNs) for music content analysis. An adversary is an algorithm that makes minor perturbations to an input that cause major repercussions to the system response. In particular, we design an adversary for a DNN...... that takes as input short-time spectral magnitudes of recorded music and outputs a high-level music descriptor. We demonstrate how this adversary can make the DNN behave in any way with only extremely minor changes to the music recording signal. We show that the adversary cannot be neutralised by a simple...... filtering of the input. Finally, we discuss adversaries in the broader context of the evaluation of music content analysis systems....

  12. Self-oscillating modulators for direct energy conversion audio power amplifiers

    DEFF Research Database (Denmark)

    Ljusev, Petar; Andersen, Michael Andreas E.

    2005-01-01

    Direct energy conversion audio power amplifier represents total integration of switching-mode power supply and Class D audio power amplifier into one compact stage, achieving high efficiency, high level of integration, low component count and eventually low cost. This paper presents how self-oscillating...

  13. Approaches to building single-stage AC/AC conversion switch-mode audio power amplifiers

    DEFF Research Database (Denmark)

    Ljusev, Petar; Andersen, Michael Andreas E.

    2004-01-01

    This paper discusses the possible topologies and promising approaches towards direct single-phase AC-AC conversion of the mains voltage for audio applications. When compared to standard Class-D switching audio power amplifiers with a separate power supply, it is expected that direct conversion...

  14. Effects of Audio-Visual Information on the Intelligibility of Alaryngeal Speech

    Science.gov (United States)

    Evitts, Paul M.; Portugal, Lindsay; Van Dine, Ami; Holler, Aline

    2010-01-01

    Background: There is minimal research on the contribution of visual information on speech intelligibility for individuals with a laryngectomy (IWL). Aims: The purpose of this project was to determine the effects of mode of presentation (audio-only, audio-visual) on alaryngeal speech intelligibility. Method: Twenty-three naive listeners were…

  15. Audio Control Handbook For Radio and Television Broadcasting. Third Revised Edition.

    Science.gov (United States)

    Oringel, Robert S.

    Audio control is the operation of all the types of sound equipment found in the studios and control rooms of a radio or television station. Written in a nontechnical style for beginners, the book explains thoroughly the operation of all types of audio equipment. Diagrams and photographs of commercial consoles, microphones, turntables, and tape…

  16. A Perceptual Model for Sinusoidal Audio Coding Based on Spectral Integration

    NARCIS (Netherlands)

    Van de Par, S.; Kohlrausch, A.; Heusdens, R.; Jensen, J.; Holdt Jensen, S.

    2005-01-01

    Psychoacoustical models have been used extensively within audio coding applications over the past decades. Recently, parametric coding techniques have been applied to general audio and this has created the need for a psychoacoustical model that is specifically suited for sinusoidal modelling of

  17. A perceptual model for sinusoidal audio coding based on spectral integration

    NARCIS (Netherlands)

    Van de Par, S.; Kohlrauch, A.; Heusdens, R.; Jensen, J.; Jensen, S.H.

    2005-01-01

    Psychoacoustical models have been used extensively within audio coding applications over the past decades. Recently, parametric coding techniques have been applied to general audio and this has created the need for a psychoacoustical model that is specifically suited for sinusoidal modelling of

  18. Changes of the Prefrontal EEG (Electroencephalogram) Activities According to the Repetition of Audio-Visual Learning.

    Science.gov (United States)

    Kim, Yong-Jin; Chang, Nam-Kee

    2001-01-01

    Investigates the changes of neuronal response according to a four time repetition of audio-visual learning. Obtains EEG data from the prefrontal (Fp1, Fp2) lobe from 20 subjects at the 8th grade level. Concludes that the habituation of neuronal response shows up in repetitive audio-visual learning and brain hemisphericity can be changed by…

  19. A Psychoacoustic-Based Multiple Audio Object Coding Approach via Intra-Object Sparsity

    Directory of Open Access Journals (Sweden)

    Maoshen Jia

    2017-12-01

    Full Text Available Rendering spatial sound scenes via audio objects has become popular in recent years, since it can provide more flexibility for different auditory scenarios, such as 3D movies, spatial audio communication and virtual classrooms. To facilitate high-quality bitrate-efficient distribution for spatial audio objects, an encoding scheme based on intra-object sparsity (approximate k-sparsity of the audio object itself is proposed in this paper. The statistical analysis is presented to validate the notion that the audio object has a stronger sparseness in the Modified Discrete Cosine Transform (MDCT domain than in the Short Time Fourier Transform (STFT domain. By exploiting intra-object sparsity in the MDCT domain, multiple simultaneously occurring audio objects are compressed into a mono downmix signal with side information. To ensure a balanced perception quality of audio objects, a Psychoacoustic-based time-frequency instants sorting algorithm and an energy equalized Number of Preserved Time-Frequency Bins (NPTF allocation strategy are proposed, which are employed in the underlying compression framework. The downmix signal can be further encoded via Scalar Quantized Vector Huffman Coding (SQVH technique at a desirable bitrate, and the side information is transmitted in a lossless manner. Both objective and subjective evaluations show that the proposed encoding scheme outperforms the Sparsity Analysis (SPA approach and Spatial Audio Object Coding (SAOC in cases where eight objects were jointly encoded.

  20. 106-17 Telemetry Standards Digitized Audio Telemetry Standard Chapter 5

    Science.gov (United States)

    2017-07-01

    Digitized Audio Telemetry Standard 5.1 General This chapter defines continuously variable slope delta (CVSD) modulation as the standard for digitizing...audio signal. The CVSD modulator is, in essence , a 1-bit analog-to-digital converter. The output of this 1-bit encoder is a serial bit stream, where

  1. Toward Personal and Emotional Connectivity in Mobile Higher Education through Asynchronous Formative Audio Feedback

    Science.gov (United States)

    Rasi, Päivi; Vuojärvi, Hanna

    2018-01-01

    This study aims to develop asynchronous formative audio feedback practices for mobile learning in higher education settings. The development was conducted in keeping with the principles of design-based research. The research activities focused on an inter-university online course, within which the use of instructor audio feedback was tested,…

  2. Estimation of the energy ratio between primary and ambience components in stereo audio data

    NARCIS (Netherlands)

    Harma, A.S.

    2011-01-01

    Stereo audio signal is often modeled as a mixture of instantaneously mixed primary components and uncorrelated ambience components. This paper focuses on the estimation of the primary-to-ambience energy ratio, PAR. This measure is useful for signal decomposition in stereo and multichannel audio

  3. 16 CFR 307.8 - Requirements for disclosure in audiovisual and audio advertising.

    Science.gov (United States)

    2010-01-01

    ... 16 Commercial Practices 1 2010-01-01 2010-01-01 false Requirements for disclosure in audiovisual and audio advertising. 307.8 Section 307.8 Commercial Practices FEDERAL TRADE COMMISSION REGULATIONS... ACT OF 1986 Advertising Disclosures § 307.8 Requirements for disclosure in audiovisual and audio...

  4. Quick Response (QR) Codes for Audio Support in Foreign Language Learning

    Science.gov (United States)

    Vigil, Kathleen Murray

    2017-01-01

    This study explored the potential benefits and barriers of using quick response (QR) codes as a means by which to provide audio materials to middle-school students learning Spanish as a foreign language. Eleven teachers of Spanish to middle-school students created transmedia materials containing QR codes linking to audio resources. Students…

  5. An Exploratory Evaluation of User Interfaces for 3D Audio Mixing

    DEFF Research Database (Denmark)

    Gelineck, Steven; Korsgaard, Dannie Michael

    2015-01-01

    The paper presents an exploratory evaluation comparing different versions of a mid-air gesture based interface for mixing 3D audio exploring: (1) how such an interface generally compares to a more traditional physical interface, (2) methods for grabbing/releasing audio channels in mid-air and (3...

  6. An Analog I/O Interface Board for Audio Arduino Open Sound Card System

    DEFF Research Database (Denmark)

    Dimitrov, Smilen; Serafin, Stefania

    2011-01-01

    AudioArduino [1] is a system consisting of an ALSA (Advanced Linux Sound Architecture) audio driver and corresponding microcontroller code; that can demonstrate full-duplex, mono, 8-bit, 44.1 kHz soundcard behavior on an FTDI based Arduino. While the basic operation as a soundcard can...

  7. A Preliminary Investigation into the Search Behaviour of Users in a Collection of Digitized Broadcast Audio

    DEFF Research Database (Denmark)

    Lund, Haakon; Skov, Mette; Larsen, Birger

    2014-01-01

    An increasing number of large digitized audio-visual collections within digital humanities have recently been made available for users. Often access to digitized audio-visual collections is hampered by little and inconsistent metadata. This paper presents the preliminary findings from a study of ...

  8. Vertigo with sudden hearing loss: audio-vestibular characteristics.

    Science.gov (United States)

    Pogson, Jacob M; Taylor, Rachael L; Young, Allison S; McGarvie, Leigh A; Flanagan, Sean; Halmagyi, G Michael; Welgampola, Miriam S

    2016-10-01

    Acute vertigo with sudden sensorineural hearing loss (SSNHL) is a rare clinical emergency. Here, we report the audio-vestibular test profiles of 27 subjects who presented with these symptoms. The vestibular test battery consisted of a three-dimensional video head impulse test (vHIT) of semicircular canal function and recording ocular and cervical vestibular-evoked myogenic potentials (oVEMP, cVEMP) to test otolith dysfunction. Unlike vestibular neuritis, where the horizontal and anterior canals with utricular function are more frequently impaired, 74 % of subjects with vertigo and SSNHL demonstrated impairment of the posterior canal gain (0.45 ± 0.20). Only 41 % showed impairment of the horizontal canal gains (0.78 ± 0.27) and 30 % of the anterior canal gains (0.79 ± 0.26), while 38 % of oVEMPs [asymmetry ratio (AR) = 41.0 ± 41.3 %] and 33 % of cVEMPs (AR = 47.3 ± 41.2 %) were significantly asymmetrical. Twenty-three subjects were diagnosed with labyrinthitis/labyrinthine infarction in the absence of evidence for an underlying pathology. Four subjects had a definitive diagnosis [Ramsay Hunt Syndrome, vestibular schwannoma, anterior inferior cerebellar artery (AICA) infarction, and traction injury]. Ischemia involving the common-cochlear or vestibulo-cochlear branches of the labyrinthine artery could be the simplest explanation for vertigo with SSNHL. Audio-vestibular tests did not provide easy separation between ischaemic and non-ischaemic causes of vertigo with SSNHL.

  9. acceleration observed in an audio air gas discharge

    International Nuclear Information System (INIS)

    Ragheb, M.S.

    2010-01-01

    an audio air gas discharge enclosed in a pyrex glass of 34 mm diameter and 25 cm long , lead to trace the occurrence of an unusual phenomenon. injected relative huge light spots of intense brightness, distributed regularly on the contour and in the center of one of the discharge electrodes, are observed. very high heat is pronounced on both electrodes, while, one of them is higher than the other it attains 660 degree C in 3-4 minutes. series of photographs and registered video films define and clarify the sequence of events that describe the observed phenomenon. the plasma is created by applying an audio power through the electrodes of an air gas discharge of 10 khz and up to 500 watts power supply. the discharge voltage is up to 900 volts: the discharge current flowing through the plasma attains 360 mA. it is found that the discharge system must attain its optimal working conditions in order to produce the amazing phenomena. the obtained plasma is classified as the maximum conditions borders of a γ-discharge type. at these conditions, the corresponding maximum electron temperature and density are 16 eV and 10 15 cm -3 respectively . the observation system succeeded to reveal and to clarify the sequence of the phenomenon events. in addition, by means of the scanning electron microscope and the energy dispersive x- ray systems, the effects on the electrodes surface are investigated and analyzed. the optical observations, in conjunction with the micrograph and surface microanalysis,demonstrate the collision occurrence, of powered agglomerations groups, to the electrode surface. detailed interpretation of that phenomenon suggests a molecular acceleration gaining their energy from the formed plasma due to optimal discharge working conditions. as a consequence, due to the ions agglomerates size this procedure could be considered as a mesoscopic acceleration technique.

  10. A high-resolution atlas of the infrared spectrum of the Sun and the Earth atmosphere from space. Volume 3: Key to identification of solar features

    Science.gov (United States)

    Geller, Murray

    1992-01-01

    During the period April 29 through May 2, 1985, the Atmospheric Trace Molecule Spectroscopy (ATMOS) experiment was operated as part of the Spacelab-3 (SL-3) payload on the shuttle Challenger. The instrument, a Fourier transform spectrometer, recorded over 2000 infrared solar spectra from an altitude of 360 km. Although the majority of the spectra were taken through the limb of the Earth's atmosphere in order to better understand its composition, several hundred of the 'high-sun' spectra were completely free from telluric absorption. These high-sun spectra recorded from space are, at the present time, the only high-resolution infrared spectra ever taken of the Sun free from absorptions due to constituents in the Earth's atmosphere. Volumes 1 and 2 of this series provide a compilation of these spectra arranged in a format suitable for quick-look reference purposes and are the first record of the continuous high-resolution infrared spectrum of the Sun and the Earth's atmosphere from space. In the Table of Identifications, which constitutes the main body of this volume, each block of eight wavenumbers is given a separate heading and corresponds to a page of two panels in Volume 1 of this series. In addition, three separate blocks of data available from ATMOS from 622-630 cm(exp -1), 630-638 cm(exp -1) and 638-646 cm(exp -1), excluded from Volume 1 because of the low signal-to-noise ratio, have been included due to the certain identification of several OH and NH transitions. In the first column of the table, the corrected frequency is given. The second column identifies the molecular species. The third and fourth columns represent the assigned transition. The fifth column gives the depth of the molecular line in millimeters. Also included in this column is a notation to indicate whether the line is a blend or lies on the shoulder(s) of another line(s). The final column repeats a question mark if the line is unidentified.

  11. An Interactive Concert Program Based on Infrared Watermark and Audio Synthesis

    Science.gov (United States)

    Wang, Hsi-Chun; Lee, Wen-Pin Hope; Liang, Feng-Ju

    The objective of this research is to propose a video/audio system which allows the user to listen the typical music notes in the concert program under infrared detection. The system synthesizes audio with different pitches and tempi in accordance with the encoded data in a 2-D barcode embedded in the infrared watermark. The digital halftoning technique has been used to fabricate the infrared watermark composed of halftone dots by both amplitude modulation (AM) and frequency modulation (FM). The results show that this interactive system successfully recognizes the barcode and synthesizes audio under infrared detection of a concert program which is also valid for human observation of the contents. This interactive video/audio system has greatly expanded the capability of the printout paper to audio display and also has many potential value-added applications.

  12. Sounding ruins: reflections on the production of an ‘audio drift’

    Science.gov (United States)

    Gallagher, Michael

    2014-01-01

    This article is about the use of audio media in researching places, which I term ‘audio geography’. The article narrates some episodes from the production of an ‘audio drift’, an experimental environmental sound work designed to be listened to on a portable MP3 player whilst walking in a ruinous landscape. Reflecting on how this work functions, I argue that, as well as representing places, audio geography can shape listeners’ attention and bodily movements, thereby reworking places, albeit temporarily. I suggest that audio geography is particularly apt for amplifying the haunted and uncanny qualities of places. I discuss some of the issues raised for research ethics, epistemology and spectral geographies. PMID:29708107

  13. Feature Extraction

    CERN Document Server

    CERN. Geneva

    2015-01-01

    Feature selection and reduction are key to robust multivariate analyses. In this talk I will focus on pros and cons of various variable selection methods and focus on those that are most relevant in the context of HEP.

  14. Solar Features

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — Collection includes a variety of solar feature datasets contributed by a number of national and private solar observatories located worldwide.

  15. Site Features

    Data.gov (United States)

    U.S. Environmental Protection Agency — This dataset consists of various site features from multiple Superfund sites in U.S. EPA Region 8. These data were acquired from multiple sources at different times...

  16. Self-guided Depression Treatment on Long-duration Space Flights: A Continuation Study

    Data.gov (United States)

    National Aeronautics and Space Administration — During 2008-2009, we completed development and alpha-testing (debugging) of a depression treatment computer program. The program uses video, audio, graphics, and...

  17. Effect of Audio Coaching on Correlation of Abdominal Displacement With Lung Tumor Motion

    International Nuclear Information System (INIS)

    Nakamura, Mitsuhiro; Narita, Yuichiro; Matsuo, Yukinori; Narabayashi, Masaru; Nakata, Manabu; Sawada, Akira; Mizowaki, Takashi; Nagata, Yasushi; Hiraoka, Masahiro

    2009-01-01

    Purpose: To assess the effect of audio coaching on the time-dependent behavior of the correlation between abdominal motion and lung tumor motion and the corresponding lung tumor position mismatches. Methods and Materials: Six patients who had a lung tumor with a motion range >8 mm were enrolled in the present study. Breathing-synchronized fluoroscopy was performed initially without audio coaching, followed by fluoroscopy with recorded audio coaching for multiple days. Two different measurements, anteroposterior abdominal displacement using the real-time positioning management system and superoinferior (SI) lung tumor motion by X-ray fluoroscopy, were performed simultaneously. Their sequential images were recorded using one display system. The lung tumor position was automatically detected with a template matching technique. The relationship between the abdominal and lung tumor motion was analyzed with and without audio coaching. Results: The mean SI tumor displacement was 10.4 mm without audio coaching and increased to 23.0 mm with audio coaching (p < .01). The correlation coefficients ranged from 0.89 to 0.97 with free breathing. Applying audio coaching, the correlation coefficients improved significantly (range, 0.93-0.99; p < .01), and the SI lung tumor position mismatches became larger in 75% of all sessions. Conclusion: Audio coaching served to increase the degree of correlation and make it more reproducible. In addition, the phase shifts between tumor motion and abdominal displacement were improved; however, all patients breathed more deeply, and the SI lung tumor position mismatches became slightly larger with audio coaching than without audio coaching.

  18. Television and the Internet: The Role Digital Technologies Play in Adolescents’ Audio-Visual Media Consumption. Young Television Audiences in Catalonia (Spain

    Directory of Open Access Journals (Sweden)

    Meritxell Roca

    2014-03-01

    Full Text Available The aim of this reported study was to investigate adolescents TV consumption habits and perceptions. Although there appears to be no general consensus on how the Internet affects TV consumption by teenagers, and data vary depending on the country, according to our study, Spanish adolescents perceive television as a habit “of the past” and find the computer a device more suited to their recreational and audio-visual consumption needs. The data obtained from eight focus groups of teenagers aged between 12 and 18 and an online survey sent to their parents show that watching TV is an activity usually linked to the home’s communal spaces. On the contrary, online audio-visual consumption (understood as a wider term not limited to just TV shows is perceived by adolescents as a more convenient activity as it adapts to their own schedules and needs.

  19. Using listener-based perceptual features as intermediate representations in music information retrieval.

    Science.gov (United States)

    Friberg, Anders; Schoonderwaldt, Erwin; Hedblad, Anton; Fabiani, Marco; Elowsson, Anders

    2014-10-01

    The notion of perceptual features is introduced for describing general music properties based on human perception. This is an attempt at rethinking the concept of features, aiming to approach the underlying human perception mechanisms. Instead of using concepts from music theory such as tones, pitches, and chords, a set of nine features describing overall properties of the music was selected. They were chosen from qualitative measures used in psychology studies and motivated from an ecological approach. The perceptual features were rated in two listening experiments using two different data sets. They were modeled both from symbolic and audio data using different sets of computational features. Ratings of emotional expression were predicted using the perceptual features. The results indicate that (1) at least some of the perceptual features are reliable estimates; (2) emotion ratings could be predicted by a small combination of perceptual features with an explained variance from 75% to 93% for the emotional dimensions activity and valence; (3) the perceptual features could only to a limited extent be modeled using existing audio features. Results clearly indicated that a small number of dedicated features were superior to a "brute force" model using a large number of general audio features.

  20. Computerized Audio-Visual Instructional Sequences (CAVIS): A Versatile System for Listening Comprehension in Foreign Language Teaching.

    Science.gov (United States)

    Aleman-Centeno, Josefina R.

    1983-01-01

    Discusses the development and evaluation of CAVIS, which consists of an Apple microcomputer used with audiovisual dialogs. Includes research on the effects of three conditions: (1) computer with audio and visual, (2) computer with audio alone and (3) audio alone in short-term and long-term recall. (EKN)