WorldWideScience

Sample records for based audio steganography

  1. Audio Steganography with Embedded Text

    Science.gov (United States)

    Teck Jian, Chua; Chai Wen, Chuah; Rahman, Nurul Hidayah Binti Ab.; Hamid, Isredza Rahmi Binti A.

    2017-08-01

    Audio steganography is about hiding the secret message into the audio. It is a technique uses to secure the transmission of secret information or hide their existence. It also may provide confidentiality to secret message if the message is encrypted. To date most of the steganography software such as Mp3Stego and DeepSound use block cipher such as Advanced Encryption Standard or Data Encryption Standard to encrypt the secret message. It is a good practice for security. However, the encrypted message may become too long to embed in audio and cause distortion of cover audio if the secret message is too long. Hence, there is a need to encrypt the message with stream cipher before embedding the message into the audio. This is because stream cipher provides bit by bit encryption meanwhile block cipher provide a fixed length of bits encryption which result a longer output compare to stream cipher. Hence, an audio steganography with embedding text with Rivest Cipher 4 encryption cipher is design, develop and test in this project.

  2. AUDIO CRYPTANALYSIS- AN APPLICATION OF SYMMETRIC KEY CRYPTOGRAPHY AND AUDIO STEGANOGRAPHY

    Directory of Open Access Journals (Sweden)

    Smita Paira

    2016-09-01

    Full Text Available In the recent trend of network and technology, “Cryptography” and “Steganography” have emerged out as the essential elements of providing network security. Although Cryptography plays a major role in the fabrication and modification of the secret message into an encrypted version yet it has certain drawbacks. Steganography is the art that meets one of the basic limitations of Cryptography. In this paper, a new algorithm has been proposed based on both Symmetric Key Cryptography and Audio Steganography. The combination of a randomly generated Symmetric Key along with LSB technique of Audio Steganography sends a secret message unrecognizable through an insecure medium. The Stego File generated is almost lossless giving a 100 percent recovery of the original message. This paper also presents a detailed experimental analysis of the algorithm with a brief comparison with other existing algorithms and a future scope. The experimental verification and security issues are promising.

  3. Exploring the Implementation of Steganography Protocols on Quantum Audio Signals

    Science.gov (United States)

    Chen, Kehan; Yan, Fei; Iliyasu, Abdullah M.; Zhao, Jianping

    2018-02-01

    Two quantum audio steganography (QAS) protocols are proposed, each of which manipulates or modifies the least significant qubit (LSQb) of the host quantum audio signal that is encoded as an FRQA (flexible representation of quantum audio) audio content. The first protocol (i.e. the conventional LSQb QAS protocol or simply the cLSQ stego protocol) is built on the exchanges between qubits encoding the quantum audio message and the LSQb of the amplitude information in the host quantum audio samples. In the second protocol, the embedding procedure to realize it implants information from a quantum audio message deep into the constraint-imposed most significant qubit (MSQb) of the host quantum audio samples, we refer to it as the pseudo MSQb QAS protocol or simply the pMSQ stego protocol. The cLSQ stego protocol is designed to guarantee high imperceptibility between the host quantum audio and its stego version, whereas the pMSQ stego protocol ensures that the resulting stego quantum audio signal is better immune to illicit tampering and copyright violations (a.k.a. robustness). Built on the circuit model of quantum computation, the circuit networks to execute the embedding and extraction algorithms of both QAS protocols are determined and simulation-based experiments are conducted to demonstrate their implementation. Outcomes attest that both protocols offer promising trade-offs in terms of imperceptibility and robustness.

  4. PIXEL PATTERN BASED STEGANOGRAPHY ON IMAGES

    Directory of Open Access Journals (Sweden)

    R. Rejani

    2015-02-01

    Full Text Available One of the drawback of most of the existing steganography methods is that it alters the bits used for storing color information. Some of the examples include LSB or MSB based steganography. There are also various existing methods like Dynamic RGB Intensity Based Steganography Scheme, Secure RGB Image Steganography from Pixel Indicator to Triple Algorithm etc that can be used to find out the steganography method used and break it. Another drawback of the existing methods is that it adds noise to the image which makes the image look dull or grainy making it suspicious for a person about existence of a hidden message within the image. To overcome these shortcomings we have come up with a pixel pattern based steganography which involved hiding the message within in image by using the existing RGB values whenever possible at pixel level or with minimum changes. Along with the image a key will also be used to decrypt the message stored at pixel levels. For further protection, both the message stored as well as the key file will be in encrypted format which can have same or different keys or decryption. Hence we call it as a RGB pixel pattern based steganography.

  5. Effective Electrocardiogram Steganography Based on Coefficient Alignment.

    Science.gov (United States)

    Yang, Ching-Yu; Wang, Wen-Fong

    2016-03-01

    This study presents two types of data hiding methods based on coefficient alignment for electrocardiogram (ECG) signals, namely, lossy and reversible ECG steganographys. The lossy method is divided into high-quality and high-capacity ECG steganography, both of which are capable of hiding confidential patient data in ECG signals. The reversible data hiding method can not only hide secret messages but also completely restore the original ECG signal after bit extraction. Simulations confirmed that the perceived quality generated by the lossy ECG steganography methods was good, while hiding capacity was acceptable. In addition, these methods have a certain degree of robustness, which is rare in conventional ECG stegangraphy schemes. Moreover, the proposed reversible ECG steganography method can not only successfully extract hidden messages but also completely recover the original ECG data.

  6. Quantum steganography using prior entanglement

    Energy Technology Data Exchange (ETDEWEB)

    Mihara, Takashi, E-mail: mihara@toyo.jp

    2015-06-05

    Steganography is the hiding of secret information within innocent-looking information (e.g., text, audio, image, video, etc.). A quantum version of steganography is a method based on quantum physics. In this paper, we propose quantum steganography by combining quantum error-correcting codes with prior entanglement. In many steganographic techniques, embedding secret messages in error-correcting codes may cause damage to them if the embedded part is corrupted. However, our proposed steganography can separately create secret messages and the content of cover messages. The intrinsic form of the cover message does not have to be modified for embedding secret messages. - Highlights: • Our steganography combines quantum error-correcting codes with prior entanglement. • Our steganography can separately create secret messages and the content of cover messages. • Errors in cover messages do not have affect the recovery of secret messages. • We embed a secret message in the Steane code as an example of our steganography.

  7. Quantum steganography using prior entanglement

    International Nuclear Information System (INIS)

    Mihara, Takashi

    2015-01-01

    Steganography is the hiding of secret information within innocent-looking information (e.g., text, audio, image, video, etc.). A quantum version of steganography is a method based on quantum physics. In this paper, we propose quantum steganography by combining quantum error-correcting codes with prior entanglement. In many steganographic techniques, embedding secret messages in error-correcting codes may cause damage to them if the embedded part is corrupted. However, our proposed steganography can separately create secret messages and the content of cover messages. The intrinsic form of the cover message does not have to be modified for embedding secret messages. - Highlights: • Our steganography combines quantum error-correcting codes with prior entanglement. • Our steganography can separately create secret messages and the content of cover messages. • Errors in cover messages do not have affect the recovery of secret messages. • We embed a secret message in the Steane code as an example of our steganography

  8. Blind Linguistic Steganalysis against Translation Based Steganography

    Science.gov (United States)

    Chen, Zhili; Huang, Liusheng; Meng, Peng; Yang, Wei; Miao, Haibo

    Translation based steganography (TBS) is a kind of relatively new and secure linguistic steganography. It takes advantage of the "noise" created by automatic translation of natural language text to encode the secret information. Up to date, there is little research on the steganalysis against this kind of linguistic steganography. In this paper, a blind steganalytic method, which is named natural frequency zoned word distribution analysis (NFZ-WDA), is presented. This method has improved on a previously proposed linguistic steganalysis method based on word distribution which is targeted for the detection of linguistic steganography like nicetext and texto. The new method aims to detect the application of TBS and uses none of the related information about TBS, its only used resource is a word frequency dictionary obtained from a large corpus, or a so called natural frequency dictionary, so it is totally blind. To verify the effectiveness of NFZ-WDA, two experiments with two-class and multi-class SVM classifiers respectively are carried out. The experimental results show that the steganalytic method is pretty promising.

  9. LSB Based Quantum Image Steganography Algorithm

    Science.gov (United States)

    Jiang, Nan; Zhao, Na; Wang, Luo

    2016-01-01

    Quantum steganography is the technique which hides a secret message into quantum covers such as quantum images. In this paper, two blind LSB steganography algorithms in the form of quantum circuits are proposed based on the novel enhanced quantum representation (NEQR) for quantum images. One algorithm is plain LSB which uses the message bits to substitute for the pixels' LSB directly. The other is block LSB which embeds a message bit into a number of pixels that belong to one image block. The extracting circuits can regain the secret message only according to the stego cover. Analysis and simulation-based experimental results demonstrate that the invisibility is good, and the balance between the capacity and the robustness can be adjusted according to the needs of applications.

  10. Progressive Exponential Clustering-Based Steganography

    Directory of Open Access Journals (Sweden)

    Li Yue

    2010-01-01

    Full Text Available Cluster indexing-based steganography is an important branch of data-hiding techniques. Such schemes normally achieve good balance between high embedding capacity and low embedding distortion. However, most cluster indexing-based steganographic schemes utilise less efficient clustering algorithms for embedding data, which causes redundancy and leaves room for increasing the embedding capacity further. In this paper, a new clustering algorithm, called progressive exponential clustering (PEC, is applied to increase the embedding capacity by avoiding redundancy. Meanwhile, a cluster expansion algorithm is also developed in order to further increase the capacity without sacrificing imperceptibility.

  11. Secure Image Steganography Algorithm Based on DCT with OTP Encryption

    Directory of Open Access Journals (Sweden)

    De Rosal Ignatius Moses Setiadi

    2017-04-01

    Full Text Available Rapid development of Internet makes transactions message even easier and faster. The main problem in the transactions message is security, especially if the message is private and secret. To secure these messages is usually done with steganography or cryptography. Steganography is a way to hide messages into other digital content such as images, video or audio so it does not seem nondescript from the outside. While cryptography is a technique to encrypt messages so that messages can not be read directly. In this paper have proposed combination of steganography using discrete cosine transform (DCT and cryptography using the one-time pad or vernam cipher implemented on a digital image. The measurement method used to determine the quality of stego image is the peak signal to noise ratio (PSNR and ormalize cross Correlation (NCC to measure the quality of the extraction of the decrypted message. Of steganography and encryption methods proposed obtained satisfactory results with PSNR and NCC high and resistant to JPEG compression and median filter. Keywords—Image Steganography, Discrete Cosine Transform (DCT, One Time Pad, Vernam, Chiper, Image Cryptography

  12. A novel quantum LSB-based steganography method using the Gray code for colored quantum images

    Science.gov (United States)

    Heidari, Shahrokh; Farzadnia, Ehsan

    2017-10-01

    As one of the prevalent data-hiding techniques, steganography is defined as the act of concealing secret information in a cover multimedia encompassing text, image, video and audio, imperceptibly, in order to perform interaction between the sender and the receiver in which nobody except the receiver can figure out the secret data. In this approach a quantum LSB-based steganography method utilizing the Gray code for quantum RGB images is investigated. This method uses the Gray code to accommodate two secret qubits in 3 LSBs of each pixel simultaneously according to reference tables. Experimental consequences which are analyzed in MATLAB environment, exhibit that the present schema shows good performance and also it is more secure and applicable than the previous one currently found in the literature.

  13. Steganography and encrypting based on immunochemical systems.

    Science.gov (United States)

    Kim, Kyung-Woo; Bocharova, Vera; Halámek, Jan; Oh, Min-Kyu; Katz, Evgeny

    2011-05-01

    Steganography and encrypting were demonstrated with immuno-specific systems. IgG-proteins were used as invisible ink developed with complementary antibodies labeled with enzymes producing color spots. The information security was achieved by mixing the target protein-antigens used for the text encoding with masking proteins of similar composition but having different bioaffinity. Two different texts were simultaneously encoded by using two different encoding proteins in a mixture. Various encrypting techniques were exemplified with the immuno-systems used for the steganography. Future use of the developed approach for information protection and watermark-technology was proposed. Scaling down the encoded text to a micro-size is feasible with the use of nanotechnology. Copyright © 2010 Wiley Periodicals, Inc.

  14. Improved chaos-based video steganography using DNA alphabets

    Directory of Open Access Journals (Sweden)

    Nirmalya Kar

    2018-03-01

    Full Text Available DNA based steganography plays a vital role in the field of privacy and secure communication. Here, we propose a DNA properties-based mechanism to send data hidden inside a video file. Initially, the video file is converted into image frames. Random frames are then selected and data is hidden in these at random locations by using the Least Significant Bit substitution method. We analyze the proposed architecture in terms of peak signal-to-noise ratio as well as mean squared error measured between the original and steganographic files averaged over all video frames. The results show minimal degradation of the steganographic video file. Keywords: Chaotic map, DNA, Linear congruential generator, Video steganography, Least significant bit

  15. Video steganography based on bit-plane decomposition of wavelet-transformed video

    Science.gov (United States)

    Noda, Hideki; Furuta, Tomofumi; Niimi, Michiharu; Kawaguchi, Eiji

    2004-06-01

    This paper presents a steganography method using lossy compressed video which provides a natural way to send a large amount of secret data. The proposed method is based on wavelet compression for video data and bit-plane complexity segmentation (BPCS) steganography. BPCS steganography makes use of bit-plane decomposition and the characteristics of the human vision system, where noise-like regions in bit-planes of a dummy image are replaced with secret data without deteriorating image quality. In wavelet-based video compression methods such as 3-D set partitioning in hierarchical trees (SPIHT) algorithm and Motion-JPEG2000, wavelet coefficients in discrete wavelet transformed video are quantized into a bit-plane structure and therefore BPCS steganography can be applied in the wavelet domain. 3-D SPIHT-BPCS steganography and Motion-JPEG2000-BPCS steganography are presented and tested, which are the integration of 3-D SPIHT video coding and BPCS steganography, and that of Motion-JPEG2000 and BPCS, respectively. Experimental results show that 3-D SPIHT-BPCS is superior to Motion-JPEG2000-BPCS with regard to embedding performance. In 3-D SPIHT-BPCS steganography, embedding rates of around 28% of the compressed video size are achieved for twelve bit representation of wavelet coefficients with no noticeable degradation in video quality.

  16. High capacity image steganography method based on framelet and compressive sensing

    Science.gov (United States)

    Xiao, Moyan; He, Zhibiao

    2015-12-01

    To improve the capacity and imperceptibility of image steganography, a novel high capacity and imperceptibility image steganography method based on a combination of framelet and compressive sensing (CS) is put forward. Firstly, SVD (Singular Value Decomposition) transform to measurement values obtained by compressive sensing technique to the secret data. Then the singular values in turn embed into the low frequency coarse subbands of framelet transform to the blocks of the cover image which is divided into non-overlapping blocks. Finally, use inverse framelet transforms and combine to obtain the stego image. The experimental results show that the proposed steganography method has a good performance in hiding capacity, security and imperceptibility.

  17. Tag Based Audio Search Engine

    OpenAIRE

    Parameswaran Vellachu; Sunitha Abburu

    2012-01-01

    The volume of the music database is increasing day by day. Getting the required song as per the choice of the listener is a big challenge. Hence, it is really hard to manage this huge quantity, in terms of searching, filtering, through the music database. It is surprising to see that the audio and music industry still rely on very simplistic metadata to describe music files. However, while searching audio resource, an efficient "Tag Based Audio Search Engine" is necessary. The current researc...

  18. A New Information Hiding Method Based on Improved BPCS Steganography

    OpenAIRE

    Sun, Shuliang

    2015-01-01

    Bit-plane complexity segmentation (BPCS) steganography is advantageous in its capacity and imperceptibility. The important step of BPCS steganography is how to locate noisy regions in a cover image exactly. The regular method, black-and-white border complexity, is a simple and easy way, but it is not always useful, especially for periodical patterns. Run-length irregularity and border noisiness are introduced in this paper to work out this problem. Canonical Cray coding (CGC) is also used to ...

  19. A Survey on different techniques of steganography

    Directory of Open Access Journals (Sweden)

    Kaur Harpreet

    2016-01-01

    Full Text Available Steganography is important due to the exponential development and secret communication of potential computer users over the internet. Steganography is the art of invisible communication to keep secret information inside other information. Steganalysis is the technology that attempts to ruin the Steganography by detecting the hidden information and extracting.Steganography is the process of Data embedding in the images, text/documented, audio and video files. The paper also highlights the security improved by applying various techniques of video steganography.

  20. A Novel Strategy for Quantum Image Steganography Based on Moiré Pattern

    Science.gov (United States)

    Jiang, Nan; Wang, Luo

    2015-03-01

    Image steganography technique is widely used to realize the secrecy transmission. Although its strategies on classical computers have been extensively researched, there are few studies on such strategies on quantum computers. Therefore, in this paper, a novel, secure and keyless steganography approach for images on quantum computers is proposed based on Moiré pattern. Algorithms based on the Moiré pattern are proposed for binary image embedding and extraction. Based on the novel enhanced quantum representation of digital images (NEQR), recursive and progressively layered quantum circuits for embedding and extraction operations are designed. In the end, experiments are done to verify the validity and robustness of proposed methods, which confirms that the approach in this paper is effective in quantum image steganography strategy.

  1. A Novel AMR-WB Speech Steganography Based on Diameter-Neighbor Codebook Partition

    Directory of Open Access Journals (Sweden)

    Junhui He

    2018-01-01

    Full Text Available Steganography is a means of covert communication without revealing the occurrence and the real purpose of communication. The adaptive multirate wideband (AMR-WB is a widely adapted format in mobile handsets and is also the recommended speech codec for VoLTE. In this paper, a novel AMR-WB speech steganography is proposed based on diameter-neighbor codebook partition algorithm. Different embedding capacity may be achieved by adjusting the iterative parameters during codebook division. The experimental results prove that the presented AMR-WB steganography may provide higher and flexible embedding capacity without inducing perceptible distortion compared with the state-of-the-art methods. With 48 iterations of cluster merging, twice the embedding capacity of complementary-neighbor-vertices-based embedding method may be obtained with a decrease of only around 2% in speech quality and much the same undetectability. Moreover, both the quality of stego speech and the security regarding statistical steganalysis are better than the recent speech steganography based on neighbor-index-division codebook partition.

  2. Digital watermarking and steganography fundamentals and techniques

    CERN Document Server

    Shih, Frank Y

    2007-01-01

    Introduction Digital Watermarking Digital Steganography Differences between Watermarking and Steganography A Brief History Appendix: Selected List of Books on Watermarking and Steganography Classification in Digital Watermarking Classification Based on Characteristics Classification Based on Applications Mathematical Preliminaries  Least-Significant-Bit Substitution Discrete Fourier Transform (DFT) Discrete Cosine Transform Discrete Wavelet Transform Random Sequence Generation  The Chaotic M

  3. Optical steganography based on amplified spontaneous emission noise.

    Science.gov (United States)

    Wu, Ben; Wang, Zhenxing; Tian, Yue; Fok, Mable P; Shastri, Bhavin J; Kanoff, Daniel R; Prucnal, Paul R

    2013-01-28

    We propose and experimentally demonstrate an optical steganography method in which a data signal is transmitted using amplified spontaneous emission (ASE) noise as a carrier. The ASE serving as a carrier for the private signal has an identical frequency spectrum to the existing noise generated by the Erbium doped fiber amplifiers (EDFAs) in the transmission system. The system also carries a conventional data channel that is not private. The so-called "stealth" or private channel is well-hidden within the noise of the system. Phase modulation is used for both the stealth channel and the public channel. Using homodyne detection, the short coherence length of the ASE ensures that the stealth signal can only be recovered if the receiver closely matches the delay-length difference, which is deliberately changed in a dynamic fashion that is only known to the transmitter and its intended receiver.

  4. A new digital image steganography algorithm based on visible wavelength

    OpenAIRE

    COŞKUN, İbrahim; AKAR, Feyzi; ÇETİN, Özdemir

    2013-01-01

    Stenography is the science that ensures secret communication through multimedia carriers such as image, audio, and video files. The ultimate end of stenography is to hide the secret data in the carrier file so that they are not detected. To that end, stenography applications should have such features as undetectability; robustness; resistance to various images process, sing methods, and compression; and capacity of the hidden data. At the same time, those features distinguish stenogr...

  5. Multi-Party Quantum Steganography

    Science.gov (United States)

    Mihara, Takashi

    2017-02-01

    Steganography has been proposed as a data hiding technique. As a derivation, quantum steganography based on quantum physics has also been proposed. In this paper, we extend the results in presented (Mihara, Phys. Lett. 379, 952 2015) and propose a multi-party quantum steganography technique that combines quantum error-correcting codes with entanglement. The proposed protocol shares an entangled state among n +1 parties and sends n secret messages, corresponding to the n parties, to the other party. With no knowledge of the other secret messages, the n parties can construct a stego message by cooperating with each other. Finally, we propose a protocol for sending qubits using the same technique.

  6. Quantum Image Steganography and Steganalysis Based On LSQu-Blocks Image Information Concealing Algorithm

    Science.gov (United States)

    A. AL-Salhi, Yahya E.; Lu, Songfeng

    2016-08-01

    Quantum steganography can solve some problems that are considered inefficient in image information concealing. It researches on Quantum image information concealing to have been widely exploited in recent years. Quantum image information concealing can be categorized into quantum image digital blocking, quantum image stereography, anonymity and other branches. Least significant bit (LSB) information concealing plays vital roles in the classical world because many image information concealing algorithms are designed based on it. Firstly, based on the novel enhanced quantum representation (NEQR), image uniform blocks clustering around the concrete the least significant Qu-block (LSQB) information concealing algorithm for quantum image steganography is presented. Secondly, a clustering algorithm is proposed to optimize the concealment of important data. Finally, we used Con-Steg algorithm to conceal the clustered image blocks. Information concealing located on the Fourier domain of an image can achieve the security of image information, thus we further discuss the Fourier domain LSQu-block information concealing algorithm for quantum image based on Quantum Fourier Transforms. In our algorithms, the corresponding unitary Transformations are designed to realize the aim of concealing the secret information to the least significant Qu-block representing color of the quantum cover image. Finally, the procedures of extracting the secret information are illustrated. Quantum image LSQu-block image information concealing algorithm can be applied in many fields according to different needs.

  7. Bit Plane Coding based Steganography Technique for JPEG2000 Images and Videos

    Directory of Open Access Journals (Sweden)

    Geeta Kasana

    2016-02-01

    Full Text Available In this paper, a Bit Plane Coding (BPC based steganography technique for JPEG2000 images and Motion JPEG2000 video is proposed. Embedding in this technique is performed in the lowest significant bit planes of the wavelet coefficients of a cover image. In JPEG2000 standard, the number of bit planes of wavelet coefficients to be used in encoding is dependent on the compression rate and are used in Tier-2 process of JPEG2000. In the proposed technique, Tier-1 and Tier-2 processes of JPEG2000 and Motion JPEG2000 are executed twice on the encoder side to collect the information about the lowest bit planes of all code blocks of a cover image, which is utilized in embedding and transmitted to the decoder. After embedding secret data, Optimal Pixel Adjustment Process (OPAP is applied on stego images to enhance its visual quality. Experimental results show that proposed technique provides large embedding capacity and better visual quality of stego images than existing steganography techniques for JPEG2000 compressed images and videos. Extracted secret image is similar to the original secret image.

  8. A Novel Quantum Video Steganography Protocol with Large Payload Based on MCQI Quantum Video

    Science.gov (United States)

    Qu, Zhiguo; Chen, Siyi; Ji, Sai

    2017-11-01

    As one of important multimedia forms in quantum network, quantum video attracts more and more attention of experts and scholars in the world. A secure quantum video steganography protocol with large payload based on the video strip encoding method called as MCQI (Multi-Channel Quantum Images) is proposed in this paper. The new protocol randomly embeds the secret information with the form of quantum video into quantum carrier video on the basis of unique features of video frames. It exploits to embed quantum video as secret information for covert communication. As a result, its capacity are greatly expanded compared with the previous quantum steganography achievements. Meanwhile, the new protocol also achieves good security and imperceptibility by virtue of the randomization of embedding positions and efficient use of redundant frames. Furthermore, the receiver enables to extract secret information from stego video without retaining the original carrier video, and restore the original quantum video as a follow. The simulation and experiment results prove that the algorithm not only has good imperceptibility, high security, but also has large payload.

  9. MoveSteg: A Method of Network Steganography Detection

    OpenAIRE

    Szczypiorski, Krzysztof; Tyl, Tomasz

    2016-01-01

    This article presents a new method for detecting a source point of time based network steganography - MoveSteg. A steganography carrier could be an example of multimedia stream made with packets. These packets are then delayed intentionally to send hidden information using time based steganography methods. The presented analysis describes a method that allows finding the source of steganography stream in network that is under our management.

  10. Watermarking-Based Digital Audio Data Authentication

    Directory of Open Access Journals (Sweden)

    Jana Dittmann

    2003-09-01

    Full Text Available Digital watermarking has become an accepted technology for enabling multimedia protection schemes. While most efforts concentrate on user authentication, recently interest in data authentication to ensure data integrity has been increasing. Existing concepts address mainly image data. Depending on the necessary security level and the sensitivity to detect changes in the media, we differentiate between fragile, semifragile, and content-fragile watermarking approaches for media authentication. Furthermore, invertible watermarking schemes exist while each bit change can be recognized by the watermark which can be extracted and the original data can be reproduced for high-security applications. Later approaches can be extended with cryptographic approaches like digital signatures. As we see from the literature, only few audio approaches exist and the audio domain requires additional strategies for time flow protection and resynchronization. To allow different security levels, we have to identify relevant audio features that can be used to determine content manipulations. Furthermore, in the field of invertible schemes, there are a bunch of publications for image and video data but no approaches for digital audio to ensure data authentication for high-security applications. In this paper, we introduce and evaluate two watermarking algorithms for digital audio data, addressing content integrity protection. In our first approach, we discuss possible features for a content-fragile watermarking scheme to allow several postproduction modifications. The second approach is designed for high-security applications to detect each bit change and reconstruct the original audio by introducing an invertible audio watermarking concept. Based on the invertible audio scheme, we combine digital signature schemes and digital watermarking to provide a public verifiable data authentication and a reproduction of the original, protected with a secret key.

  11. Forensic analysis of video steganography tools

    Directory of Open Access Journals (Sweden)

    Thomas Sloan

    2015-05-01

    Full Text Available Steganography is the art and science of concealing information in such a way that only the sender and intended recipient of a message should be aware of its presence. Digital steganography has been used in the past on a variety of media including executable files, audio, text, games and, notably, images. Additionally, there is increasing research interest towards the use of video as a media for steganography, due to its pervasive nature and diverse embedding capabilities. In this work, we examine the embedding algorithms and other security characteristics of several video steganography tools. We show how all feature basic and severe security weaknesses. This is potentially a very serious threat to the security, privacy and anonymity of their users. It is important to highlight that most steganography users have perfectly legal and ethical reasons to employ it. Some common scenarios would include citizens in oppressive regimes whose freedom of speech is compromised, people trying to avoid massive surveillance or censorship, political activists, whistle blowers, journalists, etc. As a result of our findings, we strongly recommend ceasing any use of these tools, and to remove any contents that may have been hidden, and any carriers stored, exchanged and/or uploaded online. For many of these tools, carrier files will be trivial to detect, potentially compromising any hidden data and the parties involved in the communication. We finish this work by presenting our steganalytic results, that highlight a very poor current state of the art in practical video steganography tools. There is unfortunately a complete lack of secure and publicly available tools, and even commercial tools offer very poor security. We therefore encourage the steganography community to work towards the development of more secure and accessible video steganography tools, and make them available for the general public. The results presented in this work can also be seen as a useful

  12. A high capacity text steganography scheme based on LZW compression and color coding

    Directory of Open Access Journals (Sweden)

    Aruna Malik

    2017-02-01

    Full Text Available In this paper, capacity and security issues of text steganography have been considered by employing LZW compression technique and color coding based approach. The proposed technique uses the forward mail platform to hide the secret data. This algorithm first compresses secret data and then hides the compressed secret data into the email addresses and also in the cover message of the email. The secret data bits are embedded in the message (or cover text by making it colored using a color coding table. Experimental results show that the proposed method not only produces a high embedding capacity but also reduces computational complexity. Moreover, the security of the proposed method is significantly improved by employing stego keys. The superiority of the proposed method has been experimentally verified by comparing with recently developed existing techniques.

  13. WDM optical steganography based on amplified spontaneous emission noise.

    Science.gov (United States)

    Wu, Ben; Tait, Alexander N; Chang, Matthew P; Prucnal, Paul R

    2014-10-15

    We propose and experimentally demonstrate a wavelength-division multiplexed (WDM) optical stealth transmission system carried by amplified spontaneous emission (ASE) noise. The stealth signal is hidden in both time and frequency domains by using ASE noise as the signal carrier. Each WDM channel uses part of the ASE spectrum, which provides more flexibility to apply stealth transmission in a public network and adds another layer of security to the stealth channel. Multi-channel transmission also increases the overall channel capacity, which is the major limitation of the single stealth channel transmission based on ASE noise. The relations between spectral bandwidth and coherence length of ASE carrier have been theoretically analyzed and experimentally investigated.

  14. Hierarchical system for content-based audio classification and retrieval

    Science.gov (United States)

    Zhang, Tong; Kuo, C.-C. Jay

    1998-10-01

    A hierarchical system for audio classification and retrieval based on audio content analysis is presented in this paper. The system consists of three stages. The audio recordings are first classical and segmented into speech, music, several types of environmental sounds, and silence, based on morphological and statistical analysis of temporal curves of the energy function, the average zero-crossing rate, and the fundamental frequency of audio signals. The first stage is called the coarse-level audio classification and segmentation. Then, environmental sounds are classified into finer classes such as applause, rain, birds' sound, etc., which is called the fine-level audio classification. The second stage is based on time-frequency analysis of audio signals and the use of the hidden Markov model (HMM) for classification. In the third stage, the query-by-example audio retrieval is implemented where similar sounds can be found according to the input sample audio. The way of modeling audio features with the hidden Markov model, the procedures of audio classification and retrieval, and the experimental results are described. It is shown that, with the proposed new system, audio recordings can be automatically segmented and classified into basic types in real time with an accuracy higher than 90%. Examples of audio fine classification and audio retrieval with the proposed HMM-based method are also provided.

  15. Steganography: Past, Present, Future

    Energy Technology Data Exchange (ETDEWEB)

    Judge, J C

    2001-12-01

    Steganography (a rough Greek translation of the term Steganography is secret writing) has been used in various forms for 2500 years. It has found use in variously in military, diplomatic, personal and intellectual property applications. Briefly stated, steganography is the term applied to any number of processes that will hide a message within an object, where the hidden message will not be apparent to an observer. This paper will explore steganography from its earliest instances through potential future application.

  16. Transform domain steganography with blind source separation

    Science.gov (United States)

    Jouny, Ismail

    2015-05-01

    This paper applies blind source separation or independent component analysis for images that may contain mixtures of text, audio, or other images for steganography purposes. The paper focuses on separating mixtures in the transform domain such as Fourier domain or the Wavelet domain. The study addresses the effectiveness of steganography when using linear mixtures of multimedia components and the ability of standard blind sources separation techniques to discern hidden multimedia messages. Mixing in the space, frequency, and wavelet (scale) domains is compared. Effectiveness is measured using mean square error rate between original and recovered images.

  17. Steganography in arrhythmic electrocardiogram signal.

    Science.gov (United States)

    Edward Jero, S; Ramu, Palaniappan; Ramakrishnan, S

    2015-08-01

    Security and privacy of patient data is a vital requirement during exchange/storage of medical information over communication network. Steganography method hides patient data into a cover signal to prevent unauthenticated accesses during data transfer. This study evaluates the performance of ECG steganography to ensure secured transmission of patient data where an abnormal ECG signal is used as cover signal. The novelty of this work is to hide patient data into two dimensional matrix of an abnormal ECG signal using Discrete Wavelet Transform and Singular Value Decomposition based steganography method. A 2D ECG is constructed according to Tompkins QRS detection algorithm. The missed R peaks are computed using RR interval during 2D conversion. The abnormal ECG signals are obtained from the MIT-BIH arrhythmia database. Metrics such as Peak Signal to Noise Ratio, Percentage Residual Difference, Kullback-Leibler distance and Bit Error Rate are used to evaluate the performance of the proposed approach.

  18. Wavelet-Based ECG Steganography for Protecting Patient Confidential Information in Point-of-Care Systems.

    Science.gov (United States)

    Ibaida, Ayman; Khalil, Ibrahim

    2013-12-01

    With the growing number of aging population and a significant portion of that suffering from cardiac diseases, it is conceivable that remote ECG patient monitoring systems are expected to be widely used as point-of-care (PoC) applications in hospitals around the world. Therefore, huge amount of ECG signal collected by body sensor networks from remote patients at homes will be transmitted along with other physiological readings such as blood pressure, temperature, glucose level, etc., and diagnosed by those remote patient monitoring systems. It is utterly important that patient confidentiality is protected while data are being transmitted over the public network as well as when they are stored in hospital servers used by remote monitoring systems. In this paper, a wavelet-based steganography technique has been introduced which combines encryption and scrambling technique to protect patient confidential data. The proposed method allows ECG signal to hide its corresponding patient confidential data and other physiological information thus guaranteeing the integration between ECG and the rest. To evaluate the effectiveness of the proposed technique on the ECG signal, two distortion measurement metrics have been used: the percentage residual difference and the wavelet weighted PRD. It is found that the proposed technique provides high-security protection for patients data with low (less than 1%) distortion and ECG data remain diagnosable after watermarking (i.e., hiding patient confidential data) and as well as after watermarks (i.e., hidden data) are removed from the watermarked data.

  19. Steganalysis of content-adaptive JPEG steganography based on Gauss partial derivative filter bank

    Science.gov (United States)

    Zhang, Yi; Liu, Fenlin; Yang, Chunfang; Luo, Xiangyang; Song, Xiaofeng; Lu, Jicang

    2017-01-01

    A steganalysis feature extraction method based on Gauss partial derivative filter bank is proposed in this paper to improve the detection performance for content-adaptive JPEG steganography. Considering that the embedding changes of content-adaptive steganographic schemes are performed in the texture and edge regions, the proposed method generates filtered images comprising rich texture and edge information using Gauss partial derivative filter bank, and histograms of absolute values of filtered subimages are extracted as steganalysis features. Gauss partial derivative filter bank can represent texture and edge information in multiple orientations with less computation load than conventional methods and prevent redundancy in different filtered images. These two properties are beneficial in the extraction of low-complexity sensitive features. The results of experiments conducted on three selected modern JPEG steganographic schemes-uniform embedding distortion, JPEG universal wavelet relative distortion, and side-informed UNIWARD-indicate that the proposed feature set is superior to the prior art feature sets-discrete cosine transform residual, phase aware rich model, and Gabor filter residual.

  20. A novel fuzzy logic-based image steganography method to ensure medical data security.

    Science.gov (United States)

    Karakış, R; Güler, I; Çapraz, I; Bilir, E

    2015-12-01

    This study aims to secure medical data by combining them into one file format using steganographic methods. The electroencephalogram (EEG) is selected as hidden data, and magnetic resonance (MR) images are also used as the cover image. In addition to the EEG, the message is composed of the doctor׳s comments and patient information in the file header of images. Two new image steganography methods that are based on fuzzy-logic and similarity are proposed to select the non-sequential least significant bits (LSB) of image pixels. The similarity values of the gray levels in the pixels are used to hide the message. The message is secured to prevent attacks by using lossless compression and symmetric encryption algorithms. The performance of stego image quality is measured by mean square of error (MSE), peak signal-to-noise ratio (PSNR), structural similarity measure (SSIM), universal quality index (UQI), and correlation coefficient (R). According to the obtained result, the proposed method ensures the confidentiality of the patient information, and increases data repository and transmission capacity of both MR images and EEG signals. Copyright © 2015 Elsevier Ltd. All rights reserved.

  1. A high capacity 3D steganography algorithm.

    Science.gov (United States)

    Chao, Min-Wen; Lin, Chao-hung; Yu, Cheng-Wei; Lee, Tong-Yee

    2009-01-01

    In this paper, we present a very high-capacity and low-distortion 3D steganography scheme. Our steganography approach is based on a novel multilayered embedding scheme to hide secret messages in the vertices of 3D polygon models. Experimental results show that the cover model distortion is very small as the number of hiding layers ranges from 7 to 13 layers. To the best of our knowledge, this novel approach can provide much higher hiding capacity than other state-of-the-art approaches, while obeying the low distortion and security basic requirements for steganography on 3D models.

  2. Software tools for object-based audio production using the Audio Definition Model

    OpenAIRE

    Matthias , Geier; Carpentier , Thibaut; Noisternig , Markus; Warusfel , Olivier

    2017-01-01

    International audience; We present a publicly available set of tools for the integration of the Audio Definition Model (ADM) in production workflows. ADM is an open metadata model for the description of channel-, scene-, and object-based media within a Broadcast Wave Format (BWF) container. The software tools were developed within the European research project ORPHEUS (https://orpheus-audio.eu/) that aims at developing new end-to-end object-based media chains for broadcast. These tools allow ...

  3. Content-based classification and retrieval of audio

    Science.gov (United States)

    Zhang, Tong; Kuo, C.-C. Jay

    1998-10-01

    An on-line audio classification and segmentation system is presented in this research, where audio recordings are classified and segmented into speech, music, several types of environmental sounds and silence based on audio content analysis. This is the first step of our continuing work towards a general content-based audio classification and retrieval system. The extracted audio features include temporal curves of the energy function,the average zero- crossing rate, the fundamental frequency of audio signals, as well as statistical and morphological features of these curves. The classification result is achieved through a threshold-based heuristic procedure. The audio database that we have built, details of feature extraction, classification and segmentation procedures, and experimental results are described. It is shown that, with the proposed new system, audio recordings can be automatically segmented and classified into basic types in real time with an accuracy of over 90 percent. Outlines of further classification of audio into finer types and a query-by-example audio retrieval system on top of the coarse classification are also introduced.

  4. Researcher’s Perspective of Substitution Method on Text Steganography

    Science.gov (United States)

    Zamir Mansor, Fawwaz; Mustapha, Aida; Azah Samsudin, Noor

    2017-08-01

    The linguistic steganography studies are still in the stage of development and empowerment practices. This paper will present several text steganography on substitution methods based on the researcher’s perspective, all scholar paper will analyse and compared. The objective of this paper is to give basic information in the substitution method of text domain steganography that has been applied by previous researchers. The typical ways of this method also will be identified in this paper to reveal the most effective method in text domain steganography. Finally, the advantage of the characteristic and drawback on these techniques in generally also presented in this paper.

  5. Extreme learning machine based optimal embedding location finder for image steganography.

    Science.gov (United States)

    Atee, Hayfaa Abdulzahra; Ahmad, Robiah; Noor, Norliza Mohd; Rahma, Abdul Monem S; Aljeroudi, Yazan

    2017-01-01

    In image steganography, determining the optimum location for embedding the secret message precisely with minimum distortion of the host medium remains a challenging issue. Yet, an effective approach for the selection of the best embedding location with least deformation is far from being achieved. To attain this goal, we propose a novel approach for image steganography with high-performance, where extreme learning machine (ELM) algorithm is modified to create a supervised mathematical model. This ELM is first trained on a part of an image or any host medium before being tested in the regression mode. This allowed us to choose the optimal location for embedding the message with best values of the predicted evaluation metrics. Contrast, homogeneity, and other texture features are used for training on a new metric. Furthermore, the developed ELM is exploited for counter over-fitting while training. The performance of the proposed steganography approach is evaluated by computing the correlation, structural similarity (SSIM) index, fusion matrices, and mean square error (MSE). The modified ELM is found to outperform the existing approaches in terms of imperceptibility. Excellent features of the experimental results demonstrate that the proposed steganographic approach is greatly proficient for preserving the visual information of an image. An improvement in the imperceptibility as much as 28% is achieved compared to the existing state of the art methods.

  6. Extreme learning machine based optimal embedding location finder for image steganography.

    Directory of Open Access Journals (Sweden)

    Hayfaa Abdulzahra Atee

    Full Text Available In image steganography, determining the optimum location for embedding the secret message precisely with minimum distortion of the host medium remains a challenging issue. Yet, an effective approach for the selection of the best embedding location with least deformation is far from being achieved. To attain this goal, we propose a novel approach for image steganography with high-performance, where extreme learning machine (ELM algorithm is modified to create a supervised mathematical model. This ELM is first trained on a part of an image or any host medium before being tested in the regression mode. This allowed us to choose the optimal location for embedding the message with best values of the predicted evaluation metrics. Contrast, homogeneity, and other texture features are used for training on a new metric. Furthermore, the developed ELM is exploited for counter over-fitting while training. The performance of the proposed steganography approach is evaluated by computing the correlation, structural similarity (SSIM index, fusion matrices, and mean square error (MSE. The modified ELM is found to outperform the existing approaches in terms of imperceptibility. Excellent features of the experimental results demonstrate that the proposed steganographic approach is greatly proficient for preserving the visual information of an image. An improvement in the imperceptibility as much as 28% is achieved compared to the existing state of the art methods.

  7. STEGANOGRAPHY USAGE TO CONTROL MULTIMEDIA STREAM

    Directory of Open Access Journals (Sweden)

    Grzegorz Koziel

    2014-03-01

    Full Text Available In the paper, a proposal of new application for steganography is presented. It is possible to use steganographic techniques to control multimedia stream playback. Special control markers can be included in the sound signal and the player can detect markers and modify the playback parameters according to the hidden instructions. This solution allows for remembering user preferences within the audio track as well as allowing for preparation of various versions of the same content at the production level.

  8. Investigator's guide to steganography

    CERN Document Server

    Kipper, Gregory

    2003-01-01

    The Investigator's Guide to Steganography provides a comprehensive look at this unique form of hidden communication from its earliest beginnings to its most modern uses. The book begins by exploring the past, providing valuable insight into how this method of communication began and evolved from ancient times to the present day. It continues with an in-depth look at the workings of digital steganography and watermarking methods, available tools on the Internet, and a review of companies who are providing cutting edge steganography and watermarking services. The third section builds on the first two by outlining and discussing real world uses of steganography from the business and entertainment to national security and terrorism. The book concludes by reviewing steganography detection methods and what can be expected in the future

  9. Parametric Audio Based Decoder and Music Synthesizer for Mobile Applications

    NARCIS (Netherlands)

    Oomen, A.W.J.; Szczerba, M.Z.; Therssen, D.

    2011-01-01

    This paper reviews parametric audio coders and discusses novel technologies introduced in a low-complexity, low-power consumption audiodecoder and music synthesizer platform developed by the authors. Thedecoder uses parametric coding scheme based on the MPEG-4 Parametric Audio standard. In order to

  10. Steganography-based access control to medical data hidden in electrocardiogram.

    Science.gov (United States)

    Mai, Vu; Khalil, Ibrahim; Ibaida, Ayman

    2013-01-01

    Steganographic techniques allow secret data to be embedded inside another host data such as an image or a text file without significant changes to the quality of the host data. In this research, we demonstrate how steganography can be used as the main mechanism to build an access control model that gives data owners complete control to their sensitive cardiac health information hidden in their own Electrocardiograms. Our access control model is able to protect the privacy of users, the confidentiality of medical data, reduce storage space and make it more efficient to upload and download large amount of data.

  11. Portable audio electronics for impedance-based measurements in microfluidics

    International Nuclear Information System (INIS)

    Wood, Paul; Sinton, David

    2010-01-01

    We demonstrate the use of audio electronics-based signals to perform on-chip electrochemical measurements. Cell phones and portable music players are examples of consumer electronics that are easily operated and are ubiquitous worldwide. Audio output (play) and input (record) signals are voltage based and contain frequency and amplitude information. A cell phone, laptop soundcard and two compact audio players are compared with respect to frequency response; the laptop soundcard provides the most uniform frequency response, while the cell phone performance is found to be insufficient. The audio signals in the common portable music players and laptop soundcard operate in the range of 20 Hz to 20 kHz and are found to be applicable, as voltage input and output signals, to impedance-based electrochemical measurements in microfluidic systems. Validated impedance-based measurements of concentration (0.1–50 mM), flow rate (2–120 µL min −1 ) and particle detection (32 µm diameter) are demonstrated. The prevailing, lossless, wave audio file format is found to be suitable for data transmission to and from external sources, such as a centralized lab, and the cost of all hardware (in addition to audio devices) is ∼10 USD. The utility demonstrated here, in combination with the ubiquitous nature of portable audio electronics, presents new opportunities for impedance-based measurements in portable microfluidic systems. (technical note)

  12. COMPARATIVE STUDY OF EDGE BASED LSB MATCHING STEGANOGRAPHY FOR COLOR IMAGES

    Directory of Open Access Journals (Sweden)

    A.J. Umbarkar

    2016-02-01

    Full Text Available Steganography is a very pivotal technique mainly used for covert transfer of information over a covert communication channel. This paper proposes a significant comparative study of the spatial LSB domain technique that focuses on sharper edges of the color as well as gray scale images for the purpose of data hiding and hides secret message first in sharper edge regions and then in smooth regions of the image. Message embedding depends on content of the image and message size. The experimental results illustrate that, for low embedding rate the method hides the message in sharp edges of cover image to get better stego image visualization quality. For high embedding rate, smooth regions and edges of the cover image are used for the purpose of data hiding. In this steganography method, color image and textured kind of image preserves better visual quality of stego image. The novelty of the comparative study is that, it helps to analyze the efficiency and performance of the method as it gives better results because it directly works on color images instead of converting to gray scale image.

  13. Website-based PNG image steganography using the modified Vigenere Cipher, least significant bit, and dictionary based compression methods

    Science.gov (United States)

    Rojali, Salman, Afan Galih; George

    2017-08-01

    Along with the development of information technology in meeting the needs, various adverse actions and difficult to avoid are emerging. One of such action is data theft. Therefore, this study will discuss about cryptography and steganography that aims to overcome these problems. This study will use the Modification Vigenere Cipher, Least Significant Bit and Dictionary Based Compression methods. To determine the performance of study, Peak Signal to Noise Ratio (PSNR) method is used to measure objectively and Mean Opinion Score (MOS) method is used to measure subjectively, also, the performance of this study will be compared to other method such as Spread Spectrum and Pixel Value differencing. After comparing, it can be concluded that this study can provide better performance when compared to other methods (Spread Spectrum and Pixel Value Differencing) and has a range of MSE values (0.0191622-0.05275) and PSNR (60.909 to 65.306) with a hidden file size of 18 kb and has a MOS value range (4.214 to 4.722) or image quality that is approaching very good.

  14. Scope of Support Vector Machine in Steganography

    Science.gov (United States)

    Tanwar, Rohit; Malhotrab, Sona

    2017-08-01

    Steganography is a technique used for secure transmission of data. Using audio as a cover file opens path for many extra features. In order to overcome the limitations of conventional LSB technique, various variants were proposed by different authors. In order to achieve robustness, use of various optimization techniques has been tradition. In this paper the focus is put on use of Genetic Algorithm and Particle Swarm Intelligence in steganography. To list detailed scope, merits and de-merits of the two optimization techniques is the main constituent of this paper. In spite of analyzing the two techniques, the motivation and applicability of machine learning algorithm in the problem statement is also discussed. This paper will guide the path in using Support Vector Machine for optimizing the data hiding.

  15. Quality Enhancement of Compressed Audio Based on Statistical Conversion

    Directory of Open Access Journals (Sweden)

    Mouchtaris Athanasios

    2008-01-01

    Full Text Available Most audio compression formats are based on the idea of low bit rate transparent encoding. As these types of audio signals are starting to migrate from portable players with inexpensive headphones to higher quality home audio systems, it is becoming evident that higher bit rates may be required to maintain transparency. We propose a novel method that enhances low bit rate encoded audio segments by applying multiband audio resynthesis methods in a postprocessing stage. Our algorithm employs the highly flexible Generalized Gaussian mixture model which offers a more accurate representation of audio features than the Gaussian mixture model. A novel residual conversion technique is applied which proves to significantly improve the enhancement performance without excessive overhead. In addition, both cepstral and residual errors are dramatically decreased by a feature-alignment scheme that employs a sorting transformation. Some improvements regarding the quantization step are also described that enable us to further reduce the algorithm overhead. Signal enhancement examples are presented and the results show that the overhead size incurred by the algorithm is a fraction of the uncompressed signal size. Our results show that the resulting audio quality is comparable to that of a standard perceptual codec operating at approximately the same bit rate.

  16. Quality Enhancement of Compressed Audio Based on Statistical Conversion

    Directory of Open Access Journals (Sweden)

    Chris Kyriakakis

    2008-07-01

    Full Text Available Most audio compression formats are based on the idea of low bit rate transparent encoding. As these types of audio signals are starting to migrate from portable players with inexpensive headphones to higher quality home audio systems, it is becoming evident that higher bit rates may be required to maintain transparency. We propose a novel method that enhances low bit rate encoded audio segments by applying multiband audio resynthesis methods in a postprocessing stage. Our algorithm employs the highly flexible Generalized Gaussian mixture model which offers a more accurate representation of audio features than the Gaussian mixture model. A novel residual conversion technique is applied which proves to significantly improve the enhancement performance without excessive overhead. In addition, both cepstral and residual errors are dramatically decreased by a feature-alignment scheme that employs a sorting transformation. Some improvements regarding the quantization step are also described that enable us to further reduce the algorithm overhead. Signal enhancement examples are presented and the results show that the overhead size incurred by the algorithm is a fraction of the uncompressed signal size. Our results show that the resulting audio quality is comparable to that of a standard perceptual codec operating at approximately the same bit rate.

  17. A Blind High-Capacity Wavelet-Based Steganography Technique for Hiding Images into other Images

    Directory of Open Access Journals (Sweden)

    HAMAD, S.

    2014-05-01

    Full Text Available The flourishing field of Steganography is providing effective techniques to hide data into different types of digital media. In this paper, a novel technique is proposed to hide large amounts of image data into true colored images. The proposed method employs wavelet transforms to decompose images in a way similar to the Human Visual System (HVS for more secure and effective data hiding. The designed model can blindly extract the embedded message without the need to refer to the original cover image. Experimental results showed that the proposed method outperformed all of the existing techniques not only imperceptibility but also in terms of capacity. In fact, the proposed technique showed an outstanding performance on hiding a secret image whose size equals 100% of the cover image while maintaining excellent visual quality of the resultant stego-images.

  18. Object-based audio reproduction and the audio scene description format

    OpenAIRE

    Geier, Matthias; Ahrens, Jens; Spors, Sascha

    2010-01-01

    Dieser Beitrag ist mit Zustimmung des Rechteinhabers aufgrund einer (DFG geförderten) Allianz- bzw. Nationallizenz frei zugänglich. This publication is with permission of the rights owner freely accessible due to an Alliance licence and a national licence (funded by the DFG, German Research Foundation) respectively. The introduction of new techniques for audio reproduction such as HRTF-based technology, wave field synthesis and higher-order Ambisonics is accompanied by a paradigm shift ...

  19. Audio CAPTCHA for SIP-Based VoIP

    Science.gov (United States)

    Soupionis, Yannis; Tountas, George; Gritzalis, Dimitris

    Voice over IP (VoIP) introduces new ways of communication, while utilizing existing data networks to provide inexpensive voice communications worldwide as a promising alternative to the traditional PSTN telephony. SPam over Internet Telephony (SPIT) is one potential source of future annoyance in VoIP. A common way to launch a SPIT attack is the use of an automated procedure (bot), which generates calls and produces audio advertisements. In this paper, our goal is to design appropriate CAPTCHA to fight such bots. We focus on and develop audio CAPTCHA, as the audio format is more suitable for VoIP environments and we implement it in a SIP-based VoIP environment. Furthermore, we suggest and evaluate the specific attributes that audio CAPTCHA should incorporate in order to be effective, and test it against an open source bot implementation.

  20. Discrete wavelet transform and singular value decomposition based ECG steganography for secured patient information transmission.

    Science.gov (United States)

    Edward Jero, S; Ramu, Palaniappan; Ramakrishnan, S

    2014-10-01

    ECG Steganography provides secured transmission of secret information such as patient personal information through ECG signals. This paper proposes an approach that uses discrete wavelet transform to decompose signals and singular value decomposition (SVD) to embed the secret information into the decomposed ECG signal. The novelty of the proposed method is to embed the watermark using SVD into the two dimensional (2D) ECG image. The embedding of secret information in a selected sub band of the decomposed ECG is achieved by replacing the singular values of the decomposed cover image by the singular values of the secret data. The performance assessment of the proposed approach allows understanding the suitable sub-band to hide secret data and the signal degradation that will affect diagnosability. Performance is measured using metrics like Kullback-Leibler divergence (KL), percentage residual difference (PRD), peak signal to noise ratio (PSNR) and bit error rate (BER). A dynamic location selection approach for embedding the singular values is also discussed. The proposed approach is demonstrated on a MIT-BIH database and the observations validate that HH is the ideal sub-band to hide data. It is also observed that the signal degradation (less than 0.6%) is very less in the proposed approach even with the secret data being as large as the sub band size. So, it does not affect the diagnosability and is reliable to transmit patient information.

  1. Steganography and Hiding Data with Indicators-based LSB Using a Secret Key

    Directory of Open Access Journals (Sweden)

    W. Saqer

    2016-06-01

    Full Text Available Steganography is the field of science concerned with hiding secret data inside other innocent-looking data, called the container, carrier or cover, in a way that no one apart from the meant parties can suspect the existence of the secret data. There are many algorithms and techniques of concealing data. Each of which has its own way of hiding and its own advantages and limitations. In our research we introduce a new algorithm of hiding data. The algorithm uses the same technique used by the Least Significant Bit (LSB algorithm which is embedding secret data in the least significant bit(s of the bytes of the carrier. It differs from the LSB algorithm in that it does not embed the bytes of the cover data sequentially but it embeds into one bit or two bits at once. Actually it depends on indicators to determine where and how many bits to embed at a time. These indicators are two bits of each cover byte after the least two significant bits. The advantage of this algorithm over the LSB algorithm is the randomness used to confuse intruders as it does not use fixed sequential bytes and it does not always embed one bit at a time. This aims to increase the security of the technique. Also, the amount of cover data consumed is less because it sometimes embeds two bits at once.

  2. Random linear codes in steganography

    Directory of Open Access Journals (Sweden)

    Kamil Kaczyński

    2016-12-01

    Full Text Available Syndrome coding using linear codes is a technique that allows improvement in the steganographic algorithms parameters. The use of random linear codes gives a great flexibility in choosing the parameters of the linear code. In parallel, it offers easy generation of parity check matrix. In this paper, the modification of LSB algorithm is presented. A random linear code [8, 2] was used as a base for algorithm modification. The implementation of the proposed algorithm, along with practical evaluation of algorithms’ parameters based on the test images was made.[b]Keywords:[/b] steganography, random linear codes, RLC, LSB

  3. Spread spectrum image steganography.

    Science.gov (United States)

    Marvel, L M; Boncelet, C R; Retter, C T

    1999-01-01

    In this paper, we present a new method of digital steganography, entitled spread spectrum image steganography (SSIS). Steganography, which means "covered writing" in Greek, is the science of communicating in a hidden manner. Following a discussion of steganographic communication theory and review of existing techniques, the new method, SSIS, is introduced. This system hides and recovers a message of substantial length within digital imagery while maintaining the original image size and dynamic range. The hidden message can be recovered using appropriate keys without any knowledge of the original image. Image restoration, error-control coding, and techniques similar to spread spectrum are described, and the performance of the system is illustrated. A message embedded by this method can be in the form of text, imagery, or any other digital signal. Applications for such a data-hiding scheme include in-band captioning, covert communication, image tamperproofing, authentication, embedded control, and revision tracking.

  4. Multi-bit wavelength coding phase-shift-keying optical steganography based on amplified spontaneous emission noise

    Science.gov (United States)

    Wang, Cheng; Wang, Hongxiang; Ji, Yuefeng

    2018-01-01

    In this paper, a multi-bit wavelength coding phase-shift-keying (PSK) optical steganography method is proposed based on amplified spontaneous emission noise and wavelength selection switch. In this scheme, the assignment codes and the delay length differences provide a large two-dimensional key space. A 2-bit wavelength coding PSK system is simulated to show the efficiency of our proposed method. The simulated results demonstrate that the stealth signal after encoded and modulated is well-hidden in both time and spectral domains, under the public channel and noise existing in the system. Besides, even the principle of this scheme and the existence of stealth channel are known to the eavesdropper, the probability of recovering the stealth data is less than 0.02 if the key is unknown. Thus it can protect the security of stealth channel more effectively. Furthermore, the stealth channel will results in 0.48 dB power penalty to the public channel at 1 × 10-9 bit error rate, and the public channel will have no influence on the receiving of the stealth channel.

  5. Design of a WAV audio player based on K20

    Directory of Open Access Journals (Sweden)

    Xu Yu

    2016-01-01

    Full Text Available The designed player uses the Freescale Company’s MK20DX128VLH7 as the core control ship, and its hardware platform is equipped with VS1003 audio decoder, OLED display interface, USB interface and SD card slot. The player uses the open source embedded real-time operating system μC/OS-II, Freescale USB Stack V4.1.1 and FATFS, and a graphical user interface is developed to improve the user experience based on CGUI. In general, the designed WAV audio player has a strong applicability and a good practical value.

  6. Audio Arduino - an ALSA (Advanced Linux Sound Architecture) audio driver for FTDI-based Arduinos

    DEFF Research Database (Denmark)

    Dimitrov, Smilen; Serafin, Stefania

    2011-01-01

    A contemporary PC user, typically expects a sound card to be a piece of hardware, that: can be manipulated by 'audio' software (most typically exemplified by 'media players'); and allows interfacing of the PC to audio reproduction and/or recording equipment. As such, a 'sound card' can be conside...

  7. Audio-Visual Speaker Diarization Based on Spatiotemporal Bayesian Fusion.

    Science.gov (United States)

    Gebru, Israel D; Ba, Sileye; Li, Xiaofei; Horaud, Radu

    2018-05-01

    Speaker diarization consists of assigning speech signals to people engaged in a dialogue. An audio-visual spatiotemporal diarization model is proposed. The model is well suited for challenging scenarios that consist of several participants engaged in multi-party interaction while they move around and turn their heads towards the other participants rather than facing the cameras and the microphones. Multiple-person visual tracking is combined with multiple speech-source localization in order to tackle the speech-to-person association problem. The latter is solved within a novel audio-visual fusion method on the following grounds: binaural spectral features are first extracted from a microphone pair, then a supervised audio-visual alignment technique maps these features onto an image, and finally a semi-supervised clustering method assigns binaural spectral features to visible persons. The main advantage of this method over previous work is that it processes in a principled way speech signals uttered simultaneously by multiple persons. The diarization itself is cast into a latent-variable temporal graphical model that infers speaker identities and speech turns, based on the output of an audio-visual association process, executed at each time slice, and on the dynamics of the diarization variable itself. The proposed formulation yields an efficient exact inference procedure. A novel dataset, that contains audio-visual training data as well as a number of scenarios involving several participants engaged in formal and informal dialogue, is introduced. The proposed method is thoroughly tested and benchmarked with respect to several state-of-the art diarization algorithms.

  8. Steganography in inactive frames of VoIP streams encoded by source codec

    OpenAIRE

    Huang, Yongfeng; Tang, Shanyu; Yuan, Jian

    2011-01-01

    This paper describes a novel high capacity steganography algorithm for embedding data in the inactive frames of low bit rate audio streams encoded by G.723.1 source codec, which is used extensively in Voice over Internet Protocol (VoIP). This study reveals that, contrary to existing thoughts, the inactive frames of VoIP streams are more suitable for data embedding than the active frames of the streams, that is, steganography in the inactive audio frames attains a larger data embedding capacit...

  9. COMPARISON OF DIGITAL IMAGE STEGANOGRAPHY METHODS

    OpenAIRE

    S. A. Seyyedi; R. Kh. Sadykhov

    2013-01-01

    Steganography is a method of hiding information in other information of different format (container). There are many steganography techniques with various types of container. In the Internet, digital images are the most popular and frequently used containers. We consider main image steganography techniques and their advantages and disadvantages. We also identify the requirements of a good steganography algorithm and compare various such algorithms.

  10. Principles and Perspectives of Steganography in Error Correction Codes

    OpenAIRE

    Pavel Vladimirovich Slipenchuk

    2014-01-01

    The main preference of a steganography in ECC with other types of steganography are listed. Opportunity and relevance of strict mathematical researches for this type of steganography are explained. The goals of a steganography in ECC are listed.

  11. Is image steganography natural?

    Science.gov (United States)

    Martín, Alvaro; Sapiro, Guillermo; Seroussi, Gadiel

    2005-12-01

    Steganography is the art of secret communication. Its purpose is to hide the presence of information, using, for example, images as covers. We experimentally investigate if stego-images, bearing a secret message, are statistically "natural." For this purpose, we use recent results on the statistics of natural images and investigate the effect of some popular steganography techniques. We found that these fundamental statistics of natural images are, in fact, generally altered by the hidden "nonnatural" information. Frequently, the change is consistently biased in a given direction. However, for the class of natural images considered, the change generally falls within the intrinsic variability of the statistics, and, thus, does not allow for reliable detection, unless knowledge of the data hiding process is taken into account. In the latter case, significant levels of detection are demonstrated.

  12. Audio Watermarking Based on HAS and Neural Networks in DCT Domain

    Directory of Open Access Journals (Sweden)

    Hung-Hsu Tsai

    2003-03-01

    Full Text Available We propose a new intelligent audio watermarking method based on the characteristics of the HAS and the techniques of neural networks in the DCT domain. The method makes the watermark imperceptible by using the audio masking characteristics of the HAS. Moreover, the method exploits a neural network for memorizing the relationships between the original audio signals and the watermarked audio signals. Therefore, the method is capable of extracting watermarks without original audio signals. Finally, the experimental results are also included to illustrate that the method significantly possesses robustness to be immune against common attacks for the copyright protection of digital audio.

  13. Quantum red-green-blue image steganography

    Science.gov (United States)

    Heidari, Shahrokh; Pourarian, Mohammad Rasoul; Gheibi, Reza; Naseri, Mosayeb; Houshmand, Monireh

    One of the most considering matters in the field of quantum information processing is quantum data hiding including quantum steganography and quantum watermarking. This field is an efficient tool for protecting any kind of digital data. In this paper, three quantum color images steganography algorithms are investigated based on Least Significant Bit (LSB). The first algorithm employs only one of the image’s channels to cover secret data. The second procedure is based on LSB XORing technique, and the last algorithm utilizes two channels to cover the color image for hiding secret quantum data. The performances of the proposed schemes are analyzed by using software simulations in MATLAB environment. The analysis of PSNR, BER and Histogram graphs indicate that the presented schemes exhibit acceptable performances and also theoretical analysis demonstrates that the networks complexity of the approaches scales squarely.

  14. Noiseless Steganography The Key to Covert Communications

    CERN Document Server

    Desoky, Abdelrahman

    2012-01-01

    Among the features that make Noiseless Steganography: The Key to Covert Communications a first of its kind: The first to comprehensively cover Linguistic Steganography The first to comprehensively cover Graph Steganography The first to comprehensively cover Game Steganography Although the goal of steganography is to prevent adversaries from suspecting the existence of covert communications, most books on the subject present outdated steganography approaches that are detectable by human and/or machine examinations. These approaches often fail because they camouflage data as a detectable noise b

  15. Steganography and steganalysis in voice-over IP scenarios: operational aspects and first experiences with a new steganalysis tool set

    Science.gov (United States)

    Dittmann, Jana; Hesse, Danny; Hillert, Reyk

    2005-03-01

    Based on the knowledge and experiences from existing image steganalysis techniques, the overall objective of the paper is to evaluate existing audio steganography with a special focus on attacks in ad-hoc end-to-end media communications on the example of Voice over IP (VoIP) scenarios. One aspect is to understand operational requirements of recent steganographic techniques for VoIP applications. The other aspect is to elaborate possible steganalysis approaches applied to speech data. In particular we have examined existing VoIP applications with respect to their extensibility to steganographic algorithms. We have also paid attention to the part of steganalysis in PCM audio data which allows us to detect hidden communication while a running VoIP communication with the usage of the PCM codec. In our impelementation we use Jori's Voice over IP library by Jori Liesenborgs (JVOIPLIB) that provides primitives for a voice over IP communication. Finally we show first results of our prototypic implementation which extents the common VoIP scenario by the new feature of steganography. We also show the results for our PCM steganalyzer framework that is able to detect this kind of hidden communication by using a set of 13 first and second order statistics.

  16. The method of narrow-band audio classification based on universal noise background model

    Science.gov (United States)

    Rui, Rui; Bao, Chang-chun

    2013-03-01

    Audio classification is the basis of content-based audio analysis and retrieval. The conventional classification methods mainly depend on feature extraction of audio clip, which certainly increase the time requirement for classification. An approach for classifying the narrow-band audio stream based on feature extraction of audio frame-level is presented in this paper. The audio signals are divided into speech, instrumental music, song with accompaniment and noise using the Gaussian mixture model (GMM). In order to satisfy the demand of actual environment changing, a universal noise background model (UNBM) for white noise, street noise, factory noise and car interior noise is built. In addition, three feature schemes are considered to optimize feature selection. The experimental results show that the proposed algorithm achieves a high accuracy for audio classification, especially under each noise background we used and keep the classification time less than one second.

  17. Steganography -- The New Intelligence Threat

    Science.gov (United States)

    2004-01-01

    Information can be embedded within text files, digital music and videos, and digital photographs by simply changing bits and bytes. HOW IT WORKS...International Airport could be embedded in Brittany Spears’ latest music release in MP3 format. The wide range of steganography capabilities has been...on the Internet through the use of steganography.4 Embedded files are believed to be posted in sports chat rooms, pornographic bulletin boards, and

  18. Detection of Steganography-Producing Software Artifacts on Crime-Related Seized Computers

    Directory of Open Access Journals (Sweden)

    Asawaree Kulkarni

    2009-06-01

    Full Text Available Steganography is the art and science of hiding information within information so that an observer does not know that communication is taking place. Bad actors passing information using steganography are of concern to the national security establishment and law enforcement. An attempt was made to determine if steganography was being used by criminals to communicate information. Web crawling technology was used and images were downloaded from Web sites that were considered as likely candidates for containing information hidden using steganographic techniques. A detection tool was used to analyze these images. The research failed to demonstrate that steganography was prevalent on the public Internet. The probable reasons included the growth and availability of large number of steganography-producing tools and the limited capacity of the detection tools to cope with them. Thus, a redirection was introduced in the methodology and the detection focus was shifted from the analysis of the ‘product’ of the steganography-producing software; viz. the images, to the 'artifacts’ left by the steganography-producing software while it is being used to generate steganographic images. This approach was based on the concept of ‘Stego-Usage Timeline’. As a proof of concept, a sample set of criminal computers was scanned for the remnants of steganography-producing software. The results demonstrated that the problem of ‘the detection of the usage of steganography’ could be addressed by the approach adopted after the research redirection and that certain steganographic software was popular among the criminals. Thus, the contribution of the research was in demonstrating that the limitations of the tools based on the signature detection of steganographically altered images can be overcome by focusing the detection effort on detecting the artifacts of the steganography-producing tools.

  19. Perceived Audio Quality Analysis in Digital Audio Broadcasting Plus System Based on PEAQ

    Directory of Open Access Journals (Sweden)

    K. Ulovec

    2018-04-01

    Full Text Available Broadcasters need to decide on bitrates of the services in the multiplex transmitted via Digital Audio Broadcasting Plus system. The bitrate should be set as low as possible for maximal number of services, but with high quality, not lower than in conventional analog systems. In this paper, the objective method Perceptual Evaluation of Audio Quality is used to analyze the perceived audio quality for appropriate codecs --- MP2 and AAC offering three profiles. The main aim is to determine dependencies on the type of signal --- music and speech, the number of channels --- stereo and mono, and the bitrate. Results indicate that only MP2 codec and AAC Low Complexity profile reach imperceptible quality loss. The MP2 codec needs higher bitrate than AAC Low Complexity profile for the same quality. For the both versions of AAC High-Efficiency profiles, the limit bitrates are determined above which less complex profiles outperform the more complex ones and higher bitrates above these limits are not worth using. It is shown that stereo music has worse quality than stereo speech generally, whereas for mono, the dependencies vary upon the codec/profile. Furthermore, numbers of services satisfying various quality criteria are presented.

  20. COMPARISON OF DIGITAL IMAGE STEGANOGRAPHY METHODS

    Directory of Open Access Journals (Sweden)

    S. A. Seyyedi

    2013-01-01

    Full Text Available Steganography is a method of hiding information in other information of different format (container. There are many steganography techniques with various types of container. In the Internet, digital images are the most popular and frequently used containers. We consider main image steganography techniques and their advantages and disadvantages. We also identify the requirements of a good steganography algorithm and compare various such algorithms.

  1. Code wars: steganography, signals intelligence, and terrorism

    OpenAIRE

    Conway, Maura

    2003-01-01

    This paper describes and discusses the process of secret communication known as steganography. The argument advanced here is that terrorists are unlikely to be employing digital steganography to facilitate secret intra-group communication as has been claimed. This is because terrorist use of digital steganography is both technically and operationally implausible. The position adopted in this paper is that terrorists are likely to employ low-tech steganography such as semagrams and null cipher...

  2. Analysis of Incremental Growth in Image Steganography Techniques for Various Parameters

    OpenAIRE

    Rajkumar Yadav

    2011-01-01

    Data hiding techniques are getting very large support from research community during the last two decades. Steganography is the widely used technique during the last few years for data hiding. Steganography is the art and science of hiding the data in some cover media like image file, audio file, video file, text file etc. Out of the various cover media available image file is the most widely used cover media. There are many techniques that are widely used for image steganagraphy during the l...

  3. A dictionary learning and source recovery based approach to classify diverse audio sources

    OpenAIRE

    Girish, K V Vijay; Ananthapadmanabha, T V; Ramakrishnan, A G

    2015-01-01

    A dictionary learning based audio source classification algorithm is proposed to classify a sample audio signal as one amongst a finite set of different audio sources. Cosine similarity measure is used to select the atoms during dictionary learning. Based on three objective measures proposed, namely, signal to distortion ratio (SDR), the number of non-zero weights and the sum of weights, a frame-wise source classification accuracy of 98.2% is obtained for twelve different sources. Cent percen...

  4. An RGB colour image steganography scheme using overlapping block-based pixel-value differencing.

    Science.gov (United States)

    Prasad, Shiv; Pal, Arup Kumar

    2017-04-01

    This paper presents a steganographic scheme based on the RGB colour cover image. The secret message bits are embedded into each colour pixel sequentially by the pixel-value differencing (PVD) technique. PVD basically works on two consecutive non-overlapping components; as a result, the straightforward conventional PVD technique is not applicable to embed the secret message bits into a colour pixel, since a colour pixel consists of three colour components, i.e. red, green and blue. Hence, in the proposed scheme, initially the three colour components are represented into two overlapping blocks like the combination of red and green colour components, while another one is the combination of green and blue colour components, respectively. Later, the PVD technique is employed on each block independently to embed the secret data. The two overlapping blocks are readjusted to attain the modified three colour components. The notion of overlapping blocks has improved the embedding capacity of the cover image. The scheme has been tested on a set of colour images and satisfactory results have been achieved in terms of embedding capacity and upholding the acceptable visual quality of the stego-image.

  5. Multilevel inverter based class D audio amplifier for capacitive transducers

    DEFF Research Database (Denmark)

    Nielsen, Dennis; Knott, Arnold; Andersen, Michael A. E.

    2014-01-01

    The reduced semiconductor voltage stress makes the multilevel inverters especially interesting, when driving capacitive transducers for audio applications. A ± 300 V flying capacitor class D audio amplifier driving a 100 nF load in the midrange region of 0.1-3.5 kHz with Total Harmonic Distortion...

  6. Multilevel inverter based class D audio amplifier for capacitive transducers

    DEFF Research Database (Denmark)

    Nielsen, Dennis; Knott, Arnold; Andersen, Michael A. E.

    2014-01-01

    The reduced semiconductor voltage stress makes the multilevel inverters especially interesting, when driving capacitive transducers for audio applications. A ± 300 V flying capacitor class D audio amplifier driving a 100 nF load in the midrange region of 0.1-3.5 kHz with Total Harmonic Distortion...... plus Noise (THD+N) belo w1%is presented....

  7. Advances in audio watermarking based on singular value decomposition

    CERN Document Server

    Dhar, Pranab Kumar

    2015-01-01

    This book introduces audio watermarking methods for copyright protection, which has drawn extensive attention for securing digital data from unauthorized copying. The book is divided into two parts. First, an audio watermarking method in discrete wavelet transform (DWT) and discrete cosine transform (DCT) domains using singular value decomposition (SVD) and quantization is introduced. This method is robust against various attacks and provides good imperceptible watermarked sounds. Then, an audio watermarking method in fast Fourier transform (FFT) domain using SVD and Cartesian-polar transformation (CPT) is presented. This method has high imperceptibility and high data payload and it provides good robustness against various attacks. These techniques allow media owners to protect copyright and to show authenticity and ownership of their material in a variety of applications.   ·         Features new methods of audio watermarking for copyright protection and ownership protection ·         Outl...

  8. Conflicting audio-haptic feedback in physically based simulation of walking sounds

    DEFF Research Database (Denmark)

    Turchet, Luca; Serafin, Stefania; Dimitrov, Smilen

    2010-01-01

    We describe an audio-haptic experiment conducted using a system which simulates in real-time the auditory and haptic sensation of walking on different surfaces. The system is based on physical models, that drive both the haptic and audio synthesizers, and a pair of shoes enhanced with sensors...

  9. A Psychoacoustic-Based Multiple Audio Object Coding Approach via Intra-Object Sparsity

    Directory of Open Access Journals (Sweden)

    Maoshen Jia

    2017-12-01

    Full Text Available Rendering spatial sound scenes via audio objects has become popular in recent years, since it can provide more flexibility for different auditory scenarios, such as 3D movies, spatial audio communication and virtual classrooms. To facilitate high-quality bitrate-efficient distribution for spatial audio objects, an encoding scheme based on intra-object sparsity (approximate k-sparsity of the audio object itself is proposed in this paper. The statistical analysis is presented to validate the notion that the audio object has a stronger sparseness in the Modified Discrete Cosine Transform (MDCT domain than in the Short Time Fourier Transform (STFT domain. By exploiting intra-object sparsity in the MDCT domain, multiple simultaneously occurring audio objects are compressed into a mono downmix signal with side information. To ensure a balanced perception quality of audio objects, a Psychoacoustic-based time-frequency instants sorting algorithm and an energy equalized Number of Preserved Time-Frequency Bins (NPTF allocation strategy are proposed, which are employed in the underlying compression framework. The downmix signal can be further encoded via Scalar Quantized Vector Huffman Coding (SQVH technique at a desirable bitrate, and the side information is transmitted in a lossless manner. Both objective and subjective evaluations show that the proposed encoding scheme outperforms the Sparsity Analysis (SPA approach and Spatial Audio Object Coding (SAOC in cases where eight objects were jointly encoded.

  10. TECHNICAL NOTE: Portable audio electronics for impedance-based measurements in microfluidics

    Science.gov (United States)

    Wood, Paul; Sinton, David

    2010-08-01

    We demonstrate the use of audio electronics-based signals to perform on-chip electrochemical measurements. Cell phones and portable music players are examples of consumer electronics that are easily operated and are ubiquitous worldwide. Audio output (play) and input (record) signals are voltage based and contain frequency and amplitude information. A cell phone, laptop soundcard and two compact audio players are compared with respect to frequency response; the laptop soundcard provides the most uniform frequency response, while the cell phone performance is found to be insufficient. The audio signals in the common portable music players and laptop soundcard operate in the range of 20 Hz to 20 kHz and are found to be applicable, as voltage input and output signals, to impedance-based electrochemical measurements in microfluidic systems. Validated impedance-based measurements of concentration (0.1-50 mM), flow rate (2-120 µL min-1) and particle detection (32 µm diameter) are demonstrated. The prevailing, lossless, wave audio file format is found to be suitable for data transmission to and from external sources, such as a centralized lab, and the cost of all hardware (in addition to audio devices) is ~10 USD. The utility demonstrated here, in combination with the ubiquitous nature of portable audio electronics, presents new opportunities for impedance-based measurements in portable microfluidic systems.

  11. An FPGA Implementation of Secured Steganography Communication System

    Directory of Open Access Journals (Sweden)

    Ahlam Mahmood

    2014-04-01

    Full Text Available Steganography is the idea of hiding secret message in multimedia cover which will be transmitted through the Internet. The cover carriers can be image, video, sound or text data. This paper presents an implementation of color image steganographic system on Field Programmable Gate Array and the information hiding/extracting techniques in various images. The proposed algorithm is based on merge between the idea from the random pixel manipulation methods and the Least Significant Bit (LSB matching of Steganography embedding and extracting method.  In a proposed steganography hardware approach, Linear Feedback Shift Register (LFSR method has been used in stego architecture to hide the information in the image. The LFSRs are utilized in this approach as address generators. Different LFSR arrangements using different connection unit have been implemented at the hardware level for hiding/extracting the secret data. Multilayer embedding is implemented in parallel manner with a three-stage pipeline on FPGA.  This work showed attractive results especially in the high throughputs, better stego-image quality, requires little calculation and less utilization of FPGA area. The imperceptibility of the technique combined with high payload, robustness of embedded data and accurate data retrieval renders the proposed Steganography system is suitable for covert communication and secure data transmission applications

  12. An FPGA Implementation of Secured Steganography Communication System

    Directory of Open Access Journals (Sweden)

    Ahlam Fadhil Mahmood

    2013-04-01

    Full Text Available     Steganography is the idea of hiding secret message in multimedia cover which will be transmitted through the Internet. The cover carriers can be image, video, sound or text data. This paper presents an implementation of color image steganographic system on Field Programmable Gate Array and the information hiding/extracting techniques in various images. The proposed algorithm is based on merge between the idea from the random pixel manipulation methods and the Least Significant Bit (LSB matching of Steganography embedding and extracting method.        In a proposed steganography hardware approach, Linear Feedback Shift Register (LFSR method has been used in stego architecture to hide the information in the image. The LFSRs are utilized in this approach as address generators. Different LFSR arrangements using different connection unit have been implemented at the hardware level for hiding/extracting the secret data. Multilayer embedding is implemented in parallel manner with a three-stage pipeline on FPGA.      This work showed attractive results especially in the high throughputs, better stego-image quality, requires little calculation and less utilization of FPGA area. The imperceptibility of the technique combined with high payload, robustness of embedded data and accurate data retrieval renders the proposed Steganography system is suitable for covert communication and secures data transmission applications

  13. Analysis of current-bidirectional buck-boost based switch-mode audio amplifier

    DEFF Research Database (Denmark)

    Bolten Maizonave, Gert; Andersen, Michael A. E.; Kjærgaard, Claus

    2011-01-01

    The following studdy was carried out in order to assses quantitatively the performannce of the buck--boost converter whhen used as swiitch-mode audio amplifier. It comprises of, to beggin with, the de limitation of design criteria bassed on the state of-the-art solution, which is based in a diffe......The following studdy was carried out in order to assses quantitatively the performannce of the buck--boost converter whhen used as swiitch-mode audio amplifier. It comprises of, to beggin with, the de limitation of design criteria bassed on the state of-the-art solution, which is based...... in such configuration when applied for audio....

  14. Design of batch audio/video conversion platform based on JavaEE

    Science.gov (United States)

    Cui, Yansong; Jiang, Lianpin

    2018-03-01

    With the rapid development of digital publishing industry, the direction of audio / video publishing shows the diversity of coding standards for audio and video files, massive data and other significant features. Faced with massive and diverse data, how to quickly and efficiently convert to a unified code format has brought great difficulties to the digital publishing organization. In view of this demand and present situation in this paper, basing on the development architecture of Sptring+SpringMVC+Mybatis, and combined with the open source FFMPEG format conversion tool, a distributed online audio and video format conversion platform with a B/S structure is proposed. Based on the Java language, the key technologies and strategies designed in the design of platform architecture are analyzed emphatically in this paper, designing and developing a efficient audio and video format conversion system, which is composed of “Front display system”, "core scheduling server " and " conversion server ". The test results show that, compared with the ordinary audio and video conversion scheme, the use of batch audio and video format conversion platform can effectively improve the conversion efficiency of audio and video files, and reduce the complexity of the work. Practice has proved that the key technology discussed in this paper can be applied in the field of large batch file processing, and has certain practical application value.

  15. Impact of audio narrated animation on students' understanding and learning environment based on gender

    Science.gov (United States)

    Nasrudin, Ajeng Ratih; Setiawan, Wawan; Sanjaya, Yayan

    2017-05-01

    This study is titled the impact of audio narrated animation on students' understanding in learning humanrespiratory system based on gender. This study was conducted in eight grade of junior high school. This study aims to investigate the difference of students' understanding and learning environment at boys and girls classes in learning human respiratory system using audio narrated animation. Research method that is used is quasy experiment with matching pre-test post-test comparison group design. The procedures of study are: (1) preliminary study and learning habituation using audio narrated animation; (2) implementation of learning using audio narrated animation and taking data; (3) analysis and discussion. The result of analysis shows that there is significant difference on students' understanding and learning environment at boys and girls classes in learning human respiratory system using audio narrated animation, both in general and specifically in achieving learning indicators. The discussion related to the impact of audio narrated animation, gender characteristics, and constructivist learning environment. It can be concluded that there is significant difference of students' understanding at boys and girls classes in learning human respiratory system using audio narrated animation. Additionally, based on interpretation of students' respond, there is the difference increment of agreement level in learning environment.

  16. New quantization matrices for JPEG steganography

    Science.gov (United States)

    Yildiz, Yesna O.; Panetta, Karen; Agaian, Sos

    2007-04-01

    Modern steganography is a secure communication of information by embedding a secret-message within a "cover" digital multimedia without any perceptual distortion to the cover media, so the presence of the hidden message is indiscernible. Recently, the Joint Photographic Experts Group (JPEG) format attracted the attention of researchers as the main steganographic format due to the following reasons: It is the most common format for storing images, JPEG images are very abundant on the Internet bulletin boards and public Internet sites, and they are almost solely used for storing natural images. Well-known JPEG steganographic algorithms such as F5 and Model-based Steganography provide high message capacity with reasonable security. In this paper, we present a method to increase security using JPEG images as the cover medium. The key element of the method is using a new parametric key-dependent quantization matrix. This new quantization table has practically the same performance as the JPEG table as far as compression ratio and image statistics. The resulting image is indiscernible from an image that was created using the JPEG compression algorithm. This paper presents the key-dependent quantization table algorithm and then analyzes the new table performance.

  17. Genre-adaptive semantic computing and audio-based modelling for music mood annotation.

    OpenAIRE

    Saari, Pasi; Fazekas, Gyorgy; Eerola, Tuomas; Barthet, Mathieu; Lartillot, Olivier; Sandler, Mark

    2016-01-01

    This study investigates whether taking genre into account is beneficial for automatic music mood annotation in terms of core affects valence, arousal, and tension, as well as several other mood scales. Novel techniques employing genre-adaptive semantic computing and audio-based modelling are proposed. A technique called the ACTwg employs genre-adaptive semantic computing of mood-related social tags, whereas ACTwg-SLPwg combines semantic computing and audio-based modelling, both in a genre-ada...

  18. Emotion-based Music Rretrieval on a Well-reduced Audio Feature Space

    DEFF Research Database (Denmark)

    Ruxanda, Maria Magdalena; Chua, Bee Yong; Nanopoulos, Alexandros

    2009-01-01

    Music expresses emotion. A number of audio extracted features have influence on the perceived emotional expression of music. These audio features generate a high-dimensional space, on which music similarity retrieval can be performed effectively, with respect to human perception of the music-emotion...... on a number of dimensionality reduction algorithms, including both classic and novel approaches. The paper clearly envisages which dimensionality reduction techniques on the considered audio feature space, can preserve in average the accuracy of the emotion-based music retrieval....

  19. Analytical Features: A Knowledge-Based Approach to Audio Feature Generation

    Directory of Open Access Journals (Sweden)

    Pachet François

    2009-01-01

    Full Text Available We present a feature generation system designed to create audio features for supervised classification tasks. The main contribution to feature generation studies is the notion of analytical features (AFs, a construct designed to support the representation of knowledge about audio signal processing. We describe the most important aspects of AFs, in particular their dimensional type system, on which are based pattern-based random generators, heuristics, and rewriting rules. We show how AFs generalize or improve previous approaches used in feature generation. We report on several projects using AFs for difficult audio classification tasks, demonstrating their advantage over standard audio features. More generally, we propose analytical features as a paradigm to bring raw signals into the world of symbolic computation.

  20. StirMark Benchmark: audio watermarking attacks based on lossy compression

    Science.gov (United States)

    Steinebach, Martin; Lang, Andreas; Dittmann, Jana

    2002-04-01

    StirMark Benchmark is a well-known evaluation tool for watermarking robustness. Additional attacks are added to it continuously. To enable application based evaluation, in our paper we address attacks against audio watermarks based on lossy audio compression algorithms to be included in the test environment. We discuss the effect of different lossy compression algorithms like MPEG-2 audio Layer 3, Ogg or VQF on a selection of audio test data. Our focus is on changes regarding the basic characteristics of the audio data like spectrum or average power and on removal of embedded watermarks. Furthermore we compare results of different watermarking algorithms and show that lossy compression is still a challenge for most of them. There are two strategies for adding evaluation of robustness against lossy compression to StirMark Benchmark: (a) use of existing free compression algorithms (b) implementation of a generic lossy compression simulation. We discuss how such a model can be implemented based on the results of our tests. This method is less complex, as no real psycho acoustic model has to be applied. Our model can be used for audio watermarking evaluation of numerous application fields. As an example, we describe its importance for e-commerce applications with watermarking security.

  1. Estimation of inhalation flow profile using audio-based methods to assess inhaler medication adherence.

    Science.gov (United States)

    Taylor, Terence E; Lacalle Muls, Helena; Costello, Richard W; Reilly, Richard B

    2018-01-01

    Asthma and chronic obstructive pulmonary disease (COPD) patients are required to inhale forcefully and deeply to receive medication when using a dry powder inhaler (DPI). There is a clinical need to objectively monitor the inhalation flow profile of DPIs in order to remotely monitor patient inhalation technique. Audio-based methods have been previously employed to accurately estimate flow parameters such as the peak inspiratory flow rate of inhalations, however, these methods required multiple calibration inhalation audio recordings. In this study, an audio-based method is presented that accurately estimates inhalation flow profile using only one calibration inhalation audio recording. Twenty healthy participants were asked to perform 15 inhalations through a placebo Ellipta™ DPI at a range of inspiratory flow rates. Inhalation flow signals were recorded using a pneumotachograph spirometer while inhalation audio signals were recorded simultaneously using the Inhaler Compliance Assessment device attached to the inhaler. The acoustic (amplitude) envelope was estimated from each inhalation audio signal. Using only one recording, linear and power law regression models were employed to determine which model best described the relationship between the inhalation acoustic envelope and flow signal. Each model was then employed to estimate the flow signals of the remaining 14 inhalation audio recordings. This process repeated until each of the 15 recordings were employed to calibrate single models while testing on the remaining 14 recordings. It was observed that power law models generated the highest average flow estimation accuracy across all participants (90.89±0.9% for power law models and 76.63±2.38% for linear models). The method also generated sufficient accuracy in estimating inhalation parameters such as peak inspiratory flow rate and inspiratory capacity within the presence of noise. Estimating inhaler inhalation flow profiles using audio based methods may be

  2. Comparing Learning Gains: Audio Versus Text-based Instructor Communication in a Blended Online Learning Environment

    Science.gov (United States)

    Shimizu, Dominique

    Though blended course audio feedback has been associated with several measures of course satisfaction at the postsecondary and graduate levels compared to text feedback, it may take longer to prepare and positive results are largely unverified in K-12 literature. The purpose of this quantitative study was to investigate the time investment and learning impact of audio communications with 228 secondary students in a blended online learning biology unit at a central Florida public high school. A short, individualized audio message regarding the student's progress was given to each student in the audio group; similar text-based messages were given to each student in the text-based group on the same schedule; a control got no feedback. A pretest and posttest were employed to measure learning gains in the three groups. To compare the learning gains in two types of feedback with each other and to no feedback, a controlled, randomized, experimental design was implemented. In addition, the creation and posting of audio and text feedback communications were timed in order to assess whether audio feedback took longer to produce than text only feedback. While audio feedback communications did take longer to create and post, there was no difference between learning gains as measured by posttest scores when student received audio, text-based, or no feedback. Future studies using a similar randomized, controlled experimental design are recommended to verify these results and test whether the trend holds in a broader range of subjects, over different time frames, and using a variety of assessment types to measure student learning.

  3. Overview: Main Fundamentals for Steganography

    OpenAIRE

    AL-Ani, Zaidoon Kh.; Zaidan, A. A.; Zaidan, B. B.; Alanazi, Hamdan. O.

    2010-01-01

    The rapid development of multimedia and internet allows for wide distribution of digital media data. It becomes much easier to edit, modify and duplicate digital information .Besides that, digital documents are also easy to copy and distribute, therefore it will be faced by many threats. It is a big security and privacy issue, it become necessary to find appropriate protection because of the significance, accuracy and sensitivity of the information. Steganography considers one of the techniqu...

  4. Comparison of Video Steganography Methods for Watermark Embedding

    Directory of Open Access Journals (Sweden)

    Griberman David

    2016-05-01

    Full Text Available The paper focuses on the comparison of video steganography methods for the purpose of digital watermarking in the context of copyright protection. Four embedding methods that use Discrete Cosine and Discrete Wavelet Transforms have been researched and compared based on their embedding efficiency and fidelity. A video steganography program has been developed in the Java programming language with all of the researched methods implemented for experiments. The experiments used 3 video containers with different amounts of movement. The impact of the movement has been addressed in the paper as well as the ways of potential improvement of embedding efficiency using adaptive embedding based on the movement amount. Results of the research have been verified using a survey with 17 participants.

  5. Hierarchical structure for audio-video based semantic classification of sports video sequences

    Science.gov (United States)

    Kolekar, M. H.; Sengupta, S.

    2005-07-01

    A hierarchical structure for sports event classification based on audio and video content analysis is proposed in this paper. Compared to the event classifications in other games, those of cricket are very challenging and yet unexplored. We have successfully solved cricket video classification problem using a six level hierarchical structure. The first level performs event detection based on audio energy and Zero Crossing Rate (ZCR) of short-time audio signal. In the subsequent levels, we classify the events based on video features using a Hidden Markov Model implemented through Dynamic Programming (HMM-DP) using color or motion as a likelihood function. For some of the game-specific decisions, a rule-based classification is also performed. Our proposed hierarchical structure can easily be applied to any other sports. Our results are very promising and we have moved a step forward towards addressing semantic classification problems in general.

  6. Quantum watermarking scheme through Arnold scrambling and LSB steganography

    Science.gov (United States)

    Zhou, Ri-Gui; Hu, Wenwen; Fan, Ping

    2017-09-01

    Based on the NEQR of quantum images, a new quantum gray-scale image watermarking scheme is proposed through Arnold scrambling and least significant bit (LSB) steganography. The sizes of the carrier image and the watermark image are assumed to be 2n× 2n and n× n, respectively. Firstly, a classical n× n sized watermark image with 8-bit gray scale is expanded to a 2n× 2n sized image with 2-bit gray scale. Secondly, through the module of PA-MOD N, the expanded watermark image is scrambled to a meaningless image by the Arnold transform. Then, the expanded scrambled image is embedded into the carrier image by the steganography method of LSB. Finally, the time complexity analysis is given. The simulation experiment results show that our quantum circuit has lower time complexity, and the proposed watermarking scheme is superior to others.

  7. Quantum Steganography for Multi-party Covert Communication

    Science.gov (United States)

    Liu, Lin; Tang, Guang-Ming; Sun, Yi-Feng; Yan, Shu-Fan

    2016-01-01

    A novel multi-party quantum steganography protocol based on quantum secret sharing is proposed in this paper. Hidden channels are built in HBB and improved HBB quantum secret sharing protocols for secret messages transmitting, via the entanglement swapping of GHZ states and Bell measurement. Compared with the original protocol, there are only a few different GHZ sates transmitted in the proposed protocol, making the hidden channel with good imperceptibility. Moreover, the secret messages keep secure even when the hidden channel is under the attack from the dishonest participators, for the sub-secretmessages distributed randomly to different participators. With good imperceptibility and security, the capacity of proposed protocol is higher than previous multi-party quantum steganography protocol.

  8. Reduction in time-to-sleep through EEG based brain state detection and audio stimulation.

    Science.gov (United States)

    Zhuo Zhang; Cuntai Guan; Ti Eu Chan; Juanhong Yu; Aung Aung Phyo Wai; Chuanchu Wang; Haihong Zhang

    2015-08-01

    We developed an EEG- and audio-based sleep sensing and enhancing system, called iSleep (interactive Sleep enhancement apparatus). The system adopts a closed-loop approach which optimizes the audio recording selection based on user's sleep status detected through our online EEG computing algorithm. The iSleep prototype comprises two major parts: 1) a sleeping mask integrated with a single channel EEG electrode and amplifier, a pair of stereo earphones and a microcontroller with wireless circuit for control and data streaming; 2) a mobile app to receive EEG signals for online sleep monitoring and audio playback control. In this study we attempt to validate our hypothesis that appropriate audio stimulation in relation to brain state can induce faster onset of sleep and improve the quality of a nap. We conduct experiments on 28 healthy subjects, each undergoing two nap sessions - one with a quiet background and one with our audio-stimulation. We compare the time-to-sleep in both sessions between two groups of subjects, e.g., fast and slow sleep onset groups. The p-value obtained from Wilcoxon Signed Rank Test is 1.22e-04 for slow onset group, which demonstrates that iSleep can significantly reduce the time-to-sleep for people with difficulty in falling sleep.

  9. Steganography: Forensic, Security, and Legal Issues

    Directory of Open Access Journals (Sweden)

    Merrill Warkentin

    2008-06-01

    Full Text Available Steganography has long been regarded as a tool used for illicit and destructive purposes such as crime and warfare. Currently, digital tools are widely available to ordinary computer users also. Steganography software allows both illicit and legitimate users to hide messages so that they will not be detected in transit. This article provides a brief history of steganography, discusses the current status in the computer age, and relates this to forensic, security, and legal issues. The paper concludes with recommendations for digital forensics investigators, IT staff, individual users, and other stakeholders.

  10. EFFICIENT ADAPTIVE STEGANOGRAPHY FOR COLOR IMAGESBASED ON LSBMR ALGORITHM

    Directory of Open Access Journals (Sweden)

    B. Sharmila

    2012-02-01

    Full Text Available Steganography is the art of hiding the fact that communication is taking place, by hiding information in other medium. Many different carrier file formats can be used, but digital images are the most popular because of their frequent use on the Internet. For hiding secret information in images, there exists a large variety of steganographic techniques. The Least Significant Bit (LSB based approach is a simplest type of steganographic algorithm. In all the existing approaches, the decision of choosing the region within a cover image is performed without considering the relationship between image content and the size of secret message. Thus, the plain regions in the cover will be ruin after data hiding even at a low data rate. Hence choosing the edge region for data hiding will be a solution. Many algorithms are deal with edges in images for data hiding. The Paper 'Edge adaptive image steganography based on LSBMR algorithm' is a LSB steganography presented the results of algorithms on gray-scale images only. This paper presents the results of analyzing the performance of edge adaptive steganography for colored images (JPEG. The algorithms have been slightly modified for colored image implementation and are compared on the basis of evaluation parameters like peak signal noise ratio (PSNR and mean square error (MSE. This method can select the edge region depending on the length of secret message and difference between two consecutive bits in the cover image. For length of message is short, only small edge regions are utilized while on leaving other region as such. When the data rate increases, more regions can be used adaptively for data hiding by adjusting the parameters. Besides this, the message is encrypted using efficient cryptographic algorithm which further increases the security.

  11. Analisys of Current-Bidirectional Buck-Boost Based Automotive Switch-Mode Audio Amplifier

    DEFF Research Database (Denmark)

    Bolten Maizonave, Gert; Andersen, Michael A. E.; Kjærgaard, Claus

    2011-01-01

    The following study was carried out in order to assess quantitatively the performance of the buck-boost converter when used as switch-mode audio amplifier. It comprises of, to begin with, the delimitation of design criteria based on the state-ofthe- art solution, which is based in a differential ...

  12. Convolution-based classification of audio and symbolic representations of music

    DEFF Research Database (Denmark)

    Velarde, Gissel; Cancino Chacón, Carlos; Meredith, David

    2018-01-01

    We present a novel convolution-based method for classification of audio and symbolic representations of music, which we apply to classification of music by style. Pieces of music are first sampled to pitch–time representations (piano-rolls or spectrograms) and then convolved with a Gaussian filter......-class composer identification, methods specialised for classifying symbolic representations of music are more effective. We also performed experiments on symbolic representations, synthetic audio and two different recordings of The Well-Tempered Clavier by J. S. Bach to study the method’s capacity to distinguish...

  13. DIGITAL DATA PROTECTION USING STEGANOGRAPHY

    Directory of Open Access Journals (Sweden)

    R. Rejani

    2016-03-01

    Full Text Available In today’s digital world applications from a computer or a mobile device consistently used to get every kind of work done for professional as well as entertainment purpose. However, one of the major issue that a software publisher will face is the issue of piracy. Throughout the last couple of decades, almost all-major or minor software has been pirated and freely circulated across the internet. The impact of the rampant software piracy has been huge and runs into billions of dollars every year. For an independent developer or a programmer, the impact of piracy will be huge. Huge companies that make specialized software often employ complex hardware methods such as usage of dongles to avoid software piracy. However, this is not possible to do for a normal independent programmer of a small company. As part of the research, a new method of software protection that does not need proprietary hardware and other complex methods are proposed in this paper. This method uses a combination of inbuilt hardware features as well as steganography and encryption to protect the software against piracy. The properties or methods used include uniqueness of hardware, steganography, strong encryption like AES and geographic location. To avoid hacking the proposed framework also makes use of self-checks in a random manner. The process is quite simple to implement for any developer and is usable on both traditional PCs as well as mobile environments.

  14. Multimedia security watermarking, steganography, and forensics

    CERN Document Server

    Shih, Frank Y

    2012-01-01

    Multimedia Security: Watermarking, Steganography, and Forensics outlines essential principles, technical information, and expert insights on multimedia security technology used to prove that content is authentic and has not been altered. Illustrating the need for improved content security as the Internet and digital multimedia applications rapidly evolve, this book presents a wealth of everyday protection application examples in fields including multimedia mining and classification, digital watermarking, steganography, and digital forensics. Giving readers an in-depth overview of different asp

  15. Comments on "Steganography Using Reversible Texture Synthesis".

    Science.gov (United States)

    Zhou, Hang; Chen, Kejiang; Zhang, Weiming; Yu, Nenghai

    2017-04-01

    Message hiding in texture image synthesis is a novel steganography approach by which we resample a smaller texture image and synthesize a new texture image with a similar local appearance and an arbitrary size. However, the mirror operation over the image boundary is flawed and is easy to attack. We propose an attacking method on this steganography, which can not only detect the stego-images but can also extract the hidden messages.

  16. The Development of an Audio Computer-Based Classroom Test of ESL Listening Skills.

    Science.gov (United States)

    Balizet, Sha; Treder, Dave; Parshall, Cynthia G.

    There are very few examples of audio-based computerized tests, but for many disciplines, such as foreign language and music, there appear to be many benefits to this type of testing. The purpose of the present study was to develop and compare computer-delivered and audiocassette/paper-and-pencil versions of a listening test. The test was a measure…

  17. Objective Assessment of Patient Inhaler User Technique Using an Audio-Based Classification Approach.

    Science.gov (United States)

    Taylor, Terence E; Zigel, Yaniv; Egan, Clarice; Hughes, Fintan; Costello, Richard W; Reilly, Richard B

    2018-02-01

    Many patients make critical user technique errors when using pressurised metered dose inhalers (pMDIs) which reduce the clinical efficacy of respiratory medication. Such critical errors include poor actuation coordination (poor timing of medication release during inhalation) and inhaling too fast (peak inspiratory flow rate over 90 L/min). Here, we present a novel audio-based method that objectively assesses patient pMDI user technique. The Inhaler Compliance Assessment device was employed to record inhaler audio signals from 62 respiratory patients as they used a pMDI with an In-Check Flo-Tone device attached to the inhaler mouthpiece. Using a quadratic discriminant analysis approach, the audio-based method generated a total frame-by-frame accuracy of 88.2% in classifying sound events (actuation, inhalation and exhalation). The audio-based method estimated the peak inspiratory flow rate and volume of inhalations with an accuracy of 88.2% and 83.94% respectively. It was detected that 89% of patients made at least one critical user technique error even after tuition from an expert clinical reviewer. This method provides a more clinically accurate assessment of patient inhaler user technique than standard checklist methods.

  18. Information hiding techniques for steganography and digital watermarking

    CERN Document Server

    Katzenbeisser, Stefan

    2000-01-01

    Steganography, a means by which two or more parties may communicate using ""invisible"" or ""subliminal"" communication, and watermarking, a means of hiding copyright data in images, are becoming necessary components of commercial multimedia applications that are subject to illegal use. This new book is the first comprehensive survey of steganography and watermarking and their application to modern communications and multimedia.Handbook of Information Hiding: Steganography and Watermarking helps you understand steganography, the history of this previously neglected element of cryptography, the

  19. Audio Papers

    DEFF Research Database (Denmark)

    Groth, Sanne Krogh; Samson, Kristine

    2016-01-01

    With this special issue of Seismograf we are happy to present a new format of articles: Audio Papers. Audio papers resemble the regular essay or the academic text in that they deal with a certain topic of interest, but presented in the form of an audio production. The audio paper is an extension...

  20. Evaluation of MPEG-7-Based Audio Descriptors for Animal Voice Recognition over Wireless Acoustic Sensor Networks

    Directory of Open Access Journals (Sweden)

    Joaquín Luque

    2016-05-01

    Full Text Available Environmental audio monitoring is a huge area of interest for biologists all over the world. This is why some audio monitoring system have been proposed in the literature, which can be classified into two different approaches: acquirement and compression of all audio patterns in order to send them as raw data to a main server; or specific recognition systems based on audio patterns. The first approach presents the drawback of a high amount of information to be stored in a main server. Moreover, this information requires a considerable amount of effort to be analyzed. The second approach has the drawback of its lack of scalability when new patterns need to be detected. To overcome these limitations, this paper proposes an environmental Wireless Acoustic Sensor Network architecture focused on use of generic descriptors based on an MPEG-7 standard. These descriptors demonstrate it to be suitable to be used in the recognition of different patterns, allowing a high scalability. The proposed parameters have been tested to recognize different behaviors of two anuran species that live in Spanish natural parks; the Epidalea calamita and the Alytes obstetricans toads, demonstrating to have a high classification performance.

  1. Application of Genetic Algorithm and Particle Swarm Optimization techniques for improved image steganography systems

    Science.gov (United States)

    Jude Hemanth, Duraisamy; Umamaheswari, Subramaniyan; Popescu, Daniela Elena; Naaji, Antoanela

    2016-01-01

    Image steganography is one of the ever growing computational approaches which has found its application in many fields. The frequency domain techniques are highly preferred for image steganography applications. However, there are significant drawbacks associated with these techniques. In transform based approaches, the secret data is embedded in random manner in the transform coefficients of the cover image. These transform coefficients may not be optimal in terms of the stego image quality and embedding capacity. In this work, the application of Genetic Algorithm (GA) and Particle Swarm Optimization (PSO) have been explored in the context of determining the optimal coefficients in these transforms. Frequency domain transforms such as Bandelet Transform (BT) and Finite Ridgelet Transform (FRIT) are used in combination with GA and PSO to improve the efficiency of the image steganography system.

  2. Temporal Masking for Bit-rate Reduction in Audio Codec Based on Frequency Domain Linear Prediction

    OpenAIRE

    Ganapathy, Sriram; Motlicek, Petr; Hermansky, Hynek; Garudadri, Harinath

    2008-01-01

    Audio coding based on Frequency Domain Linear Prediction (FDLP) uses auto-regressive model to approximate Hilbert envelopes in frequency sub-bands for relatively long temporal segments. Although the basic technique achieves good quality of the reconstructed signal, there is a need for improving the coding efficiency. In this paper, we present a novel method for the application of temporal masking to reduce the bit-rate in a FDLP based codec. Temporal masking refers to the hearing phenomenon, ...

  3. Advances in Audio-Based Systems to Monitor Patient Adherence and Inhaler Drug Delivery.

    Science.gov (United States)

    Taylor, Terence E; Zigel, Yaniv; De Looze, Céline; Sulaiman, Imran; Costello, Richard W; Reilly, Richard B

    2018-03-01

    Hundreds of millions of people worldwide have asthma and COPD. Current medications to control these chronic respiratory diseases can be administered using inhaler devices, such as the pressurized metered dose inhaler and the dry powder inhaler. Provided that they are used as prescribed, inhalers can improve patient clinical outcomes and quality of life. Poor patient inhaler adherence (both time of use and user technique) is, however, a major clinical concern and is associated with poor disease control, increased hospital admissions, and increased mortality rates, particularly in low- and middle-income countries. There are currently limited methods available to health-care professionals to objectively and remotely monitor patient inhaler adherence. This review describes recent sensor-based technologies that use audio-based approaches that show promising opportunities for monitoring inhaler adherence in clinical practice. This review discusses how one form of sensor-based technology, audio-based monitoring systems, can provide clinically pertinent information regarding patient inhaler use over the course of treatment. Audio-based monitoring can provide health-care professionals with quantitative measurements of the drug delivery of inhalers, signifying a clear clinical advantage over other methods of assessment. Furthermore, objective audio-based adherence measures can improve the predictability of patient outcomes to treatment compared with current standard methods of adherence assessment used in clinical practice. Objective feedback on patient inhaler adherence can be used to personalize treatment to the patient, which may enhance precision medicine in the treatment of chronic respiratory diseases. Copyright © 2017 American College of Chest Physicians. Published by Elsevier Inc. All rights reserved.

  4. WLAN steganography revisited

    Science.gov (United States)

    Kraetzer, Christian; Dittmann, Jana; Merkel, Ronny

    2008-02-01

    Two different approaches for using a sequence of packets of the IEEE 802.11 (WLAN) protocol as cover for a stenographic communication can be found in literature: in 2003 Krzysztof Szczypiorski introduced a method constructing a hidden channel using deliberately corrupted WLAN packets for communication. In 2006 Kraetzer et al. introduced a WLAN stenography approach that works without generating corrupted network packets. This later approach, with hidden storage channel scenario (SCI) and the timing channel based scenario (SCII), is reconsidered here. Fixed parameter settings limiting SCIs capabilities in the implementation (already introduced in 2006) motivated an enhancement. The new implementation of SCI increases the capacity, while at the same time improving the reliability and decreasing the detectability in comparison to the work described in 2006. The timing channel based approach SCII from 2006 is in this paper substituted by a completely new design based on the usage of WLAN Access Point addresses for the synchronization and payload transmission. This new design now allows a comprehensive practical evaluation of the implementation and the evaluations of the scheme, which was not possible with the original SCII before. The test results for both enhanced approaches are summarised and compared in terms of detectability, capacity and reliability.

  5. An Integrated Approach Using Chaotic Map & Sample Value Difference Method for Electrocardiogram Steganography and OFDM Based Secured Patient Information Transmission.

    Science.gov (United States)

    Pandey, Anukul; Saini, Barjinder Singh; Singh, Butta; Sood, Neetu

    2017-10-18

    This paper presents a patient's confidential data hiding scheme in electrocardiogram (ECG) signal and its subsequent wireless transmission. Patient's confidential data is embedded in ECG (called stego-ECG) using chaotic map and the sample value difference approach. The sample value difference approach effectually hides the patient's confidential data in ECG sample pairs at the predefined locations. The chaotic map generates these predefined locations through the use of selective control parameters. Subsequently, the wireless transmission of the stego-ECG is analyzed using the Orthogonal Frequency Division Multiplexing (OFDM) system in a Rayleigh fading scenario for telemedicine applications. Evaluation of proposed method on all 48 records of MIT-BIH arrhythmia ECG database demonstrates that the embedding does not alter the diagnostic features of cover ECG. The secret data imperceptibility in stego-ECG is evident through the statistical and clinical performance measures. Statistical measures comprise of Percentage Root-mean-square Difference (PRD), Peak Signal to Noise Ratio (PSNR), and Kulback-Leibler Divergence (KL-Div), etc. while clinical metrics includes wavelet Energy Based Diagnostic Distortion (WEDD) and Wavelet based Weighted PRD (WWPRD). The various channel Signal-to-Noise Ratio scenarios are simulated for wireless communication of stego-ECG in OFDM system. The proposed method over all the 48 records of MIT-BIH arrhythmia database resulted in average, PRD = 0.26, PSNR = 55.49, KL-Div = 3.34 × 10 -6 , WEDD = 0.02, and WWPRD = 0.10 with secret data size of 21Kb. Further, a comparative analysis of proposed method and recent existing works was also performed. The results clearly, demonstrated the superiority of proposed method.

  6. Music preferences based on audio features, and its relation to personality

    OpenAIRE

    Dunn, Greg

    2009-01-01

    Recent studies have summarized reported music preferences by genre into four broadly defined categories, which relate to various personality characteristics. Other research has indicated that genre classification is ambiguous and inconsistent. This ambiguity suggests that research relating personality to music preferences based on genre could benefit from a more objective definition of music. This problem is addressed by investigating how music preferences linked to objective audio features r...

  7. The implementation of Project-Based Learning in courses Audio Video to Improve Employability Skills

    Science.gov (United States)

    Sulistiyo, Edy; Kustono, Djoko; Purnomo; Sutaji, Eddy

    2018-04-01

    This paper presents a project-based learning (PjBL) in subjects with Audio Video the Study Programme Electro Engineering Universitas Negeri Surabaya which consists of two ways namely the design of the prototype audio-video and assessment activities project-based learning tailored to the skills of the 21st century in the form of employability skills. The purpose of learning innovation is applying the lab work obtained in the theory classes. The PjBL aims to motivate students, centering on the problems of teaching in accordance with the world of work. Measures of learning include; determine the fundamental questions, designs, develop a schedule, monitor the learners and progress, test the results, evaluate the experience, project assessment, and product assessment. The results of research conducted showed the level of mastery of the ability to design tasks (of 78.6%), technical planning (39,3%), creativity (42,9%), innovative (46,4%), problem solving skills (the 57.1%), skill to communicate (75%), oral expression (75%), searching and understanding information (to 64.3%), collaborative work skills (71,4%), and classroom conduct (of 78.6%). In conclusion, instructors have to do the reflection and make improvements in some of the aspects that have a level of mastery of the skills less than 60% both on the application of project-based learning courses, audio video.

  8. A new 4-D chaotic hyperjerk system, its synchronization, circuit design and applications in RNG, image encryption and chaos-based steganography

    Science.gov (United States)

    Vaidyanathan, S.; Akgul, A.; Kaçar, S.; Çavuşoğlu, U.

    2018-02-01

    Hyperjerk systems have received significant interest in the literature because of their simple structure and complex dynamical properties. This work presents a new chaotic hyperjerk system having two exponential nonlinearities. Dynamical properties of the chaotic hyperjerk system are discovered through equilibrium point analysis, bifurcation diagram, dissipativity and Lyapunov exponents. Moreover, an adaptive backstepping controller is designed for the synchronization of the chaotic hyperjerk system. Also, a real circuit of the chaotic hyperjerk system has been carried out to show the feasibility of the theoretical hyperjerk model. The chaotic hyperjerk system can also be useful in scientific fields such as Random Number Generators (RNGs), data security, data hiding, etc. In this work, three implementations of the chaotic hyperjerk system, viz. RNG, image encryption and sound steganography have been performed by using complex dynamics characteristics of the system.

  9. Design of DPSS based fiber bragg gratings and their application in all-optical encryption, OCDMA, optical steganography, and orthogonal-division multiplexing.

    Science.gov (United States)

    Djordjevic, Ivan B; Saleh, Alaa H; Küppers, Franko

    2014-05-05

    The future information infrastructure will be affected by limited bandwidth of optical networks, high energy consumption, heterogeneity of network segments, and security issues. As a solution to all problems, we advocate the use of both electrical basis functions (orthogonal prolate spheroidal basis functions) and optical basis functions, implemented as FBGs with orthogonal impulse response in addition to spatial modes. We design the Bragg gratings with orthogonal impulse responses by means of discrete layer peeling algorithm. The target impulse responses belong to the class of discrete prolate spheroidal sequences, which are mutually orthogonal regardless of the sequence order, while occupying the fixed bandwidth. We then design the corresponding encoders and decoders suitable for all-optical encryption, optical CDMA, optical steganography, and orthogonal-division multiplexing (ODM). Finally, we propose the spectral multiplexing-ODM-spatial multiplexing scheme enabling beyond 10 Pb/s serial optical transport networks.

  10. Audio-based Age and Gender Identification to Enhance the Recommendation of TV Content

    DEFF Research Database (Denmark)

    Shepstone, Sven Ewan; Tan, Zheng-Hua; Jensen, Søren Holdt

    2013-01-01

    Recommending TV content to groups of viewers is best carried out when relevant information such as the demographics of the group is available. However, it can be difficult and time consuming to extract information for every user in the group. This paper shows how an audio analysis of the age...... and gender of a group of users watching the TV can be used for recommending a sequence of N short TV content items for the group. First, a state of the art audio-based classifier determines the age and gender of each user in an M-user group and creates a group profile. A genetic recommender algorithm...... of state-of-the-art age-and-gender detection systems, the proposed system has a significant ability to predict an item with a matching age and gender category. User studies were conducted where subjects were asked to rate a sequence of advertisements, where half of the advertisements were randomly selected...

  11. Digital holography-based steganography.

    Science.gov (United States)

    Hamam, Habib

    2010-12-15

    A steganographic method offering a high hiding capacity is presented in which the techniques of digital holography are used to distribute information from a small secret image across the larger pixel field of a cover image. An iterative algorithm is used to design a phase-only or complex hologram from a padded version of the secret image, quantizing this data according to the carrier data bits that are available within the intended cover image. By introducing the hologram data only into low-order bits of larger amplitude cover pixels, the change in the cover image remains imperceptible to the casual observer, with a peak signal-to-noise ratio of >40 dB.

  12. Balancing Audio

    DEFF Research Database (Denmark)

    Walther-Hansen, Mads

    2016-01-01

    This paper explores the concept of balance in music production and examines the role of conceptual metaphors in reasoning about audio editing. Balance may be the most central concept in record production, however, the way we cognitively understand and respond meaningfully to a mix requiring balance...... is not thoroughly understood. In this paper I treat balance as a metaphor that we use to reason about several different actions in music production, such as adjusting levels, editing the frequency spectrum or the spatiality of the recording. This study is based on an exploration of a linguistic corpus of sound...

  13. Recommending audio mixing workflows

    OpenAIRE

    Sauer, Christian; Roth-Berghofer, Thomas; Auricchio, Nino; Proctor, Sam

    2013-01-01

    This paper describes our work on Audio Advisor, a workflow recommender for audio mixing. We examine the process of eliciting, formalising and modelling the domain knowledge and expert’s experience. We are also describing the effects and problems associated with the knowledge formalisation processes. We decided to employ structured case-based reasoning using the myCBR 3 to capture the vagueness encountered in the audio domain. We detail on how we used extensive similarity measure modelling to ...

  14. Genre-adaptive Semantic Computing and Audio-based Modelling for Music Mood Annotation

    DEFF Research Database (Denmark)

    Saari, Pasi; Fazekas, György; Eerola, Tuomas

    2016-01-01

    This study investigates whether taking genre into account is beneficial for automatic music mood annotation in terms of core affects valence, arousal, and tension, as well as several other mood scales. Novel techniques employing genre-adaptive semantic computing and audio-based modelling...... related to a set of 600 popular music tracks spanning multiple genres. The results show that ACTwg outperforms a semantic computing technique that does not exploit genre information, and ACTwg-SLPwg outperforms conventional techniques and other genre-adaptive alternatives. In particular, improvements......-based genre representation for genre-adaptive music mood analysis....

  15. Audio Twister

    DEFF Research Database (Denmark)

    Cermak, Daniel; Moreno Garcia, Rodrigo; Monastiridis, Stefanos

    2015-01-01

    Daniel Cermak-Sassenrath, Rodrigo Moreno Garcia, Stefanos Monastiridis. Audio Twister. Installation. P-Hack Copenhagen 2015, Copenhagen, DK, Apr 24, 2015.......Daniel Cermak-Sassenrath, Rodrigo Moreno Garcia, Stefanos Monastiridis. Audio Twister. Installation. P-Hack Copenhagen 2015, Copenhagen, DK, Apr 24, 2015....

  16. Kernel-Based Sensor Fusion With Application to Audio-Visual Voice Activity Detection

    Science.gov (United States)

    Dov, David; Talmon, Ronen; Cohen, Israel

    2016-12-01

    In this paper, we address the problem of multiple view data fusion in the presence of noise and interferences. Recent studies have approached this problem using kernel methods, by relying particularly on a product of kernels constructed separately for each view. From a graph theory point of view, we analyze this fusion approach in a discrete setting. More specifically, based on a statistical model for the connectivity between data points, we propose an algorithm for the selection of the kernel bandwidth, a parameter, which, as we show, has important implications on the robustness of this fusion approach to interferences. Then, we consider the fusion of audio-visual speech signals measured by a single microphone and by a video camera pointed to the face of the speaker. Specifically, we address the task of voice activity detection, i.e., the detection of speech and non-speech segments, in the presence of structured interferences such as keyboard taps and office noise. We propose an algorithm for voice activity detection based on the audio-visual signal. Simulation results show that the proposed algorithm outperforms competing fusion and voice activity detection approaches. In addition, we demonstrate that a proper selection of the kernel bandwidth indeed leads to improved performance.

  17. Blind Audio Watermarking in Transform Domain Based on Singular Value Decomposition and Exponential-Log Operations

    Directory of Open Access Journals (Sweden)

    P. K. Dhar

    2017-06-01

    Full Text Available Digital watermarking has drawn extensive attention for copyright protection of multimedia data. This paper introduces a blind audio watermarking scheme in discrete cosine transform (DCT domain based on singular value decomposition (SVD, exponential operation (EO, and logarithm operation (LO. In our proposed scheme, initially the original audio is segmented into non-overlapping frames and DCT is applied to each frame. Low frequency DCT coefficients are divided into sub-bands and power of each sub band is calculated. EO is performed on the sub-band with highest power of the DCT coefficients of each frame. SVD is applied to the exponential coefficients of each sub bands with highest power represented in matrix form. Watermark information is embedded into the largest singular value by using a quantization function. Simulation results indicate that the proposed watermarking scheme is highly robust against different attacks. In addition, it has high data payload and shows low error probability rates. Moreover, it provides good performance in terms of imperceptibility, robustness, and data payload compared with some recent state-of-the-art watermarking methods.

  18. Steganography and Cryptography Inspired Enhancement of Introductory Programming Courses

    Science.gov (United States)

    Kortsarts, Yana; Kempner, Yulia

    2015-01-01

    Steganography is the art and science of concealing communication. The goal of steganography is to hide the very existence of information exchange by embedding messages into unsuspicious digital media covers. Cryptography, or secret writing, is the study of the methods of encryption, decryption and their use in communications protocols.…

  19. Audio Fingerprint Untuk Identifikasi File Audio

    OpenAIRE

    Yuanto, Stefanus Irwan; Tampubolon, Junius Karel; Restyandito, Restyandito

    2007-01-01

    Identifikasi file audio secara biner kurang efektif karena adanya format penyimpanan dan cara penyimpanan file audio yang berbeda-beda. Dengan menerapkan konsep audio fingerprint maka sinyal audio akan diidentifikasi dengan membandingkan sebuah kode unik berukuran kecil yang mewakili sinyal audio tersebut sehingga perbedaan format dan cara penyimpanan tidak berpengaruh besar terhadap sebuah proses identifikasi audio.

  20. Survey of the Use of Steganography over the Internet

    Directory of Open Access Journals (Sweden)

    Lavinia Mihaela DINCA

    2011-01-01

    Full Text Available This paper addressesthe use of Steganography over the Internet by terrorists. There were ru-mors in the newspapers that Steganography is being used to covert communication between terrorists, without presenting any scientific proof. Niels Provos and Peter Honeyman conducted an extensive Internet search where they analyzed over 2 million images and didn’t find a single hidden image. After this study the scientific community was divided: some believed that Niels Provos and Peter Honeyman was conclusive enough other did not. This paper describes what Steganography is and what can be used for, various Steganography techniques and also presents the studies made regarding the use of Steganography on the Internet.

  1. DOA and Pitch Estimation of Audio Sources using IAA-based Filtering

    DEFF Research Database (Denmark)

    Jensen, Jesper Rindom; Christensen, Mads Græsbøll

    2014-01-01

    joint DOA and pitch filtering-based estimator can be combined with the iterative adaptive approach to circumvent this limitation in joint DOA and pitch estimation of audio sources. Simulations show a clear improvement compared to when using the sample covariance matrix and the considered approach also...... on knowledge of the inverse sample covariance matrix. Typically, this covariance is estimated using the sample covariance matrix, but for this estimate to be full rank, many temporal samples are needed. In cases with non-stationary signals, this is a serious limitation. We therefore investigate how a recent...... outperforms other state-of-the-art methods. Finally, the applicability of the considered approach is verified on real speech....

  2. A Perceptual Model for Sinusoidal Audio Coding Based on Spectral Integration

    NARCIS (Netherlands)

    Van de Par, S.; Kohlrausch, A.; Heusdens, R.; Jensen, J.; Holdt Jensen, S.

    2005-01-01

    Psychoacoustical models have been used extensively within audio coding applications over the past decades. Recently, parametric coding techniques have been applied to general audio and this has created the need for a psychoacoustical model that is specifically suited for sinusoidal modelling of

  3. A perceptual model for sinusoidal audio coding based on spectral integration

    NARCIS (Netherlands)

    Van de Par, S.; Kohlrauch, A.; Heusdens, R.; Jensen, J.; Jensen, S.H.

    2005-01-01

    Psychoacoustical models have been used extensively within audio coding applications over the past decades. Recently, parametric coding techniques have been applied to general audio and this has created the need for a psychoacoustical model that is specifically suited for sinusoidal modelling of

  4. Monaural separation of dependent audio sources based on a generalized Wiener filter

    DEFF Research Database (Denmark)

    Ma, Guilin; Agerkvist, Finn T.; Luther, J.B.

    2007-01-01

    This paper presents a two-stage approach for single- channel separation of dependent audio sources. The proposed algorithm is developed in the Bayesian framework and designed for general audio signals. In the first stage of the algorithm, the joint distribution of discrete Fourier transform (DFT)...

  5. Audio Bigrams as a Unifying Model of Pitch-based Song Description

    NARCIS (Netherlands)

    Van Balen, Jan|info:eu-repo/dai/nl/352221860; Wiering, Frans|info:eu-repo/dai/nl/141928034; Veltkamp, Remco|info:eu-repo/dai/nl/084742984

    2015-01-01

    In this paper we provide a novel perspective on a family of music description algorithms that perform what could be referred to as `soft' audio fingerprinting. These algorithms convert fragments of musical audio to one or more fixed-size vectors that can be used in distance computation and indexing,

  6. An Interactive Concert Program Based on Infrared Watermark and Audio Synthesis

    Science.gov (United States)

    Wang, Hsi-Chun; Lee, Wen-Pin Hope; Liang, Feng-Ju

    The objective of this research is to propose a video/audio system which allows the user to listen the typical music notes in the concert program under infrared detection. The system synthesizes audio with different pitches and tempi in accordance with the encoded data in a 2-D barcode embedded in the infrared watermark. The digital halftoning technique has been used to fabricate the infrared watermark composed of halftone dots by both amplitude modulation (AM) and frequency modulation (FM). The results show that this interactive system successfully recognizes the barcode and synthesizes audio under infrared detection of a concert program which is also valid for human observation of the contents. This interactive video/audio system has greatly expanded the capability of the printout paper to audio display and also has many potential value-added applications.

  7. A secure steganography for privacy protection in healthcare system.

    Science.gov (United States)

    Liu, Jing; Tang, Guangming; Sun, Yifeng

    2013-04-01

    Private data in healthcare system require confidentiality protection while transmitting. Steganography is the art of concealing data into a cover media for conveying messages confidentially. In this paper, we propose a steganographic method which can provide private data in medical system with very secure protection. In our method, a cover image is first mapped into a 1D pixels sequence by Hilbert filling curve and then divided into non-overlapping embedding units with three consecutive pixels. We use adaptive pixel pair match (APPM) method to embed digits in the pixel value differences (PVD) of the three pixels and the base of embedded digits is dependent on the differences among the three pixels. By solving an optimization problem, minimal distortion of the pixel ternaries caused by data embedding can be obtained. The experimental results show our method is more suitable to privacy protection of healthcare system than prior steganographic works.

  8. Use of audio-enhanced personal digital assistants for school-based data collection.

    Science.gov (United States)

    Trapl, Erika S; Borawski, Elaine A; Stork, Paul P; Lovegreen, Loren D; Colabianchi, Natalie; Cole, Maurice L; Charvat, Jacqueline M

    2005-10-01

    To review the different data collection options available to school-based researchers and to present the preliminary findings on the use of audio-enhanced personal digital assistants (APDA) for use in school-based data collection. A newly developed APDA system was used to collect baseline data from a sample of 645 seventh grade students enrolled in a school-based intervention study. Evaluative measures included student response, time to completion, and data quality (e.g., missingness, internal consistency of responses). Differences in data administration and data quality were examined among three groups of students: students newer to the United States speaking English as a second language; special education students; and students not newer to the United States receiving regular education. The APDA system was well received by students and was shown to offer improvements in data administration (increased portability, time to completion) and reduced missing data. Although time to completion and proportion of missing data were similar across the three groups of students, psychometric properties of the data varied considerably. The APDA system offers a promising new method for collecting data in the middle school environment. Students with cognitive deficits and language barriers were able to complete the survey in a similar amount of time without additional help; however, differences in data quality suggest that limitations in comprehension of the questions remained even though the questions were read to the respondents. More research on the use of APDA is necessary to fully understand the effect of data collection mode with special populations.

  9. Achievement report for fiscal 1998 on area consortium research and development business. Area consortium for venture business development by building base for small business (abuse double protected next generation card system based on steganography); 1998 nendo venture kigyo ikuseigata chiiki consortium kenkyu kaihatsu (chusho kigyo sozo kibangata). Steganography gijutsu wo riyoshita jisedaigata fusei shiyo taju boshi guard system no kenkyu kaihatsu

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1999-03-01

    A card system is developed using BPCS (Business Planning and Control System) steganography, with electronic data imbedded in the card. Under the system, the visual recognition of the user and the mechanical verification of the card are carried out simultaneously, with the card rejecting any abuse. In fiscal 1998, a system was built by way of trial, constituted of technologies of encoding, decoding, and packaging data into an IC (integrated circuit) card. A photograph of the user's face is attached to the card, the card carries an 8KB IC memory device, and the device stores data of the photograph of the user's face etc. A password has to be inputted before any data may be taken out. A customized key is required to display the imbedded personal data and, for the restoration of the key data, the personal key known only to the owner and the company key that is kept by the card managing company need to be collated with each other. Multiple checking is available for the prevention of abuse, which include the collation of face photographs, collation with display by inputting the password, and request for the customized key to confirm the presence of authority to read the imbedded personal data. (NEDO)

  10. Audio-visual imposture

    Science.gov (United States)

    Karam, Walid; Mokbel, Chafic; Greige, Hanna; Chollet, Gerard

    2006-05-01

    A GMM based audio visual speaker verification system is described and an Active Appearance Model with a linear speaker transformation system is used to evaluate the robustness of the verification. An Active Appearance Model (AAM) is used to automatically locate and track a speaker's face in a video recording. A Gaussian Mixture Model (GMM) based classifier (BECARS) is used for face verification. GMM training and testing is accomplished on DCT based extracted features of the detected faces. On the audio side, speech features are extracted and used for speaker verification with the GMM based classifier. Fusion of both audio and video modalities for audio visual speaker verification is compared with face verification and speaker verification systems. To improve the robustness of the multimodal biometric identity verification system, an audio visual imposture system is envisioned. It consists of an automatic voice transformation technique that an impostor may use to assume the identity of an authorized client. Features of the transformed voice are then combined with the corresponding appearance features and fed into the GMM based system BECARS for training. An attempt is made to increase the acceptance rate of the impostor and to analyzing the robustness of the verification system. Experiments are being conducted on the BANCA database, with a prospect of experimenting on the newly developed PDAtabase developed within the scope of the SecurePhone project.

  11. The Effects of Musical Experience and Hearing Loss on Solving an Audio-Based Gaming Task

    Directory of Open Access Journals (Sweden)

    Kjetil Falkenberg Hansen

    2017-12-01

    Full Text Available We conducted an experiment using a purposefully designed audio-based game called the Music Puzzle with Japanese university students with different levels of hearing acuity and experience with music in order to determine the effects of these factors on solving such games. A group of hearing-impaired students (n = 12 was compared with two hearing control groups with the additional characteristic of having high (n = 12 or low (n = 12 engagement in musical activities. The game was played with three sound sets or modes; speech, music, and a mix of the two. The results showed that people with hearing loss had longer processing times for sounds when playing the game. Solving the game task in the speech mode was found particularly difficult for the group with hearing loss, and while they found the game difficult in general, they expressed a fondness for the game and a preference for music. Participants with less musical experience showed difficulties in playing the game with musical material. We were able to explain the impacts of hearing acuity and musical experience; furthermore, we can promote this kind of tool as a viable way to train hearing by focused listening to sound, particularly with music.

  12. Steganography and Prospects of Its Application in Protection of Printing Documents

    Directory of Open Access Journals (Sweden)

    M. O. Zhmakin

    2010-09-01

    Full Text Available In the article the principle of steganography reveals. Three modern directions of concealment of the information are presented. Application classical steganography in printing prints is described.

  13. Perceptual Audio Hashing Functions

    Directory of Open Access Journals (Sweden)

    Emin Anarım

    2005-07-01

    Full Text Available Perceptual hash functions provide a tool for fast and reliable identification of content. We present new audio hash functions based on summarization of the time-frequency spectral characteristics of an audio document. The proposed hash functions are based on the periodicity series of the fundamental frequency and on singular-value description of the cepstral frequencies. They are found, on one hand, to perform very satisfactorily in identification and verification tests, and on the other hand, to be very resilient to a large variety of attacks. Moreover, we address the issue of security of hashes and propose a keying technique, and thereby a key-dependent hash function.

  14. Distortion Estimation in Compressed Music Using Only Audio Fingerprints

    NARCIS (Netherlands)

    Doets, P.J.O.; Lagendijk, R.L.

    2008-01-01

    An audio fingerprint is a compact yet very robust representation of the perceptually relevant parts of an audio signal. It can be used for content-based audio identification, even when the audio is severely distorted. Audio compression changes the fingerprint slightly. We show that these small

  15. Teacher’s Voice on Metacognitive Strategy Based Instruction Using Audio Visual Aids for Listening

    Directory of Open Access Journals (Sweden)

    Salasiah Salasiah

    2018-02-01

    Full Text Available The paper primarily stresses on exploring the teacher’s voice toward the application of metacognitive strategy with audio-visual aid in improving listening comprehension. The metacognitive strategy model applied in the study was inspired from Vandergrift and Tafaghodtari (2010 instructional model. Thus it is modified in the procedure and applied with audio-visual aids for improving listening comprehension. The study’s setting was at SMA Negeri 2 Parepare, South Sulawesi Province, Indonesia. The population of the research was the teacher of English at tenth grade at SMAN 2. The sample was taken by using random sampling technique. The data was collected by using in depth interview during the research, recorded, and analyzed using qualitative analysis. This study explored the teacher’s response toward the modified model of metacognitive strategy with audio visual aids in class of listening which covers positive and negative response toward the strategy applied during the teaching of listening. The result of data showed that this strategy helped the teacher a lot in teaching listening comprehension as the procedure has systematic steps toward students’ listening comprehension. Also, it eases the teacher to teach listening by empowering audio visual aids such as video taken from youtube.

  16. Detection and characterization of lightning-based sources using continuous wavelet transform: application to audio-magnetotellurics

    Science.gov (United States)

    Larnier, H.; Sailhac, P.; Chambodut, A.

    2018-01-01

    Atmospheric electromagnetic waves created by global lightning activity contain information about electrical processes of the inner and the outer Earth. Large signal-to-noise ratio events are particularly interesting because they convey information about electromagnetic properties along their path. We introduce a new methodology to automatically detect and characterize lightning-based waves using a time-frequency decomposition obtained through the application of continuous wavelet transform. We focus specifically on three types of sources, namely, atmospherics, slow tails and whistlers, that cover the frequency range 10 Hz to 10 kHz. Each wave has distinguishable characteristics in the time-frequency domain due to source shape and dispersion processes. Our methodology allows automatic detection of each type of event in the time-frequency decomposition thanks to their specific signature. Horizontal polarization attributes are also recovered in the time-frequency domain. This procedure is first applied to synthetic extremely low frequency time-series with different signal-to-noise ratios to test for robustness. We then apply it on real data: three stations of audio-magnetotelluric data acquired in Guadeloupe, oversea French territories. Most of analysed atmospherics and slow tails display linear polarization, whereas analysed whistlers are elliptically polarized. The diversity of lightning activity is finally analysed in an audio-magnetotelluric data processing framework, as used in subsurface prospecting, through estimation of the impedance response functions. We show that audio-magnetotelluric processing results depend mainly on the frequency content of electromagnetic waves observed in processed time-series, with an emphasis on the difference between morning and afternoon acquisition. Our new methodology based on the time-frequency signature of lightning-induced electromagnetic waves allows automatic detection and characterization of events in audio

  17. Audio asymmetric watermarking technique

    OpenAIRE

    Furon, Teddy; Moreau, Nicolas; Duhamel, Pierre

    2000-01-01

    This paper presents the application of the promising public key watermarking method1 to the audio domain. Its de- tection process does not need the original content nor the secret key used in the embedding process. It is the trans- lation, in the watermarking domain, of a public key pair cryptosystem [1]. We start to build the detector with some basic assumptions. This leads to a hypothesis test based on probability likelihood. But real audio signals do not satisfy the assumption of a Gaussia...

  18. Semantic Audio Track Mixer

    OpenAIRE

    Uhle, C.; Herre, J.; Ridderbusch, F.; Popp, H.

    2011-01-01

    An audio mixer for mixing a plurality of audio tracks to a mixture signal comprises a semantic command interpreter (30; 35) for receiving a semantic mixing command and for deriving a plurality of mixing parameters for the plurality of audio tracks from the semantic mixing command; an audio track processor (70; 75) for processing the plurality of audio tracks in accordance with the plurality of mixing parameters; and an audio track combiner (76) for combining the plurality of audio tracks proc...

  19. Multimodal Authentication - Biometric Password And Steganography

    Directory of Open Access Journals (Sweden)

    Alvin Prasad

    2017-06-01

    Full Text Available Security is a major concern for everyone be it individuals or organizations. As the nature of information systems is becoming distributed securing them is becoming difficult as well. New applications are developed by researchers and developers to counter security issues but as soon as the application is released new attacks are formed to bypass the application. Kerberos is an authentication protocol which helps in to verify and validate a user to a server. As it is a widely used protocol minimizing or preventing the password attack is important. In this research we have analyzed the Kerberos protocol and suggested some ideas which can be considered while updating Kerberos to prevent the password attack. In the proposed solution we are suggesting to use password and biometric technique while registering on the network to enjoy the services and a combination of cryptography and steganography technique while communicating back to the user.

  20. Hybrid distortion function for JPEG steganography

    Science.gov (United States)

    Wang, Zichi; Zhang, Xinpeng; Yin, Zhaoxia

    2016-09-01

    A hybrid distortion function for JPEG steganography exploiting block fluctuation and quantization steps is proposed. To resist multidomain steganalysis, both spatial domain and discrete cosine transformation (DCT) domain are involved in the proposed distortion function. In spatial domain, a distortion value is allotted for each 8×8 block according to block fluctuation. In DCT domain, quantization steps are employed to allot distortion values for DCT coefficients in a block. The two elements, block distortion and quantization steps, are combined together to measure the embedding risk. By employing the syndrome trellis coding to embed secret data, the embedding changes are constrained in complex regions, where modifications are hard to be detected. When compared to current state-of-the-art steganographic methods for JPEG images, the proposed method presents less detectable artifacts.

  1. Cross-modal integration of polyphonic characters in Chinese audio-visual sentences: a MVPA study based on functional connectivity.

    Science.gov (United States)

    Zhang, Zhengyi; Zhang, Gaoyan; Zhang, Yuanyuan; Liu, Hong; Xu, Junhai; Liu, Baolin

    2017-12-01

    This study aimed to investigate the functional connectivity in the brain during the cross-modal integration of polyphonic characters in Chinese audio-visual sentences. The visual sentences were all semantically reasonable and the audible pronunciations of the polyphonic characters in corresponding sentences contexts varied in four conditions. To measure the functional connectivity, correlation, coherence and phase synchronization index (PSI) were used, and then multivariate pattern analysis was performed to detect the consensus functional connectivity patterns. These analyses were confined in the time windows of three event-related potential components of P200, N400 and late positive shift (LPS) to investigate the dynamic changes of the connectivity patterns at different cognitive stages. We found that when differentiating the polyphonic characters with abnormal pronunciations from that with the appreciate ones in audio-visual sentences, significant classification results were obtained based on the coherence in the time window of the P200 component, the correlation in the time window of the N400 component and the coherence and PSI in the time window the LPS component. Moreover, the spatial distributions in these time windows were also different, with the recruitment of frontal sites in the time window of the P200 component, the frontal-central-parietal regions in the time window of the N400 component and the central-parietal sites in the time window of the LPS component. These findings demonstrate that the functional interaction mechanisms are different at different stages of audio-visual integration of polyphonic characters.

  2. A Perceptual Model for Sinusoidal Audio Coding Based on Spectral Integration

    Directory of Open Access Journals (Sweden)

    Jensen Søren Holdt

    2005-01-01

    Full Text Available Psychoacoustical models have been used extensively within audio coding applications over the past decades. Recently, parametric coding techniques have been applied to general audio and this has created the need for a psychoacoustical model that is specifically suited for sinusoidal modelling of audio signals. In this paper, we present a new perceptual model that predicts masked thresholds for sinusoidal distortions. The model relies on signal detection theory and incorporates more recent insights about spectral and temporal integration in auditory masking. As a consequence, the model is able to predict the distortion detectability. In fact, the distortion detectability defines a (perceptually relevant norm on the underlying signal space which is beneficial for optimisation algorithms such as rate-distortion optimisation or linear predictive coding. We evaluate the merits of the model by combining it with a sinusoidal extraction method and compare the results with those obtained with the ISO MPEG-1 Layer I-II recommended model. Listening tests show a clear preference for the new model. More specifically, the model presented here leads to a reduction of more than 20% in terms of number of sinusoids needed to represent signals at a given quality level.

  3. Audio Restoration

    Science.gov (United States)

    Esquef, Paulo A. A.

    The first reproducible recording of human voice was made in 1877 on a tinfoil cylinder phonograph devised by Thomas A. Edison. Since then, much effort has been expended to find better ways to record and reproduce sounds. By the mid-1920s, the first electrical recordings appeared and gradually took over purely acoustic recordings. The development of electronic computers, in conjunction with the ability to record data onto magnetic or optical media, culminated in the standardization of compact disc format in 1980. Nowadays, digital technology is applied to several audio applications, not only to improve the quality of modern and old recording/reproduction techniques, but also to trade off sound quality for less storage space and less taxing transmission capacity requirements.

  4. Blind steganalysis method for JPEG steganography combined with the semisupervised learning and soft margin support vector machine

    Science.gov (United States)

    Dong, Yu; Zhang, Tao; Xi, Ling

    2015-01-01

    Stego images embedded by unknown steganographic algorithms currently may not be detected by using steganalysis detectors based on binary classifier. However, it is difficult to obtain high detection accuracy by using universal steganalysis based on one-class classifier. For solving this problem, a blind detection method for JPEG steganography was proposed from the perspective of information theory. The proposed method combined the semisupervised learning and soft margin support vector machine with steganalysis detector based on one-class classifier to utilize the information in test data for improving detection performance. Reliable blind detection for JPEG steganography was realized only using cover images for training. The experimental results show that the proposed method can contribute to improving the detection accuracy of steganalysis detector based on one-class classifier and has good robustness under different source mismatch conditions.

  5. Audio-Visual Technician | IDRC - International Development ...

    International Development Research Centre (IDRC) Digital Library (Canada)

    Occasionally records on audio and/or video media, conferences, seminars, lectures and other events. Edits and duplicates audio and video tapes ... Participates in the planning and design of new or updated audio-visual systems by providing technical input on system needs. Based on current and emerging requirements as ...

  6. Evaluating Visual Information Provided by Audio Description.

    Science.gov (United States)

    Peli, E.; And Others

    1996-01-01

    The video and standard audio portions of 2 television programs were presented to 25 adults with low vision and 24 adults with normal vision; 29 additional subjects only heard the standard audio portions. Subjects then answered questions based on audio descriptions (AD) provided by Descriptive Video Service. Results indicated that some AD…

  7. An interactive audio-visual installation using ubiquitous hardware and web-based software deployment

    Directory of Open Access Journals (Sweden)

    Tiago Fernandes Tavares

    2015-05-01

    Full Text Available This paper describes an interactive audio-visual musical installation, namely MOTUS, that aims at being deployed using low-cost hardware and software. This was achieved by writing the software as a web application and using only hardware pieces that are built-in most modern personal computers. This scenario implies in specific technical restrictions, which leads to solutions combining both technical and artistic aspects of the installation. The resulting system is versatile and can be freely used from any computer with Internet access. Spontaneous feedback from the audience has shown that the provided experience is interesting and engaging, regardless of the use of minimal hardware.

  8. Steganography forensics method for detecting least significant bit replacement attack

    Science.gov (United States)

    Wang, Xiaofeng; Wei, Chengcheng; Han, Xiao

    2015-01-01

    We present an image forensics method to detect least significant bit replacement steganography attack. The proposed method provides fine-grained forensics features by using the hierarchical structure that combines pixels correlation and bit-planes correlation. This is achieved via bit-plane decomposition and difference matrices between the least significant bit-plane and each one of the others. Generated forensics features provide the susceptibility (changeability) that will be drastically altered when the cover image is embedded with data to form a stego image. We developed a statistical model based on the forensics features and used least square support vector machine as a classifier to distinguish stego images from cover images. Experimental results show that the proposed method provides the following advantages. (1) The detection rate is noticeably higher than that of some existing methods. (2) It has the expected stability. (3) It is robust for content-preserving manipulations, such as JPEG compression, adding noise, filtering, etc. (4) The proposed method provides satisfactory generalization capability.

  9. A Secure and Robust Compressed Domain Video Steganography for Intra- and Inter-Frames Using Embedding-Based Byte Differencing (EBBD Scheme.

    Directory of Open Access Journals (Sweden)

    Tarik Idbeaa

    Full Text Available This paper presents a novel secure and robust steganographic technique in the compressed video domain namely embedding-based byte differencing (EBBD. Unlike most of the current video steganographic techniques which take into account only the intra frames for data embedding, the proposed EBBD technique aims to hide information in both intra and inter frames. The information is embedded into a compressed video by simultaneously manipulating the quantized AC coefficients (AC-QTCs of luminance components of the frames during MPEG-2 encoding process. Later, during the decoding process, the embedded information can be detected and extracted completely. Furthermore, the EBBD basically deals with two security concepts: data encryption and data concealing. Hence, during the embedding process, secret data is encrypted using the simplified data encryption standard (S-DES algorithm to provide better security to the implemented system. The security of the method lies in selecting candidate AC-QTCs within each non-overlapping 8 × 8 sub-block using a pseudo random key. Basic performance of this steganographic technique verified through experiments on various existing MPEG-2 encoded videos over a wide range of embedded payload rates. Overall, the experimental results verify the excellent performance of the proposed EBBD with a better trade-off in terms of imperceptibility and payload, as compared with previous techniques while at the same time ensuring minimal bitrate increase and negligible degradation of PSNR values.

  10. A Secure and Robust Compressed Domain Video Steganography for Intra- and Inter-Frames Using Embedding-Based Byte Differencing (EBBD) Scheme.

    Science.gov (United States)

    Idbeaa, Tarik; Abdul Samad, Salina; Husain, Hafizah

    2016-01-01

    This paper presents a novel secure and robust steganographic technique in the compressed video domain namely embedding-based byte differencing (EBBD). Unlike most of the current video steganographic techniques which take into account only the intra frames for data embedding, the proposed EBBD technique aims to hide information in both intra and inter frames. The information is embedded into a compressed video by simultaneously manipulating the quantized AC coefficients (AC-QTCs) of luminance components of the frames during MPEG-2 encoding process. Later, during the decoding process, the embedded information can be detected and extracted completely. Furthermore, the EBBD basically deals with two security concepts: data encryption and data concealing. Hence, during the embedding process, secret data is encrypted using the simplified data encryption standard (S-DES) algorithm to provide better security to the implemented system. The security of the method lies in selecting candidate AC-QTCs within each non-overlapping 8 × 8 sub-block using a pseudo random key. Basic performance of this steganographic technique verified through experiments on various existing MPEG-2 encoded videos over a wide range of embedded payload rates. Overall, the experimental results verify the excellent performance of the proposed EBBD with a better trade-off in terms of imperceptibility and payload, as compared with previous techniques while at the same time ensuring minimal bitrate increase and negligible degradation of PSNR values.

  11. WLAN Technologies for Audio Delivery

    Directory of Open Access Journals (Sweden)

    Nicolas-Alexander Tatlas

    2007-01-01

    Full Text Available Audio delivery and reproduction for home or professional applications may greatly benefit from the adoption of digital wireless local area network (WLAN technologies. The most challenging aspect of such integration relates the synchronized and robust real-time streaming of multiple audio channels to multipoint receivers, for example, wireless active speakers. Here, it is shown that current WLAN solutions are susceptible to transmission errors. A detailed study of the IEEE802.11e protocol (currently under ratification is also presented and all relevant distortions are assessed via an analytical and experimental methodology. A novel synchronization scheme is also introduced, allowing optimized playback for multiple receivers. The perceptual audio performance is assessed for both stereo and 5-channel applications based on either PCM or compressed audio signals.

  12. Audio-visual interactions in environment assessment.

    Science.gov (United States)

    Preis, Anna; Kociński, Jędrzej; Hafke-Dys, Honorata; Wrzosek, Małgorzata

    2015-08-01

    The aim of the study was to examine how visual and audio information influences audio-visual environment assessment. Original audio-visual recordings were made at seven different places in the city of Poznań. Participants of the psychophysical experiments were asked to rate, on a numerical standardized scale, the degree of comfort they would feel if they were in such an environment. The assessments of audio-visual comfort were carried out in a laboratory in four different conditions: (a) audio samples only, (b) original audio-visual samples, (c) video samples only, and (d) mixed audio-visual samples. The general results of this experiment showed a significant difference between the investigated conditions, but not for all the investigated samples. There was a significant improvement in comfort assessment when visual information was added (in only three out of 7 cases), when conditions (a) and (b) were compared. On the other hand, the results show that the comfort assessment of audio-visual samples could be changed by manipulating the audio rather than the video part of the audio-visual sample. Finally, it seems, that people could differentiate audio-visual representations of a given place in the environment based rather of on the sound sources' compositions than on the sound level. Object identification is responsible for both landscape and soundscape grouping. Copyright © 2015. Published by Elsevier B.V.

  13. Intelligent audio analysis

    CERN Document Server

    Schuller, Björn W

    2013-01-01

    This book provides the reader with the knowledge necessary for comprehension of the field of Intelligent Audio Analysis. It firstly introduces standard methods and discusses the typical Intelligent Audio Analysis chain going from audio data to audio features to audio recognition.  Further, an introduction to audio source separation, and enhancement and robustness are given. After the introductory parts, the book shows several applications for the three types of audio: speech, music, and general sound. Each task is shortly introduced, followed by a description of the specific data and methods applied, experiments and results, and a conclusion for this specific task. The books provides benchmark results and standardized test-beds for a broader range of audio analysis tasks. The main focus thereby lies on the parallel advancement of realism in audio analysis, as too often today’s results are overly optimistic owing to idealized testing conditions, and it serves to stimulate synergies arising from transfer of ...

  14. Embedded FPGA Design for Optimal Pixel Adjustment Process of Image Steganography

    Directory of Open Access Journals (Sweden)

    Chiung-Wei Huang

    2018-01-01

    Full Text Available We propose a prototype of field programmable gate array (FPGA implementation for optimal pixel adjustment process (OPAP algorithm of image steganography. In the proposed scheme, the cover image and the secret message are transmitted from a personal computer (PC to an FPGA board using RS232 interface for hardware processing. We firstly embed k-bit secret message into each pixel of the cover image by the last-significant-bit (LSB substitution method, followed by executing associated OPAP calculations to construct a stego pixel. After all pixels of the cover image have been embedded, a stego image is created and transmitted from FPGA back to the PC and stored in the PC. Moreover, we have extended the basic pixel-wise structure to a parallel structure which can fully use the hardware devices to speed up the embedding process and embed several bits of secret message at the same time. Through parallel mechanism of the hardware based design, the data hiding process can be completed in few clock cycles to produce steganography outcome. Experimental results show the effectiveness and correctness of the proposed scheme.

  15. CLSM: COUPLE LAYERED SECURITY MODEL A HIGH-CAPACITY DATA HIDING SCHEME USING WITH STEGANOGRAPHY

    Directory of Open Access Journals (Sweden)

    Cemal Kocak

    2017-03-01

    Full Text Available Cryptography and steganography are the two significant techniques used in secrecy of communications and in safe message transfer. In this study CLSM – Couple Layered Security Model is suggested which has a hybrid structure enhancing information security using features of cryptography and steganography. In CLSM system; the information which has been initially cryptographically encrypted is steganographically embedded in an image at the next step. The information is encrypted by means of a Text Keyword consisting of maximum 16 digits determined by the user in cryptography method. Similarly, the encrypted information is processed, during the embedding stage, using a 16 digit pin (I-PIN which is determined again by the user. The carrier images utilized in the study have been determined as 24 bit/pixel colour. Utilization of images in .jpeg, .tiff, .pnp format has also been provided. The performance of the CLSM method has been evaluated according to the objective quality measurement criteria of PSNR-dB (Peak Signal-to-Noise Ratio and SSIM (Structural Similarity Index. In the study, 12 different sized information between 1000 and 609,129 bits were embedded into images. Between 34.14 and 65.8 dB PSNR values and between 0.989 and 0.999 SSIM values were obtained. CLSM showed better results compared to Pixel Value Differencing (PVD method, Simulated Annealing (SA Algorithm and Mix column transform based on irreducible polynomial mathematics methods.

  16. Sensitive Patient Data Hiding using a ROI Reversible Steganography Scheme for DICOM Images.

    Science.gov (United States)

    Mantos, Petros L K; Maglogiannis, Ilias

    2016-06-01

    The exchange of medical images over the Internet has evoked significant interest over the past few years due to the introduction of web and cloud based medical information systems. The protection of sensitive data has always been a key indicator in the performance of such systems. In this context, this work presents an algorithm developed for Digital Imaging and Communications in Medicine (DICOM) medical images, which applies secret-sharing steganography methods for ensuring the integrity of sensitive patient data as well as the important parts of the image. In the proposed algorithm, images are divided into two parts: the region of interest (ROI) and the region of non interest (RONI). Patient data and integrity hashes are positioned inside the ROI while the information (map) needed to recover the ROI before insertion is positioned in the RONI. Security of the extraction process is assured through the use of cryptography. The experimental results prove that the original (cover) images and the stego images provide an excellent visual equality result in terms of PSNR. Furthermore, they prove that the proposed scheme can be efficiently used as a steganography scheme in DICOM images with limited smooth areas.

  17. A Method to Detect AAC Audio Forgery

    Directory of Open Access Journals (Sweden)

    Qingzhong Liu

    2015-08-01

    Full Text Available Advanced Audio Coding (AAC, a standardized lossy compression scheme for digital audio, which was designed to be the successor of the MP3 format, generally achieves better sound quality than MP3 at similar bit rates. While AAC is also the default or standard audio format for many devices and AAC audio files may be presented as important digital evidences, the authentication of the audio files is highly needed but relatively missing. In this paper, we propose a scheme to expose tampered AAC audio streams that are encoded at the same encoding bit-rate. Specifically, we design a shift-recompression based method to retrieve the differential features between the re-encoded audio stream at each shifting and original audio stream, learning classifier is employed to recognize different patterns of differential features of the doctored forgery files and original (untouched audio files. Experimental results show that our approach is very promising and effective to detect the forgery of the same encoding bit-rate on AAC audio streams. Our study also shows that shift recompression-based differential analysis is very effective for detection of the MP3 forgery at the same bit rate.

  18. Back to basics audio

    CERN Document Server

    Nathan, Julian

    1998-01-01

    Back to Basics Audio is a thorough, yet approachable handbook on audio electronics theory and equipment. The first part of the book discusses electrical and audio principles. Those principles form a basis for understanding the operation of equipment and systems, covered in the second section. Finally, the author addresses planning and installation of a home audio system.Julian Nathan joined the audio service and manufacturing industry in 1954 and moved into motion picture engineering and production in 1960. He installed and operated recording theaters in Sydney, Austra

  19. Quantum Steganography via Greenberger-Horne-Zeilinger GHZ4 State

    International Nuclear Information System (INIS)

    El Allati, A.; Hassouni, Y.; Medeni, M.B. Ould

    2012-01-01

    A quantum steganography communication scheme via Greenberger-Horne-Zeilinger GHZ 4 state is constructed to investigate the possibility of remotely transferred hidden information. Moreover, the multipartite entangled states are become a hectic topic due to its important applications and deep effects on aspects of quantum information. Then, the scheme consists of sharing the correlation of four particle GHZ 4 states between the legitimate users. After insuring the security of the quantum channel, they begin to hide the secret information in the cover of message. Comparing the scheme with the previous quantum steganographies, capacity and imperceptibility of hidden message are good. The security of the present scheme against many attacks is also discussed. (general)

  20. An efficient steganography method for hiding patient confidential information.

    Science.gov (United States)

    Al-Dmour, Hayat; Al-Ani, Ahmed; Nguyen, Hung

    2014-01-01

    This paper deals with the important issue of security and confidentiality of patient information when exchanging or storing medical images. Steganography has recently been viewed as an alternative or complement to cryptography, as existing cryptographic systems are not perfect due to their vulnerability to certain types of attack. We propose in this paper a new steganography algorithm for hiding patient confidential information. It utilizes Pixel Value Differencing (PVD) to identify contrast regions in the image and a Hamming code that embeds 3 secret message bits into 4 bits of the cover image. In order to preserve the content of the region of interest (ROI), the embedding is only performed using the Region of Non-Interest (RONI).

  1. Cropping and noise resilient steganography algorithm using secret image sharing

    Science.gov (United States)

    Juarez-Sandoval, Oswaldo; Fierro-Radilla, Atoany; Espejel-Trujillo, Angelina; Nakano-Miyatake, Mariko; Perez-Meana, Hector

    2015-03-01

    This paper proposes an image steganography scheme, in which a secret image is hidden into a cover image using a secret image sharing (SIS) scheme. Taking advantage of the fault tolerant property of the (k,n)-threshold SIS, where using any k of n shares (k≤n), the secret data can be recovered without any ambiguity, the proposed steganography algorithm becomes resilient to cropping and impulsive noise contamination. Among many SIS schemes proposed until now, Lin and Chan's scheme is selected as SIS, due to its lossless recovery capability of a large amount of secret data. The proposed scheme is evaluated from several points of view, such as imperceptibility of the stegoimage respect to its original cover image, robustness of hidden data to cropping operation and impulsive noise contamination. The evaluation results show a high quality of the extracted secret image from the stegoimage when it suffered more than 20% cropping or high density noise contamination.

  2. Evaluation of a Smartphone-based audio-biofeedback system for improving balance in older adults--a pilot study.

    Science.gov (United States)

    Fleury, A; Mourcou, Q; Franco, C; Diot, B; Demongeot, J; Vuillerme, N

    2013-01-01

    This study was designed to assess the effectiveness of a Smartphone-based audio-biofeedback (ABF) system for improving balance in older adults. This so-called "iBalance-ABF" system that we recetly developed is "all-inclusive" in the sense that its three main components of a balance prosthesis, (i) the sensory input unit, (ii) the processing unit, and (iii) the sensory output unit, are entirely embedded into the Smartphone. The underlying principle of this system is to supply the user with supplementary information about the medial-lateral (ML) trunk tilt relative to a predetermined adjustable "dead zone" through sound generation in earphones. Six healthy older adults voluntarily participated in this pilot study. Eyes closed, they were asked to stand upright and to sway as little as possible in two (parallel and tandem) stance conditions executed without and with the use of the iBalance-ABF system. Results showed that, without any visual information, the use of the Smartphone-based ABF allowed the older healthy adults to significantly decrease their ML trunk sway in the tandem stance posture and to mitigate the destabilizing effect induced by this particular stance. Although an extended study including a larger number of participants is needed to confirm these data, the present results are encouraging. They do suggest that Smartphone-based ABF system could be used for balance training and rehabilitation therapy in older adults.

  3. Audio Classification from Time-Frequency Texture

    OpenAIRE

    Yu, Guoshen; Slotine, Jean-Jacques

    2008-01-01

    Time-frequency representations of audio signals often resemble texture images. This paper derives a simple audio classification algorithm based on treating sound spectrograms as texture images. The algorithm is inspired by an earlier visual classification scheme particularly efficient at classifying textures. While solely based on time-frequency texture features, the algorithm achieves surprisingly good performance in musical instrument classification experiments.

  4. The Quantum Steganography Protocol via Quantum Noisy Channels

    Science.gov (United States)

    Wei, Zhan-Hong; Chen, Xiu-Bo; Niu, Xin-Xin; Yang, Yi-Xian

    2015-08-01

    As a promising branch of quantum information hiding, Quantum steganography aims to transmit secret messages covertly in public quantum channels. But due to environment noise and decoherence, quantum states easily decay and change. Therefore, it is very meaningful to make a quantum information hiding protocol apply to quantum noisy channels. In this paper, we make the further research on a quantum steganography protocol for quantum noisy channels. The paper proved that the protocol can apply to transmit secret message covertly in quantum noisy channels, and explicity showed quantum steganography protocol. In the protocol, without publishing the cover data, legal receivers can extract the secret message with a certain probability, which make the protocol have a good secrecy. Moreover, our protocol owns the independent security, and can be used in general quantum communications. The communication, which happen in our protocol, do not need entangled states, so our protocol can be used without the limitation of entanglement resource. More importantly, the protocol apply to quantum noisy channels, and can be used widely in the future quantum communication.

  5. Steganography on multiple MP3 files using spread spectrum and Shamir's secret sharing

    Science.gov (United States)

    Yoeseph, N. M.; Purnomo, F. A.; Riasti, B. K.; Safiie, M. A.; Hidayat, T. N.

    2016-11-01

    The purpose of steganography is how to hide data into another media. In order to increase security of data, steganography technique is often combined with cryptography. The weakness of this combination technique is the data was centralized. Therefore, a steganography technique is develop by using combination of spread spectrum and secret sharing technique. In steganography with secret sharing, shares of data is created and hidden in several medium. Medium used to concealed shares were MP3 files. Hiding technique used was Spread Spectrum. Secret sharing scheme used was Shamir's Secret Sharing. The result showed that steganography with spread spectrum combined with Shamir's Secret Share using MP3 files as medium produce a technique that could hid data into several cover. To extract and reconstruct the data hidden in stego object, it is needed the amount of stego object which more or equal to its threshold. Furthermore, stego objects were imperceptible and robust.

  6. Reviews on Technology and Standard of Spatial Audio Coding

    Directory of Open Access Journals (Sweden)

    Ikhwana Elfitri

    2017-03-01

    Full Text Available Market demands on a more impressive entertainment media have motivated for delivery of three dimensional (3D audio content to home consumers through Ultra High Definition TV (UHDTV, the next generation of TV broadcasting, where spatial audio coding plays fundamental role. This paper reviews fundamental concept on spatial audio coding which includes technology, standard, and application. Basic principle of object-based audio reproduction system will also be elaborated, compared to the traditional channel-based system, to provide good understanding on this popular interactive audio reproduction system which gives end users flexibility to render their own preferred audio composition.

  7. Three-Dimensional Audio Client Library

    Science.gov (United States)

    Rizzi, Stephen A.

    2005-01-01

    The Three-Dimensional Audio Client Library (3DAudio library) is a group of software routines written to facilitate development of both stand-alone (audio only) and immersive virtual-reality application programs that utilize three-dimensional audio displays. The library is intended to enable the development of three-dimensional audio client application programs by use of a code base common to multiple audio server computers. The 3DAudio library calls vendor-specific audio client libraries and currently supports the AuSIM Gold-Server and Lake Huron audio servers. 3DAudio library routines contain common functions for (1) initiation and termination of a client/audio server session, (2) configuration-file input, (3) positioning functions, (4) coordinate transformations, (5) audio transport functions, (6) rendering functions, (7) debugging functions, and (8) event-list-sequencing functions. The 3DAudio software is written in the C++ programming language and currently operates under the Linux, IRIX, and Windows operating systems.

  8. Audio-haptic physically-based simulation of walking on different grounds

    DEFF Research Database (Denmark)

    Turchet, Luca; Nordahl, Rolf; Serafin, Stefania

    2010-01-01

    We describe a system which simulates in realtime the auditory and haptic sensations of walking on different surfaces. The system is based on a pair of sandals enhanced with pressure sensors and actuators. The pressure sensors detect the interaction force during walking, and control several...... physically based synthesis algorithms, which drive both the auditory and haptic feedback. The different hardware and software components of the system are described, together with possible uses and possibilities for improvements in future design iterations....

  9. A listening test system for automotive audio

    DEFF Research Database (Denmark)

    Christensen, Flemming; Geoff, Martin; Minnaar, Pauli

    2005-01-01

    This paper describes a system for simulating automotive audio through headphones for the purposes of conducting listening experiments in the laboratory. The system is based on binaural technology and consists of a component for reproducing the sound of the audio system itself and a component...

  10. Spatio-Temporal Audio Enhancement Based on IAA Noise Covariance Matrix Estimates

    DEFF Research Database (Denmark)

    Nørholm, Sidsel Marie; Jensen, Jesper Rindom; Christensen, Mads Græsbøll

    2014-01-01

    A method for estimating the noise covariance matrix in a mul- tichannel setup is proposed. The method is based on the iter- ative adaptive approach (IAA), which only needs short seg- ments of data to estimate the covariance matrix. Therefore, the method can be used for fast varying signals. The m...

  11. Audio Source Separation in Reverberant Environments Using β-Divergence-Based Nonnegative Factorization

    DEFF Research Database (Denmark)

    Fakhry, Mahmoud; Svaizer, Piergiorgio; Omologo, Maurizio

    2017-01-01

    -maximization algorithm and used to separate the signals by means of multichannel Wiener filtering. We propose to estimate these parameters by applying nonnegative factorization based on prior information on source variances. In the nonnegative factorization, spectral basis matrices can be defined as the prior...

  12. Categorizing Video Game Audio

    DEFF Research Database (Denmark)

    Westerberg, Andreas Rytter; Schoenau-Fog, Henrik

    2015-01-01

    This paper dives into the subject of video game audio and how it can be categorized in order to deliver a message to a player in the most precise way. A new categorization, with a new take on the diegetic spaces, can be used a tool of inspiration for sound- and game-designers to rethink how...... they can use audio in video games. The conclusion of this study is that the current models' view of the diegetic spaces, used to categorize video game audio, is not t to categorize all sounds. This can however possibly be changed though a rethinking of how the player interprets audio....

  13. Robust Sounds of Activities of Daily Living Classification in Two-Channel Audio-Based Telemonitoring

    Directory of Open Access Journals (Sweden)

    David Maunder

    2013-01-01

    Full Text Available Despite recent advances in the area of home telemonitoring, the challenge of automatically detecting the sound signatures of activities of daily living of an elderly patient using nonintrusive and reliable methods remains. This paper investigates the classification of eight typical sounds of daily life from arbitrarily positioned two-microphone sensors under realistic noisy conditions. In particular, the role of several source separation and sound activity detection methods is considered. Evaluations on a new four-microphone database collected under four realistic noise conditions reveal that effective sound activity detection can produce significant gains in classification accuracy and that further gains can be made using source separation methods based on independent component analysis. Encouragingly, the results show that recognition accuracies in the range 70%–100% can be consistently obtained using different microphone-pair positions, under all but the most severe noise conditions.

  14. Robust sounds of activities of daily living classification in two-channel audio-based telemonitoring.

    Science.gov (United States)

    Maunder, David; Epps, Julien; Ambikairajah, Eliathamby; Celler, Branko

    2013-01-01

    Despite recent advances in the area of home telemonitoring, the challenge of automatically detecting the sound signatures of activities of daily living of an elderly patient using nonintrusive and reliable methods remains. This paper investigates the classification of eight typical sounds of daily life from arbitrarily positioned two-microphone sensors under realistic noisy conditions. In particular, the role of several source separation and sound activity detection methods is considered. Evaluations on a new four-microphone database collected under four realistic noise conditions reveal that effective sound activity detection can produce significant gains in classification accuracy and that further gains can be made using source separation methods based on independent component analysis. Encouragingly, the results show that recognition accuracies in the range 70%-100% can be consistently obtained using different microphone-pair positions, under all but the most severe noise conditions.

  15. Study on Strengthening Plan of Safety Network CCTV Monitoring by Steganography and User Authentication

    Directory of Open Access Journals (Sweden)

    Jung-oh Park

    2015-01-01

    Full Text Available Recently, as the utilization of CCTV (closed circuit television is emerging as an issue, the studies on CCTV are receiving much attention. Accordingly, due to the development of CCTV, CCTV has IP addresses and is connected to network; it is exposed to many threats on the existing web environment. In this paper, steganography is utilized to confirm the Data Masquerading and Data Modification and, in addition, to strengthen the security; the user information is protected based on PKI (public key infrastructure, SN (serial number, and R value (random number attributed at the time of login and the user authentication protocol to block nonauthorized access of malicious user in network CCTV environment was proposed. This paper should be appropriate for utilization of user infringement-related CCTV where user information protection-related technology is not applied for CCTV in the future.

  16. A Novel Image Steganography Technique for Secured Online Transaction Using DWT and Visual Cryptography

    Science.gov (United States)

    Anitha Devi, M. D.; ShivaKumar, K. B.

    2017-08-01

    Online payment eco system is the main target especially for cyber frauds. Therefore end to end encryption is very much needed in order to maintain the integrity of secret information related to transactions carried online. With access to payment related sensitive information, which enables lot of money transactions every day, the payment infrastructure is a major target for hackers. The proposed system highlights, an ideal approach for secure online transaction for fund transfer with a unique combination of visual cryptography and Haar based discrete wavelet transform steganography technique. This combination of data hiding technique reduces the amount of information shared between consumer and online merchant needed for successful online transaction along with providing enhanced security to customer’s account details and thereby increasing customer’s confidence preventing “Identity theft” and “Phishing”. To evaluate the effectiveness of proposed algorithm Root mean square error, Peak signal to noise ratio have been used as evaluation parameters

  17. Roundtable Audio Discussion

    Directory of Open Access Journals (Sweden)

    Chris Bigum

    2007-01-01

    Full Text Available RoundTable on Technology, Teaching and Tools. This is a roundtable audio interview conducted by James Farmer, founder of Edublogs, with Anne Bartlett-Bragg (University of Technology Sydney and Chris Bigum (Deakin University. Skype was used to make and record the audio conference and the resulting sound file was edited by Andrew McLauchlan.

  18. Effects of a Theory-Based Audio HIV/AIDS Intervention for Illiterate Rural Females in Amhara, Ethiopia

    Science.gov (United States)

    Bogale, Gebeyehu W.; Boer, Henk; Seydel, Erwin R.

    2011-01-01

    In Ethiopia the level of illiteracy in rural areas is very high. In this study, we investigated the effects of an audio HIV/AIDS prevention intervention targeted at rural illiterate females. In the intervention we used social-oriented presentation formats, such as discussion between similar females and role-play. In a pretest and posttest…

  19. On-line Tool Wear Detection on DCMT070204 Carbide Tool Tip Based on Noise Cutting Audio Signal using Artificial Neural Network

    Science.gov (United States)

    Prasetyo, T.; Amar, S.; Arendra, A.; Zam Zami, M. K.

    2018-01-01

    This study develops an on-line detection system to predict the wear of DCMT070204 tool tip during the cutting process of the workpiece. The machine used in this research is CNC ProTurn 9000 to cut ST42 steel cylinder. The audio signal has been captured using the microphone placed in the tool post and recorded in Matlab. The signal is recorded at the sampling rate of 44.1 kHz, and the sampling size of 1024. The recorded signal is 110 data derived from the audio signal while cutting using a normal chisel and a worn chisel. And then perform signal feature extraction in the frequency domain using Fast Fourier Transform. Feature selection is done based on correlation analysis. And tool wear classification was performed using artificial neural networks with 33 input features selected. This artificial neural network is trained with back propagation method. Classification performance testing yields an accuracy of 74%.

  20. Robust Steganography Using LSB-XOR and Image Sharing

    OpenAIRE

    Adak, Chandranath

    2013-01-01

    Hiding and securing the secret digital information and data that are transmitted over the internet is of widespread and most challenging interest. This paper presents a new idea of robust steganography using bitwise-XOR operation between stego-key-image-pixel LSB (Least Significant Bit) value and secret message-character ASCII-binary value (or, secret image-pixel value). The stego-key-image is shared in dual-layer using odd-even position of each pixel to make the system robust. Due to image s...

  1. Overdrive and Edge as Refiners of "Belting"?: An Empirical Study Qualifying and Categorizing "Belting" Based on Audio Perception, Laryngostroboscopic Imaging, Acoustics, LTAS, and EGG.

    Science.gov (United States)

    McGlashan, Julian; Thuesen, Mathias Aaen; Sadolin, Cathrine

    2017-05-01

    We aimed to study the categorizations "Overdrive" and "Edge" from the pedagogical method Complete Vocal Technique as refiners of the often ill-defined concept of "belting" by means of audio perception, laryngostroboscopic imaging, acoustics, long-term average spectrum (LTAS), and electroglottography (EGG). This is a case-control study. Twenty singers were recorded singing sustained vowels in a "belting" quality refined by audio perception as "Overdrive" and "Edge." Two studies were performed: (1) a laryngostroboscopic examination using a videonasoendoscopic camera system (Olympus) and the Laryngostrobe program (Laryngograph); (2) a simultaneous recording of the EGG and acoustic signals using Speech Studio (Laryngograph). The images were analyzed based on consensus agreement. Statistical analysis of the acoustic, LTAS, and EGG parameters was undertaken using the Student paired t test. The two modes of singing determined by audio perception have visibly different laryngeal gestures: Edge has a more constricted setting than that of Overdrive, where the ventricular folds seem to cover more of the vocal folds, the aryepiglottic folds show a sharper edge in Edge, and the cuneiform cartilages are rolled in anteromedially. LTAS analysis shows a statistical difference, particularly after the ninth harmonic, with a coinciding first formant. The combined group showed statistical differences in shimmer, harmonics-to-noise ratio, normalized noise energy, and mean sound pressure level (P ≤ 0.05). "Belting" sounds can be categorized using audio perception into two modes of singing: "Overdrive" and "Edge." This study demonstrates consistent visibly different laryngeal gestures between these modes and with some correspondingly significant differences in LTAS, EGG, and acoustic measures. Copyright © 2017 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  2. Effect of audio in-vehicle red light-running warning message on driving behavior based on a driving simulator experiment.

    Science.gov (United States)

    Yan, Xuedong; Liu, Yang; Xu, Yongcun

    2015-01-01

    Drivers' incorrect decisions of crossing signalized intersections at the onset of the yellow change may lead to red light running (RLR), and RLR crashes result in substantial numbers of severe injuries and property damage. In recent years, some Intelligent Transport System (ITS) concepts have focused on reducing RLR by alerting drivers that they are about to violate the signal. The objective of this study is to conduct an experimental investigation on the effectiveness of the red light violation warning system using a voice message. In this study, the prototype concept of the RLR audio warning system was modeled and tested in a high-fidelity driving simulator. According to the concept, when a vehicle is approaching an intersection at the onset of yellow and the time to the intersection is longer than the yellow interval, the in-vehicle warning system can activate the following audio message "The red light is impending. Please decelerate!" The intent of the warning design is to encourage drivers who cannot clear an intersection during the yellow change interval to stop at the intersection. The experimental results showed that the warning message could decrease red light running violations by 84.3 percent. Based on the logistic regression analyses, drivers without a warning were about 86 times more likely to make go decisions at the onset of yellow and about 15 times more likely to run red lights than those with a warning. Additionally, it was found that the audio warning message could significantly reduce RLR severity because the RLR drivers' red-entry times without a warning were longer than those with a warning. This driving simulator study showed a promising effect of the audio in-vehicle warning message on reducing RLR violations and crashes. It is worthwhile to further develop the proposed technology in field applications.

  3. Audio Technology and Mobile Human Computer Interaction

    DEFF Research Database (Denmark)

    Chamberlain, Alan; Bødker, Mads; Hazzard, Adrian

    2017-01-01

    Audio-based mobile technology is opening up a range of new interactive possibilities. This paper brings some of those possibilities to light by offering a range of perspectives based in this area. It is not only the technical systems that are developing, but novel approaches to the design...... and understanding of audio-based mobile systems are evolving to offer new perspectives on interaction and design and support such systems to be applied in areas, such as the humanities....

  4. Implementing Audio-CASI on Windows’ Platforms

    Science.gov (United States)

    Cooley, Philip C.; Turner, Charles F.

    2011-01-01

    Audio computer-assisted self interviewing (Audio-CASI) technologies have recently been shown to provide important and sometimes dramatic improvements in the quality of survey measurements. This is particularly true for measurements requiring respondents to divulge highly sensitive information such as their sexual, drug use, or other sensitive behaviors. However, DOS-based Audio-CASI systems that were designed and adopted in the early 1990s have important limitations. Most salient is the poor control they provide for manipulating the video presentation of survey questions. This article reports our experiences adapting Audio-CASI to Microsoft Windows 3.1 and Windows 95 platforms. Overall, our Windows-based system provided the desired control over video presentation and afforded other advantages including compatibility with a much wider array of audio devices than our DOS-based Audio-CASI technologies. These advantages came at the cost of increased system requirements --including the need for both more RAM and larger hard disks. While these costs will be an issue for organizations converting large inventories of PCS to Windows Audio-CASI today, this will not be a serious constraint for organizations and individuals with small inventories of machines to upgrade or those purchasing new machines today. PMID:22081743

  5. Portable Audio Design

    DEFF Research Database (Denmark)

    Groth, Sanne Krogh

    2014-01-01

    The chapter presents a methodological approach to the early process of producing portable audio design. The chapter high lights audio walks and audio guides, but can also be of inspiration when working with graphical and video production for portable devices. The final products can be presented...... within online and physical institutional contexts. The approach focuses especially on the relationship to specific sites, and how an awareness of the relationship between the site and the production can be part of the design process. Such awareness entails several approaches: the necessity of paying...

  6. Throughput increase of the covert communication channel organized by the stable steganography algorithm using spatial domain of the image

    Directory of Open Access Journals (Sweden)

    O.V. Kostyrka

    2016-09-01

    Full Text Available At the organization of a covert communication channel a number of requirements are imposed on used steganography algorithms among which one of the main are: resistance to attacks against the built-in message, reliability of perception of formed steganography message, significant throughput of a steganography communication channel. Aim: The aim of this research is to modify the steganography method, developed by the author earlier, which will allow to increase the throughput of the corresponding covert communication channel when saving resistance to attacks against the built-in message and perception reliability of the created steganography message, inherent to developed method. Materials and Methods: Modifications of a steganography method that is steady against attacks against the built-in message which is carrying out the inclusion and decoding of the sent (additional information in spatial domain of the image allowing to increase the throughput of the organized communication channel are offered. Use of spatial domain of the image allows to avoid accumulation of an additional computational error during the inclusion/decoding of additional information due to “transitions” from spatial domain of the image to the area of conversion and back that positively affects the efficiency of decoding. Such methods are considered as attacks against the built-in message: imposing of different noise on a steganography message, filtering, lossy compression of a ste-ganography message where the JPEG and JPEG2000 formats with different quality coefficients for saving of a steganography message are used. Results: It is shown that algorithmic implementations of the offered methods modifications remain steady against the perturbing influences, including considerable, provide reliability of perception of the created steganography message, increase the throughput of the created steganography communication channel in comparison with the algorithm implementing

  7. Concept for audio encoding and decoding for audio channels and audio objects

    OpenAIRE

    Adami, Alexander; Borss, Christian; Dick, Sascha; Ertel, Christian; Füg, Simone; Herre, Jürgen; Hilpert, Johannes; Hölzer, Andreas; Kratschmer, Michael; Küch, Fabian; Kuntz, Achim; Murtaza, Adrian; Plogsties, Jan; Silzle, Andreas; Stenzel, Hanne

    2015-01-01

    Audio encoder for encoding audio input data (101) to obtain audio output data (501) comprises an input interface (100) for receiving a plurality of audio channels, a plurality of audio objects and metadata related to one or more of the plurality of audio objects; a mixer (200) for mixing the plurality of objects and the plurality of channels to obtain a plurality of pre-mixed channels, each pre-mixed channel comprising audio data of a channel and audio data of at least one object; a core enco...

  8. COMINT Audio Interface

    National Research Council Canada - National Science Library

    Morgans, D

    1999-01-01

    .... Demonstrations conducted under this effort concluded that 3D audio localization techniques on their own have not been developed to the point where they achieve the fidelity necessary for the military work environment...

  9. Progressive Syntax-Rich Coding of Multichannel Audio Sources

    Directory of Open Access Journals (Sweden)

    Dai Yang

    2003-09-01

    Full Text Available Being able to transmit the audio bitstream progressively is a highly desirable property for network transmission. MPEG-4 version 2 audio supports fine grain bit rate scalability in the generic audio coder (GAC. It has a bit-sliced arithmetic coding (BSAC tool, which provides scalability in the step of 1 Kbps per audio channel. There are also several other scalable audio coding methods, which have been proposed in recent years. However, these scalable audio tools are only available for mono and stereo audio material. Little work has been done on progressive coding of multichannel audio sources. MPEG advanced audio coding (AAC is one of the most distinguished multichannel digital audio compression systems. Based on AAC, we develop in this work a progressive syntax-rich multichannel audio codec (PSMAC. It not only supports fine grain bit rate scalability for the multichannel audio bitstream but also provides several other desirable functionalities. A formal subjective listening test shows that the proposed algorithm achieves an excellent performance at several different bit rates when compared with MPEG AAC.

  10. Progressive Syntax-Rich Coding of Multichannel Audio Sources

    Science.gov (United States)

    Yang, Dai; Ai, Hongmei; Kyriakakis, Chris; Kuo, C.-C. Jay

    2003-12-01

    Being able to transmit the audio bitstream progressively is a highly desirable property for network transmission. MPEG- [InlineEquation not available: see fulltext.] version [InlineEquation not available: see fulltext.] audio supports fine grain bit rate scalability in the generic audio coder (GAC). It has a bit-sliced arithmetic coding (BSAC) tool, which provides scalability in the step of 1 Kbps per audio channel. There are also several other scalable audio coding methods, which have been proposed in recent years. However, these scalable audio tools are only available for mono and stereo audio material. Little work has been done on progressive coding of multichannel audio sources. MPEG advanced audio coding (AAC) is one of the most distinguished multichannel digital audio compression systems. Based on AAC, we develop in this work a progressive syntax-rich multichannel audio codec (PSMAC). It not only supports fine grain bit rate scalability for the multichannel audio bitstream but also provides several other desirable functionalities. A formal subjective listening test shows that the proposed algorithm achieves an excellent performance at several different bit rates when compared with MPEG AAC.

  11. Structure Learning in Audio

    DEFF Research Database (Denmark)

    Nielsen, Andreas Brinch

    By having information about the setting a user is in, a computer is able to make decisions proactively to facilitate tasks for the user. Two approaches are taken in this thesis to achieve more information about an audio environment. One approach is that of classifying audio, and a new approach us......-Gaussian source distributions allowing a much wider use of the method. All methods uses a variety of classification models and model selection algorithms which is a common theme of the thesis....

  12. Museum audio description

    OpenAIRE

    Martins, Cláudia Susana Nunes

    2011-01-01

    Audio description for the blind and visually impaired has been around since people have described what is seen. Throughout time, it has evolved and developed within different media, starting with reality and daily life, moving into the cinema and television, then across other performing arts, museums and art galleries, and public places. Thus, academics and entertainment providers have developed a growing interest for audio description, especially in what concerns the best methods and strateg...

  13. Near-field Localization of Audio

    DEFF Research Database (Denmark)

    Jensen, Jesper Rindom; Christensen, Mads Græsbøll

    2014-01-01

    Localization of audio sources using microphone arrays has been an important research problem for more than two decades. Many traditional methods for solving the problem are based on a two-stage procedure: first, information about the audio source, such as time differences-of-arrival (TDOAs......) and gain ratios-of-arrival (GROAs) between microphones is estimated, and, second, this knowledge is used to localize the audio source. These methods often have a low computational complexity, but this comes at the cost of a limited estimation accuracy. Therefore, we propose a new localization approach...

  14. A high capacity data recording device based on a digital audio processor and a video cassette recorder.

    Science.gov (United States)

    Bezanilla, F

    1985-01-01

    A modified digital audio processor, a video cassette recorder, and some simple added circuitry are assembled into a recording device of high capacity. The unit converts two analog channels into digital form at 44-kHz sampling rate and stores the information in digital form in a common video cassette. Bandwidth of each channel is from direct current to approximately 20 kHz and the dynamic range is close to 90 dB. The total storage capacity in a 3-h video cassette is 2 Gbytes. The information can be retrieved in analog or digital form. PMID:3978213

  15. CERN automatic audio-conference service

    CERN Document Server

    Sierra Moral, R

    2010-01-01

    Scientists from all over the world need to collaborate with CERN on a daily basis. They must be able to communicate effectively on their joint projects at any time; as a result telephone conferences have become indispensable and widely used. Managed by 6 operators, CERN already has more than 20000 hours and 5700 audio-conferences per year. However, the traditional telephone based audio-conference system needed to be modernized in three ways. Firstly, to provide the participants with more autonomy in the organization of their conferences; secondly, to eliminate the constraints of manual intervention by operators; and thirdly, to integrate the audio-conferences into a collaborative working framework. The large number, and hence cost, of the conferences prohibited externalization and so the CERN telecommunications team drew up a specification to implement a new system. It was decided to use a new commercial collaborative audio-conference solution based on the SIP protocol. The system was tested as the first Euro...

  16. CERN automatic audio-conference service

    CERN Multimedia

    Sierra Moral, R

    2009-01-01

    Scientists from all over the world need to collaborate with CERN on a daily basis. They must be able to communicate effectively on their joint projects at any time; as a result telephone conferences have become indispensable and widely used. Managed by 6 operators, CERN already has more than 20000 hours and 5700 audio-conferences per year. However, the traditional telephone based audio-conference system needed to be modernized in three ways. Firstly, to provide the participants with more autonomy in the organization of their conferences; secondly, to eliminate the constraints of manual intervention by operators; and thirdly, to integrate the audio-conferences into a collaborative working framework. The large number, and hence cost, of the conferences prohibited externalization and so the CERN telecommunications team drew up a specification to implement a new system. It was decided to use a new commercial collaborative audio-conference solution based on the SIP protocol. The system was tested as the first Euro...

  17. TNO at TRECVID 2008, Combining Audio and Video Fingerprinting for Robust Copy Detection

    NARCIS (Netherlands)

    Doets, P.J.; Eendebak, P.T.; Ranguelova, E.; Kraaij, W.

    2009-01-01

    TNO has evaluated a baseline audio and a video fingerprinting system based on robust hashing for the TRECVID 2008 copy detection task. We participated in the audio, the video and the combined audio-video copy detection task. The audio fingerprinting implementation clearly outperformed the video

  18. Recognition of Activities of Daily Living Based on Environmental Analyses Using Audio Fingerprinting Techniques: A Systematic Review

    Science.gov (United States)

    Santos, Rui; Pombo, Nuno; Flórez-Revuelta, Francisco

    2018-01-01

    An increase in the accuracy of identification of Activities of Daily Living (ADL) is very important for different goals of Enhanced Living Environments and for Ambient Assisted Living (AAL) tasks. This increase may be achieved through identification of the surrounding environment. Although this is usually used to identify the location, ADL recognition can be improved with the identification of the sound in that particular environment. This paper reviews audio fingerprinting techniques that can be used with the acoustic data acquired from mobile devices. A comprehensive literature search was conducted in order to identify relevant English language works aimed at the identification of the environment of ADLs using data acquired with mobile devices, published between 2002 and 2017. In total, 40 studies were analyzed and selected from 115 citations. The results highlight several audio fingerprinting techniques, including Modified discrete cosine transform (MDCT), Mel-frequency cepstrum coefficients (MFCC), Principal Component Analysis (PCA), Fast Fourier Transform (FFT), Gaussian mixture models (GMM), likelihood estimation, logarithmic moduled complex lapped transform (LMCLT), support vector machine (SVM), constant Q transform (CQT), symmetric pairwise boosting (SPB), Philips robust hash (PRH), linear discriminant analysis (LDA) and discrete cosine transform (DCT). PMID:29315232

  19. AudioMUD: a multiuser virtual environment for blind people.

    Science.gov (United States)

    Sánchez, Jaime; Hassler, Tiago

    2007-03-01

    A number of virtual environments have been developed during the last years. Among them there are some applications for blind people based on different type of audio, from simple sounds to 3-D audio. In this study, we pursued a different approach. We designed AudioMUD by using spoken text to describe the environment, navigation, and interaction. We have also introduced some collaborative features into the interaction between blind users. The core of a multiuser MUD game is a networked textual virtual environment. We developed AudioMUD by adding some collaborative features to the basic idea of a MUD and placed a simulated virtual environment inside the human body. This paper presents the design and usability evaluation of AudioMUD. Blind learners were motivated when interacted with AudioMUD and helped to improve the interaction through audio and interface design elements.

  20. Web Audio/Video Streaming Tool

    Science.gov (United States)

    Guruvadoo, Eranna K.

    2003-01-01

    In order to promote NASA-wide educational outreach program to educate and inform the public of space exploration, NASA, at Kennedy Space Center, is seeking efficient ways to add more contents to the web by streaming audio/video files. This project proposes a high level overview of a framework for the creation, management, and scheduling of audio/video assets over the web. To support short-term goals, the prototype of a web-based tool is designed and demonstrated to automate the process of streaming audio/video files. The tool provides web-enabled users interfaces to manage video assets, create publishable schedules of video assets for streaming, and schedule the streaming events. These operations are performed on user-defined and system-derived metadata of audio/video assets stored in a relational database while the assets reside on separate repository. The prototype tool is designed using ColdFusion 5.0.

  1. EVALUASI KEPUASAN PENGGUNA TERHADAP APLIKASI AUDIO BOOKS

    Directory of Open Access Journals (Sweden)

    Raditya Maulana Anuraga

    2017-02-01

    Full Text Available Listeno is the first application audio books in Indonesia so that the users can get the book in audio form like listen to music, Listeno have problems in a feature request Listeno offline mode that have not been released, a security problem mp3 files that must be considered, and the target Listeno not yet reached 100,000 active users. This research has the objective to evaluate user satisfaction to Audio Books with research method approach, Nielsen. The analysis in this study using Importance Performance Analysis (IPA is combined with the index of User Satisfaction (IKP based on the indicators used are: Benefit (Usefulness, Utility (Utility, Usability (Usability, easy to understand (Learnability, Efficient (efficiency , Easy to remember (Memorability, Error (Error, and satisfaction (satisfaction. The results showed Applications User Satisfaction Audio books are quite satisfied with the results of the calculation IKP 69.58%..

  2. Augmenting Environmental Interaction in Audio Feedback Systems

    Directory of Open Access Journals (Sweden)

    Seunghun Kim

    2016-04-01

    Full Text Available Audio feedback is defined as a positive feedback of acoustic signals where an audio input and output form a loop, and may be utilized artistically. This article presents new context-based controls over audio feedback, leading to the generation of desired sonic behaviors by enriching the influence of existing acoustic information such as room response and ambient noise. This ecological approach to audio feedback emphasizes mutual sonic interaction between signal processing and the acoustic environment. Mappings from analyses of the received signal to signal-processing parameters are designed to emphasize this specificity as an aesthetic goal. Our feedback system presents four types of mappings: approximate analyses of room reverberation to tempo-scale characteristics, ambient noise to amplitude and two different approximations of resonances to timbre. These mappings are validated computationally and evaluated experimentally in different acoustic conditions.

  3. Design of an audio advertisement dataset

    Science.gov (United States)

    Fu, Yutao; Liu, Jihong; Zhang, Qi; Geng, Yuting

    2015-12-01

    Since more and more advertisements swarm into radios, it is necessary to establish an audio advertising dataset which could be used to analyze and classify the advertisement. A method of how to establish a complete audio advertising dataset is presented in this paper. The dataset is divided into four different kinds of advertisements. Each advertisement's sample is given in *.wav file format, and annotated with a txt file which contains its file name, sampling frequency, channel number, broadcasting time and its class. The classifying rationality of the advertisements in this dataset is proved by clustering the different advertisements based on Principal Component Analysis (PCA). The experimental results show that this audio advertisement dataset offers a reliable set of samples for correlative audio advertisement experimental studies.

  4. Design of progressive syntax-rich multichannel audio codec

    Science.gov (United States)

    Yang, Dai; Ai, Hongmei; Kyriakakis, Christos; Kuo, C.-C. Jay

    2001-12-01

    Being able to transmit the audio bitstream progressively is a highly desirable property for network transmission. MPEG-4 version-2 audio supports fine grain bit rate scalability in the Generic Audio Coder (GAC). It has a Bit-Sliced Arithmetic Coding (BSAC) tool, which provides scalability in the step of 1kbit/sec per audio channel. However, this fine grain scalability tool is only available for mono and stereo audio material. Not much work has been done on progressively transmitting multichannel audio sources. MPEG Advanced Audio Coding (AAC) is one of the most distinguished multichannel digital audio compression systems. Based on AAC, we develop a progressive syntax-rich multichannel audio codec in this work. It not only supports fine grain bit rate scalability for the multichannel audio bitstream, but also provides several other desirable functionalities. A formal subjective listening test shows that the proposed algorithm achieves a better performance at several different bit rates when compared with MPEG-4 BSAC for the mono audio sources.

  5. Image steganography using layered pixel-value differencing

    Science.gov (United States)

    Kim, Jaeyoung; Park, Hanhoon

    2017-02-01

    This paper proposes a layered approach to improve the embedding capacity of the existing pixel-value differencing (PVD) methods for image steganography. Specifically, one of the PVD methods is applied to embed a secret information into a cover image and the resulting image, called stego-image, is used to embed additional secret information by the same or another PVD method. This results in a double-layered stego-image. Then, another PVD method can be applied to the double-layered stego-image, resulting in a triple-layered stego-image. Likewise, multi-layered stego-images can be obtained. To successfully recover the secret information hidden in each layer, the embedding process is carefully designed. In the experiment, the proposed layered PVD method proved to be effective.

  6. Improving the privacy of optical steganography with temporal phase masks.

    Science.gov (United States)

    Wang, Z; Fok, M P; Xu, L; Chang, J; Prucnal, P R

    2010-03-15

    Temporal phase modulation of spread stealth signals is proposed and demonstrated to improve optical steganography transmission privacy. After phase modulation, the temporally spread stealth signal has a more complex spectral-phase-temporal relationship, such that the original temporal profile cannot be restored when only dispersion compensation is applied to the temporally spread stealth signals. Therefore, it increases the difficulty for the eavesdropper to detect and intercept the stealth channel that is hidden under a public transmission, even with a correct dispersion compensation device. The experimental results demonstrate the feasibility of this approach and display insignificant degradation in transmission performance, compared to the conventional stealth transmission without temporal phase modulation. The proposed system can also work without a clock transmission for signal synchronization. Our analysis and simulation results show that it is difficult for the adversary to detect the existence of the stealth transmission, or find the correct phase mask to recover the stealth signals.

  7. Increasing Security for Cloud Computing By Steganography in Image Edges

    Directory of Open Access Journals (Sweden)

    Hassan Hadi Saleh

    2017-03-01

    Full Text Available The security of data storage in “cloud” is big challenge because the data keep within resources that may be accessed by particular machines. The managing of these data and services may not be high reliable. Therefore, the security of data is highly challenging. To increase the security of data in data center of cloud, we have introduced good method to ensure data security in “cloud computing” by methods of data hiding using color images which is called steganography. The fundamental objective of this paper is to prevent "Data Access” by unauthorized or opponent users. This scheme stores data at data centers within edges of color images and retrieves data from it when it is wanted.

  8. Steganography anomaly detection using simple one-class classification

    Science.gov (United States)

    Rodriguez, Benjamin M.; Peterson, Gilbert L.; Agaian, Sos S.

    2007-04-01

    There are several security issues tied to multimedia when implementing the various applications in the cellular phone and wireless industry. One primary concern is the potential ease of implementing a steganography system. Traditionally, the only mechanism to embed information into a media file has been with a desktop computer. However, as the cellular phone and wireless industry matures, it becomes much simpler for the same techniques to be performed using a cell phone. In this paper, two methods are compared that classify cell phone images as either an anomaly or clean, where a clean image is one in which no alterations have been made and an anomalous image is one in which information has been hidden within the image. An image in which information has been hidden is known as a stego image. The main concern in detecting steganographic content with machine learning using cell phone images is in training specific embedding procedures to determine if the method has been used to generate a stego image. This leads to a possible flaw in the system when the learned model of stego is faced with a new stego method which doesn't match the existing model. The proposed solution to this problem is to develop systems that detect steganography as anomalies, making the embedding method irrelevant in detection. Two applicable classification methods for solving the anomaly detection of steganographic content problem are single class support vector machines (SVM) and Parzen-window. Empirical comparison of the two approaches shows that Parzen-window outperforms the single class SVM most likely due to the fact that Parzen-window generalizes less.

  9. Diagonal queue medical image steganography with Rabin cryptosystem.

    Science.gov (United States)

    Jain, Mamta; Lenka, Saroj Kumar

    2016-03-01

    The main purpose of this work is to provide a novel and efficient method to the image steganography area of research in the field of biomedical, so that the security can be given to the very precious and confidential sensitive data of the patient and at the same time with the implication of the highly reliable algorithms will explode the high security to the precious brain information from the intruders. The patient information such as patient medical records with personal identification information of patients can be stored in both storage and transmission. This paper describes a novel methodology for hiding medical records like HIV reports, baby girl fetus, and patient's identity information inside their Brain disease medical image files viz. scan image or MRI image using the notion of obscurity with respect to a diagonal queue least significant bit substitution. Data structure queue plays a dynamic role in resource sharing between multiple communication parties and when secret medical data are transferred asynchronously (secret medical data not necessarily received at the same rate they were sent). Rabin cryptosystem is used for secret medical data writing, since it is computationally secure against a chosen-plaintext attack and shows the difficulty of integer factoring. The outcome of the cryptosystem is organized in various blocks and equally distributed sub-blocks. In steganography process, various Brain disease cover images are organized into various blocks of diagonal queues. The secret cipher blocks and sub-blocks are assigned dynamically to selected diagonal queues for embedding. The receiver gets four values of medical data plaintext corresponding to one ciphertext, so only authorized receiver can identify the correct medical data. Performance analysis was conducted using MSE, PSNR, maximum embedding capacity as well as by histogram analysis between various Brain disease stego and cover images.

  10. DAFX Digital Audio Effects

    CERN Document Server

    2011-01-01

    The rapid development in various fields of Digital Audio Effects, or DAFX, has led to new algorithms and this second edition of the popular book, DAFX: Digital Audio Effects has been updated throughout to reflect progress in the field. It maintains a unique approach to DAFX with a lecture-style introduction into the basics of effect processing. Each effect description begins with the presentation of the physical and acoustical phenomena, an explanation of the signal processing techniques to achieve the effect, followed by a discussion of musical applications and the control of effect parameter

  11. 3D Audio System

    Science.gov (United States)

    1992-01-01

    Ames Research Center research into virtual reality led to the development of the Convolvotron, a high speed digital audio processing system that delivers three-dimensional sound over headphones. It consists of a two-card set designed for use with a personal computer. The Convolvotron's primary application is presentation of 3D audio signals over headphones. Four independent sound sources are filtered with large time-varying filters that compensate for motion. The perceived location of the sound remains constant. Possible applications are in air traffic control towers or airplane cockpits, hearing and perception research and virtual reality development.

  12. EFEKTIVITAS MODEL PROBLEM BASED LEARNING BERBANTUAN MEDIA AUDIO VISUAL DITINJAU DARI HASIL BELAJAR IPA SISWA KELAS 5 SDN 1 GADU SAMBONG - BLORA SEMESTER 2 TAHUN 2014/2015

    Directory of Open Access Journals (Sweden)

    Andhini Virgiana

    2016-05-01

    Full Text Available Tujuan dari penelitian ini adalah untuk mengetahui perbedaan tingkat hasil belajar antara model problem based learning berbantuan media audio visual dengan model pembelajaran think pair share berbantuan media visual pada pembelajaran IPA siswa kelas 5 SDN 1 Gadu Sambong Kabupaten Blora semester 2 tahun pelajaran 2014/2015. Penelitian ini merupakan penelitian quasi experiment dengan nonequivalent control group design. Subjek penelitian dalam penelitian ini adalah siswa kelas 5 SDN 1 Gadu dan siswa kelas 5 SDN 2 Gagakan. Teknik  pengumpulan data dalam penelitian adalah tes dan observasi. Teknik analisis data yang digunakan adalah statistik deskriptif, statistik parametrik, dan uji t dengan  independent sample t-tes pada taraf signifikansi 5% (α = 0,05. Berdasarkan hasil penelitian dan pembahasan, maka dapat disimpulkan bahwa terdapat perbedaan tingkat efektivitas antara model problem based learning berbantu media audio visual dengan model pembelajaran think pair share berbantu media visual terhadap hasil belajar IPA siswa kelas 5 SDN 1 Gadu Kecamatan Sambong Kabupaten Blora semester 2 tahun 2014/2015. Terbukti hal ini ditunjukkan oleh hasil uji t-test sebesar 3,603 > 1,999 dan signifikansi sebesar 0,001 rata-rata kelas kontrol yaitu 87,0588 > 80,2000.

  13. Audio-visual perception of 3D cinematography: an fMRI study using condition-based and computation-based analyses.

    Science.gov (United States)

    Ogawa, Akitoshi; Bordier, Cecile; Macaluso, Emiliano

    2013-01-01

    The use of naturalistic stimuli to probe sensory functions in the human brain is gaining increasing interest. Previous imaging studies examined brain activity associated with the processing of cinematographic material using both standard "condition-based" designs, as well as "computational" methods based on the extraction of time-varying features of the stimuli (e.g. motion). Here, we exploited both approaches to investigate the neural correlates of complex visual and auditory spatial signals in cinematography. In the first experiment, the participants watched a piece of a commercial movie presented in four blocked conditions: 3D vision with surround sounds (3D-Surround), 3D with monaural sound (3D-Mono), 2D-Surround, and 2D-Mono. In the second experiment, they watched two different segments of the movie both presented continuously in 3D-Surround. The blocked presentation served for standard condition-based analyses, while all datasets were submitted to computation-based analyses. The latter assessed where activity co-varied with visual disparity signals and the complexity of auditory multi-sources signals. The blocked analyses associated 3D viewing with the activation of the dorsal and lateral occipital cortex and superior parietal lobule, while the surround sounds activated the superior and middle temporal gyri (S/MTG). The computation-based analyses revealed the effects of absolute disparity in dorsal occipital and posterior parietal cortices and of disparity gradients in the posterior middle temporal gyrus plus the inferior frontal gyrus. The complexity of the surround sounds was associated with activity in specific sub-regions of S/MTG, even after accounting for changes of sound intensity. These results demonstrate that the processing of naturalistic audio-visual signals entails an extensive set of visual and auditory areas, and that computation-based analyses can track the contribution of complex spatial aspects characterizing such life-like stimuli.

  14. Audio-visual perception of 3D cinematography: an fMRI study using condition-based and computation-based analyses.

    Directory of Open Access Journals (Sweden)

    Akitoshi Ogawa

    Full Text Available The use of naturalistic stimuli to probe sensory functions in the human brain is gaining increasing interest. Previous imaging studies examined brain activity associated with the processing of cinematographic material using both standard "condition-based" designs, as well as "computational" methods based on the extraction of time-varying features of the stimuli (e.g. motion. Here, we exploited both approaches to investigate the neural correlates of complex visual and auditory spatial signals in cinematography. In the first experiment, the participants watched a piece of a commercial movie presented in four blocked conditions: 3D vision with surround sounds (3D-Surround, 3D with monaural sound (3D-Mono, 2D-Surround, and 2D-Mono. In the second experiment, they watched two different segments of the movie both presented continuously in 3D-Surround. The blocked presentation served for standard condition-based analyses, while all datasets were submitted to computation-based analyses. The latter assessed where activity co-varied with visual disparity signals and the complexity of auditory multi-sources signals. The blocked analyses associated 3D viewing with the activation of the dorsal and lateral occipital cortex and superior parietal lobule, while the surround sounds activated the superior and middle temporal gyri (S/MTG. The computation-based analyses revealed the effects of absolute disparity in dorsal occipital and posterior parietal cortices and of disparity gradients in the posterior middle temporal gyrus plus the inferior frontal gyrus. The complexity of the surround sounds was associated with activity in specific sub-regions of S/MTG, even after accounting for changes of sound intensity. These results demonstrate that the processing of naturalistic audio-visual signals entails an extensive set of visual and auditory areas, and that computation-based analyses can track the contribution of complex spatial aspects characterizing such life

  15. PENGGUNAAN MEDIA AUDIO DALAM PEMBELAJARAN STENOGRAFI

    Directory of Open Access Journals (Sweden)

    S Martono

    2011-06-01

    Full Text Available The objective this study is to know the effectivenes of using audio media in stenografi typing learning. The population  of this research was 30 students that divided into two groups; experimental and controlled group consisted of 15 students. Based on the first score in stenografi subject that the two groups have the same abillity but they were given different treatment. For experimental group, they got a treatment of audio media whereas the controlled group didn’t use audio media. The technique of collecting data were documentation technique and experimental tecnique. The instrument was stenografi speed typing. The final result showed that the using of audio media was more effective and can improve the study result better than controlled group. This result was expected to  give significance for the stenografi teachers to apply audio media in learning and input for the students that stenografi was not a memorizing subject but it was a skill subject that must be trained by joining the lesson. Thus, people can use stenografi typing to record each talk. Keywords: Learning, Audio Media, Stenografi

  16. PENGGUNAAN MEDIA AUDIO DALAM PEMBELAJARAN STENOGRAFI

    Directory of Open Access Journals (Sweden)

    S Martono

    2007-06-01

    Full Text Available The objective this study is to know the effectivenes of using audio media in stenografi typing learning. The population  of this research was 30 students that divided into two groups; experimental and controlled group consisted of 15 students. Based on the first score in stenografi subject that the two groups have the same abillity but they were given different treatment. For experimental group, they got a treatment of audio media whereas the controlled group didn’t use audio media. The technique of collecting data were documentation technique and experimental tecnique. The instrument was stenografi speed typing. The final result showed that the using of audio media was more effective and can improve the study result better than controlled group. This result was expected to  give significance for the stenografi teachers to apply audio media in learning and input for the students that stenografi was not a memorizing subject but it was a skill subject that must be trained by joining the lesson. Thus, people can use stenografi typing to record each talk. Keywords: Learning, Audio Media, Stenografi

  17. Audio Feedback -- Better Feedback?

    Science.gov (United States)

    Voelkel, Susanne; Mello, Luciane V.

    2014-01-01

    National Student Survey (NSS) results show that many students are dissatisfied with the amount and quality of feedback they get for their work. This study reports on two case studies in which we tried to address these issues by introducing audio feedback to one undergraduate (UG) and one postgraduate (PG) class, respectively. In case study one…

  18. Circuit Bodging : Audio Multiplexer

    NARCIS (Netherlands)

    Roeling, E.; Allen, B.

    2010-01-01

    Audio amplifiers usually come with a single, glaring design flaw: Not enough auxiliary inputs. Not only that, but you’re usually required to press a button to switch between the amplifier’s limited number of inputs. This is unacceptable - we have better things to do than change input channels! In

  19. Embedded Audio Without Beeps

    DEFF Research Database (Denmark)

    Overholt, Daniel; Møbius, Nikolaj Friis

    2014-01-01

    software environments for audio processing) via innovative interfaces that send real-time inputs to such software running on a laptop, mobile device, or small Linux board (e.g., Raspberry Pi or Beagleboard). Basic hardware will be provided, but participants are also encouraged to bring related equipment...

  20. The audio expert everything you need to know about audio

    CERN Document Server

    Winer, Ethan

    2012-01-01

    The Audio Expert is a comprehensive reference that covers all aspects of audio, with many practical, as well as theoretical, explanations. Providing in-depth descriptions of how audio really works, using common sense plain-English explanations and mechanical analogies with minimal math, the book is written for people who want to understand audio at the deepest, most technical level, without needing an engineering degree. It's presented in an easy-to-read, conversational tone, and includes more than 400 figures and photos augmenting the text.The Audio Expert takes th

  1. Efectos digitales de audio con Web Audio API

    OpenAIRE

    GARCÍA CHAPARRO, SAMUEL

    2015-01-01

    El presente trabajo consiste en un estudio de la capacidad de Web Audio API para el procesado de efectos de audio en tiempo real. De todos los efectos de audio posibles se han elegido el wah-wah, el flanger y el choris, efectos ampliamente empleados con guitarra eléctrica. Se crean funciones de lenguaje JavaScript que modelan el comportamiento de los efectos de audio elegidos, haciéndolas funcionar sobre una plataforma web HTML5. García Chaparro, S. (2015). Efectos digitales de audio con W...

  2. ENERGY STAR Certified Audio Video

    Science.gov (United States)

    Certified models meet all ENERGY STAR requirements as listed in the Version 3.0 ENERGY STAR Program Requirements for Audio Video Equipment that are effective as of May 1, 2013. A detailed listing of key efficiency criteria are available at http://www.energystar.gov/index.cfm?c=audio_dvd.pr_crit_audio_dvd

  3. Analysis of the soundscape in an intensive care unit based on the annotation of an audio recording.

    Science.gov (United States)

    Park, Munhum; Kohlrausch, Armin; de Bruijn, Werner; de Jager, Peter; Simons, Koen

    2014-04-01

    The acoustic environments in hospitals, particularly in intensive care units (ICUs), are characterized by frequent high-level sound events which may negatively affect patient outcome. Many studies performed acoustic surveys, but the measurement protocol was not always reported in detail, and the scope of analysis was limited by the selected mode of sound level meters. Fewer studies systematically investigated the noise sources in ICUs by employing an observer in the patient room, which may potentially bias the measurement. In the current study, the soundscape of an ICU was evaluated where acoustic parameters were extracted from a ∼67-h audio recording, and a selected 24-h recording was annotated off-line for a source-specific analysis. The results showed that the patient-involved noise accounted for 31% of the acoustic energy and 11% of the predicted loudness peaks (PLPs). Excluding the patient-involved noise, the remaining acoustic energy was attributed to staff members (57%), alarms (30%), and the operational noise of life-supporting devices (13%). Furthermore, the contribution of each noise category to the PLPs was found to be more uneven: Staff (92%), alarms (6%), and device noise (2%). The current study suggests that most of the noise sources in ICUs may be associated with modifiable human factors.

  4. Audio scene segmentation for video with generic content

    Science.gov (United States)

    Niu, Feng; Goela, Naveen; Divakaran, Ajay; Abdel-Mottaleb, Mohamed

    2008-01-01

    In this paper, we present a content-adaptive audio texture based method to segment video into audio scenes. The audio scene is modeled as a semantically consistent chunk of audio data. Our algorithm is based on "semantic audio texture analysis." At first, we train GMM models for basic audio classes such as speech, music, etc. Then we define the semantic audio texture based on those classes. We study and present two types of scene changes, those corresponding to an overall audio texture change and those corresponding to a special "transition marker" used by the content creator, such as a short stretch of music in a sitcom or silence in dramatic content. Unlike prior work using genre specific heuristics, such as some methods presented for detecting commercials, we adaptively find out if such special transition markers are being used and if so, which of the base classes are being used as markers without any prior knowledge about the content. Our experimental results show that our proposed audio scene segmentation works well across a wide variety of broadcast content genres.

  5. Developing a Consensus-Driven, Core Competency Model to Shape Future Audio Engineering Technology Curriculum: A Web-Based Modified Delphi Study

    Science.gov (United States)

    Tough, David T.

    2009-01-01

    The purpose of this online study was to create a ranking of essential core competencies and technologies required by AET (audio engineering technology) programs 10 years in the future. The study was designed to facilitate curriculum development and improvement in the rapidly expanding number of small to medium sized audio engineering technology…

  6. Demonstration of optical steganography transmission using temporal phase coded optical signals with spectral notch filtering.

    Science.gov (United States)

    Hong, Xuezhi; Wang, Dawei; Xu, Lei; He, Sailing

    2010-06-07

    A novel approach is proposed and experimentally demonstrated for optical steganography transmission in WDM networks using temporal phase coded optical signals with spectral notch filtering. A temporal phase coded stealth channel is temporally and spectrally overlaid onto a public WDM channel. Direct detection of the public channel is achieved in the presence of the stealth channel. The interference from the public channel is suppressed by spectral notching before the detection of the optical stealth signal. The approach is shown to have good compatibility and robustness to the existing WDM network for optical steganography transmission.

  7. A secure approach for encrypting and compressing biometric information employing orthogonal code and steganography

    Science.gov (United States)

    Islam, Muhammad F.; Islam, Mohammed N.

    2012-04-01

    The objective of this paper is to develop a novel approach for encryption and compression of biometric information utilizing orthogonal coding and steganography techniques. Multiple biometric signatures are encrypted individually using orthogonal codes and then multiplexed together to form a single image, which is then embedded in a cover image using the proposed steganography technique. The proposed technique employs three least significant bits for this purpose and a secret key is developed to choose one from among these bits to be replaced by the corresponding bit of the biometric image. The proposed technique offers secure transmission of multiple biometric signatures in an identification document which will be protected from unauthorized steganalysis attempt.

  8. Hiding phase-quantized biometrics: a case of steganography for reduced-complexity correlation filter classifiers

    Science.gov (United States)

    Hennings, Pablo; Savvides, Marios; Vijaya Kumar, B. V. K.

    2005-03-01

    This paper introduces an application of steganography for hiding cancelable biometric data based on quad-phase correlation filter classification. The proposed technique can perform two tasks: (1) embed an encrypted (cancelable) template for biometric recognition into a host image or (2) embed the biometric data required for remote (or later) classification, such as embedding a transformed face image into the host image, so that it can be transmitted for remote authentication or stored for later use. The novel approach is that we will encode quantized Fourier domain information of the template (or biometric) in the spatial representation of the host image. More importantly we show that we only need 2 bits per pixel in the frequency domain to represent the filter and biometric, making it compact and ideal for application of data hiding. To preserve the template (or biometric) from vulnerabilities to successful attacks, we encrypt the filter or biometric image by convolving it with a random kernel which essentially produces an image in the spatial domain which looks like white noise, so essentially both the frequency and spatial representations will have no visible exploitable structure. We also present results on reduced complexity correlation filter classification performance when using biometric images recovered from stego-images.

  9. Audio stream classification for multimedia database search

    Science.gov (United States)

    Artese, M.; Bianco, S.; Gagliardi, I.; Gasparini, F.

    2013-03-01

    Search and retrieval of huge archives of Multimedia data is a challenging task. A classification step is often used to reduce the number of entries on which to perform the subsequent search. In particular, when new entries of the database are continuously added, a fast classification based on simple threshold evaluation is desirable. In this work we present a CART-based (Classification And Regression Tree [1]) classification framework for audio streams belonging to multimedia databases. The database considered is the Archive of Ethnography and Social History (AESS) [2], which is mainly composed of popular songs and other audio records describing the popular traditions handed down generation by generation, such as traditional fairs, and customs. The peculiarities of this database are that it is continuously updated; the audio recordings are acquired in unconstrained environment; and for the non-expert human user is difficult to create the ground truth labels. In our experiments, half of all the available audio files have been randomly extracted and used as training set. The remaining ones have been used as test set. The classifier has been trained to distinguish among three different classes: speech, music, and song. All the audio files in the dataset have been previously manually labeled into the three classes above defined by domain experts.

  10. Enhancing Navigation Skills through Audio Gaming.

    Science.gov (United States)

    Sánchez, Jaime; Sáenz, Mauricio; Pascual-Leone, Alvaro; Merabet, Lotfi

    2010-01-01

    We present the design, development and initial cognitive evaluation of an Audio-based Environment Simulator (AbES). This software allows a blind user to navigate through a virtual representation of a real space for the purposes of training orientation and mobility skills. Our findings indicate that users feel satisfied and self-confident when interacting with the audio-based interface, and the embedded sounds allow them to correctly orient themselves and navigate within the virtual world. Furthermore, users are able to transfer spatial information acquired through virtual interactions into real world navigation and problem solving tasks.

  11. Enhancing Navigation Skills through Audio Gaming

    Science.gov (United States)

    Sánchez, Jaime; Sáenz, Mauricio; Pascual-Leone, Alvaro; Merabet, Lotfi

    2014-01-01

    We present the design, development and initial cognitive evaluation of an Audio-based Environment Simulator (AbES). This software allows a blind user to navigate through a virtual representation of a real space for the purposes of training orientation and mobility skills. Our findings indicate that users feel satisfied and self-confident when interacting with the audio-based interface, and the embedded sounds allow them to correctly orient themselves and navigate within the virtual world. Furthermore, users are able to transfer spatial information acquired through virtual interactions into real world navigation and problem solving tasks. PMID:25505796

  12. Effective ASCII-HEX steganography for secure cloud

    International Nuclear Information System (INIS)

    Afghan, S.

    2015-01-01

    There are many reasons of cloud computing popularity some of the most important are; backup and rescue, cost effective, nearly limitless storage, automatic software amalgamation, easy access to information and many more. Pay-as-you-go model is followed to provide everything as a service. Data is secured by using standard security policies available at cloud end. In spite of its many benefits, as mentioned above, cloud computing has also some security issues. Provider as well as customer has to provide and collect data in a secure manner. Both of these issues plus efficient transmitting of data over cloud are very critical issues and needed to be resolved. There is need of security during the travel time of sensitive data over the network that can be processed or stored by the customer. Security to the customer's data at the provider end can be provided by using current security algorithms, which are not known by the customer. There is reliability problem due to existence of multiple boundaries in the cloud resource access. ASCII and HEX security with steganography is used to propose an algorithm that stores the encrypted data/cipher text in an image file which will be then sent to the cloud end. This is done by using CDM (Common Deployment Model). In future, an algorithm should be proposed and implemented for the security of virtual images in the cloud computing. (author)

  13. The Impact of Hard Disk Firmware Steganography on Computer Forensics

    Directory of Open Access Journals (Sweden)

    Iain Sutherland

    2009-06-01

    Full Text Available The hard disk drive is probably the predominant form of storage media and is a primary data source in a forensic investigation. The majority of available software tools and literature relating to the investigation of the structure and content contained within a hard disk drive concerns the extraction and analysis of evidence from the various file systems which can reside in the user accessible area of the disk. It is known that there are other areas of the hard disk drive which could be used to conceal information, such as the Host Protected Area and the Device Configuration Overlay. There are recommended methods for the detection and forensic analysis of these areas using appropriate tools and techniques. However, there are additional areas of a disk that have currently been overlooked.  The Service Area or Platter Resident Firmware Area is used to store code and control structures responsible for the functionality of the drive and for logging failing or failed sectors.This paper provides an introduction into initial research into the investigation and identification of issues relating to the analysis of the Platter Resident Firmware Area. In particular, the possibility that the Platter Resident Firmware Area could be manipulated and exploited to facilitate a form of steganography, enabling information to be concealed by a user and potentially from a digital forensic investigator.

  14. Audio Signal Decoder, Method for Decoding an Audio Signal and Computer Program Using Cascaded Audio Object Processing Stages

    OpenAIRE

    Hellmuth, O.; Falch, C.; Herre, J.; Hilpert, J.; Ridderbusch, F.; Terentiev, L.

    2010-01-01

    An audio signal decoder for providing an upmix signal representation in dependence on a downmix signal representation and an object-related parametric information comprises an object separator configured to decompose the downmix signal representation, to provide a first audio information describing a first set of one or more audio objects of a first audio object type and a second audio information describing a second set of one or more audio objects of a second audio object type, in dependenc...

  15. Klasifikasi Bit-Plane Noise untuk Penyisipan Pesan pada Teknik Steganography BPCS Menggunakan Fuzzy Inference Sistem Mamdani

    Directory of Open Access Journals (Sweden)

    Rahmad Hidayat

    2015-04-01

    Full Text Available Bit-Plane Complexity Segmentation (BPCS is a fairly new steganography technique. The most important process in BPCS is the calculation of complexity value of a bit-plane. The bit-plane complexity is calculated by looking at the amount of bit changes contained in a bit-plane. If a bit-plane has a high complexity, the bi-plane is categorized as a noise bit-plane that does not contain valuable information on the image. Classification of the bit-plane using the set cripst set (noise/not is not fair, where a little difference of the value will significantly change the status of the bit-plane. The purpose of this study is to apply the principles of fuzzy sets to classify the bit-plane into three sets that are informative, partly informative, and the noise region. Classification of the bit-plane into a fuzzy set is expected to classify the bit-plane in a more objective approach and ultimately message capacity of the images can be improved by using the Mamdani fuzzy inference to take decisions which bit-plane will be replaced with a message based on the classification of bit-plane and the size of the message that will be inserted. This research is able to increase the capability of BPCS steganography techniques to insert a message in bit-pane with more precise so that the container image quality would be better. It can be seen that the PSNR value of original image and stego-image is only slightly different.

  16. Small signal audio design

    CERN Document Server

    Self, Douglas

    2014-01-01

    Learn to use inexpensive and readily available parts to obtain state-of-the-art performance in all the vital parameters of noise, distortion, crosstalk and so on. With ample coverage of preamplifiers and mixers and a new chapter on headphone amplifiers, this practical handbook provides an extensive repertoire of circuits that can be put together to make almost any type of audio system.A resource packed full of valuable information, with virtually every page revealing nuggets of specialized knowledge not found elsewhere. Essential points of theory that bear on practical performance are lucidly

  17. Mixxing Audio Menggunakan FL Studio

    OpenAIRE

    Prawira, Yanheri

    2011-01-01

    Kajian ini bertujuan untuk memudahkan proses mixing audio dan menghemat biaya dalam proses Mixxing audio hanya menggunakan sebuah laptop ataupun komputer sebagai media utama yang menggunakan OS Windows 7, dan menggunakan aplikasi yang mencakup : FL Studio 9, ASIO 4 ALL tanpa tambahan alat apapun. Tujuan dari pembuatan system ini berguna untuk mempermudah proses mixxing audio DJ dengan menggunakan media laptop ataupun komputer, tanpa mengeluarkan banyak biaya. 082406014

  18. Parametric Coding of Stereo Audio

    Directory of Open Access Journals (Sweden)

    Erik Schuijers

    2005-06-01

    Full Text Available Parametric-stereo coding is a technique to efficiently code a stereo audio signal as a monaural signal plus a small amount of parametric overhead to describe the stereo image. The stereo properties are analyzed, encoded, and reinstated in a decoder according to spatial psychoacoustical principles. The monaural signal can be encoded using any (conventional audio coder. Experiments show that the parameterized description of spatial properties enables a highly efficient, high-quality stereo audio representation.

  19. Audio Signal Quantization Companding Laws Comparative Analysis

    Directory of Open Access Journals (Sweden)

    Aleksei A. Matskaniuk

    2012-05-01

    Full Text Available We describe the results of research on the effectiveness of the optimal in the sense of minimum error variance quantization scale audio playback (Lloyd-Max algorithm, and scales based on the A and Mu-law companding.

  20. CERN automatic audio-conference service

    Science.gov (United States)

    Sierra Moral, Rodrigo

    2010-04-01

    Scientists from all over the world need to collaborate with CERN on a daily basis. They must be able to communicate effectively on their joint projects at any time; as a result telephone conferences have become indispensable and widely used. Managed by 6 operators, CERN already has more than 20000 hours and 5700 audio-conferences per year. However, the traditional telephone based audio-conference system needed to be modernized in three ways. Firstly, to provide the participants with more autonomy in the organization of their conferences; secondly, to eliminate the constraints of manual intervention by operators; and thirdly, to integrate the audio-conferences into a collaborative working framework. The large number, and hence cost, of the conferences prohibited externalization and so the CERN telecommunications team drew up a specification to implement a new system. It was decided to use a new commercial collaborative audio-conference solution based on the SIP protocol. The system was tested as the first European pilot and several improvements (such as billing, security, redundancy...) were implemented based on CERN's recommendations. The new automatic conference system has been operational since the second half of 2006. It is very popular for the users and has doubled the number of conferences in the past two years.

  1. CERN automatic audio-conference service

    International Nuclear Information System (INIS)

    Sierra Moral, Rodrigo

    2010-01-01

    Scientists from all over the world need to collaborate with CERN on a daily basis. They must be able to communicate effectively on their joint projects at any time; as a result telephone conferences have become indispensable and widely used. Managed by 6 operators, CERN already has more than 20000 hours and 5700 audio-conferences per year. However, the traditional telephone based audio-conference system needed to be modernized in three ways. Firstly, to provide the participants with more autonomy in the organization of their conferences; secondly, to eliminate the constraints of manual intervention by operators; and thirdly, to integrate the audio-conferences into a collaborative working framework. The large number, and hence cost, of the conferences prohibited externalization and so the CERN telecommunications team drew up a specification to implement a new system. It was decided to use a new commercial collaborative audio-conference solution based on the SIP protocol. The system was tested as the first European pilot and several improvements (such as billing, security, redundancy...) were implemented based on CERN's recommendations. The new automatic conference system has been operational since the second half of 2006. It is very popular for the users and has doubled the number of conferences in the past two years.

  2. The Lowdown on Audio Downloads

    Science.gov (United States)

    Farrell, Beth

    2010-01-01

    First offered to public libraries in 2004, downloadable audiobooks have grown by leaps and bounds. According to the Audio Publishers Association, their sales today account for 21% of the spoken-word audio market. It hasn't been easy, however. WMA. DRM. MP3. AAC. File extensions small on letters but very big on consequences for librarians,…

  3. Efficient audio power amplification - challenges

    Energy Technology Data Exchange (ETDEWEB)

    Andersen, Michael A.E.

    2005-07-01

    For more than a decade efficient audio power amplification has evolved and today switch-mode audio power amplification in various forms are the state-of-the-art. The technical steps that lead to this evolution are described and in addition many of the challenges still to be faced and where extensive research and development are needed is covered. (au)

  4. Efficient Audio Power Amplification - Challenges

    DEFF Research Database (Denmark)

    Andersen, Michael Andreas E.

    2005-01-01

    For more than a decade efficient audio power amplification has evolved and today switch-mode audio power amplification in various forms are the state-of-the-art. The technical steps that lead to this evolution are described and in addition many of the challenges still to be faced and where...... extensive research and development are needed is covered....

  5. Effect of using different cover image quality to obtain robust selective embedding in steganography

    Science.gov (United States)

    Abdullah, Karwan Asaad; Al-Jawad, Naseer; Abdulla, Alan Anwer

    2014-05-01

    One of the common types of steganography is to conceal an image as a secret message in another image which normally called a cover image; the resulting image is called a stego image. The aim of this paper is to investigate the effect of using different cover image quality, and also analyse the use of different bit-plane in term of robustness against well-known active attacks such as gamma, statistical filters, and linear spatial filters. The secret messages are embedded in higher bit-plane, i.e. in other than Least Significant Bit (LSB), in order to resist active attacks. The embedding process is performed in three major steps: First, the embedding algorithm is selectively identifying useful areas (blocks) for embedding based on its lighting condition. Second, is to nominate the most useful blocks for embedding based on their entropy and average. Third, is to select the right bit-plane for embedding. This kind of block selection made the embedding process scatters the secret message(s) randomly around the cover image. Different tests have been performed for selecting a proper block size and this is related to the nature of the used cover image. Our proposed method suggests a suitable embedding bit-plane as well as the right blocks for the embedding. Experimental results demonstrate that different image quality used for the cover images will have an effect when the stego image is attacked by different active attacks. Although the secret messages are embedded in higher bit-plane, but they cannot be recognised visually within the stegos image.

  6. Class D audio amplifiers for high voltage capacitive transducers

    DEFF Research Database (Denmark)

    Nielsen, Dennis

    Audio reproduction systems contains two key components, the amplifier and the loudspeaker. In the last 20 – 30 years the technology of audio amplifiers have performed a fundamental shift of paradigm. Class D audio amplifiers have replaced the linear amplifiers, suffering from the well-known issues...... of high volume, weight, and cost. High efficient class D amplifiers are now widely available offering power densities, that their linear counterparts can not match. Unlike the technology of audio amplifiers, the loudspeaker is still based on the traditional electrodynamic transducer invented by C.W. Rice...... and E.W. Kellog in 1925 [1]. The poor efficiency of the electrodynamic transducer remains a key issue, and a significant limit of the efficiency of the complete audio reproduction systems. Also the geometric limits of the electrodynamic transducer imposes significant limits on the design of loudspeakers...

  7. All About Audio Equalization: Solutions and Frontiers

    Directory of Open Access Journals (Sweden)

    Vesa Välimäki

    2016-05-01

    Full Text Available Audio equalization is a vast and active research area. The extent of research means that one often cannot identify the preferred technique for a particular problem. This review paper bridges those gaps, systemically providing a deep understanding of the problems and approaches in audio equalization, their relative merits and applications. Digital signal processing techniques for modifying the spectral balance in audio signals and applications of these techniques are reviewed, ranging from classic equalizers to emerging designs based on new advances in signal processing and machine learning. Emphasis is placed on putting the range of approaches within a common mathematical and conceptual framework. The application areas discussed herein are diverse, and include well-defined, solvable problems of filter design subject to constraints, as well as newly emerging challenges that touch on problems in semantics, perception and human computer interaction. Case studies are given in order to illustrate key concepts and how they are applied in practice. We also recommend preferred signal processing approaches for important audio equalization problems. Finally, we discuss current challenges and the uncharted frontiers in this field. The source code for methods discussed in this paper is made available at https://code.soundsoftware.ac.uk/projects/allaboutaudioeq.

  8. Species-specific audio detection: a comparison of three template-based detection algorithms using random forests

    Directory of Open Access Journals (Sweden)

    Carlos J. Corrada Bravo

    2017-04-01

    Full Text Available We developed a web-based cloud-hosted system that allow users to archive, listen, visualize, and annotate recordings. The system also provides tools to convert these annotations into datasets that can be used to train a computer to detect the presence or absence of a species. The algorithm used by the system was selected after comparing the accuracy and efficiency of three variants of a template-based detection. The algorithm computes a similarity vector by comparing a template of a species call with time increments across the spectrogram. Statistical features are extracted from this vector and used as input for a Random Forest classifier that predicts presence or absence of the species in the recording. The fastest algorithm variant had the highest average accuracy and specificity; therefore, it was implemented in the ARBIMON web-based system.

  9. Audio-based, unsupervised machine learning reveals cyclic changes in earthquake mechanisms in the Geysers geothermal field, California

    Science.gov (United States)

    Holtzman, B. K.; Paté, A.; Paisley, J.; Waldhauser, F.; Repetto, D.; Boschi, L.

    2017-12-01

    The earthquake process reflects complex interactions of stress, fracture and frictional properties. New machine learning methods reveal patterns in time-dependent spectral properties of seismic signals and enable identification of changes in faulting processes. Our methods are based closely on those developed for music information retrieval and voice recognition, using the spectrogram instead of the waveform directly. Unsupervised learning involves identification of patterns based on differences among signals without any additional information provided to the algorithm. Clustering of 46,000 earthquakes of $0.3

  10. Efficient audio signal processing for embedded systems

    Science.gov (United States)

    Chiu, Leung Kin

    As mobile platforms continue to pack on more computational power, electronics manufacturers start to differentiate their products by enhancing the audio features. However, consumers also demand smaller devices that could operate for longer time, hence imposing design constraints. In this research, we investigate two design strategies that would allow us to efficiently process audio signals on embedded systems such as mobile phones and portable electronics. In the first strategy, we exploit properties of the human auditory system to process audio signals. We designed a sound enhancement algorithm to make piezoelectric loudspeakers sound ”richer" and "fuller." Piezoelectric speakers have a small form factor but exhibit poor response in the low-frequency region. In the algorithm, we combine psychoacoustic bass extension and dynamic range compression to improve the perceived bass coming out from the tiny speakers. We also developed an audio energy reduction algorithm for loudspeaker power management. The perceptually transparent algorithm extends the battery life of mobile devices and prevents thermal damage in speakers. This method is similar to audio compression algorithms, which encode audio signals in such a ways that the compression artifacts are not easily perceivable. Instead of reducing the storage space, however, we suppress the audio contents that are below the hearing threshold, therefore reducing the signal energy. In the second strategy, we use low-power analog circuits to process the signal before digitizing it. We designed an analog front-end for sound detection and implemented it on a field programmable analog array (FPAA). The system is an example of an analog-to-information converter. The sound classifier front-end can be used in a wide range of applications because programmable floating-gate transistors are employed to store classifier weights. Moreover, we incorporated a feature selection algorithm to simplify the analog front-end. A machine

  11. A BIC Based Initial Training Set Selection Algorithm for Active Learning and Its Application in Audio Detection

    Directory of Open Access Journals (Sweden)

    Y. Leng

    2013-06-01

    Full Text Available To construct a classification system or a detection system, large amounts of labeled samples are needed. However, manual labeling is dull and time consuming, so researchers have proposed the active learning technology. The initial training set selection is the first step of an active learning process, but currently there have been few studies on it. Most active learning algorithms adopt random sampling or algorithms like sampling by clustering (SBC to select the initial training samples. But these two kinds of method would lose their effectiveness in detecting events of small probability. Because sometimes they could not select or select too few samples of the small probability events. To solve this problem, this paper proposes a BIC based initial training set selection algorithm. The BIC based algorithm performs clustering on the whole training set first. Then uses BIC to judge the status of clusters. Finally, it adopts different selection strategies for clusters of different status. Experimental results on two real data sets show that, compared to random sampling and SBC, the proposed BIC based initial training set selection algorithm can efficiently solve the detection problem of small probability events. In the mean time, it has obvious advantages in detecting events of non-small probability.

  12. Electrophysiological evidence for Audio-visuo-lingual speech integration.

    Science.gov (United States)

    Treille, Avril; Vilain, Coriandre; Schwartz, Jean-Luc; Hueber, Thomas; Sato, Marc

    2018-01-31

    Recent neurophysiological studies demonstrate that audio-visual speech integration partly operates through temporal expectations and speech-specific predictions. From these results, one common view is that the binding of auditory and visual, lipread, speech cues relies on their joint probability and prior associative audio-visual experience. The present EEG study examined whether visual tongue movements integrate with relevant speech sounds, despite little associative audio-visual experience between the two modalities. A second objective was to determine possible similarities and differences of audio-visual speech integration between unusual audio-visuo-lingual and classical audio-visuo-labial modalities. To this aim, participants were presented with auditory, visual, and audio-visual isolated syllables, with the visual presentation related to either a sagittal view of the tongue movements or a facial view of the lip movements of a speaker, with lingual and facial movements previously recorded by an ultrasound imaging system and a video camera. In line with previous EEG studies, our results revealed an amplitude decrease and a latency facilitation of P2 auditory evoked potentials in both audio-visual-lingual and audio-visuo-labial conditions compared to the sum of unimodal conditions. These results argue against the view that auditory and visual speech cues solely integrate based on prior associative audio-visual perceptual experience. Rather, they suggest that dynamic and phonetic informational cues are sharable across sensory modalities, possibly through a cross-modal transfer of implicit articulatory motor knowledge. Copyright © 2017 Elsevier Ltd. All rights reserved.

  13. Face Detection Technique as Interactive Audio/Video Controller for a Mother-Tongue-Based Instructional Material

    Science.gov (United States)

    Guidang, Excel Philip B.; Llanda, Christopher John R.; Palaoag, Thelma D.

    2018-03-01

    Face Detection Technique as a strategy in controlling a multimedia instructional material was implemented in this study. Specifically, it achieved the following objectives: 1) developed a face detection application that controls an embedded mother-tongue-based instructional material for face-recognition configuration using Python; 2) determined the perceptions of the students using the Mutt Susan’s student app review rubric. The study concludes that face detection technique is effective in controlling an electronic instructional material. It can be used to change the method of interaction of the student with an instructional material. 90% of the students perceived the application to be a great app and 10% rated the application to be good.

  14. Music Genre Classification Using MIDI and Audio Features

    Science.gov (United States)

    Cataltepe, Zehra; Yaslan, Yusuf; Sonmez, Abdullah

    2007-12-01

    We report our findings on using MIDI files and audio features from MIDI, separately and combined together, for MIDI music genre classification. We use McKay and Fujinaga's 3-root and 9-leaf genre data set. In order to compute distances between MIDI pieces, we use normalized compression distance (NCD). NCD uses the compressed length of a string as an approximation to its Kolmogorov complexity and has previously been used for music genre and composer clustering. We convert the MIDI pieces to audio and then use the audio features to train different classifiers. MIDI and audio from MIDI classifiers alone achieve much smaller accuracies than those reported by McKay and Fujinaga who used not NCD but a number of domain-based MIDI features for their classification. Combining MIDI and audio from MIDI classifiers improves accuracy and gets closer to, but still worse, accuracies than McKay and Fujinaga's. The best root genre accuracies achieved using MIDI, audio, and combination of them are 0.75, 0.86, and 0.93, respectively, compared to 0.98 of McKay and Fujinaga. Successful classifier combination requires diversity of the base classifiers. We achieve diversity through using certain number of seconds of the MIDI file, different sample rates and sizes for the audio file, and different classification algorithms.

  15. Music Genre Classification Using MIDI and Audio Features

    Directory of Open Access Journals (Sweden)

    Abdullah Sonmez

    2007-01-01

    Full Text Available We report our findings on using MIDI files and audio features from MIDI, separately and combined together, for MIDI music genre classification. We use McKay and Fujinaga's 3-root and 9-leaf genre data set. In order to compute distances between MIDI pieces, we use normalized compression distance (NCD. NCD uses the compressed length of a string as an approximation to its Kolmogorov complexity and has previously been used for music genre and composer clustering. We convert the MIDI pieces to audio and then use the audio features to train different classifiers. MIDI and audio from MIDI classifiers alone achieve much smaller accuracies than those reported by McKay and Fujinaga who used not NCD but a number of domain-based MIDI features for their classification. Combining MIDI and audio from MIDI classifiers improves accuracy and gets closer to, but still worse, accuracies than McKay and Fujinaga's. The best root genre accuracies achieved using MIDI, audio, and combination of them are 0.75, 0.86, and 0.93, respectively, compared to 0.98 of McKay and Fujinaga. Successful classifier combination requires diversity of the base classifiers. We achieve diversity through using certain number of seconds of the MIDI file, different sample rates and sizes for the audio file, and different classification algorithms.

  16. Confidentiality Enhancement of Highly Sensitive Nuclear Data Using Steganography with Chaotic Encryption over OFDM Channel

    International Nuclear Information System (INIS)

    Mahmoud, S.; Ayad, N.; Elsayed, F.; Elbendary, M.

    2016-01-01

    Full text: Due to the widespread usage of the internet and other wired and wireless communication methods, the security of the transmitted data has become a major requirement. Nuclear knowledge is mainly built upon the exchange of nuclear information which is considered highly sensitive information, so its security has to be enhanced by using high level security mechanisms. Data confidentiality is concerned with the achievement of higher protection for confidential information from unauthorized disclosure or access. Cryptography and steganography are famous and widely used techniques that process information in order to achieve its confidentiality, but sometimes, when used individually, they don’t satisfy a required level of security for highly sensitive data. In this paper, cryptography is accompanied with steganography for constituting a multilayer security techniques that can strengthen the level of security of highly confidential nuclear data that are archived or transmitted through different channel types and noise conditions. (author)

  17. Improved diagonal queue medical image steganography using Chaos theory, LFSR, and Rabin cryptosystem.

    Science.gov (United States)

    Jain, Mamta; Kumar, Anil; Choudhary, Rishabh Charan

    2017-06-01

    In this article, we have proposed an improved diagonal queue medical image steganography for patient secret medical data transmission using chaotic standard map, linear feedback shift register, and Rabin cryptosystem, for improvement of previous technique (Jain and Lenka in Springer Brain Inform 3:39-51, 2016). The proposed algorithm comprises four stages, generation of pseudo-random sequences (pseudo-random sequences are generated by linear feedback shift register and standard chaotic map), permutation and XORing using pseudo-random sequences, encryption using Rabin cryptosystem, and steganography using the improved diagonal queues. Security analysis has been carried out. Performance analysis is observed using MSE, PSNR, maximum embedding capacity, as well as by histogram analysis between various Brain disease stego and cover images.

  18. Instrumental Landing Using Audio Indication

    Science.gov (United States)

    Burlak, E. A.; Nabatchikov, A. M.; Korsun, O. N.

    2018-02-01

    The paper proposes an audio indication method for presenting to a pilot the information regarding the relative positions of an aircraft in the tasks of precision piloting. The implementation of the method is presented, the use of such parameters of audio signal as loudness, frequency and modulation are discussed. To confirm the operability of the audio indication channel the experiments using modern aircraft simulation facility were carried out. The simulated performed the instrument landing using the proposed audio method to indicate the aircraft deviations in relation to the slide path. The results proved compatible with the simulated instrumental landings using the traditional glidescope pointers. It inspires to develop the method in order to solve other precision piloting tasks.

  19. A centralized audio presentation manager

    Energy Technology Data Exchange (ETDEWEB)

    Papp, A.L. III; Blattner, M.M.

    1994-05-16

    The centralized audio presentation manager addresses the problems which occur when multiple programs running simultaneously attempt to use the audio output of a computer system. Time dependence of sound means that certain auditory messages must be scheduled simultaneously, which can lead to perceptual problems due to psychoacoustic phenomena. Furthermore, the combination of speech and nonspeech audio is examined; each presents its own problems of perceptibility in an acoustic environment composed of multiple auditory streams. The centralized audio presentation manager receives abstract parameterized message requests from the currently running programs, and attempts to create and present a sonic representation in the most perceptible manner through the use of a theoretically and empirically designed rule set.

  20. Definici?n de audio

    OpenAIRE

    Monta?ez, Luis A.; Cabrera, Juan G.

    2015-01-01

    Descripci?n del significado de Audio como objeto de estudio por distintos autores, y su diferenciaci?n con el significado de Sonido. Se define Audio como una se?al el?ctrica con caracter?sticas similares en su forma de onda en comparaci?n a la de una se?al sonora. La se?al sonora corresponde a presi?n en un medio f?sico, mientras que la se?al de Audio es una tensi?n o voltaje definida como se?al an?loga. As? el Audio se concibe como una se?al el?ctrica, an?loga o anal?gica, frente una se?al s...

  1. Definici?n de audio

    OpenAIRE

    Monta?ez Carrillo, Luis A.; Cabrera, Juan G.

    2015-01-01

    Descripci?n del significado de Audio como objeto de estudio por distintos autores, y su diferenciaci?n con el significado de Sonido. De esta forma se define Audio como una se?al el?ctrica con caracter?sticas similares en su forma de onda en comparaci?n a la de una se?al sonora, teniendo en cuenta la se?al sonora corresponde a presi?n en u medio f?sico, mientras que la se?al de Audio es una tensi?n o voltaje definida como se?al an?loga. En este orden de ideas, el Audio se concibe como una se?a...

  2. ENERGY STAR Certified Audio Video

    Data.gov (United States)

    U.S. Environmental Protection Agency — Certified models meet all ENERGY STAR requirements as listed in the Version 3.0 ENERGY STAR Program Requirements for Audio Video Equipment that are effective as of...

  3. Class D audio amplifier with 4th order output filter and self-oscillating full-state hysteresis based feedback driving capacitive transducers

    DEFF Research Database (Denmark)

    Nielsen, Dennis; Knott, Arnold; Andersen, Michael A. E.

    2014-01-01

    A practical solution is presented for the design of a non-isolated high voltage DC/AC power converter. The converter is intended to be used as a class D audio amplifier for a Dielectric Electro Active Polymer (DEAP) transducer. A simple and effective hysteretic control scheme for the converter...

  4. Realtime Audio with Garbage Collection

    OpenAIRE

    Matheussen, Kjetil Svalastog

    2010-01-01

    Two non-moving concurrent garbage collectors tailored for realtime audio processing are described. Both collectors work on copies of the heap to avoid cache misses and audio-disruptive synchronizations. Both collectors are targeted at multiprocessor personal computers. The first garbage collector works in uncooperative environments, and can replace Hans Boehm's conservative garbage collector for C and C++. The collector does not access the virtual memory system. Neither doe...

  5. Tourism research and audio methods

    DEFF Research Database (Denmark)

    Jensen, Martin Trandberg

    2016-01-01

    Audio methods enriches sensuous tourism ethnographies. • The note suggests five research avenues for future auditory scholarship. • Sensuous tourism research has neglected the role of sounds in embodied tourism experiences.......• Audio methods enriches sensuous tourism ethnographies. • The note suggests five research avenues for future auditory scholarship. • Sensuous tourism research has neglected the role of sounds in embodied tourism experiences....

  6. Steganalysis of LSB Image Steganography using Multiple Regression and Auto Regressive (AR) Model

    OpenAIRE

    Souvik Bhattacharyya; Gautam Sanyal

    2011-01-01

    The staggering growth in communication technologyand usage of public domain channels (i.e. Internet) has greatly facilitated transfer of data. However, such open communication channelshave greater vulnerability to security threats causing unauthorizedin- formation access. Traditionally, encryption is used to realizethen communication security. However, important information is notprotected once decoded. Steganography is the art and science of communicating in a way which hides the existence o...

  7. Steganography algorithm multi pixel value differencing (MPVD) to increase message capacity and data security

    Science.gov (United States)

    Rojali, Siahaan, Ida Sri Rejeki; Soewito, Benfano

    2017-08-01

    Steganography is the art and science of hiding the secret messages so the existence of the message cannot be detected by human senses. The data concealment is using the Multi Pixel Value Differencing (MPVD) algorithm, utilizing the difference from each pixel. The development was done by using six interval tables. The objective of this algorithm is to enhance the message capacity and to maintain the data security.

  8. A Novel Steganography Technique for SDTV-H.264/AVC Encoded Video

    Directory of Open Access Journals (Sweden)

    Christian Di Laura

    2016-01-01

    Full Text Available Today, eavesdropping is becoming a common issue in the rapidly growing digital network and has foreseen the need for secret communication channels embedded in digital media. In this paper, a novel steganography technique designed for Standard Definition Digital Television (SDTV H.264/AVC encoded video sequences is presented. The algorithm introduced here makes use of the compression properties of the Context Adaptive Variable Length Coding (CAVLC entropy encoder to achieve a low complexity and real-time inserting method. The chosen scheme hides the private message directly in the H.264/AVC bit stream by modifying the AC frequency quantized residual luminance coefficients of intrapredicted I-frames. In order to avoid error propagation in adjacent blocks, an interlaced embedding strategy is applied. Likewise, the steganography technique proposed allows self-detection of the hidden message at the target destination. The code source was implemented by mixing MATLAB 2010 b and Java development environments. Finally, experimental results have been assessed through objective and subjective quality measures and reveal that less visible artifacts are produced with the technique proposed by reaching PSNR values above 40.0 dB and an embedding bit rate average per secret communication channel of 425 bits/sec. This exemplifies that steganography is affordable in digital television.

  9. Modeling Audio Fingerprints : Structure, Distortion, Capacity

    NARCIS (Netherlands)

    Doets, P.J.O.

    2010-01-01

    An audio fingerprint is a compact low-level representation of a multimedia signal. An audio fingerprint can be used to identify audio files or fragments in a reliable way. The use of audio fingerprints for identification consists of two phases. In the enrollment phase known content is fingerprinted,

  10. An inconclusive digital audio authenticity examination: a unique case.

    Science.gov (United States)

    Koenig, Bruce E; Lacey, Douglas S

    2012-01-01

    This case report sets forth an authenticity examination of 35 encrypted, proprietary-format digital audio files containing recorded telephone conversations between two codefendants in a criminal matter. The codefendant who recorded the conversations did so on a recording system he developed; additionally, he was both a forensic audio authenticity examiner, who had published and presented in the field, and was the head of a professional audio society's writing group for authenticity standards. The authors conducted the examination of the recordings following nine laboratory steps of the peer-reviewed and published 11-step digital audio authenticity protocol. Based considerably on the codefendant's direct involvement with the development of the encrypted audio format, his experience in the field of forensic audio authenticity analysis, and the ease with which the audio files could be accessed, converted, edited in the gap areas, and reconstructed in such a way that the processes were undetected, the authors concluded that the recordings could not be scientifically authenticated through accepted forensic practices. © 2011 American Academy of Forensic Sciences.

  11. Optimized Audio Classification and Segmentation Algorithm by Using Ensemble Methods

    Directory of Open Access Journals (Sweden)

    Saadia Zahid

    2015-01-01

    Full Text Available Audio segmentation is a basis for multimedia content analysis which is the most important and widely used application nowadays. An optimized audio classification and segmentation algorithm is presented in this paper that segments a superimposed audio stream on the basis of its content into four main audio types: pure-speech, music, environment sound, and silence. An algorithm is proposed that preserves important audio content and reduces the misclassification rate without using large amount of training data, which handles noise and is suitable for use for real-time applications. Noise in an audio stream is segmented out as environment sound. A hybrid classification approach is used, bagged support vector machines (SVMs with artificial neural networks (ANNs. Audio stream is classified, firstly, into speech and nonspeech segment by using bagged support vector machines; nonspeech segment is further classified into music and environment sound by using artificial neural networks and lastly, speech segment is classified into silence and pure-speech segments on the basis of rule-based classifier. Minimum data is used for training classifier; ensemble methods are used for minimizing misclassification rate and approximately 98% accurate segments are obtained. A fast and efficient algorithm is designed that can be used with real-time multimedia applications.

  12. Introduction to audio analysis a MATLAB approach

    CERN Document Server

    Giannakopoulos, Theodoros

    2014-01-01

    Introduction to Audio Analysis serves as a standalone introduction to audio analysis, providing theoretical background to many state-of-the-art techniques. It covers the essential theory necessary to develop audio engineering applications, but also uses programming techniques, notably MATLAB®, to take a more applied approach to the topic. Basic theory and reproducible experiments are combined to demonstrate theoretical concepts from a practical point of view and provide a solid foundation in the field of audio analysis. Audio feature extraction, audio classification, audio segmentation, au

  13. Enlace optoelectrónico de audio

    OpenAIRE

    García Lozano, Jesús

    2012-01-01

    En este proyecto se diseña e implementa un sistema capaz de transmitir audio mediante luz infrarroja. Se pueden diferenciar dos grandes partes del proyecto, una el módulo emisor y la otra el módulo receptor. La señal es introducida en el módulo emisor a partir de cualquier reproductor de audio. Esta señal es sometida a un proceso de modulación FM para mejorar la comunicación entre emisor y receptor, puesto que la transmisión de la señal en banda base es más vulnerable a ruidos. Una vez modula...

  14. PENGEMBANGAN MEDIA AUDIO VISUAL PEMBELAJARAN MENULIS BERITA SINGKAT

    OpenAIRE

    Sastri, Sastri; Wiryotinoyo, Mujiyono; Sudaryono, Sudaryono

    2015-01-01

    This article is based on a developmental research which is aimed at constructing audio visual media writing news. This media is developed with a contextual approach. Materials and training tasks are presented, designed using contextual approach or match an environment of student. Through this approach, students are expected to construct experiences into the learning situation. The design used in the development of audio-visual media using the model of learning to write news Alessi and Trollip...

  15. High quality scalable audio codec

    Science.gov (United States)

    Kim, Miyoung; Oh, Eunmi; Kim, JungHoe

    2007-09-01

    The MPEG-4 BSAC (Bit Sliced Arithmetic Coding) is a fine-grain scalable codec with layered structure which consists of a single base-layer and several enhancement layers. The scalable functionality allows us to decode the subsets of a full bitstream and to deliver audio contents adaptively under conditions of heterogeneous network and devices, and user interaction. This bitrate scalability can be provided at the cost of high frequency components. It means that the decoded output of BSAC sounds muffled as the transmitted layers become less and less due to deprived conditions of network and devices. The goal of the proposed technology is to compensate the missing high frequency components, while maintaining the fine grain scalability of BSAC. This paper describes the integration of SBR (Spectral Bandwidth Replication) tool to existing MPEG-4 BSAC. Listening test results show that the sound quality of BSAC is improved when the full bitstream is truncated for lower bitrates, and this quality is comparable to that of BSAC using SBR tool without truncation at the same bitrate.

  16. Musical examination to bridge audio data and sheet music

    Science.gov (United States)

    Pan, Xunyu; Cross, Timothy J.; Xiao, Liangliang; Hei, Xiali

    2015-03-01

    The digitalization of audio is commonly implemented for the purpose of convenient storage and transmission of music and songs in today's digital age. Analyzing digital audio for an insightful look at a specific musical characteristic, however, can be quite challenging for various types of applications. Many existing musical analysis techniques can examine a particular piece of audio data. For example, the frequency of digital sound can be easily read and identified at a specific section in an audio file. Based on this information, we could determine the musical note being played at that instant, but what if you want to see a list of all the notes played in a song? While most existing methods help to provide information about a single piece of the audio data at a time, few of them can analyze the available audio file on a larger scale. The research conducted in this work considers how to further utilize the examination of audio data by storing more information from the original audio file. In practice, we develop a novel musical analysis system Musicians Aid to process musical representation and examination of audio data. Musicians Aid solves the previous problem by storing and analyzing the audio information as it reads it rather than tossing it aside. The system can provide professional musicians with an insightful look at the music they created and advance their understanding of their work. Amateur musicians could also benefit from using it solely for the purpose of obtaining feedback about a song they were attempting to play. By comparing our system's interpretation of traditional sheet music with their own playing, a musician could ensure what they played was correct. More specifically, the system could show them exactly where they went wrong and how to adjust their mistakes. In addition, the application could be extended over the Internet to allow users to play music with one another and then review the audio data they produced. This would be particularly

  17. Location audio simplified capturing your audio and your audience

    CERN Document Server

    Miles, Dean

    2014-01-01

    From the basics of using camera, handheld, lavalier, and shotgun microphones to camera calibration and mixer set-ups, Location Audio Simplified unlocks the secrets to clean and clear broadcast quality audio no matter what challenges you face. Author Dean Miles applies his twenty-plus years of experience as a professional location operator to teach the skills, techniques, tips, and secrets needed to produce high-quality production sound on location. Humorous and thoroughly practical, the book covers a wide array of topics, such as:* location selection* field mixing* boo

  18. A Joint Audio-Visual Approach to Audio Localization

    DEFF Research Database (Denmark)

    Jensen, Jesper Rindom; Christensen, Mads Græsbøll

    2015-01-01

    Localization of audio sources is an important research problem, e.g., to facilitate noise reduction. In the recent years, the problem has been tackled using distributed microphone arrays (DMA). A common approach is to apply direction-of-arrival (DOA) estimation on each array (denoted as nodes...... time-of-flight cameras. Moreover, we propose an optimal method for weighting such DOA and range information for audio localization. Our experiments on both synthetic and real data show that there is a clear, potential advantage of using the joint audiovisual localization framework....

  19. The Safeguard of Audio Collections: A Computer Science Based Approach to Quality Control—The Case of the Sound Archive of the Arena di Verona

    Directory of Open Access Journals (Sweden)

    Federica Bressan

    2013-01-01

    Full Text Available In the field of multimedia, very little attention is given to the activities involved in the preservation of audio documents. At the same time, more and more archives storing audio and video documents face the problem of obsolescing and degrading media, which could largely benefit from the instruments and the methodologies of research in multimedia. This paper presents the methodology and the results of the Italian project REVIVAL, aimed at the development of a hardware/software platform to support the active preservation of the audio collection of the Fondazione Arena di Verona, one of the finest in Europe for the operatic genre, with a special attention on protocols and tools for quality control. On the scientific side, the most significant objectives achieved by the project are (i the setup of a working environment inside the archive, (ii the knowledge transfer to the archival personnel, (iii the realization of chemical analyses on magnetic tapes in collaboration with experts in the fields of materials science and chemistry, and (iv the development of original open-source software tools. On the cultural side, the recovery, the safeguard, and the access to unique copies of unpublished live recordings of artists the calibre of Domingo and Pavarotti are of great musicological and economical value.

  20. Investigating the impact of audio instruction and audio-visual biofeedback for lung cancer radiation therapy

    Science.gov (United States)

    George, Rohini

    Lung cancer accounts for 13% of all cancers in the Unites States and is the leading cause of deaths among both men and women. The five-year survival for lung cancer patients is approximately 15%.(ACS facts & figures) Respiratory motion decreases accuracy of thoracic radiotherapy during imaging and delivery. To account for respiration, generally margins are added during radiation treatment planning, which may cause a substantial dose delivery to normal tissues and increase the normal tissue toxicity. To alleviate the above-mentioned effects of respiratory motion, several motion management techniques are available which can reduce the doses to normal tissues, thereby reducing treatment toxicity and allowing dose escalation to the tumor. This may increase the survival probability of patients who have lung cancer and are receiving radiation therapy. However the accuracy of these motion management techniques are inhibited by respiration irregularity. The rationale of this thesis was to study the improvement in regularity of respiratory motion by breathing coaching for lung cancer patients using audio instructions and audio-visual biofeedback. A total of 331 patient respiratory motion traces, each four minutes in length, were collected from 24 lung cancer patients enrolled in an IRB-approved breathing-training protocol. It was determined that audio-visual biofeedback significantly improved the regularity of respiratory motion compared to free breathing and audio instruction, thus improving the accuracy of respiratory gated radiotherapy. It was also observed that duty cycles below 30% showed insignificant reduction in residual motion while above 50% there was a sharp increase in residual motion. The reproducibility of exhale based gating was higher than that of inhale base gating. Modeling the respiratory cycles it was found that cosine and cosine 4 models had the best correlation with individual respiratory cycles. The overall respiratory motion probability distribution

  1. Perancangan Sistem Audio Mobil Berbasiskan Sistem Pakar dan Web

    Directory of Open Access Journals (Sweden)

    Djunaidi Santoso

    2011-12-01

    Full Text Available Designing car audio that fits user’s needs is a fun activity. However, the design often consumes more time and costly since it should be consulted to the experts several times. For easy access to information in designing a car audio system as well as error prevention, an car audio system based on expert system and web is designed for those who do not have sufficient time and expense to consult directly to experts. This system consists of tutorial modules designed using the HyperText Preprocessor (PHP and MySQL as database. This car audio system design is evaluated uses black box testing method which focuses on the functional needs of the application. Tests are performed by providing inputs and produce outputs corresponding to the function of each module. The test results prove the correspondence between input and output, which means that the program meet the initial goals of the design. 

  2. Perceptually controlled doping for audio source separation

    Science.gov (United States)

    Mahé, Gaël; Nadalin, Everton Z.; Suyama, Ricardo; Romano, João MT

    2014-12-01

    The separation of an underdetermined audio mixture can be performed through sparse component analysis (SCA) that relies however on the strong hypothesis that source signals are sparse in some domain. To overcome this difficulty in the case where the original sources are available before the mixing process, the informed source separation (ISS) embeds in the mixture a watermark, which information can help a further separation. Though powerful, this technique is generally specific to a particular mixing setup and may be compromised by an additional bitrate compression stage. Thus, instead of watermarking, we propose a `doping' method that makes the time-frequency representation of each source more sparse, while preserving its audio quality. This method is based on an iterative decrease of the distance between the distribution of the signal and a target sparse distribution, under a perceptual constraint. We aim to show that the proposed approach is robust to audio coding and that the use of the sparsified signals improves the source separation, in comparison with the original sources. In this work, the analysis is made only in instantaneous mixtures and focused on voice sources.

  3. Audio power amplifier design handbook

    CERN Document Server

    Self, Douglas

    2013-01-01

    This book is essential for audio power amplifier designers and engineers for one simple reason...it enables you as a professional to develop reliable, high-performance circuits. The Author Douglas Self covers the major issues of distortion and linearity, power supplies, overload, DC-protection and reactive loading. He also tackles unusual forms of compensation and distortion produced by capacitors and fuses. This completely updated fifth edition includes four NEW chapters including one on The XD Principle, invented by the author, and used by Cambridge Audio. Cro

  4. Digital Augmented Reality Audio Headset

    Directory of Open Access Journals (Sweden)

    Jussi Rämö

    2012-01-01

    Full Text Available Augmented reality audio (ARA combines virtual sound sources with the real sonic environment of the user. An ARA system can be realized with a headset containing binaural microphones. Ideally, the ARA headset should be acoustically transparent, that is, it should not cause audible modification to the surrounding sound. A practical implementation of an ARA mixer requires a low-latency headphone reproduction system with additional equalization to compensate for the attenuation and the modified ear canal resonances caused by the headphones. This paper proposes digital IIR filters to realize the required equalization and evaluates a real-time prototype ARA system. Measurements show that the throughput latency of the digital prototype ARA system can be less than 1.4 ms, which is sufficiently small in practice. When the direct and processed sounds are combined in the ear, a comb filtering effect is brought about and appears as notches in the frequency response. The comb filter effect in speech and music signals was studied in a listening test and it was found to be inaudible when the attenuation is 20 dB. Insert ARA headphones have a sufficient attenuation at frequencies above about 1 kHz. The proposed digital ARA system enables several immersive audio applications, such as a virtual audio tourist guide and audio teleconferencing.

  5. Engaging Students with Audio Feedback

    Science.gov (United States)

    Cann, Alan

    2014-01-01

    Students express widespread dissatisfaction with academic feedback. Teaching staff perceive a frequent lack of student engagement with written feedback, much of which goes uncollected or unread. Published evidence shows that audio feedback is highly acceptable to students but is underused. This paper explores methods to produce and deliver audio…

  6. Haptic and Audio Interaction Design

    DEFF Research Database (Denmark)

    This book constitutes the refereed proceedings of the 5th International Workshop on Haptic and Audio Interaction Design, HAID 2010 held in Copenhagen, Denmark, in September 2010. The 21 revised full papers presented were carefully reviewed and selected for inclusion in the book. The papers are or...

  7. Audio watermark a comprehensive foundation using Matlab

    CERN Document Server

    Lin, Yiqing

    2015-01-01

    This book illustrates the commonly used and novel approaches of audio watermarking for copyrights protection. The author examines the theoretical and practical step by step guide to the topic of data hiding in audio signal such as music, speech, broadcast. The book covers new techniques developed by the authors are fully explained and MATLAB programs, for audio watermarking and audio quality assessments and also discusses methods for objectively predicting the perceptual quality of the watermarked audio signals. Explains the theoretical basics of the commonly used audio watermarking techniques Discusses the methods used to objectively and subjectively assess the quality of the audio signals Provides a comprehensive well tested MATLAB programs that can be used efficiently to watermark any audio media

  8. State-of-the-art soft computing techniques in image steganography domain

    Science.gov (United States)

    Hussain, Hanizan Shaker; Din, Roshidi; Samad, Hafiza Abdul; Yaacub, Mohd Hanafizah; Murad, Roslinda; Rukhiyah, A.; Sabdri, Noor Maizatulshima

    2016-08-01

    This paper reviews major works of soft computing (SC) techniques in image steganography and watermarking in the last ten years, focusing on three main SC techniques, which are neural network, genetic algorithm, and fuzzy logic. The findings suggests that all these works applied SC techniques either during pre-processing, embedding or extracting stages or more than one of these stages. Therefore, the presence of SC techniques with their diverse approaches and strengths can help researchers in future work to attain excellent quality of image information hiding that comprises both imperceptibility and robustness.

  9. Bit rates in audio source coding

    NARCIS (Netherlands)

    Veldhuis, Raymond N.J.

    1992-01-01

    The goal is to introduce and solve the audio coding optimization problem. Psychoacoustic results such as masking and excitation pattern models are combined with results from rate distortion theory to formulate the audio coding optimization problem. The solution of the audio optimization problem is a

  10. Audio Frequency Analysis in Mobile Phones

    Science.gov (United States)

    Aguilar, Horacio Munguía

    2016-01-01

    A new experiment using mobile phones is proposed in which its audio frequency response is analyzed using the audio port for inputting external signal and getting a measurable output. This experiment shows how the limited audio bandwidth used in mobile telephony is the main cause of the poor speech quality in this service. A brief discussion is…

  11. 50 CFR 27.72 - Audio equipment.

    Science.gov (United States)

    2010-10-01

    ... 50 Wildlife and Fisheries 6 2010-10-01 2010-10-01 false Audio equipment. 27.72 Section 27.72 Wildlife and Fisheries UNITED STATES FISH AND WILDLIFE SERVICE, DEPARTMENT OF THE INTERIOR (CONTINUED) THE... Audio equipment. The operation or use of audio devices including radios, recording and playback devices...

  12. Audio Satellites – Overhearing Everyday Life

    DEFF Research Database (Denmark)

    Breinbjerg, Morten; Højlund, Marie Koldkjær; Riis, Morten S.

    2016-01-01

    The project “Audio Satellites – overhearing everyday life” consists of a number of mobile listening devices (audio satellites) from which sound is distributed in real time to a server and made available for listening and mixing through a web interface. The audio satellites can either be carried...

  13. 36 CFR 2.12 - Audio disturbances.

    Science.gov (United States)

    2010-07-01

    ... 36 Parks, Forests, and Public Property 1 2010-07-01 2010-07-01 false Audio disturbances. 2.12... RESOURCE PROTECTION, PUBLIC USE AND RECREATION § 2.12 Audio disturbances. (a) The following are prohibited..., motorized toy, or an audio device, such as a radio, television set, tape deck or musical instrument, in a...

  14. WebGL and web audio software lightweight components for multimedia education

    Science.gov (United States)

    Chang, Xin; Yuksel, Kivanc; Skarbek, Władysław

    2017-08-01

    The paper presents the results of our recent work on development of contemporary computing platform DC2 for multimedia education usingWebGL andWeb Audio { the W3C standards. Using literate programming paradigm the WEBSA educational tools were developed. It offers for a user (student), the access to expandable collection of WEBGL Shaders and web Audio scripts. The unique feature of DC2 is the option of literate programming, offered for both, the author and the reader in order to improve interactivity to lightweightWebGL andWeb Audio components. For instance users can define: source audio nodes including synthetic sources, destination audio nodes, and nodes for audio processing such as: sound wave shaping, spectral band filtering, convolution based modification, etc. In case of WebGL beside of classic graphics effects based on mesh and fractal definitions, the novel image processing analysis by shaders is offered like nonlinear filtering, histogram of gradients, and Bayesian classifiers.

  15. Audio Format Change From Analog to Digital Audio Using the Sony Sound Forge 9.0

    OpenAIRE

    Faisal Safrudin; Yulina Yulina, SKom, MMSI

    2007-01-01

    Changes in an audio analog to digital audio is not only useful in among the journalists or the journalists are also useful for general audiences though. In previous technology we encounter a lot of almost everyone uses the form of analog audio cassettes. Along with the development of technology, analog audio format is rarely used in the presence of digital audio, but it can be overcome by changing the format of analog audio to digital audio using Sony Sound Forge 9.0. The author will discuss ...

  16. Building a Steganography Program Including How to Load, Process, and Save JPEG and PNG Files in Java

    Science.gov (United States)

    Courtney, Mary F.; Stix, Allen

    2006-01-01

    Instructors teaching beginning programming classes are often interested in exercises that involve processing photographs (i.e., files stored as .jpeg). They may wish to offer activities such as color inversion, the color manipulation effects archived with pixel thresholding, or steganography, all of which Stevenson et al. [4] assert are sought by…

  17. Adaptive Modulation Approach for Robust MPEG-4 AAC Encoded Audio Transmission

    Science.gov (United States)

    2011-11-01

    Codec (AAC) - Main profile with the single channel element (SCE) syntax, and transmit it using the audio data transport stream (ADTS) format. Single...4 AAC ENCODED AUDIO TRANSMISSION 5a. CONTRACT NUMBER IN HOUSE 5b. GRANT NUMBER FA8750-11-1-0048 5c. PROGRAM ELEMENT NUMBER 62702F 6. AUTHOR(S... audio data over fading wireless channels using Unequal Error Protection based on adaptive modulation and forward error correcting (FEC) codes. The

  18. Analog Audio Format Changes From Being Digital Audio Using Sony Sound Forge 9.0

    OpenAIRE

    Faisal Safrudin; Yulina Yulina

    2010-01-01

    Perubahan sebuah audio analog ke audio digital tidak hanya berguna padakalangan jurnalis atau wartawan juga bermanfaat untuk khalayak umumsekalipun. Pada teknologi sebelumnya banyak kita jumpai hampir setiap orangmenggunakan audio analog yaitu berupa kaset. Sejalannya perkembanganteknologi, format audio analog sudah jarang digunakan dengan hadirnya audiodigital, namun hal tersebut dapat diatasi dengan merubah format audio analog keaudio digital dengan menggunakan Sony Sound Forge 9.0. Penulis...

  19. Audio-guided audiovisual data segmentation, indexing, and retrieval

    Science.gov (United States)

    Zhang, Tong; Kuo, C.-C. Jay

    1998-12-01

    While current approaches for video segmentation and indexing are mostly focused on visual information, audio signals may actually play a primary role in video content parsing. In this paper, we present an approach for automatic segmentation, indexing, and retrieval of audiovisual data, based on audio content analysis. The accompanying audio signal of audiovisual data is first segmented and classified into basic types, i.e., speech, music, environmental sound, and silence. This coarse-level segmentation and indexing step is based upon morphological and statistical analysis of several short-term features of the audio signals. Then, environmental sounds are classified into finer classes, such as applause, explosions, bird sounds, etc. This fine-level classification and indexing step is based upon time- frequency analysis of audio signals and the use of the hidden Markov model as the classifier. On top of this archiving scheme, an audiovisual data retrieval system is proposed. Experimental results show that the proposed approach has an accuracy rate higher than 90 percent for the coarse-level classification, and higher than 85 percent for the fine-level classification. Examples of audiovisual data segmentation and retrieval are also provided.

  20. Feature Representations for Neuromorphic Audio Spike Streams.

    Science.gov (United States)

    Anumula, Jithendar; Neil, Daniel; Delbruck, Tobi; Liu, Shih-Chii

    2018-01-01

    Event-driven neuromorphic spiking sensors such as the silicon retina and the silicon cochlea encode the external sensory stimuli as asynchronous streams of spikes across different channels or pixels. Combining state-of-art deep neural networks with the asynchronous outputs of these sensors has produced encouraging results on some datasets but remains challenging. While the lack of effective spiking networks to process the spike streams is one reason, the other reason is that the pre-processing methods required to convert the spike streams to frame-based features needed for the deep networks still require further investigation. This work investigates the effectiveness of synchronous and asynchronous frame-based features generated using spike count and constant event binning in combination with the use of a recurrent neural network for solving a classification task using N-TIDIGITS18 dataset. This spike-based dataset consists of recordings from the Dynamic Audio Sensor, a spiking silicon cochlea sensor, in response to the TIDIGITS audio dataset. We also propose a new pre-processing method which applies an exponential kernel on the output cochlea spikes so that the interspike timing information is better preserved. The results from the N-TIDIGITS18 dataset show that the exponential features perform better than the spike count features, with over 91% accuracy on the digit classification task. This accuracy corresponds to an improvement of at least 2.5% over the use of spike count features, establishing a new state of the art for this dataset.

  1. Realization of guitar audio effects using methods of digital signal processing

    Science.gov (United States)

    Buś, Szymon; Jedrzejewski, Konrad

    2015-09-01

    The paper is devoted to studies on possibilities of realization of guitar audio effects by means of methods of digital signal processing. As a result of research, some selected audio effects corresponding to the specifics of guitar sound were realized as the real-time system called Digital Guitar Multi-effect. Before implementation in the system, the selected effects were investigated using the dedicated application with a graphical user interface created in Matlab environment. In the second stage, the real-time system based on a microcontroller and an audio codec was designed and realized. The system is designed to perform audio effects on the output signal of an electric guitar.

  2. Highlight summarization in golf videos using audio signals

    Science.gov (United States)

    Kim, Hyoung-Gook; Kim, Jin Young

    2008-01-01

    In this paper, we present an automatic summarization of highlights in golf videos based on audio information alone without video information. The proposed highlight summarization system is carried out based on semantic audio segmentation and detection on action units from audio signals. Studio speech, field speech, music, and applause are segmented by means of sound classification. Swing is detected by the methods of impulse onset detection. Sounds like swing and applause form a complete action unit, while studio speech and music parts are used to anchor the program structure. With the advantage of highly precise detection of applause, highlights are extracted effectively. Our experimental results obtain high classification precision on 18 golf games. It proves that the proposed system is very effective and computationally efficient to apply the technology to embedded consumer electronic devices.

  3. AudioRegent: Exploiting SimpleADL and SoX for Digital Audio Delivery

    Directory of Open Access Journals (Sweden)

    Nitin Arora

    2010-06-01

    Full Text Available AudioRegent is a command-line Python script currently being used by the University of Alabama Libraries’ Digital Services to create web-deliverable MP3s from regions within archival audio files. In conjunction with a small-footprint XML file called SimpleADL and SoX, an open-source command-line audio editor, AudioRegent batch processes archival audio files, allowing for one or many user-defined regions, particular to each audio file, to be extracted with additional audio processing in a transparent manner that leaves the archival audio file unaltered. Doing so has alleviated many of the tensions of cumbersome workflows, complicated documentation, preservation concerns, and reliance on expensive closed-source GUI audio applications.

  4. Technical Evaluation Report 31: Internet Audio Products (3/ 3

    Directory of Open Access Journals (Sweden)

    Jim Rudolph

    2004-08-01

    Full Text Available Two contrasting additions to the online audio market are reviewed: iVocalize, a browser-based audio-conferencing software, and Skype, a PC-to-PC Internet telephone tool. These products are selected for review on the basis of their success in gaining rapid popular attention and usage during 2003-04. The iVocalize review emphasizes the product’s role in the development of a series of successful online audio communities – notably several serving visually impaired users. The Skype review stresses the ease with which the product may be used for simultaneous PC-to-PC communication among up to five users. Editor’s Note: This paper serves as an introduction to reports about online community building, and reviews of online products for disabled persons, in the next ten reports in this series. JPB, Series Ed.

  5. Efficiency Optimization in Class-D Audio Amplifiers

    DEFF Research Database (Denmark)

    Yamauchi, Akira; Knott, Arnold; Jørgensen, Ivan Harald Holger

    2015-01-01

    This paper presents a new power efficiency optimization routine for designing Class-D audio amplifiers. The proposed optimization procedure finds design parameters for the power stage and the output filter, and the optimum switching frequency such that the weighted power losses are minimized under...... the given constraints. The optimization routine is applied to minimize the power losses in a 130 W class-D audio amplifier based on consumer behavior investigations, where the amplifier operates at idle and low power levels most of the time. Experimental results demonstrate that the optimization method can...... lead to around 30 % of efficiency improvement at 1.3 W output power without significant effects on both audio performance and the efficiency at high power levels....

  6. Image and audio wavelet integration for home security video compression

    Science.gov (United States)

    Cheng, Yu-Shen; Huang, Gen-Dow

    2002-03-01

    We present a novel wavelet compression algorithm for both audio and image with acceptable test by human perception. It is well known that Discrete Wavelet Transform (DWT) provides global multiple resolution decomposition that is the significant feature for the audio and image compressions. Experimental simulations show that the proposed audio and image model can satisfy the current industrial communication requirements in terms of the processing time and the compression fidelity. Development of wavelet-based compression algorithm considers the trade-off for hardware implementations. As a result, this high-performance video codec can develop compact, low power, high-speed, portable, cost-effective, and low-weight video compression for multimedia and home security applications.

  7. Music Identification System Using MPEG-7 Audio Signature Descriptors

    Science.gov (United States)

    You, Shingchern D.; Chen, Wei-Hwa; Chen, Woei-Kae

    2013-01-01

    This paper describes a multiresolution system based on MPEG-7 audio signature descriptors for music identification. Such an identification system may be used to detect illegally copied music circulated over the Internet. In the proposed system, low-resolution descriptors are used to search likely candidates, and then full-resolution descriptors are used to identify the unknown (query) audio. With this arrangement, the proposed system achieves both high speed and high accuracy. To deal with the problem that a piece of query audio may not be inside the system's database, we suggest two different methods to find the decision threshold. Simulation results show that the proposed method II can achieve an accuracy of 99.4% for query inputs both inside and outside the database. Overall, it is highly possible to use the proposed system for copyright control. PMID:23533359

  8. Music Identification System Using MPEG-7 Audio Signature Descriptors

    Directory of Open Access Journals (Sweden)

    Shingchern D. You

    2013-01-01

    Full Text Available This paper describes a multiresolution system based on MPEG-7 audio signature descriptors for music identification. Such an identification system may be used to detect illegally copied music circulated over the Internet. In the proposed system, low-resolution descriptors are used to search likely candidates, and then full-resolution descriptors are used to identify the unknown (query audio. With this arrangement, the proposed system achieves both high speed and high accuracy. To deal with the problem that a piece of query audio may not be inside the system’s database, we suggest two different methods to find the decision threshold. Simulation results show that the proposed method II can achieve an accuracy of 99.4% for query inputs both inside and outside the database. Overall, it is highly possible to use the proposed system for copyright control.

  9. Design and implementation of a two-way real-time communication system for audio over CATV networks

    Science.gov (United States)

    Cho, Choong Sang; Oh, Yoo Rhee; Lee, Young Han; Kim, Hong Kook

    2007-09-01

    In this paper, we design and implement a two-way real-time communication system for audio over cable television (CATV) networks to provide an audio-based interaction between the CATV broadcasting station and CATV subscribers. The two-way real-time communication system consists of a real-time audio encoding/decoding module, a payload formatter based on a transmission control protocol/Internet protocol (TCP/IP), and a cable network. At the broadcasting station, audio signals from a microphone are encoded by an audio codec that is implemented using a digital signal processor (DSP), where the MPEG-2 Layer II audio codec is used for the audio codec and TMS320C6416 is used for a DSP. Next, a payload formatter constructs a TCP/IP packet from an audio bitstream for transmission to a cable modem. Another payload formatter at the subscriber unpacks the TCP/IP packet decoded from the cable modem into audio bitstream. This bitstream is decoded by the MPEG-2 Layer II audio decoder. Finally the decoded audio signals are played out to the speaker. We confirmed that the system worked in real-time, with a measured delay of around 150 ms including the algorithmic and processing time delays.

  10. Audio-visual gender recognition

    Science.gov (United States)

    Liu, Ming; Xu, Xun; Huang, Thomas S.

    2007-11-01

    Combining different modalities for pattern recognition task is a very promising field. Basically, human always fuse information from different modalities to recognize object and perform inference, etc. Audio-Visual gender recognition is one of the most common task in human social communication. Human can identify the gender by facial appearance, by speech and also by body gait. Indeed, human gender recognition is a multi-modal data acquisition and processing procedure. However, computational multimodal gender recognition has not been extensively investigated in the literature. In this paper, speech and facial image are fused to perform a mutli-modal gender recognition for exploring the improvement of combining different modalities.

  11. Digital audio watermarking fundamentals, techniques and challenges

    CERN Document Server

    Xiang, Yong; Yan, Bin

    2017-01-01

    This book offers comprehensive coverage on the most important aspects of audio watermarking, from classic techniques to the latest advances, from commonly investigated topics to emerging research subdomains, and from the research and development achievements to date, to current limitations, challenges, and future directions. It also addresses key topics such as reversible audio watermarking, audio watermarking with encryption, and imperceptibility control methods. The book sets itself apart from the existing literature in three main ways. Firstly, it not only reviews classical categories of audio watermarking techniques, but also provides detailed descriptions, analysis and experimental results of the latest work in each category. Secondly, it highlights the emerging research topic of reversible audio watermarking, including recent research trends, unique features, and the potentials of this subdomain. Lastly, the joint consideration of audio watermarking and encryption is also reviewed. With the help of this...

  12. Modified BTC Algorithm for Audio Signal Coding

    Directory of Open Access Journals (Sweden)

    TOMIC, S.

    2016-11-01

    Full Text Available This paper describes modification of a well-known image coding algorithm, named Block Truncation Coding (BTC and its application in audio signal coding. BTC algorithm was originally designed for black and white image coding. Since black and white images and audio signals have different statistical characteristics, the application of this image coding algorithm to audio signal presents a novelty and a challenge. Several implementation modifications are described in this paper, while the original idea of the algorithm is preserved. The main modifications are performed in the area of signal quantization, by designing more adequate quantizers for audio signal processing. The result is a novel audio coding algorithm, whose performance is presented and analyzed in this research. The performance analysis indicates that this novel algorithm can be successfully applied in audio signal coding.

  13. Presence and the utility of audio spatialization

    DEFF Research Database (Denmark)

    Bormann, Karsten

    2005-01-01

    The primary concern of this paper is whether the utility of audio spatialization, as opposed to the fidelity of audio spatialization, impacts presence. An experiment is reported that investigates the presence-performance relationship by decoupling spatial audio fidelity (realism) from task...... performance by varying the spatial fidelity of the audio independently of its relevance to performance on the search task that subjects were to perform. This was achieved by having conditions in which subjects searched for a music-playing radio (an active sound source) and having conditions in which...... supplied only nonattenuated audio was detrimental to performance. Even so, this group of subjects consistently had the largest increase in presence scores over the baseline experiment. Further, the Witmer and Singer (1998) presence questionnaire was more sensitive to whether the audio source was active...

  14. Toward Personal and Emotional Connectivity in Mobile Higher Education through Asynchronous Formative Audio Feedback

    Science.gov (United States)

    Rasi, Päivi; Vuojärvi, Hanna

    2018-01-01

    This study aims to develop asynchronous formative audio feedback practices for mobile learning in higher education settings. The development was conducted in keeping with the principles of design-based research. The research activities focused on an inter-university online course, within which the use of instructor audio feedback was tested,…

  15. Multilevel tracking power supply for switch-mode audio power amplifiers

    DEFF Research Database (Denmark)

    Iversen, Niels Elkjær; Lazarevic, Vladan; Vasic, Miroslav

    2018-01-01

    Switch-mode technology is the common choice for high efficiency audio power amplifiers. The dynamic nature of real audio reduces efficiency as less continuous output power can be achieved. Based on methods used for RF amplifiers this paper proposes to employ envelope tracking techniques...

  16. An Exploratory Evaluation of User Interfaces for 3D Audio Mixing

    DEFF Research Database (Denmark)

    Gelineck, Steven; Korsgaard, Dannie Michael

    2015-01-01

    The paper presents an exploratory evaluation comparing different versions of a mid-air gesture based interface for mixing 3D audio exploring: (1) how such an interface generally compares to a more traditional physical interface, (2) methods for grabbing/releasing audio channels in mid-air and (3)...

  17. An Analog I/O Interface Board for Audio Arduino Open Sound Card System

    DEFF Research Database (Denmark)

    Dimitrov, Smilen; Serafin, Stefania

    2011-01-01

    AudioArduino [1] is a system consisting of an ALSA (Advanced Linux Sound Architecture) audio driver and corresponding microcontroller code; that can demonstrate full-duplex, mono, 8-bit, 44.1 kHz soundcard behavior on an FTDI based Arduino. While the basic operation as a soundcard can be demonstr...

  18. Making the Switch to Digital Audio

    Directory of Open Access Journals (Sweden)

    Shannon Gwin Mitchell

    2004-12-01

    Full Text Available In this article, the authors describe the process of converting from analog to digital audio data. They address the step-by-step decisions that they made in selecting hardware and software for recording and converting digital audio, issues of system integration, and cost considerations. The authors present a brief description of how digital audio is being used in their current research project and how it has enhanced the “quality” of their qualitative research.

  19. Audio-haptic interaction in simulated walking experiences

    DEFF Research Database (Denmark)

    Serafin, Stefania

    2011-01-01

    In this paper an overview of the work conducted on audio-haptic physically based simulation and evaluation of walking is provided. This work has been performed in the context of the Natural Interactive Walking (NIW) project, whose goal is to investigate possibilities for the integrated and interc...

  20. A Model of Distraction in an Audio-on-Audio Interference Situation with Music Program Material

    DEFF Research Database (Denmark)

    Francombe, J.; Mason, R.; Dewhirst, M.

    2015-01-01

    by a qualitative analysis of subject responses. Distraction ratings were collected for one hundred randomly created audio-on-audio interference situations with music target and interferer programs. The selected features were related to the overall loudness, loudness ratio, perceptual evaluation of audio source...

  1. Temporal phase mask encrypted optical steganography carried by amplified spontaneous emission noise.

    Science.gov (United States)

    Wu, Ben; Wang, Zhenxing; Shastri, Bhavin J; Chang, Matthew P; Frost, Nicholas A; Prucnal, Paul R

    2014-01-13

    A temporal phase mask encryption method is proposed and experimentally demonstrated to improve the security of the stealth channel in an optical steganography system. The stealth channel is protected in two levels. In the first level, the data is carried by amplified spontaneous emission (ASE) noise, which cannot be detected in either the time domain or spectral domain. In the second level, even if the eavesdropper suspects the existence of the stealth channel, each data bit is covered by a fast changing phase mask. The phase mask code is always combined with the wide band noise from ASE. Without knowing the right phase mask code to recover the stealth data, the eavesdropper can only receive the noise like signal with randomized phase.

  2. Parallel steganography framework for hiding a color image inside stereo images

    Science.gov (United States)

    Munoz-Ramirez, David O.; Ponomaryov, Volodymyr I.; Reyes-Reyes, Rogelio; Cruz-Ramos, Clara

    2017-05-01

    In this work, a robust steganography framework to hide a color image into a stereo images is proposed. The embedding algorithm is performed via Discrete Cosine Transform (DCT) and Quantization Index Modulation-Dither Modulation (QIM-DM) hiding the secret data. Additionally, the Arnold's Cat Map Transform is applied in order to scramble the secret color image, guaranteeing better security and robustness of the proposed system. Novel framework has demonstrated better performance against JPEG compression attacks among other existing approaches. Besides, the proposed algorithm is developed taking into account the parallel paradigm in order to be implemented in multi-core CPU increasing the processing speed. The results obtained by the proposed framework show high values of PSNR and SSIM, which demonstrate imperceptibility and sufficient robustness against JPEG compression attacks.

  3. Reducing audio stimulus presentation latencies across studies, laboratories, and hardware and operating system configurations.

    Science.gov (United States)

    Babjack, Destiny L; Cernicky, Brandon; Sobotka, Andrew J; Basler, Lee; Struthers, Devon; Kisic, Richard; Barone, Kimberly; Zuccolotto, Anthony P

    2015-09-01

    Using differing computer platforms and audio output devices to deliver audio stimuli often introduces (1) substantial variability across labs and (2) variable time between the intended and actual sound delivery (the sound onset latency). Fast, accurate audio onset latencies are particularly important when audio stimuli need to be delivered precisely as part of studies that depend on accurate timing (e.g., electroencephalographic, event-related potential, or multimodal studies), or in multisite studies in which standardization and strict control over the computer platforms used is not feasible. This research describes the variability introduced by using differing configurations and introduces a novel approach to minimizing audio sound latency and variability. A stimulus presentation and latency assessment approach is presented using E-Prime and Chronos (a new multifunction, USB-based data presentation and collection device). The present approach reliably delivers audio stimuli with low latencies that vary by ≤1 ms, independent of hardware and Windows operating system (OS)/driver combinations. The Chronos audio subsystem adopts a buffering, aborting, querying, and remixing approach to the delivery of audio, to achieve a consistent 1-ms sound onset latency for single-sound delivery, and precise delivery of multiple sounds that achieves standard deviations of 1/10th of a millisecond without the use of advanced scripting. Chronos's sound onset latencies are small, reliable, and consistent across systems. Testing of standard audio delivery devices and configurations highlights the need for careful attention to consistency between labs, experiments, and multiple study sites in their hardware choices, OS selections, and adoption of audio delivery systems designed to sidestep the audio latency variability issue.

  4. Viral Computer Warfare via Activation Engine Employing Steganography

    National Research Council Canada - National Science Library

    Lathrop, Dale

    2000-01-01

    ... as a deployment system for cyber-attacks. The results of this research indicate that the use of a separate engine followed by an HTML-based electronic mail message containing a photographic image with a steganographically...

  5. Presence and the utility of audio spatialization

    DEFF Research Database (Denmark)

    Bormann, Karsten

    2005-01-01

    or not, while the presence questionnaire used by Slater and coworkers (see Tromp et al., 1998) was more sensitive to whether audio was fully spatialized or not. Finally, having the sound source active positively impacts the assessment of the audio while negatively impacting subjects' assessment...

  6. Synchronization and comparison of Lifelog audio recordings

    DEFF Research Database (Denmark)

    Nielsen, Andreas Brinch; Hansen, Lars Kai

    2008-01-01

    We investigate concurrent ‘Lifelog’ audio recordings to locate segments from the same environment. We compare two techniques earlier proposed for pattern recognition in extended audio recordings, namely cross-correlation and a fingerprinting technique. If successful, such alignment can be used...

  7. Prediction of perceptual audio reproduction characteristics

    DEFF Research Database (Denmark)

    Volk, Christer Peter

    affects perception. In this project a number of audio metrics are presented, which describes perceptual characteristics in terms of properties of the physical acoustical output of headphones and loudspeakers. The audio metrics relies on perceptual models for estimations of the how these acoustical outputs...

  8. Estimation of macro sleep stages from whole night audio analysis.

    Science.gov (United States)

    Dafna, E; Halevi, M; Ben Or, D; Tarasiuk, A; Zigel, Y

    2016-08-01

    During routine sleep diagnostic procedure, sleep is broadly divided into three states: rapid eye movement (REM), non-REM (NREM) states, and wake, frequently named macro-sleep stages (MSS). In this study, we present a pioneering attempt for MSS detection using full night audio analysis. Our working hypothesis is that there might be differences in sound properties within each MSS due to breathing efforts (or snores) and body movements in bed. In this study, audio signals of 35 patients referred to a sleep laboratory were recorded and analyzed. An additional 178 subjects were used to train a probabilistic time-series model for MSS staging across the night. The audio-based system was validated on 20 out of the 35 subjects. System accuracy for estimating (detecting) epoch-by-epoch wake/REM/NREM states for a given subject is 74% (69% for wake, 54% for REM, and 79% NREM). Mean error (absolute difference) was 36±34 min for detecting total sleep time, 17±21 min for sleep latency, 5±5% for sleep efficiency, and 7±5% for REM percentage. These encouraging results indicate that audio-based analysis can provide a simple and comfortable alternative method for ambulatory evaluation of sleep and its disorders.

  9. Digital signal processor for silicon audio playback devices; Silicon audio saisei kikiyo digital signal processor

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2000-03-01

    The digital audio signal processor (DSP) TC9446F series has been developed silicon audio playback devices with a memory medium of, e.g., flash memory, DVD players, and AV devices, e.g., TV sets. It corresponds to AAC (advanced audio coding) (2ch) and MP3 (MPEG1 Layer3), as the audio compressing techniques being used for transmitting music through an internet. It also corresponds to compressed types, e.g., Dolby Digital, DTS (digital theater system) and MPEG2 audio, being adopted for, e.g., DVDs. It can carry a built-in audio signal processing program, e.g., Dolby ProLogic, equalizer, sound field controlling, and 3D sound. TC9446XB has been lined up anew. It adopts an FBGA (fine pitch ball grid array) package for portable audio devices. (translated by NEDO)

  10. Steganography on quantum pixel images using Shannon entropy

    Science.gov (United States)

    Laurel, Carlos Ortega; Dong, Shi-Hai; Cruz-Irisson, M.

    2016-07-01

    This paper presents a steganographical algorithm based on least significant bit (LSB) from the most significant bit information (MSBI) and the equivalence of a bit pixel image to a quantum pixel image, which permits to make the information communicate secretly onto quantum pixel images for its secure transmission through insecure channels. This algorithm offers higher security since it exploits the Shannon entropy for an image.

  11. Turkish Music Genre Classification using Audio and Lyrics Features

    Directory of Open Access Journals (Sweden)

    Önder ÇOBAN

    2017-05-01

    Full Text Available Music Information Retrieval (MIR has become a popular research area in recent years. In this context, researchers have developed music information systems to find solutions for such major problems as automatic playlist creation, hit song detection, and music genre or mood classification. Meta-data information, lyrics, or melodic content of music are used as feature resource in previous works. However, lyrics do not often used in MIR systems and the number of works in this field is not enough especially for Turkish. In this paper, firstly, we have extended our previously created Turkish MIR (TMIR dataset, which comprises of Turkish lyrics, by including the audio file of each song. Secondly, we have investigated the effect of using audio and textual features together or separately on automatic Music Genre Classification (MGC. We have extracted textual features from lyrics using different feature extraction models such as word2vec and traditional Bag of Words. We have conducted our experiments on Support Vector Machine (SVM algorithm and analysed the impact of feature selection and different feature groups on MGC. We have considered lyrics based MGC as a text classification task and also investigated the effect of term weighting method. Experimental results show that textual features can also be effective as well as audio features for Turkish MGC, especially when a supervised term weighting method is employed. We have achieved the highest success rate as 99,12\\% by using both audio and textual features together.

  12. Video genre categorization and representation using audio-visual information

    Science.gov (United States)

    Ionescu, Bogdan; Seyerlehner, Klaus; Rasche, Christoph; Vertan, Constantin; Lambert, Patrick

    2012-04-01

    We propose an audio-visual approach to video genre classification using content descriptors that exploit audio, color, temporal, and contour information. Audio information is extracted at block-level, which has the advantage of capturing local temporal information. At the temporal structure level, we consider action content in relation to human perception. Color perception is quantified using statistics of color distribution, elementary hues, color properties, and relationships between colors. Further, we compute statistics of contour geometry and relationships. The main contribution of our work lies in harnessing the descriptive power of the combination of these descriptors in genre classification. Validation was carried out on over 91 h of video footage encompassing 7 common video genres, yielding average precision and recall ratios of 87% to 100% and 77% to 100%, respectively, and an overall average correct classification of up to 97%. Also, experimental comparison as part of the MediaEval 2011 benchmarking campaign demonstrated the efficiency of the proposed audio-visual descriptors over other existing approaches. Finally, we discuss a 3-D video browsing platform that displays movies using feature-based coordinates and thus regroups them according to genre.

  13. High-Fidelity Piezoelectric Audio Device

    Science.gov (United States)

    Woodward, Stanley E.; Fox, Robert L.; Bryant, Robert G.

    2003-01-01

    ModalMax is a very innovative means of harnessing the vibration of a piezoelectric actuator to produce an energy efficient low-profile device with high-bandwidth high-fidelity audio response. The piezoelectric audio device outperforms many commercially available speakers made using speaker cones. The piezoelectric device weighs substantially less (4 g) than the speaker cones which use magnets (10 g). ModalMax devices have extreme fabrication simplicity. The entire audio device is fabricated by lamination. The simplicity of the design lends itself to lower cost. The piezoelectric audio device can be used without its acoustic chambers and thereby resulting in a very low thickness of 0.023 in. (0.58 mm). The piezoelectric audio device can be completely encapsulated, which makes it very attractive for use in wet environments. Encapsulation does not significantly alter the audio response. Its small size (see Figure 1) is applicable to many consumer electronic products, such as pagers, portable radios, headphones, laptop computers, computer monitors, toys, and electronic games. The audio device can also be used in automobile or aircraft sound systems.

  14. Method for reading sensors and controlling actuators using audio interfaces of mobile devices.

    Science.gov (United States)

    Aroca, Rafael V; Burlamaqui, Aquiles F; Gonçalves, Luiz M G

    2012-01-01

    This article presents a novel closed loop control architecture based on audio channels of several types of computing devices, such as mobile phones and tablet computers, but not restricted to them. The communication is based on an audio interface that relies on the exchange of audio tones, allowing sensors to be read and actuators to be controlled. As an application example, the presented technique is used to build a low cost mobile robot, but the system can also be used in a variety of mechatronics applications and sensor networks, where smartphones are the basic building blocks.

  15. Method for Reading Sensors and Controlling Actuators Using Audio Interfaces of Mobile Devices

    Science.gov (United States)

    Aroca, Rafael V.; Burlamaqui, Aquiles F.; Gonçalves, Luiz M. G.

    2012-01-01

    This article presents a novel closed loop control architecture based on audio channels of several types of computing devices, such as mobile phones and tablet computers, but not restricted to them. The communication is based on an audio interface that relies on the exchange of audio tones, allowing sensors to be read and actuators to be controlled. As an application example, the presented technique is used to build a low cost mobile robot, but the system can also be used in a variety of mechatronics applications and sensor networks, where smartphones are the basic building blocks. PMID:22438726

  16. Musical Audio Synthesis Using Autoencoding Neural Nets

    OpenAIRE

    Sarroff, Andy; Casey, Michael A.

    2014-01-01

    With an optimal network topology and tuning of hyperpa-\\ud rameters, artificial neural networks (ANNs) may be trained\\ud to learn a mapping from low level audio features to one\\ud or more higher-level representations. Such artificial neu-\\ud ral networks are commonly used in classification and re-\\ud gression settings to perform arbitrary tasks. In this work\\ud we suggest repurposing autoencoding neural networks as\\ud musical audio synthesizers. We offer an interactive musi-\\ud cal audio synt...

  17. Audio-Visual Classification of Sports Types

    DEFF Research Database (Denmark)

    Gade, Rikke; Abou-Zleikha, Mohamed; Christensen, Mads Græsbøll

    2015-01-01

    In this work we propose a method for classification of sports types from combined audio and visual features ex- tracted from thermal video. From audio Mel Frequency Cepstral Coefficients (MFCC) are extracted, and PCA are applied to reduce the feature space to 10 dimensions. From the visual modality...... short trajectories are constructed to rep- resent the motion of players. From these, four motion fea- tures are extracted and combined directly with audio fea- tures for classification. A k-nearest neighbour classifier is applied for classification of 180 1-minute video sequences from three sports types...

  18. 3D Audio Acquisition and Reproduction Systems

    OpenAIRE

    Evrard, Marc; André, Cédric; Embrechts, Jean-Jacques; Verly, Jacques

    2011-01-01

    This presentation introduces two different research projects dealing with 3D audio for 3D-stereoscopic movies. The first project “3D audio acquisition for real time applications” studies the best method for acquiring a full 3D audio soundscape on location and for processing it in real-time for further reproduction. The second project “Adding 3D sound to 3D cinema” is aimed towards the study of reproducing a 3D soundscape consistent with the visual content of a 3D-stereoscopic movie. ...

  19. Amplitude Modulated Sinusoidal Signal Decomposition for Audio Coding

    DEFF Research Database (Denmark)

    Christensen, M. G.; Jacobson, A.; Andersen, S. V.

    2006-01-01

    In this paper, we present a decomposition for sinusoidal coding of audio, based on an amplitude modulation of sinusoids via a linear combination of arbitrary basis vectors. The proposed method, which incorporates a perceptual distortion measure, is based on a relaxation of a nonlinear least......-squares minimization. Rate-distortion curves and listening tests show that, compared to a constant-amplitude sinusoidal coder, the proposed decomposition offers perceptually significant improvements in critical transient signals....

  20. Virtual Microphones for Multichannel Audio Resynthesis

    Directory of Open Access Journals (Sweden)

    Athanasios Mouchtaris

    2003-09-01

    Full Text Available Multichannel audio offers significant advantages for music reproduction, including the ability to provide better localization and envelopment, as well as reduced imaging distortion. On the other hand, multichannel audio is a demanding media type in terms of transmission requirements. Often, bandwidth limitations prohibit transmission of multiple audio channels. In such cases, an alternative is to transmit only one or two reference channels and recreate the rest of the channels at the receiving end. Here, we propose a system capable of synthesizing the required signals from a smaller set of signals recorded in a particular venue. These synthesized “virtual” microphone signals can be used to produce multichannel recordings that accurately capture the acoustics of that venue. Applications of the proposed system include transmission of multichannel audio over the current Internet infrastructure and, as an extension of the methods proposed here, remastering existing monophonic and stereophonic recordings for multichannel rendering.

  1. Spatial audio reproduction with primary ambient extraction

    CERN Document Server

    He, JianJun

    2017-01-01

    This book first introduces the background of spatial audio reproduction, with different types of audio content and for different types of playback systems. A literature study on the classical and emerging Primary Ambient Extraction (PAE) techniques is presented. The emerging techniques aim to improve the extraction performance and also enhance the robustness of PAE approaches in dealing with more complex signals encountered in practice. The in-depth theoretical study helps readers to understand the rationales behind these approaches. Extensive objective and subjective experiments validate the feasibility of applying PAE in spatial audio reproduction systems. These experimental results, together with some representative audio examples and MATLAB codes of the key algorithms, illustrate clearly the differences among various approaches and also help readers gain insights on selecting different approaches for different applications.

  2. Parametric time-frequency domain spatial audio

    CERN Document Server

    Delikaris-Manias, Symeon; Politis, Archontis

    2018-01-01

    This book provides readers with the principles and best practices in spatial audio signal processing. It describes how sound fields and their perceptual attributes are captured and analyzed within the time-frequency domain, how essential representation parameters are coded, and how such signals are efficiently reproduced for practical applications. The book is split into four parts starting with an overview of the fundamentals. It then goes on to explain the reproduction of spatial sound before offering an examination of signal-dependent spatial filtering. The book finishes with coverage of both current and future applications and the direction that spatial audio research is heading in. Parametric Time-frequency Domain Spatial Audio focuses on applications in entertainment audio, including music, home cinema, and gaming--covering the capturing and reproduction of spatial sound as well as its generation, transduction, representation, transmission, and perception. This book will teach readers the tools needed...

  3. Audio production principles practical studio applications

    CERN Document Server

    Elmosnino, Stephane

    2018-01-01

    A new and fully practical guide to all of the key topics in audio production, this book covers the entire workflow from pre-production, to recording all kinds of instruments, to mixing theories and tools, and finally to mastering.

  4. Audio Description as a Pedagogical Tool

    OpenAIRE

    Georgina Kleege; Scott Wallin

    2015-01-01

    Audio description is the process of translating visual information into words for people who are blind or have low vision. Typically such description has focused on films, museum exhibitions, images and video on the internet, and live theater. Because it allows people with visual impairments to experience a variety of cultural and educational texts that would otherwise be inaccessible, audio description is a mandated aspect of disability inclusion, although it remains markedly underdeveloped ...

  5. Audio description as an accessibility enhancer

    OpenAIRE

    Martins, Cláudia Susana Nunes

    2012-01-01

    Audio description for the blind and visually-impaired has been around since people have described what is seen. Throughout time, it has evolved and developed in different contexts, starting with daily life, moving into the cinema and television, then across other performing arts, museums and galleries, historical sites and public places. Audio description is above all an issue of accessibility and of providing visually-impaired people with the same rights to have access to culture, e...

  6. The Effect Of 3D Audio And Other Audio Techniques On Virtual Reality Experience.

    Science.gov (United States)

    Brinkman, Willem-Paul; Hoekstra, Allart R D; van Egmond, René

    2015-01-01

    Three studies were conducted to examine the effect of audio on people's experience in a virtual world. The first study showed that people could distinguish between mono, stereo, Dolby surround and 3D audio of a wasp. The second study found significant effects for audio techniques on people's self-reported anxiety, presence, and spatial perception. The third study found that adding sound to a visual virtual world had a significant effect on people's experience (including heart rate), while it found no difference in experience between stereo and 3D audio.

  7. Learning Audio - Sheet Music Correspondences for Score Identification and Offline Alignment

    OpenAIRE

    Dorfer, Matthias; Arzt, Andreas; Widmer, Gerhard

    2017-01-01

    This work addresses the problem of matching short excerpts of audio with their respective counterparts in sheet music images. We show how to employ neural network-based cross-modality embedding spaces for solving the following two sheet music-related tasks: retrieving the correct piece of sheet music from a database when given a music audio as a search query; and aligning an audio recording of a piece with the corresponding images of sheet music. We demonstrate the feasibility of this in expe...

  8. Design and implementation of an audio indicator

    Science.gov (United States)

    Zheng, Shiyong; Li, Zhao; Li, Biqing

    2017-04-01

    This page proposed an audio indicator which designed by using C9014, LED by operational amplifier level indicator, the decimal count/distributor of CD4017. The experimental can control audibly neon and holiday lights through the signal. Input audio signal after C9014 composed of operational amplifier for power amplifier, the adjust potentiometer extraction amplification signal input voltage CD4017 distributors make its drive to count, then connect the LED display running situation of the circuit. This simple audio indicator just use only U1 and can produce two colors LED with the audio signal tandem come pursuit of the running effect, from LED display the running of the situation takes can understand the general audio signal. The variation in the audio and the frequency of the signal and the corresponding level size. In this light can achieve jump to change, slowly, atlas, lighting four forms, used in home, hotel, discos, theater, advertising and other fields, and a wide range of USES, rU1h life in a modern society.

  9. Tensorial dynamic time warping with articulation index representation for efficient audio-template learning.

    Science.gov (United States)

    Le, Long N; Jones, Douglas L

    2018-03-01

    Audio classification techniques often depend on the availability of a large labeled training dataset for successful performance. However, in many application domains of audio classification (e.g., wildlife monitoring), obtaining labeled data is still a costly and laborious process. Motivated by this observation, a technique is proposed to efficiently learn a clean template from a few labeled, but likely corrupted (by noise and interferences), data samples. This learning can be done efficiently via tensorial dynamic time warping on the articulation index-based time-frequency representations of audio data. The learned template can then be used in audio classification following the standard template-based approach. Experimental results show that the proposed approach outperforms both (1) the recurrent neural network approach and (2) the state-of-the-art in the template-based approach on a wildlife detection application with few training samples.

  10. The Audio Description as a Physics Teaching Tool

    Science.gov (United States)

    Cozendey, Sabrina; Costa, Maria da Piedade

    2016-01-01

    This study analyses the use of audio description in teaching physics concepts, aiming to determine the variables that influence the understanding of the concept. One education resource was audio described. For make the audio description the screen was freezing. The video with and without audio description should be presented to students, so that…

  11. Entropy coding of Quantized Spectral Components in FDLP audio codec

    OpenAIRE

    Motlicek, Petr; Ganapathy, Sriram; Hermansky, Hynek

    2008-01-01

    Audio codec based on Frequency Domain Linear Prediction (FDLP) exploits auto-regressive modeling to approximate instantaneous energy in critical frequency sub-bands of relatively long input segments. Current version of the FDLP codec operating at 66 kbps has shown to provide comparable subjective listening quality results to the state-of-the-art codecs on similar bit-rates even without employing strategic blocks, such as entropy coding or simultaneous masking. This paper describes an experime...

  12. A high-capacity steganography scheme for JPEG2000 baseline system.

    Science.gov (United States)

    Zhang, Liang; Wang, Haili; Wu, Renbiao

    2009-08-01

    Hiding capacity is very important for efficient covert communications. For JPEG2000 compressed images, it is necessary to enlarge the hiding capacity because the available redundancy is very limited. In addition, the bitstream truncation makes it difficult to hide information. In this paper, a high-capacity steganography scheme is proposed for the JPEG2000 baseline system, which uses bit-plane encoding procedure twice to solve the problem due to bitstream truncation. Moreover, embedding points and their intensity are determined in a well defined quantitative manner via redundancy evaluation to increase hiding capacity. The redundancy is measured by bit, which is different from conventional methods which adjust the embedding intensity by multiplying a visual masking factor. High volumetric data is embedded into bit-planes as low as possible to keep message integrality, but at the cost of an extra bit-plane encoding procedure and slightly changed compression ratio. The proposed method can be easily integrated into the JPEG2000 image coder, and the produced stego-bitstream can be decoded normally. Simulation shows that the proposed method is feasible, effective, and secure.

  13. A graph theory practice on transformed image: a random image steganography.

    Science.gov (United States)

    Thanikaiselvan, V; Arulmozhivarman, P; Subashanthini, S; Amirtharajan, Rengarajan

    2013-01-01

    Modern day information age is enriched with the advanced network communication expertise but unfortunately at the same time encounters infinite security issues when dealing with secret and/or private information. The storage and transmission of the secret information become highly essential and have led to a deluge of research in this field. In this paper, an optimistic effort has been taken to combine graceful graph along with integer wavelet transform (IWT) to implement random image steganography for secure communication. The implementation part begins with the conversion of cover image into wavelet coefficients through IWT and is followed by embedding secret image in the randomly selected coefficients through graph theory. Finally stegoimage is obtained by applying inverse IWT. This method provides a maximum of 44 dB peak signal to noise ratio (PSNR) for 266646 bits. Thus, the proposed method gives high imperceptibility through high PSNR value and high embedding capacity in the cover image due to adaptive embedding scheme and high robustness against blind attack through graph theoretic random selection of coefficients.

  14. Comparison of Linear Prediction Models for Audio Signals

    Directory of Open Access Journals (Sweden)

    2009-03-01

    Full Text Available While linear prediction (LP has become immensely popular in speech modeling, it does not seem to provide a good approach for modeling audio signals. This is somewhat surprising, since a tonal signal consisting of a number of sinusoids can be perfectly predicted based on an (all-pole LP model with a model order that is twice the number of sinusoids. We provide an explanation why this result cannot simply be extrapolated to LP of audio signals. If noise is taken into account in the tonal signal model, a low-order all-pole model appears to be only appropriate when the tonal components are uniformly distributed in the Nyquist interval. Based on this observation, different alternatives to the conventional LP model can be suggested. Either the model should be changed to a pole-zero, a high-order all-pole, or a pitch prediction model, or the conventional LP model should be preceded by an appropriate frequency transform, such as a frequency warping or downsampling. By comparing these alternative LP models to the conventional LP model in terms of frequency estimation accuracy, residual spectral flatness, and perceptual frequency resolution, we obtain several new and promising approaches to LP-based audio modeling.

  15. Implementation of Audio signal by using wavelet transform

    OpenAIRE

    Chakresh kumar,; Chandra Shekhar; Ashu Soni; Bindu Thakral

    2010-01-01

    Audio coding is the technology to represent audio in digital form with as few bits as possible while maintaining the intelligibility and quality required for particular application. Interest in audio coding is motivated by the evolution to digital communications and the requirement to minimize bit rate, and hence conserve bandwidth. There is always a tradeoff between compression ratio and maintaining the delivered audio quality and intelligibility. Audio coding is widely used in application s...

  16. “Wrapping” X3DOM around Web Audio API

    Directory of Open Access Journals (Sweden)

    Andreas Stamoulias

    2015-12-01

    Full Text Available Spatial sound has a conceptual role in the Web3D environments, due to highly realism scenes that can provide. Lately the efforts are concentrated on the extension of the X3D/ X3DOM through spatial sound attributes. This paper presents a novel method for the introduction of spatial sound components in the X3DOM framework, based on X3D specification and Web Audio API. The proposed method incorporates the introduction of enhanced sound nodes for X3DOM which are derived by the implementation of the X3D standard components, enriched with accessional features of Web Audio API. Moreover, several examples-scenarios developed for the evaluation of our approach. The implemented examples established the achievability of new registered nodes in X3DOM, for spatial sound characteristics in Web3D virtual worlds.

  17. Audio teleconferencing: creative use of a forgotten innovation.

    Science.gov (United States)

    Mather, Carey; Marlow, Annette

    2012-06-01

    As part of a regional School of Nursing and Midwifery's commitment to addressing recruitment and retention issues, approximately 90% of second year undergraduate student nurses undertake clinical placements at: multipurpose centres; regional or district hospitals; aged care; or community centres based in rural and remote regions within the State. The remaining 10% undertake professional experience placement in urban areas only. This placement of a large cohort of students, in low numbers in a variety of clinical settings, initiated the need to provide consistent support to both students and staff at these facilities. Subsequently the development of an audio teleconferencing model of clinical facilitation to guide student teaching and learning and to provide support to registered nurse preceptors in clinical practice was developed. This paper draws on Weimer's 'Personal Accounts of Change' approach to describe, discuss and evaluate the modifications that have occurred since the inception of this audio teleconferencing model (Weimer, 2006).

  18. Underdetermined Blind Audio Source Separation Using Modal Decomposition

    Directory of Open Access Journals (Sweden)

    Abdeldjalil Aïssa-El-Bey

    2007-03-01

    Full Text Available This paper introduces new algorithms for the blind separation of audio sources using modal decomposition. Indeed, audio signals and, in particular, musical signals can be well approximated by a sum of damped sinusoidal (modal components. Based on this representation, we propose a two-step approach consisting of a signal analysis (extraction of the modal components followed by a signal synthesis (grouping of the components belonging to the same source using vector clustering. For the signal analysis, two existing algorithms are considered and compared: namely the EMD (empirical mode decomposition algorithm and a parametric estimation algorithm using ESPRIT technique. A major advantage of the proposed method resides in its validity for both instantaneous and convolutive mixtures and its ability to separate more sources than sensors. Simulation results are given to compare and assess the performance of the proposed algorithms.

  19. Underdetermined Blind Audio Source Separation Using Modal Decomposition

    Directory of Open Access Journals (Sweden)

    Aïssa-El-Bey Abdeldjalil

    2007-01-01

    Full Text Available This paper introduces new algorithms for the blind separation of audio sources using modal decomposition. Indeed, audio signals and, in particular, musical signals can be well approximated by a sum of damped sinusoidal (modal components. Based on this representation, we propose a two-step approach consisting of a signal analysis (extraction of the modal components followed by a signal synthesis (grouping of the components belonging to the same source using vector clustering. For the signal analysis, two existing algorithms are considered and compared: namely the EMD (empirical mode decomposition algorithm and a parametric estimation algorithm using ESPRIT technique. A major advantage of the proposed method resides in its validity for both instantaneous and convolutive mixtures and its ability to separate more sources than sensors. Simulation results are given to compare and assess the performance of the proposed algorithms.

  20. Audio Visual Media Components in Educational Game for Elementary Students

    Directory of Open Access Journals (Sweden)

    Meilani Hartono

    2016-12-01

    Full Text Available The purpose of this research was to review and implement interactive audio visual media used in an educational game to improve elementary students’ interest in learning mathematics. The game was developed for desktop platform. The art of the game was set as 2D cartoon art with animation and audio in order to make students more interest. There were four mini games developed based on the researches on mathematics study. Development method used was Multimedia Development Life Cycle (MDLC that consists of requirement, design, development, testing, and implementation phase. Data collection methods used are questionnaire, literature study, and interview. The conclusion is elementary students interest with educational game that has fun and active (moving objects, with fast tempo of music, and carefree color like blue. This educational game is hoped to be an alternative teaching tool combined with conventional teaching method.

  1. Using the ENF Criterion for Determining the Time of Recording of Short Digital Audio Recordings

    Science.gov (United States)

    Huijbregtse, Maarten; Geradts, Zeno

    The Electric Network Frequency (ENF) Criterion is a recently developed forensic technique for determining the time of recording of digital audio recordings, by matching the ENF pattern from a questioned recording with an ENF pattern database. In this paper we discuss its inherent limitations in the case of short - i.e., less than 10 minutes in duration - digital audio recordings. We also present a matching procedure based on the correlation coefficient, as a more robust alternative to squared error matching.

  2. Big Data Analytics: Challenges And Applications For Text, Audio, Video, And Social Media Data

    OpenAIRE

    Jai Prakash Verma; Smita Agrawal; Bankim Patel; Atul Patel

    2016-01-01

    All types of machine automated systems are generating large amount of data in different forms like statistical, text, audio, video, sensor, and bio-metric data that emerges the term Big Data. In this paper we are discussing issues, challenges, and application of these types of Big Data with the consideration of big data dimensions. Here we are discussing social media data analytics, content based analytics, text data analytics, audio, and video data analytics their issues and expected applica...

  3. Automatic summarization of soccer highlights using audio-visual descriptors.

    Science.gov (United States)

    Raventós, A; Quijada, R; Torres, Luis; Tarrés, Francesc

    2015-01-01

    Automatic summarization generation of sports video content has been object of great interest for many years. Although semantic descriptions techniques have been proposed, many of the approaches still rely on low-level video descriptors that render quite limited results due to the complexity of the problem and to the low capability of the descriptors to represent semantic content. In this paper, a new approach for automatic highlights summarization generation of soccer videos using audio-visual descriptors is presented. The approach is based on the segmentation of the video sequence into shots that will be further analyzed to determine its relevance and interest. Of special interest in the approach is the use of the audio information that provides additional robustness to the overall performance of the summarization system. For every video shot a set of low and mid level audio-visual descriptors are computed and lately adequately combined in order to obtain different relevance measures based on empirical knowledge rules. The final summary is generated by selecting those shots with highest interest according to the specifications of the user and the results of relevance measures. A variety of results are presented with real soccer video sequences that prove the validity of the approach.

  4. FPGA-based implementation for steganalysis: a JPEG-compatibility algorithm

    Science.gov (United States)

    Gutierrez-Fernandez, E.; Portela-García, M.; Lopez-Ongil, C.; Garcia-Valderas, M.

    2013-05-01

    Steganalysis is a process to detect hidden data in cover documents, like digital images, videos, audio files, etc. This is the inverse process of steganography, which is the used method to hide secret messages. The widely use of computers and network technologies make digital files very easy-to-use means for storing secret data or transmitting secret messages through the Internet. Depending on the cover medium used to embed the data, there are different steganalysis methods. In case of images, many of the steganalysis and steganographic methods are focused on JPEG image formats, since JPEG is one of the most common formats. One of the main important handicaps of steganalysis methods is the processing speed, since it is usually necessary to process huge amount of data or it can be necessary to process the on-going internet traffic in real-time. In this paper, a JPEG steganalysis system is implemented in an FPGA in order to speed-up the detection process with respect to software-based implementations and to increase the throughput. In particular, the implemented method is the JPEG-compatibility detection algorithm that is based on the fact that when a JPEG image is modified, the resulting image is incompatible with the JPEG compression process.

  5. Audio-Visual Speech Recognition Using Lip Information Extracted from Side-Face Images

    Directory of Open Access Journals (Sweden)

    Koji Iwano

    2007-03-01

    Full Text Available This paper proposes an audio-visual speech recognition method using lip information extracted from side-face images as an attempt to increase noise robustness in mobile environments. Our proposed method assumes that lip images can be captured using a small camera installed in a handset. Two different kinds of lip features, lip-contour geometric features and lip-motion velocity features, are used individually or jointly, in combination with audio features. Phoneme HMMs modeling the audio and visual features are built based on the multistream HMM technique. Experiments conducted using Japanese connected digit speech contaminated with white noise in various SNR conditions show effectiveness of the proposed method. Recognition accuracy is improved by using the visual information in all SNR conditions. These visual features were confirmed to be effective even when the audio HMM was adapted to noise by the MLLR method.

  6. Robot Command Interface Using an Audio-Visual Speech Recognition System

    Science.gov (United States)

    Ceballos, Alexánder; Gómez, Juan; Prieto, Flavio; Redarce, Tanneguy

    In recent years audio-visual speech recognition has emerged as an active field of research thanks to advances in pattern recognition, signal processing and machine vision. Its ultimate goal is to allow human-computer communication using voice, taking into account the visual information contained in the audio-visual speech signal. This document presents a command's automatic recognition system using audio-visual information. The system is expected to control the laparoscopic robot da Vinci. The audio signal is treated using the Mel Frequency Cepstral Coefficients parametrization method. Besides, features based on the points that define the mouth's outer contour according to the MPEG-4 standard are used in order to extract the visual speech information.

  7. Real-Time Audio Processing on the T-CREST Multicore Platform

    DEFF Research Database (Denmark)

    Ausin, Daniel Sanz; Pezzarossa, Luca; Schoeberl, Martin

    2017-01-01

    of the audio signal. This paper presents a real-time multicore audio processing system based on the T-CREST platform. T-CREST is a time-predictable multicore processor for real-time embedded systems. Multiple audio effect tasks have been implemented, which can be connected together in different configurations...... forming sequential and parallel effect chains, and using a network-onchip for intercommunication between processors. The evaluation of the system shows that real-time processing of multiple effect configurations is possible, and that the estimation and control of latency ensures real-time behavior.......Multicore platforms are nowadays widely used for audio processing applications, due to the improvement of computational power that they provide. However, some of these systems are not optimized for temporally constrained environments, which often leads to an undesired increase in the latency...

  8. Audio Description as a Pedagogical Tool

    Directory of Open Access Journals (Sweden)

    Georgina Kleege

    2015-05-01

    Full Text Available Audio description is the process of translating visual information into words for people who are blind or have low vision. Typically such description has focused on films, museum exhibitions, images and video on the internet, and live theater. Because it allows people with visual impairments to experience a variety of cultural and educational texts that would otherwise be inaccessible, audio description is a mandated aspect of disability inclusion, although it remains markedly underdeveloped and underutilized in our classrooms and in society in general. Along with increasing awareness of disability, audio description pushes students to practice close reading of visual material, deepen their analysis, and engage in critical discussions around the methodology, standards and values, language, and role of interpretation in a variety of academic disciplines. We outline a few pedagogical interventions that can be customized to different contexts to develop students' writing and critical thinking skills through guided description of visual material.

  9. Evaluation of Perceived Spatial Audio Quality

    Directory of Open Access Journals (Sweden)

    Jan Berg

    2006-04-01

    Full Text Available The increased use of audio applications capable of conveying enhanced spatial quality puts focus on how such a quality should be evaluated. Different approaches to evaluation of perceived quality are briefly discussed and a new technique is introduced. In a series of experiment, attributes were elicited from subjects, tested and subsequently used for derivation of evaluation scales that were feasible for subjective evaluation of the spatial quality of certain multichannel stimuli. The findings of these experiments led to the development of a novel method for evaluation of spatial audio in surround sound systems. Parts of the method were subsequently implemented in the OPAQUE software prototype designed to facilitate the elicitation process. The prototype was successfully tested in a pilot experiment. The experiments show that attribute scales derived from subjects' personal constructs are functional for evaluation of perceived spatial audio quality. Finally, conclusions on the importance of spatial quality evaluation of new applications are made.

  10. Ears on the hand: reaching 3D audio targets

    Directory of Open Access Journals (Sweden)

    Hanneton Sylvain

    2011-12-01

    Full Text Available We studied the ability of right-handed participants to reach 3D audio targets with their right hand. Our immersive audio environment was based on the OpenAL library and Fastrak magnetic sensors for motion capture. Participants listen the target through a “virtual” listener linked to a sensor fixed either on the head or on the hand. We compare three experimental conditions in which the virtual listener is on the head, on the left hand, and on the right hand (that reach the target. We show that (1 participants are able to learn the task but (2 with a low success rate and high durations, (3 the individual levels of performance are very variable, (4 the best performances are achieved when the listener is on the right hand. Consequently, we concluded that our participants were able to learn to locate 3D audio sources even if their ears are transposed on their hand, but we found of behavioral differences between the three experimental conditions.

  11. Authenticity examination of compressed audio recordings using detection of multiple compression and encoders' identification.

    Science.gov (United States)

    Korycki, Rafal

    2014-05-01

    Since the appearance of digital audio recordings, audio authentication has been becoming increasingly difficult. The currently available technologies and free editing software allow a forger to cut or paste any single word without audible artifacts. Nowadays, the only method referring to digital audio files commonly approved by forensic experts is the ENF criterion. It consists in fluctuation analysis of the mains frequency induced in electronic circuits of recording devices. Therefore, its effectiveness is strictly dependent on the presence of mains signal in the recording, which is a rare occurrence. Recently, much attention has been paid to authenticity analysis of compressed multimedia files and several solutions were proposed for detection of double compression in both digital video and digital audio. This paper addresses the problem of tampering detection in compressed audio files and discusses new methods that can be used for authenticity analysis of digital recordings. Presented approaches consist in evaluation of statistical features extracted from the MDCT coefficients as well as other parameters that may be obtained from compressed audio files. Calculated feature vectors are used for training selected machine learning algorithms. The detection of multiple compression covers up tampering activities as well as identification of traces of montage in digital audio recordings. To enhance the methods' robustness an encoder identification algorithm was developed and applied based on analysis of inherent parameters of compression. The effectiveness of tampering detection algorithms is tested on a predefined large music database consisting of nearly one million of compressed audio files. The influence of compression algorithms' parameters on the classification performance is discussed, based on the results of the current study. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  12. Mobile video-to-audio transducer and motion detection for sensory substitution

    Directory of Open Access Journals (Sweden)

    Maxime eAmbard

    2015-10-01

    Full Text Available Visuo-auditory sensory substitution systems are augmented reality devices that translate a video stream into an audio stream in order to help the blind in daily tasks requiring visuo-spatial information. In this work, we present both a new mobile device and a transcoding method specifically designed to sonify moving objects. Frame differencing is used to extract spatial features from the video stream and two-dimensional spatial information is converted into audio cues using pitch, interaural time difference and interaural level difference. Using numerical methods, we attempt to reconstruct visuo-spatial information based on audio signals generated from various video stimuli. We show that despite a contrasted visual background and a highly lossy encoding method, the information in the audio signal is sufficient to allow object localization, object trajectory evaluation, object approach detection, and spatial separation of multiple objects. We also show that this type of audio signal can be interpreted by human users by asking ten subjects to discriminate trajectories based on generated audio signals.

  13. Procedural Audio in Computer Games Using Motion Controllers: An Evaluation on the Effect and Perception

    Directory of Open Access Journals (Sweden)

    Niels Böttcher

    2013-01-01

    Full Text Available A study has been conducted into whether the use of procedural audio affects players in computer games using motion controllers. It was investigated whether or not (1 players perceive a difference between detailed and interactive procedural audio and prerecorded audio, (2 the use of procedural audio affects their motor-behavior, and (3 procedural audio affects their perception of control. Three experimental surveys were devised, two consisting of game sessions and the third consisting of watching videos of gameplay. A skiing game controlled by a Nintendo Wii balance board and a sword-fighting game controlled by a Wii remote were implemented with two versions of sound, one sample based and the other procedural based. The procedural models were designed using a perceptual approach and by alternative combinations of well-known synthesis techniques. The experimental results showed that, when being actively involved in playing or purely observing a video recording of a game, the majority of participants did not notice any difference in sound. Additionally, it was not possible to show that the use of procedural audio caused any consistent change in the motor behavior. In the skiing experiment, a portion of players perceived the control of the procedural version as being more sensitive.

  14. Frequency Hopping Method for Audio Watermarking

    Directory of Open Access Journals (Sweden)

    A. Anastasijević

    2012-11-01

    Full Text Available This paper evaluates the degradation of audio content for a perceptible removable watermark. Two different approaches to embedding the watermark in the spectral domain were investigated. The frequencies for watermark embedding are chosen according to a pseudorandom sequence making the methods robust. Consequentially, the lower quality audio can be used for promotional purposes. For a fee, the watermark can be removed with a secret watermarking key. Objective and subjective testing was conducted in order to measure degradation level for the watermarked music samples and to examine residual distortion for different parameters of the watermarking algorithm and different music genres.

  15. Combining multiple observations of audio signals

    Science.gov (United States)

    Bayram, Ilker

    2013-09-01

    We consider the problem of reconstructing an audio signal from multiple observations, each of which is contaminated with time-varying noise. Assuming that the time-variation is different for each observation, we propose an estimation formulation that can adapt to these changes. Specifically, we postulate a parametric reconstruction and choose the parameters so that the reconstruction minimizes a cost function. The cost function is selected so that audio signals are penalized less compared to arbitrary signals with the same energy. As cost functions, we experiment with a recently proposed prior as well as mixed norms placed on the short time Fourier coefficients.

  16. Personalized Audio Systems - a Bayesian Approach

    DEFF Research Database (Denmark)

    Nielsen, Jens Brehm; Jensen, Bjørn Sand; Hansen, Toke Jansen

    2013-01-01

    Modern audio systems are typically equipped with several user-adjustable parameters unfamiliar to most users listening to the system. To obtain the best possible setting, the user is forced into multi-parameter optimization with respect to the users's own objective and preference. To address this......, the present paper presents a general inter-active framework for personalization of such audio systems. The framework builds on Bayesian Gaussian process regression in which a model of the users's objective function is updated sequentially. The parameter setting to be evaluated in a given trial is selected...

  17. Non Audio-Video gesture recognition system

    DEFF Research Database (Denmark)

    Craciunescu, Razvan; Mihovska, Albena Dimitrova; Kyriazakos, Sofoklis

    2016-01-01

    recognition from the face and hand gesture recognition. Gesture recognition enables humans to communicate with the machine and interact naturally without any mechanical devices. This paper investigates the possibility to use non-audio/video sensors in order to design a low-cost gesture recognition device...... that can be connected to any computer on the market. The paper proposes an equation that relates the distance and voltage for a Sharp GP2Y0A21 and GP2D120 sensors in the situation that a hand is used as the reflective object. In the end, the presented system is compared with other audio/video system...

  18. Overview of the audio description in spanish DTT channels

    Directory of Open Access Journals (Sweden)

    Francisco José González

    2014-09-01

    Full Text Available This paper presents an analysis of current practices in audio description in Spanish TV channels. The results of this research show that in some channels the audio description is broadcasted for ‘receiver mix audio description’ while in other channels the alternative used is ‘broadcaster mix audio description’. The problems detected for the activation of audio description in users’ TVs can be solved applying some enhancement to signaling information used by broadcasters in their DVB TV channels. Finally, some recommendations for the users are included to present the key aspects to audio description activation in their TVs.

  19. Rate-distortion analysis of steganography for conveying stereovision disparity maps

    Science.gov (United States)

    Umeda, Toshiyuki; Batolomeu, Ana B. D. T.; Francob, Filipe A. L.; Delannay, Damien; Macq, Benoit M. M.

    2004-06-01

    3-D images transmission in a way which is compliant with traditional 2-D representations can be done through the embedding of disparity maps within the 2-D signal. This approach enables the transmission of stereoscopic video sequences or images on traditional analogue TV channels (PAL or NTSC) or printed photographic images. The aim of this work is to study the achievable performances of such a technique. The embedding of disparity maps has to be seen as a global rate-distortion problem. The embedding capacity through steganography is determined by the transmission channel noise and by the bearable distortion on the watermarked image. The distortion of the 3-D image displayed as two stereo views depends on the rate allocated to the complementary information required to build those two views from one reference 2-D image. Results from the works on the scalar Costa scheme are used to optimize the embedding of the disparity map compressed bit stream into the reference image. A method for computing the optimal trade off between the disparity map distortion and embedding distortion as a function of the channel impairments is proposed. The goal is to get a similar distortion on the left (the reference image) and the right (the disparity compensated image) images. We show that in typical situations the embedding of 2 bits/pixels in the left image, while the disparity map is compressed at 1 bit per pixel leads to a good trade-off. The disparity map is encoded with a strong error correcting code, including synchronisation bits.

  20. Audio wiring guide how to wire the most popular audio and video connectors

    CERN Document Server

    Hechtman, John

    2012-01-01

    Whether you're a pro or an amateur, a musician or into multimedia, you can't afford to guess about audio wiring. The Audio Wiring Guide is a comprehensive, easy-to-use guide that explains exactly what you need to know. No matter the size of your wiring project or installation, this handy tool provides you with the essential information you need and the techniques to use it. Using The Audio Wiring Guide is like having an expert at your side. By following the clear, step-by-step directions, you can do professional-level work at a fraction of the cost.

  1. Emotion based segmentation of musical audio

    NARCIS (Netherlands)

    Aljanaki, A.; Wiering, F.; Veltkamp, R.C.

    2015-01-01

    The dominant approach to musical emotion variation detection tracks emotion over time continuously and usually deals with time resolutions of one second. In this paper we discuss the problems associated with this approach and propose to move to bigger time resolutions when tracking emotion over

  2. Audio-vocal interaction in single neurons of the monkey ventrolateral prefrontal cortex.

    Science.gov (United States)

    Hage, Steffen R; Nieder, Andreas

    2015-05-06

    Complex audio-vocal integration systems depend on a strong interconnection between the auditory and the vocal motor system. To gain cognitive control over audio-vocal interaction during vocal motor control, the PFC needs to be involved. Neurons in the ventrolateral PFC (VLPFC) have been shown to separately encode the sensory perceptions and motor production of vocalizations. It is unknown, however, whether single neurons in the PFC reflect audio-vocal interactions. We therefore recorded single-unit activity in the VLPFC of rhesus monkeys (Macaca mulatta) while they produced vocalizations on command or passively listened to monkey calls. We found that 12% of randomly selected neurons in VLPFC modulated their discharge rate in response to acoustic stimulation with species-specific calls. Almost three-fourths of these auditory neurons showed an additional modulation of their discharge rates either before and/or during the monkeys' motor production of vocalization. Based on these audio-vocal interactions, the VLPFC might be well positioned to combine higher order auditory processing with cognitive control of the vocal motor output. Such audio-vocal integration processes in the VLPFC might constitute a precursor for the evolution of complex learned audio-vocal integration systems, ultimately giving rise to human speech. Copyright © 2015 the authors 0270-6474/15/357030-11$15.00/0.

  3. Predistortion of a Bidirectional Cuk Audio Amplifier

    DEFF Research Database (Denmark)

    Birch, Thomas Hagen; Nielsen, Dennis; Knott, Arnold

    2014-01-01

    using predistortion. This paper suggests linearizing a nonlinear bidirectional Cuk audio amplifier using an analog predistortion approach. A prototype power stage was built and results show that a voltage gain of up to 9 dB and reduction in THD from 6% down to 3% was obtainable using this approach....

  4. Spatial audio quality perception (part 2)

    DEFF Research Database (Denmark)

    Conetta, R.; Brookes, T.; Rumsey, F.

    2015-01-01

    location, envelopment, coverage angle, ensemble width, and spaciousness. They can also impact timbre, and changes to timbre can then influence spatial perception. Previously obtained data was used to build a regression model of perceived spatial audio quality in terms of spatial and timbral metrics...

  5. Audio/Visual Ratios in Commercial Filmstrips.

    Science.gov (United States)

    Gulliford, Nancy L.

    Developed by the Westinghouse Electric Corporation, Video Audio Compressed (VIDAC) is a compressed time, variable rate, still picture television system. This technology made it possible for a centralized library of audiovisual materials to be transmitted over a television channel in very short periods of time. In order to establish specifications…

  6. Utilization of Nonlinear Converters for Audio Amplification

    DEFF Research Database (Denmark)

    Iversen, Niels; Birch, Thomas; Knott, Arnold

    2012-01-01

    . The introduction of non-linear converters for audio amplication defeats this limitation. A Cuk converter, designed to deliver an AC peak output voltage twice the supply voltage, is presented in this paper. A 3V prototype has been developed to prove the concept. The prototype shows that it is possible to achieve...

  7. Providing Students with Formative Audio Feedback

    Science.gov (United States)

    Brearley, Francis Q.; Cullen, W. Rod

    2012-01-01

    The provision of timely and constructive feedback is increasingly challenging for busy academics. Ensuring effective student engagement with feedback is equally difficult. Increasingly, studies have explored provision of audio recorded feedback to enhance effectiveness and engagement with feedback. Few, if any, of these focus on purely formative…

  8. An ESL Audio-Script Writing Workshop

    Science.gov (United States)

    Miller, Carla

    2012-01-01

    The roles of dialogue, collaborative writing, and authentic communication have been explored as effective strategies in second language writing classrooms. In this article, the stages of an innovative, multi-skill writing method, which embeds students' personal voices into the writing process, are explored. A 10-step ESL Audio Script Writing Model…

  9. Agency Video, Audio and Imagery Library

    Science.gov (United States)

    Grubbs, Rodney

    2015-01-01

    The purpose of this presentation was to inform the ISS International Partners of the new NASA Agency Video, Audio and Imagery Library (AVAIL) website. AVAIL is a new resource for the public to search for and download NASA-related imagery, and is not intended to replace the current process by which the International Partners receive their Space Station imagery products.

  10. Frequency Compensation of an Audio Power Amplifier

    NARCIS (Netherlands)

    van der Zee, Ronan A.R.; van Heeswijk, R.

    2006-01-01

    A car audio power amplifier is presented that uses a frequency compensation scheme which avoids large compensation capacitors around the MOS power transistors, while retaining the bandwidth and stable load range of nested miller compensation. THD is 0.005%@(1kHz, 10W), SNR is 108dB, and the

  11. Audio Journal in an ELT Context

    Directory of Open Access Journals (Sweden)

    Neşe Aysin Siyli

    2012-09-01

    Full Text Available It is widely acknowledged that one of the most serious problems students of English as a foreign language face is their deprivation of practicing the language outside the classroom. Generally, the classroom is the sole environment where they can practice English, which by its nature does not provide rich setting to help students develop their competence by putting the language into practice. Motivated by this need, this descriptive study investigated the impact of audio dialog journals on students’ speaking skills. It also aimed to gain insights into students’ and teacher’s opinions on keeping audio dialog journals outside the class. The data of the study developed from student and teacher audio dialog journals, student written feedbacks, interviews held with the students, and teacher observations. The descriptive analysis of the data revealed that audio dialog journals served a number of functions ranging from cognitive to linguistic, from pedagogical to psychological, and social. The findings and pedagogical implications of the study are discussed in detail.

  12. Consuming audio: an introduction to Tweak Theory

    NARCIS (Netherlands)

    Perlman, Marc

    2014-01-01

    abstractAudio technology is a medium for music, and when we pay attention to it we tend to speculate about its effects on the music it transmits. By now there are well-established traditions of commentary (many of them critical) about the impact of musical reproduction on musical production.

  13. Audible Aliasing Distortion in Digital Audio Synthesis

    Directory of Open Access Journals (Sweden)

    J. Schimmel

    2012-04-01

    Full Text Available This paper deals with aliasing distortion in digital audio signal synthesis of classic periodic waveforms with infinite Fourier series, for electronic musical instruments. When these waveforms are generated in the digital domain then the aliasing appears due to its unlimited bandwidth. There are several techniques for the synthesis of these signals that have been designed to avoid or reduce the aliasing distortion. However, these techniques have high computing demands. One can say that today's computers have enough computing power to use these methods. However, we have to realize that today’s computer-aided music production requires tens of multi-timbre voices generated simultaneously by software synthesizers and the most of the computing power must be reserved for hard-disc recording subsystem and real-time audio processing of many audio channels with a lot of audio effects. Trivially generated classic analog synthesizer waveforms are therefore still effective for sound synthesis. We cannot avoid the aliasing distortion but spectral components produced by the aliasing can be masked with harmonic components and thus made inaudible if sufficient oversampling ratio is used. This paper deals with the assessment of audible aliasing distortion with the help of a psychoacoustic model of simultaneous masking and compares the computing demands of trivial generation using oversampling with those of other methods.

  14. Restoration of Local Degradations in Audio Signals

    Directory of Open Access Journals (Sweden)

    M. Brejl

    1996-09-01

    Full Text Available The paper presents an algorithm for restoration of local degradations in audio signals. The theoretical foundations and basic suggestions of this algorithm were published in [1]. A complete description of restoration process and some improvements are presented here.

  15. Using Audio-Derived Affective Offset to Enhance TV Recommendation

    DEFF Research Database (Denmark)

    Shepstone, Sven Ewan; Tan, Zheng-Hua; Jensen, Søren Holdt

    2014-01-01

    . First a user's mood profile is determined using 12-class audio-based emotion classifications . An initial TV content item is then displayed to the user based on the extracted mood profile. The user has the option to either accept the recommendation, or to critique the item once or several times......, by navigating the emotion space to request an alternative match. The final match is then compared to the initial match, in terms of the difference in the items' affective parameterization . This offset is then utilized in future recommendation sessions. The system was evaluated by eliciting three different...

  16. Extracting meaning from audio signals - a machine learning approach

    DEFF Research Database (Denmark)

    Larsen, Jan

    2007-01-01

    * Machine learning framework for sound search * Genre classification * Music and audio separation * Wind noise suppression......* Machine learning framework for sound search * Genre classification * Music and audio separation * Wind noise suppression...

  17. Training of audio descriptors: the cinematographic aesthetics as basis for the learning of the audio description aesthetics – materials, methods and products

    Directory of Open Access Journals (Sweden)

    Soraya Ferreira Alves

    2016-12-01

    Full Text Available Audio description (AD, a resource used to make theater, cinema, TV, and visual works of art accessible to people with visual impairments, is slowly being implemented in Brazil and demanding qualified professionals. Based on this statement, this article reports the results of a research developed during post-doctoral studies. The study is dedicated to the confrontation of film aesthetics with audio description techniques to check how the knowledge of the former can contribute to audiodescritor training. Through action research, a short film adapted from a Mario de Andrade’s, a Brazilian writer, short story called O Peru de Natal (Christmas Turkey was produced. The film as well as its audio description were carried out involving students and teachers from the discipline Intersemiotic Translation at the State University of Ceará. Thus, we intended to suggest pedagogical procedures generated by the students experiences by evaluating their choices and their implications.

  18. Audio Mining with emphasis on Music Genre Classification

    DEFF Research Database (Denmark)

    Meng, Anders

    2004-01-01

    etc. is receiving quite a lot of attention. The first breakthough in audio mining was created by MuscleFish in 1996, a simple audio retrieval system. With the increasing amount of audio material being accessible through the web, e.g. Apple's iTunes (700,000+ songs), Sony, Amazon, new methods...

  19. 47 CFR 10.520 - Common audio attention signal.

    Science.gov (United States)

    2010-10-01

    ... 47 Telecommunication 1 2010-10-01 2010-10-01 false Common audio attention signal. 10.520 Section... Equipment Requirements § 10.520 Common audio attention signal. A Participating CMS Provider and equipment manufacturers may only market devices for public use under part 10 that include an audio attention signal that...

  20. Audio Books in the Nigerian Higher Educational System: To be ...

    African Journals Online (AJOL)

    This study discusses audio books from the point of view of an innovation. It discusses the advantages and disadvantages of audio books. It examined students' familiarization with audio books and their perception about its being introduced into the school system. It was found out that Nigerian students are already familiar ...

  1. Debugging of Class-D Audio Power Amplifiers

    DEFF Research Database (Denmark)

    Crone, Lasse; Pedersen, Jeppe Arnsdorf; Mønster, Jakob Døllner

    2012-01-01

    Determining and optimizing the performance of a Class-D audio power amplier can be very dicult without knowledge of the use of audio performance measuring equipment and of how the various noise and distortion sources in uence the audio performance. This paper gives an introduction on how to measure...

  2. Utilization of non-linear converters for audio amplification

    DEFF Research Database (Denmark)

    Iversen, Niels Elkjær; Birch, Thomas; Knott, Arnold

    2012-01-01

    Class D amplifiers fits the automotive demands quite well. The traditional buck-based amplifier has reduced both the cost and size of amplifiers. However the buck topology is not without its limitations. The maximum peak AC output voltage produced by the power stage is only equal the supply voltage....... The introduction of non-linear converters for audio amplification defeats this limitation. A Cuk converter, designed to deliver an AC peak output voltage twice the supply voltage, is presented in this paper. A 3V prototype has been developed to prove the concept. The prototype shows that it is possible to achieve...

  3. Utilizing Domain Knowledge in End-to-End Audio Processing

    DEFF Research Database (Denmark)

    Tax, Tycho; Antich, Jose Luis Diez; Purwins, Hendrik

    2017-01-01

    End-to-end neural network based approaches to audio modelling are generally outperformed by models trained on high-level data representations. In this paper we present preliminary work that shows the feasibility of training the first layers of a deep convolutional neural network (CNN) model...... to learn the commonly-used log-scaled mel-spectrogram transformation. Secondly, we demonstrate that upon initializing the first layers of an end-to-end CNN classifier with the learned transformation, convergence and performance on the ESC-50 environmental sound classification dataset are similar to a CNN...

  4. Audio segmentation using Flattened Local Trimmed Range for ecological acoustic space analysis

    Directory of Open Access Journals (Sweden)

    Giovany Vega

    2016-06-01

    Full Text Available The acoustic space in a given environment is filled with footprints arising from three processes: biophony, geophony and anthrophony. Bioacoustic research using passive acoustic sensors can result in thousands of recordings. An important component of processing these recordings is to automate signal detection. In this paper, we describe a new spectrogram-based approach for extracting individual audio events. Spectrogram-based audio event detection (AED relies on separating the spectrogram into background (i.e., noise and foreground (i.e., signal classes using a threshold such as a global threshold, a per-band threshold, or one given by a classifier. These methods are either too sensitive to noise, designed for an individual species, or require prior training data. Our goal is to develop an algorithm that is not sensitive to noise, does not need any prior training data and works with any type of audio event. To do this, we propose: (1 a spectrogram filtering method, the Flattened Local Trimmed Range (FLTR method, which models the spectrogram as a mixture of stationary and non-stationary energy processes and mitigates the effect of the stationary processes, and (2 an unsupervised algorithm that uses the filter to detect audio events. We measured the performance of the algorithm using a set of six thoroughly validated audio recordings and obtained a sensitivity of 94% and a positive predictive value of 89%. These sensitivity and positive predictive values are very high, given that the validated recordings are diverse and obtained from field conditions. The algorithm was then used to extract audio events in three datasets. Features of these audio events were plotted and showed the unique aspects of the three acoustic communities.

  5. Audio coding in wireless acoustic sensor networks

    DEFF Research Database (Denmark)

    Zahedi, Adel; Østergaard, Jan; Jensen, Søren Holdt

    2015-01-01

    ) for the resulting remote DSC problem under covariance matrix distortion constraints. We further show that for this problem, the Gaussian source is the worst to code. Thus, the Gaussian RDF provides an upper bound to other sources such as audio signals. We then turn our attention to audio signals. We consider......In this paper, we consider the problem of source coding for a wireless acoustic sensor network where each node in the network makes its own noisy measurement of the sound field, and communicates with other nodes in the network by sending and receiving encoded versions of the measurements. To make...... use of the correlation between the sources available at the nodes, we consider the possibility of combining the measurement and the received messages into one single message at each node instead of forwarding the received messages and separate encoding of the measurement. Moreover, to exploit...

  6. A Perceptually Reweighted Mixed-Norm Method for Sparse Approximation of Audio Signals

    DEFF Research Database (Denmark)

    Christensen, Mads Græsbøll; Sturm, Bob L.

    2011-01-01

    using standard software. A prominent feature of the new method is that it solves a problem that is closely related to the objective of coding, namely rate-distortion optimization. In computer simulations, we demonstrate the properties of the algorithm and its application to real audio signals.......In this paper, we consider the problem of finding sparse representations of audio signals for coding purposes. In doing so, it is of utmost importance that when only a subset of the present components of an audio signal are extracted, it is the perceptually most important ones. To this end, we...... propose a new iterative algorithm based on two principles: 1) a reweighted l1-norm based measure of sparsity; and 2) a reweighted l2-norm based measure of perceptual distortion. Using these measures, the considered problem is posed as a constrained convex optimization problem that can be solved optimally...

  7. Basic Concepts in Augmented Reality Audio

    OpenAIRE

    Lemordant, Jacques

    2010-01-01

    International audience; The basic difference between real and virtual sound environments is that virtual sounds are originating from another environment or are artificially created, whereas the real sounds are the natural existing sounds in the user's own environment. Augmented Reality Audio combines these aspects in a way where real and virtual sound scenes are mixed so that virtual sounds are perceived as an extension or a complement to the natural ones.

  8. Personalized Audio Systems - a Bayesian Approach

    DEFF Research Database (Denmark)

    Nielsen, Jens Brehm; Jensen, Bjørn Sand; Hansen, Toke Jansen

    2013-01-01

    , the present paper presents a general inter-active framework for personalization of such audio systems. The framework builds on Bayesian Gaussian process regression in which a model of the users's objective function is updated sequentially. The parameter setting to be evaluated in a given trial is selected...... are optimized using the proposed framework. Twelve test subjects obtain a personalized setting with the framework, and these settings are signicantly preferred to those obtained with random experimentation....

  9. New musical organology : the audio-games

    OpenAIRE

    Zénouda , Hervé

    2012-01-01

    International audience; This article aims to shed light on a new and emerging creative field: " Audio Games, " a crossroad between video games and computer music. Today, a plethora of tiny applications, which propose entertaining audiovisual experiences with a preponderant sound dimension, are available for game consoles, computers, and mobile phones. These experiences represent a new universe where the gameplay of video games is applied to musical composition, hence creating new links betwee...

  10. Emerging topics in translation: Audio description

    OpenAIRE

    Perego, Elisa

    2012-01-01

    The volume deals with several aspects of audio description for the blind and sight impaired which came to the surface during the AD session of the conference Emerging topics in translation and interpreting held at the Department of Language, Translation and Interpreting Studies of the University of Trieste, 16-18 June 2010. The topics dealt with in the volume range from the more established (linguistic analysis of ADs in various languages, strategies to overcome possible obs...

  11. Digitisation of the CERN Audio Archives

    CERN Multimedia

    Maximilien Brice

    2006-01-01

    Since the creation of CERN in 1954 until mid 1980s, the audiovisual service has recorded hundreds of hours of moments of life at CERN on audio tapes. These moments range from inaugurations of new facilities to VIP speeches and general interest cultural seminars The preservation process started in June 2005 On these pictures, we see Waltraud Hug working on an open-reel tape.

  12. Detection Of Alterations In Audio Files Using Spectrograph Analysis

    Directory of Open Access Journals (Sweden)

    Anandha Krishnan G

    2015-08-01

    Full Text Available The corresponding study was carried out to detect changes in audio file using spectrograph. An audio file format is a file format for storing digital audio data on a computer system. A sound spectrograph is a laboratory instrument that displays a graphical representation of the strengths of the various component frequencies of a sound as time passes. The objectives of the study were to find the changes in spectrograph of audio after altering them to compare altering changes with spectrograph of original files and to check for similarity and difference in mp3 and wav. Five different alterations were carried out on each audio file to analyze the differences between the original and the altered file. For altering the audio file MP3 or WAV by cutcopy the file was opened in Audacity. A different audio was then pasted to the audio file. This new file was analyzed to view the differences. By adjusting the necessary parameters the noise was reduced. The differences between the new file and the original file were analyzed. By adjusting the parameters from the dialog box the necessary changes were made. The edited audio file was opened in the software named spek where after analyzing a graph is obtained of that particular file which is saved for further analysis. The original audio graph received was combined with the edited audio file graph to see the alterations.

  13. Comparing audio and video data for rating communication.

    Science.gov (United States)

    Williams, Kristine; Herman, Ruth; Bontempo, Daniel

    2013-09-01

    Video recording has become increasingly popular in nursing research, adding rich nonverbal, contextual, and behavioral information. However, benefits of video over audio data have not been well established. We compared communication ratings of audio versus video data using the Emotional Tone Rating Scale. Twenty raters watched video clips of nursing care and rated staff communication on 12 descriptors that reflect dimensions of person-centered and controlling communication. Another group rated audio-only versions of the same clips. Interrater consistency was high within each group with Interclass Correlation Coefficient (ICC) (2,1) for audio .91, and video = .94. Interrater consistency for both groups combined was also high with ICC (2,1) for audio and video = .95. Communication ratings using audio and video data were highly correlated. The value of video being superior to audio-recorded data should be evaluated in designing studies evaluating nursing care.

  14. Securing Digital Audio using Complex Quadratic Map

    Science.gov (United States)

    Suryadi, MT; Satria Gunawan, Tjandra; Satria, Yudi

    2018-03-01

    In This digital era, exchanging data are common and easy to do, therefore it is vulnerable to be attacked and manipulated from unauthorized parties. One data type that is vulnerable to attack is digital audio. So, we need data securing method that is not vulnerable and fast. One of the methods that match all of those criteria is securing the data using chaos function. Chaos function that is used in this research is complex quadratic map (CQM). There are some parameter value that causing the key stream that is generated by CQM function to pass all 15 NIST test, this means that the key stream that is generated using this CQM is proven to be random. In addition, samples of encrypted digital sound when tested using goodness of fit test are proven to be uniform, so securing digital audio using this method is not vulnerable to frequency analysis attack. The key space is very huge about 8.1×l031 possible keys and the key sensitivity is very small about 10-10, therefore this method is also not vulnerable against brute-force attack. And finally, the processing speed for both encryption and decryption process on average about 450 times faster that its digital audio duration.

  15. Audio Spatial Representation Around the Body.

    Science.gov (United States)

    Aggius-Vella, Elena; Campus, Claudio; Finocchietti, Sara; Gori, Monica

    2017-01-01

    Studies have found that portions of space around our body are differently coded by our brain. Numerous works have investigated visual and auditory spatial representation, focusing mostly on the spatial representation of stimuli presented at head level, especially in the frontal space. Only few studies have investigated spatial representation around the entire body and its relationship with motor activity. Moreover, it is still not clear whether the space surrounding us is represented as a unitary dimension or whether it is split up into different portions, differently shaped by our senses and motor activity. To clarify these points, we investigated audio localization of dynamic and static sounds at different body levels. In order to understand the role of a motor action in auditory space representation, we asked subjects to localize sounds by pointing with the hand or the foot, or by giving a verbal answer. We found that the audio sound localization was different depending on the body part considered. Moreover, a different pattern of response was observed when subjects were asked to make actions with respect to the verbal responses. These results suggest that the audio space around our body is split in various spatial portions, which are perceived differently: front, back, around chest, and around foot, suggesting that these four areas could be differently modulated by our senses and our actions.

  16. Laser modified ZnO/CdSSe core-shell nanowire arrays for Micro-Steganography and improved photoconduction.

    Science.gov (United States)

    Lu, Junpeng; Liu, Hongwei; Zheng, Minrui; Zhang, Hongji; Lim, Sharon Xiaodai; Tok, Eng Soon; Sow, Chorng Haur

    2014-09-12

    Arrays of ZnO/CdSSe core/shell nanowires with shells of tunable band gaps represent a class of interesting hybrid nanomaterials with unique optical and photoelectrical properties due to their type II heterojunctions and chemical compositions. In this work, we demonstrate that direct focused laser beam irradiation is able to achieve localized modification of the hybrid structure and chemical composition of the nanowire arrays. As a result, the photoresponsivity of the laser modified hybrid is improved by a factor of ~3. A 3D photodetector with improved performance is demonstrated using laser modified nanowire arrays overlaid with monolayer graphene as the top electrode. Finally, by controlling the power of the scanning focused laser beam, micropatterns with different fluorescence emissions are created on a substrate covered with nanowire arrays. Such a pattern is not apparent when imaged under normal optical microscopy but the pattern becomes readily revealed under fluorescence microscopy i.e. a form of Micro-Steganography is achieved.

  17. Audre's daughter: Black lesbian steganography in Dee Rees' Pariah and Audre Lorde's Zami: A New Spelling of My Name.

    Science.gov (United States)

    Kang, Nancy

    2016-01-01

    This article argues that African-American director Dee Rees' critically acclaimed debut Pariah (2011) is a rewriting of lesbian poet-activist Audre Lorde's iconic "bio-mythography" Zami: A New Spelling of My Name (1982). The article examines how Rees' work creatively and subtly re-envisions Lorde's Zami by way of deeply rooted and often cleverly camouflaged patterns, resonances, and contrasts. Shared topics include naming, mother-daughter bonds, the role of clothing in identity formation, domestic abuse, queer time, and lesbian, gay, bisexual, and transgender legacy discourse construction. What emerges between the visual and written texts is a hidden language of connection--what may be termed Black lesbian steganography--which proves thought-provoking to viewers and readers alike.

  18. Effectiveness of increasing emergency department patients' self-perceived risk for being human immunodeficiency virus (HIV) infected through audio computer self-interview-based feedback about reported HIV risk behaviors.

    Science.gov (United States)

    Merchant, Roland C; Clark, Melissa A; Langan, Thomas J; Seage, George R; Mayer, Kenneth H; DeGruttola, Victor G

    2009-11-01

    Prior research has demonstrated that emergency department (ED) patient acceptance of human immunodeficiency virus (HIV) screening is partially dependent on patients' self-perceived risk of infection. The primary objective of this study was to determine the effectiveness of audio computer-assisted self-interview (ACASI)-based feedback. The intervention aimed to increase patient's self-perceived risk of being HIV infected by providing immediate feedback on their risk behaviors. This 1-year, randomized, controlled trial at a U.S. ED enrolled a random sample of 18- to 64-year-old subcritically ill or injured adult patients who were not known to be HIV infected. All participants completed an anonymous, ACASI-based questionnaire about their HIV risk behaviors related to injection drug use and sex, as well as their self-perceived risk for being HIV infected. Participants were randomly assigned to one of two study groups: an intervention group in which participants received immediate ACASI-based feedback in response to each of their reported risk behaviors or a no-intervention group without feedback. Participants were asked to indicate their level of HIV risk on a five-point scale before and after they answered the questions. Change in level of self-perceived HIV risk was calculated and compared by study group using Pearson's chi-square test. An HIV risk behavior score that summarized reported HIV risk behavior was devised. Because HIV risk behaviors differ by sex, scores were calculated separately for each sex. Linear regression models that adjusted for study group and same subject covariance were employed to determine if higher HIV risk behavior scores were associated with an increase in self-perceived HIV risk. Of the 566 trial participants, the median age was 29 years (interquartile range [IQR] = 22-43 years), 62.2% were females, and 66.9% had been tested previously for HIV. After answering the reported HIV risk behavior questions, 12.6% of participants had an increase

  19. Effectiveness of Increasing Emergency Department Patients’ Self-perceived Risk for Being Human Immunodeficiency Virus (HIV) Infected Through Audio Computer Self-interview–based Feedback About Reported HIV Risk Behaviors

    Science.gov (United States)

    Merchant, Roland C.; Clark, Melissa A.; Langan, Thomas J.; Seage, George R.; Mayer, Kenneth H.; DeGruttola, Victor G.

    2011-01-01

    Objectives Prior research has demonstrated that emergency department (ED) patient acceptance of human immunodeficiency virus (HIV) screening is partially dependent on patients’ self-perceived risk of infection. The primary objective of this study was to determine the effectiveness of audio computer-assisted self-interview (ACASI)-based feedback. The intervention aimed to increase patient’s self-perceived risk of being HIV infected by providing immediate feedback on their risk behaviors. Methods This 1-year, randomized, controlled trial at a U.S. ED enrolled a random sample of 18- to 64-year-old subcritically ill or injured adult patients who were not known to be HIV infected. All participants completed an anonymous, ACASI-based questionnaire about their HIV risk behaviors related to injection drug use and sex, as well as their self-perceived risk for being HIV infected. Participants were randomly assigned to one of two study groups: an intervention group in which participants received immediate ACASI-based feedback in response to each of their reported risk behaviors or a no-intervention group without feedback. Participants were asked to indicate their level of HIV risk on a five-point scale before and after they answered the questions. Change in level of self-perceived HIV risk was calculated and compared by study group using Pearson’s chi-square test. An HIV risk behavior score that summarized reported HIV risk behavior was devised. Because HIV risk behaviors differ by sex, scores were calculated separately for each sex. Linear regression models that adjusted for study group and same subject covariance were employed to determine if higher HIV risk behavior scores were associated with an increase in self-perceived HIV risk. Results Of the 566 trial participants, the median age was 29 years (interquartile range [IQR] = 22–43 years), 62.2% were females, and 66.9% had been tested previously for HIV. After answering the reported HIV risk behavior questions, 12

  20. Concurrent emotional pictures modulate temporal order judgments of spatially separated audio-tactile stimuli.

    Science.gov (United States)

    Jia, Lina; Shi, Zhuanghua; Zang, Xuelian; Müller, Hermann J

    2013-11-06

    Although attention can be captured toward high-arousal stimuli, little is known about how perceiving emotion in one modality influences the temporal processing of non-emotional stimuli in other modalities. We addressed this issue by presenting observers spatially uninformative emotional pictures while they performed an audio-tactile temporal-order judgment (TOJ) task. In Experiment 1, audio-tactile stimuli were presented at the same location straight ahead of the participants, who had to judge "which modality came first?". In Experiments 2 and 3, the audio-tactile stimuli were delivered one to the left and the other to the right side, and participants had to judge "which side came first?". We found both negative and positive high-arousal pictures to significantly bias TOJs towards the tactile and away from the auditory event when the audio-tactile stimuli were spatially separated; by contrast, there was no such bias when the audio-tactile stimuli originated from the same location. To further examine whether this bias is attributable to the emotional meanings conveyed by the pictures or to their high arousal effect, we compared and contrasted the influences of near-body threat vs. remote threat (emotional) pictures on audio-tactile TOJs in Experiment 3. The bias manifested only in the near-body threat condition. Taken together, the findings indicate that visual stimuli conveying meanings of near-body interaction activate a sensorimotor functional link prioritizing the processing of tactile over auditory signals when these signals are spatially separated. In contrast, audio-tactile signals from the same location engender strong crossmodal integration, thus counteracting modality-based attentional shifts induced by the emotional pictures. © 2013 Published by Elsevier B.V.

  1. A high efficiency PWM CMOS class-D audio power amplifier

    Science.gov (United States)

    Zhangming, Zhu; Lianxi, Liu; Yintang, Yang; Han, Lei

    2009-02-01

    Based on the difference close-loop feedback technique and the difference pre-amp, a high efficiency PWM CMOS class-D audio power amplifier is proposed. A rail-to-rail PWM comparator with window function has been embedded in the class-D audio power amplifier. Design results based on the CSMC 0.5 μm CMOS process show that the max efficiency is 90%, the PSRR is -75 dB, the power supply voltage range is 2.5-5.5 V, the THD+N in 1 kHz input frequency is less than 0.20%, the quiescent current in no load is 2.8 mA, and the shutdown current is 0.5 μA. The active area of the class-D audio power amplifier is about 1.47 × 1.52 mm2. With the good performance, the class-D audio power amplifier can be applied to several audio power systems.

  2. Audio-visual speech experience with age influences perceived audio-visual asynchrony in speech.

    Science.gov (United States)

    Alm, Magnus; Behne, Dawn

    2013-10-01

    Previous research indicates that perception of audio-visual (AV) synchrony changes in adulthood. Possible explanations for these age differences include a decline in hearing acuity, a decline in cognitive processing speed, and increased experience with AV binding. The current study aims to isolate the effect of AV experience by comparing synchrony judgments from 20 young adults (20 to 30 yrs) and 20 normal-hearing middle-aged adults (50 to 60 yrs), an age range for which a decline of cognitive processing speed is expected to be minimal. When presented with AV stop consonant syllables with asynchronies ranging from 440 ms audio-lead to 440 ms visual-lead, middle-aged adults showed significantly less tolerance for audio-lead than young adults. Middle-aged adults also showed a greater shift in their point of subjective simultaneity than young adults. Natural audio-lead asynchronies are arguably more predictable than natural visual-lead asynchronies, and this predictability may render audio-lead thresholds more prone to experience-related fine-tuning.

  3. The Sweet-Home project: audio processing and decision making in smart home to improve well-being and reliance.

    Science.gov (United States)

    Vacher, Michel; Chahuara, Pedro; Lecouteux, Benjamin; Istrate, Dan; Portet, Francois; Joubert, Thierry; Sehili, Mohamed; Meillon, Brigitte; Bonnefond, Nicolas; Fabre, Sébastien; Roux, Camille; Caffiau, Sybille

    2013-01-01

    The Sweet-Home project aims at providing audio-based interaction technology that lets the user have full control over their home environment, at detecting distress situations and at easing the social inclusion of the elderly and frail population. This paper presents an overview of the project focusing on the implemented techniques for speech and sound recognition as context-aware decision making with uncertainty. A user experiment in a smart home demonstrates the interest of this audio-based technology.

  4. GaN Power Stage for Switch-mode Audio Amplification

    DEFF Research Database (Denmark)

    Ploug, Rasmus Overgaard; Knott, Arnold; Poulsen, Søren Bang

    2015-01-01

    Gallium Nitride (GaN) based power transistors are gaining more and more attention since the introduction of the enhancement mode eGaN Field Effect Transistor (FET) which makes an adaptation from Metal-Oxide Semiconductor (MOSFET) to eGaN based technology less complex than by using depletion mode Ga......N FETs. This project seeks to investigate the possibilities of using eGaN FETs as the power switching device in a full bridge power stage intended for switch mode audio amplification. A 50 W 1 MHz power stage was built and provided promising audio performance. Future work includes optimization of dead...

  5. Power Parameters and Efficiency of Class B Audio Amplifiers in Real-World Scenario

    Directory of Open Access Journals (Sweden)

    H. Zhivomirov

    2017-04-01

    Full Text Available Consumer audio amplifiers are intended to op¬erate with various loudspeaker loads, i.e. the load imped¬ance profile of the audio amplifier is a priori unknown. We propose the power parameters analysis of the class B audio amplifiers to be carried out in the realistic worst-case (RWC scenario of operation with the minimal value of the impedance and a RWC type of signal, instead of the nominal impedance of the loudspeaker and a sine-wave signal. Experimental validation, carried out for different types of signals and loudspeaker loads, demonstrate the advantages of the proposed RWC-based power parameters estimation. Furthermore, we provide a way of assessing the safe-operating area (SOA boundaries, based on the output I-V loci of the amplifier and by means of an equi¬valent load line (ELL.

  6. Automatic processing of CERN video, audio and photo archives

    International Nuclear Information System (INIS)

    Kwiatek, M

    2008-01-01

    The digitalization of CERN audio-visual archives, a major task currently in progress, will generate over 40 TB of video, audio and photo files. Storing these files is one issue, but a far more important challenge is to provide long-time coherence of the archive and to make these files available on-line with minimum manpower investment. An infrastructure, based on standard CERN services, has been implemented, whereby master files, stored in the CERN Distributed File System (DFS), are discovered and scheduled for encoding into lightweight web formats based on predefined profiles. Changes in master files, conversion profiles or in the metadata database (read from CDS, the CERN Document Server) are automatically detected and the media re-encoded whenever necessary. The encoding processes are run on virtual servers provided on-demand by the CERN Server Self Service Centre, so that new servers can be easily configured to adapt to higher load. Finally, the generated files are made available from the CERN standard web servers with streaming implemented using Windows Media Services

  7. Measuring 3D Audio Localization Performance and Speech Quality of Conferencing Calls for a Multiparty Communication System

    Directory of Open Access Journals (Sweden)

    Mansoor Hyder

    2013-07-01

    Full Text Available Communication systems which support 3D (Three Dimensional audio offer a couple of advantages to the users/customers. Firstly, within the virtual acoustic environments all participants could easily be recognized through their placement/sitting positions. Secondly, all participants can turn their focus on any particular talker when multiple participants start talking at the same time by taking advantage of the natural listening tendency which is called the Cocktail Party Effect. On the other hand, 3D audio is known as a decreasing factor for overall speech quality because of the commencement of reverberations and echoes within the listening environment. In this article, we study the tradeoff between speech quality and human natural ability of localizing audio events/or talkers within our three dimensional audio supported telephony and teleconferencing solution. Further, we performed subjective user studies by incorporating two different HRTFs (Head Related Transfer Functions, different placements of the teleconferencing participants and different layouts of the virtual environments. Moreover, subjective user studies results for audio event localization and subjective speech quality are presented in this article. This subjective user study would help the research community to optimize the existing 3D audio systems and to design new 3D audio supported teleconferencing solutions based on the quality of experience requirements of the users/customers for agriculture personal in particular and for all potential users in general.

  8. Measuring 3D Audio Localization Performance and Speech Quality of Conferencing Calls for a Multiparty Communication System

    International Nuclear Information System (INIS)

    Hyder, M.; Menghwar, G.D.; Qureshi, A.

    2013-01-01

    Communication systems which support 3D (Three Dimensional) audio offer a couple of advantages to the users/customers. Firstly, within the virtual acoustic environments all participants could easily be recognized through their placement/sitting positions. Secondly, all participants can turn their focus on any particular talker when multiple participants start talking at the same time by taking advantage of the natural listening tendency which is called the Cocktail Party Effect. On the other hand, 3D audio is known as a decreasing factor for overall speech quality because of the commencement of reverberations and echoes within the listening environment. In this article, we study the tradeoff between speech quality and human natural ability of localizing audio events/or talkers within our three dimensional audio supported telephony and teleconferencing solution. Further, we performed subjective user studies by incorporating two different HRTFs (Head Related Transfer Functions), different placements of the teleconferencing participants and different layouts of the virtual environments. Moreover, subjective user studies results for audio event localization and subjective speech quality are presented in this article. This subjective user study would help the research community to optimize the existing 3D audio systems and to design new 3D audio supported teleconferencing solutions based on the quality of experience requirements of the users/customers for agriculture personal in particular and for all potential users in general. (author)

  9. Differences in Human Audio Localization Performance between a HRTF- and a non-HRTF Audio System

    DEFF Research Database (Denmark)

    Larsen, Camilla Horne; Lauritsen, David Skødt; Larsen, Jacob Junker

    2013-01-01

    Spatial audio solutions have been around for a long time in real-time applications, but yielding spatial cues that more closely simulate real life accuracy has been a computational issue, and has often been solved by hardware solutions. This has long been a restriction, but now with more powerful...... computers this is becoming a lesser and lesser concern and software solutions are now applicable. Most current virtual environment applications do not take advantage of these im- plementations of accurate spatial cues, however. This paper compares a common implementation of spatial audio and a head......-related transfer function (HRTF) system implemen- tation in a study in relation to precision, speed and navi- gational performance in localizing audio sources in a virtual environment. We found that a system using HRTFs is signif- icantly better at all three performance tasks than a system using panning....

  10. Le registrazioni audio dell’archivio Luigi Nono di Venezia

    Directory of Open Access Journals (Sweden)

    Luca Cossettini

    2009-11-01

    Full Text Available The audio recordings of the Luigi Nono Archive in Venice: guidelines for preservation and critical edition of audio documentsStudying audio recordings brings us back to ancient source verification problems that too often one thinks are overcome by the technical reproduction of sound. Au-dio signal is “fixed” on a specific carrier (tape, disc etc with a specific audio format (speed, number of tracks etc; the choice of support and format during the first “memorizing” process and the following copying processes is a subjective and, in case of copying, an interpretative operation conducted within a continuously evolv-ing audio technology. What we listen to today is the result of a transmission process that unavoidably transforms the original acoustic event and the documents that memorize it. Audio recording is no way a timeless and immutable fixing process. It is therefore necessary to study the transmission processes and to reconstruct the au-dio document tradition. The re-recording of the tapes of the Archivio Luigi Nono, conducted by the Audio Labs of the DAMS Musica of the University of Udine, of-fers clear examples of the technical and musicological interpretative problems one can find when he works with audio recordings.

  11. Elicitation of attributes for the evaluation of audio-on audio-interference

    DEFF Research Database (Denmark)

    Francombe, Jon; Mason, R.; Dewhirst, M.

    2014-01-01

    An experiment to determine the perceptual attributes of the experience of listening to a target audio program in the presence of an audio interferer was performed. The first stage was a free elicitation task in which a total of 572 phrases were produced. In the second stage, a consensus vocabulary...... procedure was used to reduce these phrases into a comprehensive set of attributes. Groups of experienced and inexperienced listeners determined nine and eight attributes, respectively. These attribute sets were combined by the listeners to produce a final set of 12 attributes: masking, calming, distraction...

  12. Real-Time Perceptual Model for Distraction in Interfering Audio-on-Audio Scenarios

    DEFF Research Database (Denmark)

    Rämö, Jussi; Bech, Søren; Jensen, Søren Holdt

    2017-01-01

    model. Thus, while providing similar accuracy as the previous model, the proposed model can be run in real time. The proposed distraction model can be used as a tool for evaluating and optimizing sound-zone systems. Furthermore, the real-time capability of the model introduces new possibilities......This letter proposes a real-time perceptual model predicting the experienced distraction occurring in interfering audio-on-audio situations. The proposed model improves the computational efficiency of a previous distraction model, which cannot provide predictions in real time. The chosen approach...

  13. Audio analysis of statistically instantaneous signals with mixed Gaussian probability distributions

    Science.gov (United States)

    Naik, Ganesh R.; Wang, Wenwu

    2012-10-01

    In this article, a novel method is proposed to measure the separation qualities of statistically instantaneous audio signals with mixed Gaussian probability distributions. This study evaluates the impact of the Probability Distribution Function (PDF) of the mixed signals on the outcomes of both sub- and super-Gaussian distributions. Different Gaussian measures are evaluated by using various spectral-distortion measures. It aims to compare the different audio mixtures from both super-Gaussian and sub-Gaussian perspectives. Extensive computer simulation confirms that the separated sources always have super-Gaussian characteristics irrespective of the PDF of the signals or mixtures. The result based on the objective measures demonstrates the effectiveness of source separation in improving the quality of the separated audio sources.

  14. Improved Convolutive and Under-Determined Blind Audio Source Separation with MRF Smoothing.

    Science.gov (United States)

    Zdunek, Rafał

    2013-01-01

    Convolutive and under-determined blind audio source separation from noisy recordings is a challenging problem. Several computational strategies have been proposed to address this problem. This study is concerned with several modifications to the expectation-minimization-based algorithm, which iteratively estimates the mixing and source parameters. This strategy assumes that any entry in each source spectrogram is modeled using superimposed Gaussian components, which are mutually and individually independent across frequency and time bins. In our approach, we resolve this issue by considering a locally smooth temporal and frequency structure in the power source spectrograms. Local smoothness is enforced by incorporating a Gibbs prior in the complete data likelihood function, which models the interactions between neighboring spectrogram bins using a Markov random field. Simulations using audio files derived from stereo audio source separation evaluation campaign 2008 demonstrate high efficiency with the proposed improvement.

  15. Collusion-Resistant Audio Fingerprinting System in the Modulated Complex Lapped Transform Domain

    Science.gov (United States)

    Garcia-Hernandez, Jose Juan; Feregrino-Uribe, Claudia; Cumplido, Rene

    2013-01-01

    Collusion-resistant fingerprinting paradigm seems to be a practical solution to the piracy problem as it allows media owners to detect any unauthorized copy and trace it back to the dishonest users. Despite the billionaire losses in the music industry, most of the collusion-resistant fingerprinting systems are devoted to digital images and very few to audio signals. In this paper, state-of-the-art collusion-resistant fingerprinting ideas are extended to audio signals and the corresponding parameters and operation conditions are proposed. Moreover, in order to carry out fingerprint detection using just a fraction of the pirate audio clip, block-based embedding and its corresponding detector is proposed. Extensive simulations show the robustness of the proposed system against average collusion attack. Moreover, by using an efficient Fast Fourier Transform core and standard computer machines it is shown that the proposed system is suitable for real-world scenarios. PMID:23762455

  16. Collusion-resistant audio fingerprinting system in the modulated complex lapped transform domain.

    Directory of Open Access Journals (Sweden)

    Jose Juan Garcia-Hernandez

    Full Text Available Collusion-resistant fingerprinting paradigm seems to be a practical solution to the piracy problem as it allows media owners to detect any unauthorized copy and trace it back to the dishonest users. Despite the billionaire losses in the music industry, most of the collusion-resistant fingerprinting systems are devoted to digital images and very few to audio signals. In this paper, state-of-the-art collusion-resistant fingerprinting ideas are extended to audio signals and the corresponding parameters and operation conditions are proposed. Moreover, in order to carry out fingerprint detection using just a fraction of the pirate audio clip, block-based embedding and its corresponding detector is proposed. Extensive simulations show the robustness of the proposed system against average collusion attack. Moreover, by using an efficient Fast Fourier Transform core and standard computer machines it is shown that the proposed system is suitable for real-world scenarios.

  17. Audio feedback for student writing in online nursing courses: exploring student and instructor reactions.

    Science.gov (United States)

    Wood, Kathryn A; Moskovitz, Cary; Valiga, Theresa M

    2011-09-01

    Because scientific writing is an essential skill for advanced practice nurses, it is an important component of graduate education. Faculty typically provide written feedback about student writing, but this may not be the most effective choice for the distance-learning environment. This exploratory pilot study's aim was to compare spoken, recorded feedback with written feedback in three areas: which approach do students perceive as providing more useful guidance; which approach helps students feel more connected to the course; and which approach do instructors prefer? Students enrolled in an evidence-based practice graduate-level course received asynchronous audio feedback on their written assignments instead of the written feedback they received in other courses. Results from a survey completed by 30 students at completion of the course suggest a strong preference for audio feedback. This pilot study suggests that audio feedback may be preferable to written comments for distance learning courses. Copyright 2011, SLACK Incorporated.

  18. Audio-teleconferencing as a medium for distance learning: its application for continuing education in optometry.

    Science.gov (United States)

    Wildsoet, C; Wood, J; Parke, J

    1996-02-01

    This paper describes a highly successful pilot program of four audio-teleconferences that was offered in 1993 to optometrists based in rural and regional areas of Queensland. The program represents the first application of such technology for this purpose, either within Australia or overseas. It accessed the facilities of eight of a network of 38 open learning centres across Queensland and comprised an integrated package of three workshops, each of 3 hours duration, covering issues relating to the eye disease, glaucoma, and a stand-alone workshop covering general practice/legal issues. A 'complex directed conference model' of audio-teleconferencing was used, with each workshop incorporating a slide presentation and companion workbook that described individual pre-workshop preparatory activities and was structured to provide a focus for group discussions during the workshop. The program demonstrated audio-teleconferencing to be both a cost- and educationally-effective medium for the delivery of continuing education to a widely distributed audience.

  19. Walsh-Hadamard-Based 3-D Steganography for Protecting Sensitive Information in Point-of-Care.

    Science.gov (United States)

    Abuadbba, Alsharif; Khalil, Ibrahim

    2017-09-01

    Remote points-of-care has recently had a lot of attention for their advantages such as saving lives and cost reduction. The transmitted streams usually contain 1) normal biomedical signals (e.g., electrocardiograms) and 2) highly private information (e.g., patient identity). Despite the obvious advantages, the primary concerns are privacy and authenticity of the transferred data. Therefore, this paper introduces a novel steganographic mechanism that ensures 1) strong privacy preservation of private information by random concealing inside the transferred signals employing a key and 2) evidence of originality for the biomedical signals. To maximize hiding, fast Walsh-Hadamard transform is utilized to transform the signals into a group of coefficients. To ensure the lowest distortion, only less-significant values of coefficients are employed. To strengthen security, the key is utilized in a three-dimensional (3-D) random coefficients' reform to produce a 3-D order employed in the concealing process. The resultant distortion has been thoroughly measured in all stages. After extensive experiments on three types of signals, it has been proved that the algorithm has a little impact on the genuine signals ( 1 %). The security evaluation also confirms that unlawful retrieval of the hidden information within rational time is mightily improbable.

  20. Calibration of an audio frequency noise generator

    DEFF Research Database (Denmark)

    Diamond, Joseph M.

    1966-01-01

    a noise bandwidth Bn = π/2 × (3dB bandwidth). To apply this method to low audio frequencies, the noise bandwidth of the low Q parallel resonant circuit has been found, including the effects of both series and parallel damping. The method has been used to calibrate a General Radio 1390-B noise generator...... it is used for measurement purposes. The spectral density of a noise source may be found by measuring its rms output over a known noise bandwidth. Such a bandwidth may be provided by a passive filter using accurately known elements. For example, the parallel resonant circuit with purely parallel damping has...

  1. Mixing audio concepts, practices and tools

    CERN Document Server

    Izhaki, Roey

    2013-01-01

    Your mix can make or break a record, and mixing is an essential catalyst for a record deal. Professional engineers with exceptional mixing skills can earn vast amounts of money and find that they are in demand by the biggest acts. To develop such skills, you need to master both the art and science of mixing. The new edition of this bestselling book offers all you need to know and put into practice in order to improve your mixes. Covering the entire process --from fundamental concepts to advanced techniques -- and offering a multitude of audio samples, tips and tricks, this boo

  2. Predistortion of a Bidirectional Cuk Audio Amplifier

    DEFF Research Database (Denmark)

    Birch, Thomas Hagen; Nielsen, Dennis; Knott, Arnold

    2014-01-01

    Some non-linear amplifier topologies are capable of providing a larger voltage gain than one from a DC source, which could make them suitable for various applications. However, the non-linearities introduce a significant amount of harmonic distortion (THD). Some of this distortion could be reduced...... using predistortion. This paper suggests linearizing a nonlinear bidirectional Cuk audio amplifier using an analog predistortion approach. A prototype power stage was built and results show that a voltage gain of up to 9 dB and reduction in THD from 6% down to 3% was obtainable using this approach....

  3. Spatial audio quality perception (part 1)

    DEFF Research Database (Denmark)

    Conetta, R.; Brookes, T.; Rumsey, F.

    2015-01-01

    resulting from 48 such SAPs. Perceived degradation also depends on the particular listeners, the program content, and the listening location. For example, combining off-center listener with another SAP can reduce spatial quality significantly when compared to listening to that SAP from a central location....... The choice of the SAP can have a large influence on the degree of degradation. Taken together these findings and the quality-annotated database can guide the development of a regression model of perceived overall spatial audio quality, incorporating previously developed spatially-relevant feature...

  4. Effects for augmented reality audio headsets

    OpenAIRE

    Martí i Rabadán, Miquel

    2014-01-01

    [ANGLÈS] Augmented reality is a real-time combination of real and virtual worlds. In augmented reality audio (ARA) real surrounding sounds are mixed with virtual sound sources. In this bachelor’s degree thesis a digital, real-time hear-through system (HTS) is implemented for the acoustical transparency of an ARA headset. It is achieved by adding back the sounds that have been attenuated by the isolation characteristics of the headphone itself. The surrounding sounds are recorded on both ears...

  5. Audio signal encryption using chaotic Hénon map and lifting wavelet transforms

    Science.gov (United States)

    Roy, Animesh; Misra, A. P.

    2017-12-01

    We propose an audio signal encryption scheme based on the chaotic Hénon map. The scheme mainly comprises two phases: one is the preprocessing stage where the audio signal is transformed into data by the lifting wavelet scheme and the other in which the transformed data is encrypted by chaotic data set and hyperbolic functions. Furthermore, we use dynamic keys and consider the key space size to be large enough to resist any kind of cryptographic attacks. A statistical investigation is also made to test the security and the efficiency of the proposed scheme.

  6. Frequency dependent loss analysis and minimization of system losses in switchmode audio power amplifiers

    DEFF Research Database (Denmark)

    Yamauchi, Akira; Knott, Arnold; Jørgensen, Ivan Harald Holger

    2014-01-01

    In this paper, frequency dependent losses in switch-mode audio power amplifiers are analyzed and a loss model is improved by taking the voltage dependence of the parasitic capacitance of MOSFETs into account. The estimated power losses are compared to the measurement and great accuracy is achieved....... By choosing the optimal switching frequency based on the proposed analysis, the experimental results show that system power losses of the reference design are minimized and an efficiency improvement of 8 % in maximum is achieved without compromising audio performances....

  7. An Exploratory Evaluation of User Interfaces for 3D Audio Mixing

    DEFF Research Database (Denmark)

    Gelineck, Steven; Korsgaard, Dannie Michael

    2015-01-01

    The paper presents an exploratory evaluation comparing different versions of a mid-air gesture based interface for mixing 3D audio exploring: (1) how such an interface generally compares to a more traditional physical interface, (2) methods for grabbing/releasing audio channels in mid-air and (3......) representation of sources in separate 3D views vs. in one shared 3D view. Results suggest that while the traditional physical interface is generally intuitive and easy to use, the 3D gesture interface provides an improved understanding of the 3D space and provides a better control of especially moving sources...

  8. Audio Haptic Videogaming for Developing Wayfinding Skills in Learners Who are Blind.

    Science.gov (United States)

    Sánchez, Jaime; de Borba Campos, Marcia; Espinoza, Matías; Merabet, Lotfi B

    2014-01-01

    Interactive digital technologies are currently being developed as a novel tool for education and skill development. Audiopolis is an audio and haptic based videogame designed for developing orientation and mobility (O&M) skills in people who are blind. We have evaluated the cognitive impact of videogame play on O&M skills by assessing performance on a series of behavioral tasks carried out in both indoor and outdoor virtual spaces. Our results demonstrate that the use of Audiopolis had a positive impact on the development and use of O&M skills in school-aged learners who are blind. The impact of audio and haptic information on learning is also discussed.

  9. Acoustic Heritage and Audio Creativity: the Creative Application of Sound in the Representation, Understanding and Experience of Past Environments

    OpenAIRE

    Damian Murphy; Simon Shelley; Aglaia Foteinou; Jude Brereton; Helena Daffern

    2017-01-01

    Acoustic Heritage is one aspect of archaeoacoustics, and refers more specifically to the quantifiable acoustic properties of buildings, sites and landscapes from our architectural and archaeological past, forming an important aspect of our intangible cultural heritage. Auralisation, the audio equivalent of 3D visualization, enables these acoustic properties, captured via the process of measurement and survey, or computer based modelling, to form the basis of an audio reconstruction and presen...

  10. Managing exam stress using UMTS phones: the advantage of portable audio/video support.

    Science.gov (United States)

    Riva, Giuseppe; Grassi, Alessandra; Villani, Daniela; Gaggioli, Andrea; Preziosa, Alessandra

    2007-01-01

    Test-taking anxiety or stress is very common among university students. It can be very distressing and sometimes debilitating. Exam anxiety involves physical components and emotional components that may be taken into account for managing and reducing anxiety. An approach to control exam anxiety is to learn how to regulate emotions. To help students in managing exam stress we developed a specific protocol based on mobile narratives--multimedia narratives experienced on UMTS/3G phones. 30 female university students (M=23.48; sd=1.24) who were going to perform an exam within a week were included in the trial. They were randomly divided in five groups according to the type and mobility of the medium used: (1) audio only narrative (CD at home); (2) audio only narrative (portable MP3); (3) audio and video narrative (DVD at home); (4) audio and video narrative (UMTS based); (5) control group. Audio/video narratives induced a reduction in exam anxiety in more than 80% of the sample vs 50% of the MP3 sample and 0% of the CD sample. Further, all the users who experienced mobile narratives on UMTS phones were able to relax before the exam, against 50% of DVD users and 33% of audio-only users. The trial showed a better efficacy of mobile narratives experienced on UMTS phones in reducing the level of exam stress and in helping the student to relax. These results suggest that for the specific sample considered--Italian university students--the media used for providing an anti-stress protocol has a clear impact on its efficacy.

  11. Elicitation of attributes for the evaluation of audio-on-audio interference.

    Science.gov (United States)

    Francombe, Jon; Mason, Russell; Dewhirst, Martin; Bech, Søren

    2014-11-01

    An experiment to determine the perceptual attributes of the experience of listening to a target audio program in the presence of an audio interferer was performed. The first stage was a free elicitation task in which a total of 572 phrases were produced. In the second stage, a consensus vocabulary procedure was used to reduce these phrases into a comprehensive set of attributes. Groups of experienced and inexperienced listeners determined nine and eight attributes, respectively. These attribute sets were combined by the listeners to produce a final set of 12 attributes: masking, calming, distraction, separation, confusion, annoyance, environment, chaotic, balance and blend, imagery, response to stimuli over time, and short-term response to stimuli. In the third stage, a simplified ranking procedure was used to select only the most useful and relevant attributes. Four attributes were selected: distraction, annoyance, balance and blend, and confusion. Ratings using these attributes were collected in the fourth stage, and a principal component analysis performed. This suggested two dimensions underlying the perception of an audio-on-audio interference situation: The first dimension was labeled "distraction" and accounted for 89% of the variance; the second dimension, accounting for 10% of the variance, was labeled "balance and blend."

  12. Audio-Visual, Visuo-Tactile and Audio-Tactile Correspondences in Preschoolers.

    Science.gov (United States)

    Nava, Elena; Grassi, Massimo; Turati, Chiara

    2016-01-01

    Interest in crossmodal correspondences has recently seen a renaissance thanks to numerous studies in human adults. Yet, still very little is known about crossmodal correspondences in children, particularly in sensory pairings other than audition and vision. In the current study, we investigated whether 4-5-year-old children match auditory pitch to the spatial motion of visual objects (audio-visual condition). In addition, we investigated whether this correspondence extends to touch, i.e., whether children also match auditory pitch to the spatial motion of touch (audio-tactile condition) and the spatial motion of visual objects to touch (visuo-tactile condition). In two experiments, two different groups of children were asked to indicate which of two stimuli fitted best with a centrally located third stimulus (Experiment 1), or to report whether two presented stimuli fitted together well (Experiment 2). We found sensitivity to the congruency of all of the sensory pairings only in Experiment 2, suggesting that only under specific circumstances can these correspondences be observed. Our results suggest that pitch-height correspondences for audio-visual and audio-tactile combinations may still be weak in preschool children, and speculate that this could be due to immature linguistic and auditory cues that are still developing at age five.

  13. Newnes audio and Hi-Fi engineer's pocket book

    CERN Document Server

    Capel, Vivian

    2013-01-01

    Newnes Audio and Hi-Fi Engineer's Pocket Book, Second Edition provides concise discussion of several audio topics. The book is comprised of 10 chapters that cover different audio equipment. The coverage of the text includes microphones, gramophones, compact discs, and tape recorders. The book also covers high-quality radio, amplifiers, and loudspeakers. The book then reviews the concepts of sound and acoustics, and presents some facts and formulas relevant to audio. The text will be useful to sound engineers and other professionals whose work involves sound systems.

  14. Semantic Labeling of Nonspeech Audio Clips

    Directory of Open Access Journals (Sweden)

    Xiaojuan Ma

    2010-01-01

    Full Text Available Human communication about entities and events is primarily linguistic in nature. While visual representations of information are shown to be highly effective as well, relatively little is known about the communicative power of auditory nonlinguistic representations. We created a collection of short nonlinguistic auditory clips encoding familiar human activities, objects, animals, natural phenomena, machinery, and social scenes. We presented these sounds to a broad spectrum of anonymous human workers using Amazon Mechanical Turk and collected verbal sound labels. We analyzed the human labels in terms of their lexical and semantic properties to ascertain that the audio clips do evoke the information suggested by their pre-defined captions. We then measured the agreement with the semantically compatible labels for each sound clip. Finally, we examined which kinds of entities and events, when captured by nonlinguistic acoustic clips, appear to be well-suited to elicit information for communication, and which ones are less discriminable. Our work is set against the broader goal of creating resources that facilitate communication for people with some types of language loss. Furthermore, our data should prove useful for future research in machine analysis/synthesis of audio, such as computational auditory scene analysis, and annotating/querying large collections of sound effects.

  15. Simple Solutions for Space Station Audio Problems

    Science.gov (United States)

    Wood, Eric

    2016-01-01

    Throughout this summer, a number of different projects were supported relating to various NASA programs, including the International Space Station (ISS) and Orion. The primary project that was worked on was designing and testing an acoustic diverter which could be used on the ISS to increase sound pressure levels in Node 1, a module that does not have any Audio Terminal Units (ATUs) inside it. This acoustic diverter is not intended to be a permanent solution to providing audio to Node 1; it is simply intended to improve conditions while more permanent solutions are under development. One of the most exciting aspects of this project is that the acoustic diverter is designed to be 3D printed on the ISS, using the 3D printer that was set up earlier this year. Because of this, no new hardware needs to be sent up to the station, and no extensive hardware testing needs to be performed on the ground before sending it to the station. Instead, the 3D part file can simply be uploaded to the station's 3D printer, where the diverter will be made.

  16. Feature Selection for Audio Surveillance in Urban Environment

    Directory of Open Access Journals (Sweden)

    KIKTOVA Eva

    2014-05-01

    Full Text Available This paper presents the work leading to the acoustic event detection system, which is designed to recognize two types of acoustic events (shot and breaking glass in urban environment. For this purpose, a huge front-end processing was performed for the effective parametric representation of an input sound. MFCC features and features computed during their extraction (MELSPEC and FBANK, then MPEG-7 audio descriptors and other temporal and spectral characteristics were extracted. High dimensional feature sets were created and in the next phase reduced by the mutual information based selection algorithms. Hidden Markov Model based classifier was applied and evaluated by the Viterbi decoding algorithm. Thus very effective feature sets were identified and also the less important features were found.

  17. Exploring Meaning Negotiation Patterns in Synchronous Audio and Video Conferencing English Classes in China

    Science.gov (United States)

    Li, Chenxi; Wu, Ligao; Li, Chen; Tang, Jinlan

    2017-01-01

    This work-in-progress doctoral research project aims to identify meaning negotiation patterns in synchronous audio and video Computer-Mediated Communication (CMC) environments based on the model of CMC text chat proposed by Smith (2003). The study was conducted in the Institute of Online Education at Beijing Foreign Studies University. Four dyads…

  18. Challenges of Using Audio-Visual Aids as Warm-Up Activity in Teaching Aviation English

    Science.gov (United States)

    Sahin, Mehmet; Sule, St.; Seçer, Y. E.

    2016-01-01

    This study aims to find out the challenges encountered in the use of video as audio-visual material as a warm-up activity in aviation English course at high school level. This study is based on a qualitative study in which focus group interview is used as the data collection procedure. The participants of focus group are four instructors teaching…

  19. Online Instructor's Use of Audio Feedback to Increase Social Presence and Student Satisfaction

    Science.gov (United States)

    Portolese Dias, Laura; Trumpy, Robert

    2014-01-01

    This study investigates the impact of written group feedback, versus audio feedback, based upon four student satisfaction measures in the online classroom environment. Undergraduate students in the control group were provided both individual written feedback and group written feedback, while undergraduate students in the experimental treatment…

  20. Transitioning from Analog to Digital Audio Recording in Childhood Speech Sound Disorders

    Science.gov (United States)

    Shriberg, Lawrence D.; Mcsweeny, Jane L.; Anderson, Bruce E.; Campbell, Thomas F.; Chial, Michael R.; Green, Jordan R.; Hauner, Katherina K.; Moore, Christopher A.; Rusiewicz, Heather L.; Wilson, David L.

    2005-01-01

    Few empirical findings or technical guidelines are available on the current transition from analog to digital audio recording in childhood speech sound disorders. Of particular concern in the present context was whether a transition from analog- to digital-based transcription and coding of prosody and voice features might require re-standardizing…

  1. Designing between Pedagogies and Cultures: Audio-Visual Chinese Language Resources for Australian Schools

    Science.gov (United States)

    Yuan, Yifeng; Shen, Huizhong

    2016-01-01

    This design-based study examines the creation and development of audio-visual Chinese language teaching and learning materials for Australian schools by incorporating users' feedback and content writers' input that emerged in the designing process. Data were collected from workshop feedback of two groups of Chinese-language teachers from primary…

  2. MPEG-7 audio-visual indexing test-bed for video retrieval

    Science.gov (United States)

    Gagnon, Langis; Foucher, Samuel; Gouaillier, Valerie; Brun, Christelle; Brousseau, Julie; Boulianne, Gilles; Osterrath, Frederic; Chapdelaine, Claude; Dutrisac, Julie; St-Onge, Francis; Champagne, Benoit; Lu, Xiaojian

    2003-12-01

    This paper reports on the development status of a Multimedia Asset Management (MAM) test-bed for content-based indexing and retrieval of audio-visual documents within the MPEG-7 standard. The project, called "MPEG-7 Audio-Visual Document Indexing System" (MADIS), specifically targets the indexing and retrieval of video shots and key frames from documentary film archives, based on audio-visual content like face recognition, motion activity, speech recognition and semantic clustering. The MPEG-7/XML encoding of the film database is done off-line. The description decomposition is based on a temporal decomposition into visual segments (shots), key frames and audio/speech sub-segments. The visible outcome will be a web site that allows video retrieval using a proprietary XQuery-based search engine and accessible to members at the Canadian National Film Board (NFB) Cineroute site. For example, end-user will be able to ask to point on movie shots in the database that have been produced in a specific year, that contain the face of a specific actor who tells a specific word and in which there is no motion activity. Video streaming is performed over the high bandwidth CA*net network deployed by CANARIE, a public Canadian Internet development organization.

  3. Audio-vestibular signs and symptoms in Chiari malformation type i. Case series and literature review.

    Science.gov (United States)

    Guerra Jiménez, Gloria; Mazón Gutiérrez, Ángel; Marco de Lucas, Enrique; Valle San Román, Natalia; Martín Laez, Rubén; Morales Angulo, Carmelo

    2015-01-01

    Chiari malformation is an alteration of the base of the skull with herniation through the foramen magnum of the brain stem and cerebellum. Although the most common presentation is occipital headache, the association of audio-vestibular symptoms is not rare. The aim of our study was to describe audio-vestibular signs and symptoms in Chiari malformation type i (CM-I). We performed a retrospective observational study of patients referred to our unit during the last 5 years. We also carried out a literature review of audio-vestibular signs and symptoms in this disease. There were 9 patients (2 males and 7 females), with an average age of 42.8 years. Five patients presented a Ménière-like syndrome; 2 cases, a recurrent vertigo with peripheral features; one patient showed a sudden hearing loss; and one case suffered a sensorineural hearing loss with early childhood onset. The most common audio-vestibular symptom indicated in the literature in patients with CM-I is unsteadiness (49%), followed by dizziness (18%), nystagmus (15%) and hearing loss (15%). Nystagmus is frequently horizontal (74%) or down-beating (18%). Other audio-vestibular signs and symptoms are tinnitus (11%), aural fullness (10%) and hyperacusis (1%). Occipital headache that increases with Valsalva manoeuvres and hand paresthesias are very suggestive symptoms. The appearance of audio-vestibular manifestations in CM-I makes it common to refer these patients to neurotologists. Unsteadiness, vertiginous syndromes and sensorineural hearing loss are frequent. Nystagmus, especially horizontal and down-beating, is not rare. It is important for neurotologists to familiarise themselves with CM-I symptoms to be able to consider it in differential diagnosis. Copyright © 2014 Elsevier España, S.L.U. y Sociedad Española de Otorrinolaringología y Patología Cérvico-Facial. All rights reserved.

  4. A Novel Audio Cryptosystem Using Chaotic Maps and DNA Encoding

    Directory of Open Access Journals (Sweden)

    S. J. Sheela

    2017-01-01

    Full Text Available Chaotic maps have good potential in security applications due to their inherent characteristics relevant to cryptography. This paper introduces a new audio cryptosystem based on chaotic maps, hybrid chaotic shift transform (HCST, and deoxyribonucleic acid (DNA encoding rules. The scheme uses chaotic maps such as two-dimensional modified Henon map (2D-MHM and standard map. The 2D-MHM which has sophisticated chaotic behavior for an extensive range of control parameters is used to perform HCST. DNA encoding technology is used as an auxiliary tool which enhances the security of the cryptosystem. The performance of the algorithm is evaluated for various speech signals using different encryption/decryption quality metrics. The simulation and comparison results show that the algorithm can achieve good encryption results and is able to resist several cryptographic attacks. The various types of analysis revealed that the algorithm is suitable for narrow band radio communication and real-time speech encryption applications.

  5. Real Time Recognition Of Speakers From Internet Audio Stream

    Directory of Open Access Journals (Sweden)

    Weychan Radoslaw

    2015-09-01

    Full Text Available In this paper we present an automatic speaker recognition technique with the use of the Internet radio lossy (encoded speech signal streams. We show an influence of the audio encoder (e.g., bitrate on the speaker model quality. The model of each speaker was calculated with the use of the Gaussian mixture model (GMM approach. Both the speaker recognition and the further analysis were realized with the use of short utterances to facilitate real time processing. The neighborhoods of the speaker models were analyzed with the use of the ISOMAP algorithm. The experiments were based on four 1-hour public debates with 7–8 speakers (including the moderator, acquired from the Polish radio Internet services. The presented software was developed with the MATLAB environment.

  6. Audio-haptic-virtual Mona Lisa (The Blind and Painting

    Directory of Open Access Journals (Sweden)

    Aksinja Kermauner

    2014-03-01

    Full Text Available The purpose of the article is to explore in what ways the visual arts (with emphasis on painting can be brought closer to the blind in the postmodern society, in which sight is perceived to be the chief sense and in which most information is based on images. The basic methods of presenting a work of art involve the remaining senses, mostly those of hearing and touch. It is of course not enough just to deliver a factual description of a painting or to transform it into tactile graphics – more complex techniques such as audio-description, method of associations, participating in role-playing, all with the aim of a holistic experience of the work of art, must be sought instead. In the world of virtual reality, additional equipment for the blind (e.g., data gloves provides new opportunities.

  7. Content Discovery from Composite Audio : An unsupervised approach

    NARCIS (Netherlands)

    Lu, L.

    2009-01-01

    In this thesis, we developed and assessed a novel robust and unsupervised framework for semantic inference from composite audio signals. We focused on the problem of detecting audio scenes and grouping them into meaningful clusters. Our approach addressed all major steps in a general process of

  8. Tune in the Net with RealAudio.

    Science.gov (United States)

    Buchanan, Larry

    1997-01-01

    Describes how to connect to the RealAudio Web site to download a player that provides sound from Web pages to the computer through streaming technology. Explains hardware and software requirements and provides addresses for other RealAudio Web sites are provided, including weather information and current news. (LRW)

  9. Teaching Audio Playwriting: The Pedagogy of Drama Podcasting

    Science.gov (United States)

    Eshelman, David J.

    2016-01-01

    This article suggests how teaching artists can develop practical coursework in audio playwriting. To prepare students to work in the reemergent audio drama medium, the author created a seminar course called Radio Theatre Writing, taught at Arkansas Tech University in the fall of 2014. The course had three sections. First, it focused on…

  10. Use of Video and Audio Texts in EFL Listening Test

    Science.gov (United States)

    Basal, Ahmet; Gülözer, Kaine; Demir, Ibrahim

    2015-01-01

    The study aims to discover whether audio or video modality in a listening test is more beneficial to test takers. In this study, the posttest-only control group design was utilized and quantitative data were collected in order to measure participant performances concerning two types of modality (audio or video) in a listening test. The…

  11. Effect of Audio vs. Video on Aural Discrimination of Vowels

    Science.gov (United States)

    McCrocklin, Shannon

    2012-01-01

    Despite the growing use of media in the classroom, the effects of using of audio versus video in pronunciation teaching has been largely ignored. To analyze the impact of the use of audio or video training on aural discrimination of vowels, 61 participants (all students at a large American university) took a pre-test followed by two training…

  12. A Case Study on Audio Feedback with Geography Undergraduates

    Science.gov (United States)

    Rodway-Dyer, Sue; Knight, Jasper; Dunne, Elizabeth

    2011-01-01

    Several small-scale studies have suggested that audio feedback can help students to reflect on their learning and to develop deep learning approaches that are associated with higher attainment in assessments. For this case study, Geography undergraduates were given audio feedback on a written essay assignment, alongside traditional written…

  13. Automated Speech and Audio Analysis for Semantic Access to Multimedia

    NARCIS (Netherlands)

    Jong, F.M.G. de; Ordelman, R.; Huijbregts, M.

    2006-01-01

    The deployment and integration of audio processing tools can enhance the semantic annotation of multimedia content, and as a consequence, improve the effectiveness of conceptual access tools. This paper overviews the various ways in which automatic speech and audio analysis can contribute to

  14. Decision-level fusion for audio-visual laughter detection

    NARCIS (Netherlands)

    Reuderink, B.; Poel, M.; Truong, K.; Poppe, R.; Pantic, M.

    2008-01-01

    Laughter is a highly variable signal, which can be caused by a spectrum of emotions. This makes the automatic detection of laughter a challenging, but interesting task. We perform automatic laughter detection using audio-visual data from the AMI Meeting Corpus. Audio-visual laughter detection is

  15. Automated speech and audio analysis for semantic access to multimedia

    NARCIS (Netherlands)

    de Jong, Franciska M.G.; Ordelman, Roeland J.F.; Huijbregts, M.A.H.; Avrithis, Y.; Kompatsiaris, Y.; Staab, S.; O' Connor, N.E.

    2006-01-01

    The deployment and integration of audio processing tools can enhance the semantic annotation of multimedia content, and as a consequence, improve the effectiveness of conceptual access tools. This paper overviews the various ways in which automatic speech and audio analysis can contribute to

  16. On the Use of Memory Models in Audio Features

    DEFF Research Database (Denmark)

    Jensen, Karl Kristoffer

    2011-01-01

    Audio feature estimation is potentially improved by including higher- level models. One such model is the Short Term Memory (STM) model. A new paradigm of audio feature estimation is obtained by adding the influence of notes in the STM. These notes are identified when the perceptual spectral flux......, and an initial experiment with sensory dissonance has been undertaken with good results....

  17. Automatic processing of CERN video, audio and photo archives

    CERN Document Server

    Kwiatek, M

    2008-01-01

    The digitalization of CERN audio-visual archives, a major task currently in progress, will generate over 40 TB of video, audio and photo files. Storing these files is one issue, but a far more important challenge is to provide long-time coherence of the archive and to make these files available on-line with minimum manpower investment.

  18. Improving audio chord transcription by exploiting harmonic and metric knowledge

    NARCIS (Netherlands)

    de Haas, W.B.; Rodrigues Magalhães, J.P.; Wiering, F.

    2012-01-01

    We present a new system for chord transcription from polyphonic musical audio that uses domain-specific knowledge about tonal harmony and metrical position to improve chord transcription performance. Low-level pulse and spectral features are extracted from an audio source using the Vamp plugin

  19. PROTOTIPE KOMPRESI LOSSLESS AUDIO CODEC MENGGUNAKAN ENTROPY ENCODING

    OpenAIRE

    Andreas Soegandi

    2010-01-01

    The purpose of this study was to perform lossless compression on the uncompress audio file audio to minimize file size without reducing the quality. The application is developed using the entropy encoding compression method with rice coding technique. For the result, the compression ratio is good enough and easy to be developed because the algorithm is quite simple. 

  20. Prototipe Kompresi Lossless Audio Codec Menggunakan Entropy Encoding

    Directory of Open Access Journals (Sweden)

    Andreas Soegandi

    2010-12-01

    Full Text Available The purpose of this study was to perform lossless compression on the uncompress audio file audio to minimize file size without reducing the quality. The application is developed using the entropy encoding compression method with rice coding technique. For the result, the compression ratio is good enough and easy to be developed because the algorithm is quite simple.