streaming audio-visual content: Topics by WorldWideScience.org

Sample records for streaming audio-visual content

Extraction of Information of Audio-Visual Contents

Directory of Open Access Journals (Sweden)

Carlos Aguilar

2011-10-01

Full Text Available In this article we show how it is possible to use Channel Theory (Barwise and Seligman, 1997 for modeling the process of information extraction realized by audiences of audio-visual contents. To do this, we rely on the concepts pro- posed by Channel Theory and, especially, its treatment of representational systems. We then show how the information that an agent is capable of extracting from the content depends on the number of channels he is able to establish between the content and the set of classifications he is able to discriminate. The agent can endeavor the extraction of information through these channels from the totality of content; however, we discuss the advantages of extracting from its constituents in order to obtain a greater number of informational items that represent it. After showing how the extraction process is endeavored for each channel, we propose a method of representation of all the informative values an agent can obtain from a content using a matrix constituted by the channels the agent is able to establish on the content (source classifications, and the ones he can understand as individual (destination classifications. We finally show how this representation allows reflecting the evolution of the informative items through the evolution of audio-visual content.
Audio-visual temporal recalibration can be constrained by content cues regardless of spatial overlap

Directory of Open Access Journals (Sweden)

Warrick eRoseboom

2013-04-01

Full Text Available It has now been well established that the point of subjective synchrony for audio and visual events can be shifted following exposure to asynchronous audio-visual presentations, an effect often referred to as temporal recalibration. Recently it was further demonstrated that it is possible to concurrently maintain two such recalibrated, and opposing, estimates of audio-visual temporal synchrony. However, it remains unclear precisely what defines a given audio-visual pair such that it is possible to maintain a temporal relationship distinct from other pairs. It has been suggested that spatial separation of the different audio-visual pairs is necessary to achieve multiple distinct audio-visual synchrony estimates. Here we investigated if this was necessarily true. Specifically, we examined whether it is possible to obtain two distinct temporal recalibrations for stimuli that differed only in featural content. Using both complex (audio visual speech; Experiment 1 and simple stimuli (high and low pitch audio matched with either vertically or horizontally oriented Gabors; Experiment 2 we found concurrent, and opposite, recalibrations despite there being no spatial difference in presentation location at any point throughout the experiment. This result supports the notion that the content of an audio-visual pair can be used to constrain distinct audio-visual synchrony estimates regardless of spatial overlap.
Web Audio/Video Streaming Tool

Science.gov (United States)

Guruvadoo, Eranna K.

2003-01-01

In order to promote NASA-wide educational outreach program to educate and inform the public of space exploration, NASA, at Kennedy Space Center, is seeking efficient ways to add more contents to the web by streaming audio/video files. This project proposes a high level overview of a framework for the creation, management, and scheduling of audio/video assets over the web. To support short-term goals, the prototype of a web-based tool is designed and demonstrated to automate the process of streaming audio/video files. The tool provides web-enabled users interfaces to manage video assets, create publishable schedules of video assets for streaming, and schedule the streaming events. These operations are performed on user-defined and system-derived metadata of audio/video assets stored in a relational database while the assets reside on separate repository. The prototype tool is designed using ColdFusion 5.0.
Robust audio-visual speech recognition under noisy audio-video conditions.

Science.gov (United States)

Stewart, Darryl; Seymour, Rowan; Pass, Adrian; Ming, Ji

2014-02-01

This paper presents the maximum weighted stream posterior (MWSP) model as a robust and efficient stream integration method for audio-visual speech recognition in environments, where the audio or video streams may be subjected to unknown and time-varying corruption. A significant advantage of MWSP is that it does not require any specific measurements of the signal in either stream to calculate appropriate stream weights during recognition, and as such it is modality-independent. This also means that MWSP complements and can be used alongside many of the other approaches that have been proposed in the literature for this problem. For evaluation we used the large XM2VTS database for speaker-independent audio-visual speech recognition. The extensive tests include both clean and corrupted utterances with corruption added in either/both the video and audio streams using a variety of types (e.g., MPEG-4 video compression) and levels of noise. The experiments show that this approach gives excellent performance in comparison to another well-known dynamic stream weighting approach and also compared to any fixed-weighted integration approach in both clean conditions or when noise is added to either stream. Furthermore, our experiments show that the MWSP approach dynamically selects suitable integration weights on a frame-by-frame basis according to the level of noise in the streams and also according to the naturally fluctuating relative reliability of the modalities even in clean conditions. The MWSP approach is shown to maintain robust recognition performance in all tested conditions, while requiring no prior knowledge about the type or level of noise.
Audio-visual speech timing sensitivity is enhanced in cluttered conditions.

Directory of Open Access Journals (Sweden)

Warrick Roseboom

2011-04-01

Full Text Available Events encoded in separate sensory modalities, such as audition and vision, can seem to be synchronous across a relatively broad range of physical timing differences. This may suggest that the precision of audio-visual timing judgments is inherently poor. Here we show that this is not necessarily true. We contrast timing sensitivity for isolated streams of audio and visual speech, and for streams of audio and visual speech accompanied by additional, temporally offset, visual speech streams. We find that the precision with which synchronous streams of audio and visual speech are identified is enhanced by the presence of additional streams of asynchronous visual speech. Our data suggest that timing perception is shaped by selective grouping processes, which can result in enhanced precision in temporally cluttered environments. The imprecision suggested by previous studies might therefore be a consequence of examining isolated pairs of audio and visual events. We argue that when an isolated pair of cross-modal events is presented, they tend to group perceptually and to seem synchronous as a consequence. We have revealed greater precision by providing multiple visual signals, possibly allowing a single auditory speech stream to group selectively with the most synchronous visual candidate. The grouping processes we have identified might be important in daily life, such as when we attempt to follow a conversation in a crowded room.
Automated processing of massive audio/video content using FFmpeg

Directory of Open Access Journals (Sweden)

Kia Siang Hock

2014-01-01

Full Text Available Audio and video content forms an integral, important and expanding part of the digital collections in libraries and archives world-wide. While these memory institutions are familiar and well-versed in the management of more conventional materials such as books, periodicals, ephemera and images, the handling of audio (e.g., oral history recordings and video content (e.g., audio-visual recordings, broadcast content requires additional toolkits. In particular, a robust and comprehensive tool that provides a programmable interface is indispensable when dealing with tens of thousands of hours of audio and video content. FFmpeg is comprehensive and well-established open source software that is capable of the full-range of audio/video processing tasks (such as encode, decode, transcode, mux, demux, stream and filter. It is also capable of handling a wide-range of audio and video formats, a unique challenge in memory institutions. It comes with a command line interface, as well as a set of developer libraries that can be incorporated into applications.
Audio-Visual Speech Recognition Using MPEG-4 Compliant Visual Features

Directory of Open Access Journals (Sweden)

Petar S. Aleksic

2002-11-01

Full Text Available We describe an audio-visual automatic continuous speech recognition system, which significantly improves speech recognition performance over a wide range of acoustic noise levels, as well as under clean audio conditions. The system utilizes facial animation parameters (FAPs supported by the MPEG-4 standard for the visual representation of speech. We also describe a robust and automatic algorithm we have developed to extract FAPs from visual data, which does not require hand labeling or extensive training procedures. The principal component analysis (PCA was performed on the FAPs in order to decrease the dimensionality of the visual feature vectors, and the derived projection weights were used as visual features in the audio-visual automatic speech recognition (ASR experiments. Both single-stream and multistream hidden Markov models (HMMs were used to model the ASR system, integrate audio and visual information, and perform a relatively large vocabulary (approximately 1000 words speech recognition experiments. The experiments performed use clean audio data and audio data corrupted by stationary white Gaussian noise at various SNRs. The proposed system reduces the word error rate (WER by 20% to 23% relatively to audio-only speech recognition WERs, at various SNRs (0Ã¢Â€Â“30 dB with additive white Gaussian noise, and by 19% relatively to audio-only speech recognition WER under clean audio conditions.
Knowledge-assisted cross-media analysis of audio-visual content in the news domain

NARCIS (Netherlands)

Mezaris, Vasileios; Gidaros, Spyros; Papadopoulos, Georgios Th.; Kasper, Walter; Ordelman, Roeland J.F.; de Jong, Franciska M.G.; Kompatsiaris, Ioannis

In this paper, a complete architecture for knowledge-assisted cross-media analysis of News-related multimedia content is presented, along with its constituent components. The proposed analysis architecture employs state-of-the-art methods for the analysis of each individual modality (visual, audio,
Audio-Visual Temporal Recalibration Can be Constrained by Content Cues Regardless of Spatial Overlap

OpenAIRE

Roseboom, Warrick; Kawabe, Takahiro; Nishida, Shin?Ya

2013-01-01

It has now been well established that the point of subjective synchrony for audio and visual events can be shifted following exposure to asynchronous audio-visual presentations, an effect often referred to as temporal recalibration. Recently it was further demonstrated that it is possible to concurrently maintain two such recalibrated, and opposing, estimates of audio-visual temporal synchrony. However, it remains unclear precisely what defines a given audio-visual pair such that it is possib...
Semantic Indexing of Multimedia Content Using Visual, Audio, and Text Cues

Directory of Open Access Journals (Sweden)

W. H. Adams

2003-02-01

Full Text Available We present a learning-based approach to the semantic indexing of multimedia content using cues derived from audio, visual, and text features. We approach the problem by developing a set of statistical models for a predefined lexicon. Novel concepts are then mapped in terms of the concepts in the lexicon. To achieve robust detection of concepts, we exploit features from multiple modalities, namely, audio, video, and text. Concept representations are modeled using Gaussian mixture models (GMM, hidden Markov models (HMM, and support vector machines (SVM. Models such as Bayesian networks and SVMs are used in a late-fusion approach to model concepts that are not explicitly modeled in terms of features. Our experiments indicate promise in the proposed classification and fusion methodologies: our proposed fusion scheme achieves more than 10% relative improvement over the best unimodal concept detector.
Voice over: Audio-visual congruency and content recall in the gallery setting.

Science.gov (United States)

Fairhurst, Merle T; Scott, Minnie; Deroy, Ophelia

2017-01-01

Experimental research has shown that pairs of stimuli which are congruent and assumed to 'go together' are recalled more effectively than an item presented in isolation. Will this multisensory memory benefit occur when stimuli are richer and longer, in an ecological setting? In the present study, we focused on an everyday situation of audio-visual learning and manipulated the relationship between audio guide tracks and viewed portraits in the galleries of the Tate Britain. By varying the gender and narrative style of the voice-over, we examined how the perceived congruency and assumed unity of the audio guide track with painted portraits affected subsequent recall. We show that tracks perceived as best matching the viewed portraits led to greater recall of both sensory and linguistic content. We provide the first evidence that manipulating crossmodal congruence and unity assumptions can effectively impact memory in a multisensory ecological setting, even in the absence of precise temporal alignment between sensory cues.
A System for the Semantic Multimodal Analysis of News Audio-Visual Content

Directory of Open Access Journals (Sweden)

Michael G. Strintzis

2010-01-01

Full Text Available News-related content is nowadays among the most popular types of content for users in everyday applications. Although the generation and distribution of news content has become commonplace, due to the availability of inexpensive media capturing devices and the development of media sharing services targeting both professional and user-generated news content, the automatic analysis and annotation that is required for supporting intelligent search and delivery of this content remains an open issue. In this paper, a complete architecture for knowledge-assisted multimodal analysis of news-related multimedia content is presented, along with its constituent components. The proposed analysis architecture employs state-of-the-art methods for the analysis of each individual modality (visual, audio, text separately and proposes a novel fusion technique based on the particular characteristics of news-related content for the combination of the individual modality analysis results. Experimental results on news broadcast video illustrate the usefulness of the proposed techniques in the automatic generation of semantic annotations.
Audio/visual analysis for high-speed TV advertisement detection from MPEG bitstream

OpenAIRE

Sadlier, David A.

2002-01-01

Advertisement breaks dunng or between television programmes are typically flagged by senes of black-and-silent video frames, which recurrendy occur in order to audio-visually separate individual advertisement spots from one another. It is the regular prevalence of these flags that enables automatic differentiauon between what is programme content and what is advertisement break. Detection of these audio-visual depressions within broadcast television content provides a basis on which advertise...
Audio-visual biofeedback for respiratory-gated radiotherapy: Impact of audio instruction and audio-visual biofeedback on respiratory-gated radiotherapy

International Nuclear Information System (INIS)

George, Rohini; Chung, Theodore D.; Vedam, Sastry S.; Ramakrishnan, Viswanathan; Mohan, Radhe; Weiss, Elisabeth; Keall, Paul J.

2006-01-01

Purpose: Respiratory gating is a commercially available technology for reducing the deleterious effects of motion during imaging and treatment. The efficacy of gating is dependent on the reproducibility within and between respiratory cycles during imaging and treatment. The aim of this study was to determine whether audio-visual biofeedback can improve respiratory reproducibility by decreasing residual motion and therefore increasing the accuracy of gated radiotherapy. Methods and Materials: A total of 331 respiratory traces were collected from 24 lung cancer patients. The protocol consisted of five breathing training sessions spaced about a week apart. Within each session the patients initially breathed without any instruction (free breathing), with audio instructions and with audio-visual biofeedback. Residual motion was quantified by the standard deviation of the respiratory signal within the gating window. Results: Audio-visual biofeedback significantly reduced residual motion compared with free breathing and audio instruction. Displacement-based gating has lower residual motion than phase-based gating. Little reduction in residual motion was found for duty cycles less than 30%; for duty cycles above 50% there was a sharp increase in residual motion. Conclusions: The efficiency and reproducibility of gating can be improved by: incorporating audio-visual biofeedback, using a 30-50% duty cycle, gating during exhalation, and using displacement-based gating
Pengaruh layanan informasi bimbingan konseling berbantuan media audio visual terhadap empati siswa

Directory of Open Access Journals (Sweden)

Rita Kumalasari

2017-05-01

The results of research effective of audio-visual media counseling techniques effective and practical to increase the empathy of students are rational design, key concepts, understanding, purpose, content models, the role and qualifications tutor (counselor is expected, procedures or steps in the implementation of the audio-visual, evaluation, follow-up, support system. This research is proven effective in improving student behavior. Empathy behavior of students increases 28.9% from the previous 45.08% increase to 73.98%. This increase occurred in all aspects of empathy Keywords: Effective, Audio visual, Empathy
Summarizing Audiovisual Contents of a Video Program

Science.gov (United States)

Gong, Yihong

2003-12-01

In this paper, we focus on video programs that are intended to disseminate information and knowledge such as news, documentaries, seminars, etc, and present an audiovisual summarization system that summarizes the audio and visual contents of the given video separately, and then integrating the two summaries with a partial alignment. The audio summary is created by selecting spoken sentences that best present the main content of the audio speech while the visual summary is created by eliminating duplicates/redundancies and preserving visually rich contents in the image stream. The alignment operation aims to synchronize each spoken sentence in the audio summary with its corresponding speaker's face and to preserve the rich content in the visual summary. A Bipartite Graph-based audiovisual alignment algorithm is developed to efficiently find the best alignment solution that satisfies these alignment requirements. With the proposed system, we strive to produce a video summary that: (1) provides a natural visual and audio content overview, and (2) maximizes the coverage for both audio and visual contents of the original video without having to sacrifice either of them.
Audio-Visual Tibetan Speech Recognition Based on a Deep Dynamic Bayesian Network for Natural Human Robot Interaction

Directory of Open Access Journals (Sweden)

Yue Zhao

2012-12-01

Full Text Available Audio-visual speech recognition is a natural and robust approach to improving human-robot interaction in noisy environments. Although multi-stream Dynamic Bayesian Network and coupled HMM are widely used for audio-visual speech recognition, they fail to learn the shared features between modalities and ignore the dependency of features among the frames within each discrete state. In this paper, we propose a Deep Dynamic Bayesian Network (DDBN to perform unsupervised extraction of spatial-temporal multimodal features from Tibetan audio-visual speech data and build an accurate audio-visual speech recognition model under a no frame-independency assumption. The experiment results on Tibetan speech data from some real-world environments showed the proposed DDBN outperforms the state-of-art methods in word recognition accuracy.
pyAudioAnalysis: An Open-Source Python Library for Audio Signal Analysis.

Science.gov (United States)

Giannakopoulos, Theodoros

2015-01-01

Audio information plays a rather important role in the increasing digital content that is available today, resulting in a need for methodologies that automatically analyze such content: audio event recognition for home automations and surveillance systems, speech recognition, music information retrieval, multimodal analysis (e.g. audio-visual analysis of online videos for content-based recommendation), etc. This paper presents pyAudioAnalysis, an open-source Python library that provides a wide range of audio analysis procedures including: feature extraction, classification of audio signals, supervised and unsupervised segmentation and content visualization. pyAudioAnalysis is licensed under the Apache License and is available at GitHub (https://github.com/tyiannak/pyAudioAnalysis/). Here we present the theoretical background behind the wide range of the implemented methodologies, along with evaluation metrics for some of the methods. pyAudioAnalysis has been already used in several audio analysis research applications: smart-home functionalities through audio event detection, speech emotion recognition, depression classification based on audio-visual features, music segmentation, multimodal content-based movie recommendation and health applications (e.g. monitoring eating habits). The feedback provided from all these particular audio applications has led to practical enhancement of the library.
Market potential for interactive audio-visual media

NARCIS (Netherlands)

Leurdijk, A.; Limonard, S.

2005-01-01

NM2 (New Media for a New Millennium) develops tools for interactive, personalised and non-linear audio-visual content that will be tested in seven pilot productions. This paper looks at the market potential for these productions from a technological, a business and a users' perspective. It shows
Computationally Efficient Clustering of Audio-Visual Meeting Data

Science.gov (United States)

Hung, Hayley; Friedland, Gerald; Yeo, Chuohao

This chapter presents novel computationally efficient algorithms to extract semantically meaningful acoustic and visual events related to each of the participants in a group discussion using the example of business meeting recordings. The recording setup involves relatively few audio-visual sensors, comprising a limited number of cameras and microphones. We first demonstrate computationally efficient algorithms that can identify who spoke and when, a problem in speech processing known as speaker diarization. We also extract visual activity features efficiently from MPEG4 video by taking advantage of the processing that was already done for video compression. Then, we present a method of associating the audio-visual data together so that the content of each participant can be managed individually. The methods presented in this article can be used as a principal component that enables many higher-level semantic analysis tasks needed in search, retrieval, and navigation.

Consequence of audio visual collection in school libraries

OpenAIRE

Kuri, Ramesh

2016-01-01

The collection of Audio-Visual in library plays important role in teaching and learning. The importance of audio visual (AV) technology in education should not be underestimated. If audio-visual collection in library is carefully planned and designed, it can provide a rich learning environment. In this article, an author discussed the consequences of Audio-Visual collection in libraries especially for students of school library
[Intermodal timing cues for audio-visual speech recognition].

Science.gov (United States)

Hashimoto, Masahiro; Kumashiro, Masaharu

2004-06-01

The purpose of this study was to investigate the limitations of lip-reading advantages for Japanese young adults by desynchronizing visual and auditory information in speech. In the experiment, audio-visual speech stimuli were presented under the six test conditions: audio-alone, and audio-visually with either 0, 60, 120, 240 or 480 ms of audio delay. The stimuli were the video recordings of a face of a female Japanese speaking long and short Japanese sentences. The intelligibility of the audio-visual stimuli was measured as a function of audio delays in sixteen untrained young subjects. Speech intelligibility under the audio-delay condition of less than 120 ms was significantly better than that under the audio-alone condition. On the other hand, the delay of 120 ms corresponded to the mean mora duration measured for the audio stimuli. The results implied that audio delays of up to 120 ms would not disrupt lip-reading advantage, because visual and auditory information in speech seemed to be integrated on a syllabic time scale. Potential applications of this research include noisy workplace in which a worker must extract relevant speech from all the other competing noises.
Fusion for Audio-Visual Laughter Detection

NARCIS (Netherlands)

Reuderink, B.

2007-01-01

Laughter is a highly variable signal, and can express a spectrum of emotions. This makes the automatic detection of laughter a challenging but interesting task. We perform automatic laughter detection using audio-visual data from the AMI Meeting Corpus. Audio-visual laughter detection is performed
Audio-Visual Classification of Sports Types

DEFF Research Database (Denmark)

Gade, Rikke; Abou-Zleikha, Mohamed; Christensen, Mads Græsbøll

2015-01-01

In this work we propose a method for classification of sports types from combined audio and visual features ex- tracted from thermal video. From audio Mel Frequency Cepstral Coefficients (MFCC) are extracted, and PCA are applied to reduce the feature space to 10 dimensions. From the visual modali...
Multimodal indexing of digital audio-visual documents: A case study for cultural heritage data

NARCIS (Netherlands)

Carmichael, J.; Larson, M.; Marlow, J.; Newman, E.; Clough, P.; Oomen, J.; Sav, S.

2008-01-01

This paper describes a multimedia multimodal information access sub-system (MIAS) for digital audio-visual documents, typically presented in streaming media format. The system is designed to provide both professional and general users with entry points into video documents that are relevant to their
Automatic summarization of soccer highlights using audio-visual descriptors.

Science.gov (United States)

Raventós, A; Quijada, R; Torres, Luis; Tarrés, Francesc

2015-01-01

Automatic summarization generation of sports video content has been object of great interest for many years. Although semantic descriptions techniques have been proposed, many of the approaches still rely on low-level video descriptors that render quite limited results due to the complexity of the problem and to the low capability of the descriptors to represent semantic content. In this paper, a new approach for automatic highlights summarization generation of soccer videos using audio-visual descriptors is presented. The approach is based on the segmentation of the video sequence into shots that will be further analyzed to determine its relevance and interest. Of special interest in the approach is the use of the audio information that provides additional robustness to the overall performance of the summarization system. For every video shot a set of low and mid level audio-visual descriptors are computed and lately adequately combined in order to obtain different relevance measures based on empirical knowledge rules. The final summary is generated by selecting those shots with highest interest according to the specifications of the user and the results of relevance measures. A variety of results are presented with real soccer video sequences that prove the validity of the approach.
The effect of combined sensory and semantic components on audio-visual speech perception in older adults

Directory of Open Access Journals (Sweden)

Corrina eMaguinness

2011-12-01

Full Text Available Previous studies have found that perception in older people benefits from multisensory over uni-sensory information. As normal speech recognition is affected by both the auditory input and the visual lip-movements of the speaker, we investigated the efficiency of audio and visual integration in an older population by manipulating the relative reliability of the auditory and visual information in speech. We also investigated the role of the semantic context of the sentence to assess whether audio-visual integration is affected by top-down semantic processing. We presented participants with audio-visual sentences in which the visual component was either blurred or not blurred. We found that there was a greater cost in recall performance for semantically meaningless speech in the audio-visual blur compared to audio-visual no blur condition and this effect was specific to the older group. Our findings have implications for understanding how aging affects efficient multisensory integration for the perception of speech and suggests that multisensory inputs may benefit speech perception in older adults when the semantic content of the speech is unpredictable.
Voice activity detection using audio-visual information

DEFF Research Database (Denmark)

Petsatodis, Theodore; Pnevmatikakis, Aristodemos; Boukis, Christos

2009-01-01

An audio-visual voice activity detector that uses sensors positioned distantly from the speaker is presented. Its constituting unimodal detectors are based on the modeling of the temporal variation of audio and visual features using Hidden Markov Models; their outcomes are fused using a post...
Audio scene segmentation for video with generic content

Science.gov (United States)

Niu, Feng; Goela, Naveen; Divakaran, Ajay; Abdel-Mottaleb, Mohamed

2008-01-01

In this paper, we present a content-adaptive audio texture based method to segment video into audio scenes. The audio scene is modeled as a semantically consistent chunk of audio data. Our algorithm is based on "semantic audio texture analysis." At first, we train GMM models for basic audio classes such as speech, music, etc. Then we define the semantic audio texture based on those classes. We study and present two types of scene changes, those corresponding to an overall audio texture change and those corresponding to a special "transition marker" used by the content creator, such as a short stretch of music in a sitcom or silence in dramatic content. Unlike prior work using genre specific heuristics, such as some methods presented for detecting commercials, we adaptively find out if such special transition markers are being used and if so, which of the base classes are being used as markers without any prior knowledge about the content. Our experimental results show that our proposed audio scene segmentation works well across a wide variety of broadcast content genres.
Relationship between age at menarche and exposure to sexual content in audio-visual media and other factors in Islamic junior high school girls

Directory of Open Access Journals (Sweden)

Tity Wulandari

2018-01-01

Full Text Available Background In recent decades, girls have experienced menarche at earlier ages, which may have negative effects on health. Exposure to audio-visual media and other factors may influence the age at menarche, although past studies have produced inconsistent results. Objective To assess for relationships between the age at menarche and audio-visual media exposure, socio-economic status, nutritional status, physical activity, and psychosocial dysfunction in adolescent girls. Methods This cross-sectional study was conducted from August to October 2015 in students from two integrated Islamic junior high schools in Medan, North Sumatera. There were 216 students who met the inclusion criteria: aged 10-16 years and experienced menarche. They were asked to fill out questionnaires that had been previously validated, regarding their history of exposure to audio-visual media, physical activity, and psychosocial dysfunction. The data were analyzed by Chi-square and Fisher’s exact tests in order to assess for relationships between audio-visual media exposure and other potential factors with the age at menarche. Results Of 261 female students at the two schools, 216 had undergone menarche, with a mean age at menarche of 11.6 (SD 1.13 years. There was no significant relationship between age at menarche and audio-visual media exposure (P=0.68. Also, there were no significant relationships between factors such as socio-economic and psychosocial status with age at menarche (P=0.64 and P=0.28, respectively. However, there were significant relationships between earlier age at menarche and overweight/obese nutritional status (P=0.02 as well as low physical activity (P=0.01. Multivariate logistic regression analysis showed that low physical activity had the strongest influence on early menarche (RP=2.40; 95%CI 0.92 to 6.24. Conclusion Age at menarche is not significantly associated with sexual content of audio-visual media exposure. However, there were significant
Audio-Visual Processing in Meetings: Seven Questions and Current AMI Answers

NARCIS (Netherlands)

Al-Hamas, Marc; Hain, Thomas; Cernocky, Jan; Schreiber, Sascha; Poel, Mannes; Rienks, R.J.

2007-01-01

The project Augmented Multi-party Interaction (AMI) is concerned with the development of meeting browsers and remote meeting assistants for instrumented meeting rooms – and the required component technologies R&D themes: group dynamics, audio, visual, and multimodal processing, content abstraction,
Visual communication and the content and style of conversation.

Science.gov (United States)

Rutter, D R; Stephenson, G M; Dewey, M E

1981-02-01

Previous research suggests that visual communication plays a number of important roles in social interaction. In particular, it appears to influence the content of what people say in discussions, the style of their speech, and the outcomes they reach. However, the findings are based exclusively on comparisons between face-to-face conversations and audio conversations, in which subjects sit in separate rooms and speak over a microphone-headphone intercom which precludes visual communication. Interpretation is difficult, because visual communication is confounded with physical presence, which itself makes available certain cues denied to audio subjects. The purpose of this paper is to report two experiments in which the variables were separated and content and style were re-examined. The first made use of blind subjects, and again compared the face-to-face and audio conditions. The second returned to sighted subjects, and examined four experimental conditions: face-to-face; audio; a curtain condition in which subjects sat in the same room but without visual communication; and a video condition in which they sat in separate rooms and communicated over a television link. Neither visual communication nor physical presence proved to be critical variable. Instead, the two sources of cues combined, such that content and style were influenced by the aggregate of available cues. The more cueless the settings, the more task-oriented, depersonalized and unspontaneous the conversation. The findings also suggested that the primary effect of cuelessness is to influence verbal content, and that its influence on both style and outcome occurs indirectly, through the mediation of content.
The Fungible Audio-Visual Mapping and its Experience

Directory of Open Access Journals (Sweden)

Adriana Sa

2014-12-01

Full Text Available This article draws a perceptual approach to audio-visual mapping. Clearly perceivable cause and effect relationships can be problematic if one desires the audience to experience the music. Indeed perception would bias those sonic qualities that fit previous concepts of causation, subordinating other sonic qualities, which may form the relations between the sounds themselves. The question is, how can an audio-visual mapping produce a sense of causation, and simultaneously confound the actual cause-effect relationships. We call this a fungible audio-visual mapping. Our aim here is to glean its constitution and aspect. We will report a study, which draws upon methods from experimental psychology to inform audio-visual instrument design and composition. The participants are shown several audio-visual mapping prototypes, after which we pose quantitative and qualitative questions regarding their sense of causation, and their sense of understanding the cause-effect relationships. The study shows that a fungible mapping requires both synchronized and seemingly non-related components – sufficient complexity to be confusing. As the specific cause-effect concepts remain inconclusive, the sense of causation embraces the whole.
Audio-visual onset differences are used to determine syllable identity for ambiguous audio-visual stimulus pairs.

Science.gov (United States)

Ten Oever, Sanne; Sack, Alexander T; Wheat, Katherine L; Bien, Nina; van Atteveldt, Nienke

2013-01-01

Content and temporal cues have been shown to interact during audio-visual (AV) speech identification. Typically, the most reliable unimodal cue is used more strongly to identify specific speech features; however, visual cues are only used if the AV stimuli are presented within a certain temporal window of integration (TWI). This suggests that temporal cues denote whether unimodal stimuli belong together, that is, whether they should be integrated. It is not known whether temporal cues also provide information about the identity of a syllable. Since spoken syllables have naturally varying AV onset asynchronies, we hypothesize that for suboptimal AV cues presented within the TWI, information about the natural AV onset differences can aid in speech identification. To test this, we presented low-intensity auditory syllables concurrently with visual speech signals, and varied the stimulus onset asynchronies (SOA) of the AV pair, while participants were instructed to identify the auditory syllables. We revealed that specific speech features (e.g., voicing) were identified by relying primarily on one modality (e.g., auditory). Additionally, we showed a wide window in which visual information influenced auditory perception, that seemed even wider for congruent stimulus pairs. Finally, we found a specific response pattern across the SOA range for syllables that were not reliably identified by the unimodal cues, which we explained as the result of the use of natural onset differences between AV speech signals. This indicates that temporal cues not only provide information about the temporal integration of AV stimuli, but additionally convey information about the identity of AV pairs. These results provide a detailed behavioral basis for further neuro-imaging and stimulation studies to unravel the neurofunctional mechanisms of the audio-visual-temporal interplay within speech perception.
Decision-level fusion for audio-visual laughter detection

NARCIS (Netherlands)

Reuderink, B.; Poel, M.; Truong, K.; Poppe, R.; Pantic, M.

2008-01-01

Laughter is a highly variable signal, which can be caused by a spectrum of emotions. This makes the automatic detection of laughter a challenging, but interesting task. We perform automatic laughter detection using audio-visual data from the AMI Meeting Corpus. Audio-visual laughter detection is
Decision-Level Fusion for Audio-Visual Laughter Detection

NARCIS (Netherlands)

Reuderink, B.; Poel, Mannes; Truong, Khiet Phuong; Poppe, Ronald Walter; Pantic, Maja; Popescu-Belis, Andrei; Stiefelhagen, Rainer

Laughter is a highly variable signal, which can be caused by a spectrum of emotions. This makes the automatic detection of laugh- ter a challenging, but interesting task. We perform automatic laughter detection using audio-visual data from the AMI Meeting Corpus. Audio- visual laughter detection is
Stream specificity and asymmetries in feature binding and content-addressable access in visual encoding and memory.

Science.gov (United States)

Huynh, Duong L; Tripathy, Srimant P; Bedell, Harold E; Ögmen, Haluk

2015-01-01

Human memory is content addressable-i.e., contents of the memory can be accessed using partial information about the bound features of a stored item. In this study, we used a cross-feature cuing technique to examine how the human visual system encodes, binds, and retains information about multiple stimulus features within a set of moving objects. We sought to characterize the roles of three different features (position, color, and direction of motion, the latter two of which are processed preferentially within the ventral and dorsal visual streams, respectively) in the construction and maintenance of object representations. We investigated the extent to which these features are bound together across the following processing stages: during stimulus encoding, sensory (iconic) memory, and visual short-term memory. Whereas all features examined here can serve as cues for addressing content, their effectiveness shows asymmetries and varies according to cue-report pairings and the stage of information processing and storage. Position-based indexing theories predict that position should be more effective as a cue compared to other features. While we found a privileged role for position as a cue at the stimulus-encoding stage, position was not the privileged cue at the sensory and visual short-term memory stages. Instead, the pattern that emerged from our findings is one that mirrors the parallel processing streams in the visual system. This stream-specific binding and cuing effectiveness manifests itself in all three stages of information processing examined here. Finally, we find that the Leaky Flask model proposed in our previous study is applicable to all three features.
Audio stream classification for multimedia database search

Science.gov (United States)

Artese, M.; Bianco, S.; Gagliardi, I.; Gasparini, F.

2013-03-01

Search and retrieval of huge archives of Multimedia data is a challenging task. A classification step is often used to reduce the number of entries on which to perform the subsequent search. In particular, when new entries of the database are continuously added, a fast classification based on simple threshold evaluation is desirable. In this work we present a CART-based (Classification And Regression Tree [1]) classification framework for audio streams belonging to multimedia databases. The database considered is the Archive of Ethnography and Social History (AESS) [2], which is mainly composed of popular songs and other audio records describing the popular traditions handed down generation by generation, such as traditional fairs, and customs. The peculiarities of this database are that it is continuously updated; the audio recordings are acquired in unconstrained environment; and for the non-expert human user is difficult to create the ground truth labels. In our experiments, half of all the available audio files have been randomly extracted and used as training set. The remaining ones have been used as test set. The classifier has been trained to distinguish among three different classes: speech, music, and song. All the audio files in the dataset have been previously manually labeled into the three classes above defined by domain experts.
Rehabilitation of balance-impaired stroke patients through audio-visual biofeedback

DEFF Research Database (Denmark)

Gheorghe, Cristina; Nissen, Thomas; Juul Rosengreen Christensen, Daniel

2015-01-01

This study explored how audio-visual biofeedback influences physical balance of seven balance-impaired stroke patients, between 33–70 years-of-age. The setup included a bespoke balance board and a music rhythm game. The procedure was designed as follows: (1) a control group who performed a balance...... training exercise without any technological input, (2) a visual biofeedback group, performing via visual input, and (3) an audio-visual biofeedback group, performing via audio and visual input. Results retrieved from comparisons between the data sets (2) and (3) suggested superior postural stability...
Classifying laughter and speech using audio-visual feature prediction

NARCIS (Netherlands)

Petridis, Stavros; Asghar, Ali; Pantic, Maja

2010-01-01

In this study, a system that discriminates laughter from speech by modelling the relationship between audio and visual features is presented. The underlying assumption is that this relationship is different between speech and laughter. Neural networks are trained which learn the audio-to-visual and

Advances in audio source seperation and multisource audio content retrieval

Science.gov (United States)

Vincent, Emmanuel

2012-06-01

Audio source separation aims to extract the signals of individual sound sources from a given recording. In this paper, we review three recent advances which improve the robustness of source separation in real-world challenging scenarios and enable its use for multisource content retrieval tasks, such as automatic speech recognition (ASR) or acoustic event detection (AED) in noisy environments. We present a Flexible Audio Source Separation Toolkit (FASST) and discuss its advantages compared to earlier approaches such as independent component analysis (ICA) and sparse component analysis (SCA). We explain how cues as diverse as harmonicity, spectral envelope, temporal fine structure or spatial location can be jointly exploited by this toolkit. We subsequently present the uncertainty decoding (UD) framework for the integration of audio source separation and audio content retrieval. We show how the uncertainty about the separated source signals can be accurately estimated and propagated to the features. Finally, we explain how this uncertainty can be efficiently exploited by a classifier, both at the training and the decoding stage. We illustrate the resulting performance improvements in terms of speech separation quality and speaker recognition accuracy.
Streaming Audio and Video: New Challenges and Opportunities for Museums.

Science.gov (United States)

Spadaccini, Jim

Streaming audio and video present new challenges and opportunities for museums. Streaming media is easier to author and deliver to Internet audiences than ever before; digital video editing is commonplace now that the tools--computers, digital video cameras, and hard drives--are so affordable; the cost of serving video files across the Internet…
Comparative evaluation of audio and audio - tactile methods to improve oral hygiene status of visually impaired school children

OpenAIRE

R Krishnakumar; Swarna Swathi Silla; Sugumaran K Durai; Mohan Govindarajan; Syed Shaheed Ahamed; Logeshwari Mathivanan

2016-01-01

Background: Visually impaired children are unable to maintain good oral hygiene, as their tactile abilities are often underdeveloped owing to their visual disturbances. Conventional brushing techniques are often poorly comprehended by these children and hence, it was decided to evaluate the effectiveness of audio and audio-tactile methods in improving the oral hygiene of these children. Objective: To evaluate and compare the effectiveness of audio and audio-tactile methods in improving oral h...
Documentary management of the sport audio-visual information in the generalist televisions

OpenAIRE

Jorge Caldera Serrano; Felipe Alonso

2007-01-01

The management of the sport audio-visual documentation of the Information Systems of the state, zonal and local chains is analyzed within the framework. For it it is made makes a route by the documentary chain that makes the sport audio-visual information with the purpose of being analyzing each one of the parameters, showing therefore a series of recommendations and norms for the preparation of the sport audio-visual registry. Evidently the audio-visual sport documentation difference i...
Haptic and Audio-visual Stimuli: Enhancing Experiences and Interaction

NARCIS (Netherlands)

Nijholt, Antinus; Dijk, Esko O.; Lemmens, Paul M.C.; Luitjens, S.B.

2010-01-01

The intention of the symposium on Haptic and Audio-visual stimuli at the EuroHaptics 2010 conference is to deepen the understanding of the effect of combined Haptic and Audio-visual stimuli. The knowledge gained will be used to enhance experiences and interactions in daily life. To this end, a
Audio-visual synchrony and feature-selective attention co-amplify early visual processing.

Science.gov (United States)

Keitel, Christian; Müller, Matthias M

2016-05-01

Our brain relies on neural mechanisms of selective attention and converging sensory processing to efficiently cope with rich and unceasing multisensory inputs. One prominent assumption holds that audio-visual synchrony can act as a strong attractor for spatial attention. Here, we tested for a similar effect of audio-visual synchrony on feature-selective attention. We presented two superimposed Gabor patches that differed in colour and orientation. On each trial, participants were cued to selectively attend to one of the two patches. Over time, spatial frequencies of both patches varied sinusoidally at distinct rates (3.14 and 3.63 Hz), giving rise to pulse-like percepts. A simultaneously presented pure tone carried a frequency modulation at the pulse rate of one of the two visual stimuli to introduce audio-visual synchrony. Pulsed stimulation elicited distinct time-locked oscillatory electrophysiological brain responses. These steady-state responses were quantified in the spectral domain to examine individual stimulus processing under conditions of synchronous versus asynchronous tone presentation and when respective stimuli were attended versus unattended. We found that both, attending to the colour of a stimulus and its synchrony with the tone, enhanced its processing. Moreover, both gain effects combined linearly for attended in-sync stimuli. Our results suggest that audio-visual synchrony can attract attention to specific stimulus features when stimuli overlap in space.
Fusion of audio and visual cues for laughter detection

NARCIS (Netherlands)

Petridis, Stavros; Pantic, Maja

Past research on automatic laughter detection has focused mainly on audio-based detection. Here we present an audio- visual approach to distinguishing laughter from speech and we show that integrating the information from audio and video channels leads to improved performance over single-modal
Audio-Visual Speaker Diarization Based on Spatiotemporal Bayesian Fusion.

Science.gov (United States)

Gebru, Israel D; Ba, Sileye; Li, Xiaofei; Horaud, Radu

2018-05-01

Speaker diarization consists of assigning speech signals to people engaged in a dialogue. An audio-visual spatiotemporal diarization model is proposed. The model is well suited for challenging scenarios that consist of several participants engaged in multi-party interaction while they move around and turn their heads towards the other participants rather than facing the cameras and the microphones. Multiple-person visual tracking is combined with multiple speech-source localization in order to tackle the speech-to-person association problem. The latter is solved within a novel audio-visual fusion method on the following grounds: binaural spectral features are first extracted from a microphone pair, then a supervised audio-visual alignment technique maps these features onto an image, and finally a semi-supervised clustering method assigns binaural spectral features to visible persons. The main advantage of this method over previous work is that it processes in a principled way speech signals uttered simultaneously by multiple persons. The diarization itself is cast into a latent-variable temporal graphical model that infers speaker identities and speech turns, based on the output of an audio-visual association process, executed at each time slice, and on the dynamics of the diarization variable itself. The proposed formulation yields an efficient exact inference procedure. A novel dataset, that contains audio-visual training data as well as a number of scenarios involving several participants engaged in formal and informal dialogue, is introduced. The proposed method is thoroughly tested and benchmarked with respect to several state-of-the art diarization algorithms.
Optimized Audio Classification and Segmentation Algorithm by Using Ensemble Methods

Directory of Open Access Journals (Sweden)

Saadia Zahid

2015-01-01

Full Text Available Audio segmentation is a basis for multimedia content analysis which is the most important and widely used application nowadays. An optimized audio classification and segmentation algorithm is presented in this paper that segments a superimposed audio stream on the basis of its content into four main audio types: pure-speech, music, environment sound, and silence. An algorithm is proposed that preserves important audio content and reduces the misclassification rate without using large amount of training data, which handles noise and is suitable for use for real-time applications. Noise in an audio stream is segmented out as environment sound. A hybrid classification approach is used, bagged support vector machines (SVMs with artificial neural networks (ANNs. Audio stream is classified, firstly, into speech and nonspeech segment by using bagged support vector machines; nonspeech segment is further classified into music and environment sound by using artificial neural networks and lastly, speech segment is classified into silence and pure-speech segments on the basis of rule-based classifier. Minimum data is used for training classifier; ensemble methods are used for minimizing misclassification rate and approximately 98% accurate segments are obtained. A fast and efficient algorithm is designed that can be used with real-time multimedia applications.
News video story segmentation method using fusion of audio-visual features

Science.gov (United States)

Wen, Jun; Wu, Ling-da; Zeng, Pu; Luan, Xi-dao; Xie, Yu-xiang

2007-11-01

News story segmentation is an important aspect for news video analysis. This paper presents a method for news video story segmentation. Different form prior works, which base on visual features transform, the proposed technique uses audio features as baseline and fuses visual features with it to refine the results. At first, it selects silence clips as audio features candidate points, and selects shot boundaries and anchor shots as two kinds of visual features candidate points. Then this paper selects audio feature candidates as cues and develops different fusion method, which effectively using diverse type visual candidates to refine audio candidates, to get story boundaries. Experiment results show that this method has high efficiency and adaptability to different kinds of news video.
Selective attention modulates the direction of audio-visual temporal recalibration.

Science.gov (United States)

Ikumi, Nara; Soto-Faraco, Salvador

2014-01-01

Temporal recalibration of cross-modal synchrony has been proposed as a mechanism to compensate for timing differences between sensory modalities. However, far from the rich complexity of everyday life sensory environments, most studies to date have examined recalibration on isolated cross-modal pairings. Here, we hypothesize that selective attention might provide an effective filter to help resolve which stimuli are selected when multiple events compete for recalibration. We addressed this question by testing audio-visual recalibration following an adaptation phase where two opposing audio-visual asynchronies were present. The direction of voluntary visual attention, and therefore to one of the two possible asynchronies (flash leading or flash lagging), was manipulated using colour as a selection criterion. We found a shift in the point of subjective audio-visual simultaneity as a function of whether the observer had focused attention to audio-then-flash or to flash-then-audio groupings during the adaptation phase. A baseline adaptation condition revealed that this effect of endogenous attention was only effective toward the lagging flash. This hints at the role of exogenous capture and/or additional endogenous effects producing an asymmetry toward the leading flash. We conclude that selective attention helps promote selected audio-visual pairings to be combined and subsequently adjusted in time but, stimulus organization exerts a strong impact on recalibration. We tentatively hypothesize that the resolution of recalibration in complex scenarios involves the orchestration of top-down selection mechanisms and stimulus-driven processes.
Selective attention modulates the direction of audio-visual temporal recalibration.

Directory of Open Access Journals (Sweden)

Nara Ikumi

Full Text Available Temporal recalibration of cross-modal synchrony has been proposed as a mechanism to compensate for timing differences between sensory modalities. However, far from the rich complexity of everyday life sensory environments, most studies to date have examined recalibration on isolated cross-modal pairings. Here, we hypothesize that selective attention might provide an effective filter to help resolve which stimuli are selected when multiple events compete for recalibration. We addressed this question by testing audio-visual recalibration following an adaptation phase where two opposing audio-visual asynchronies were present. The direction of voluntary visual attention, and therefore to one of the two possible asynchronies (flash leading or flash lagging, was manipulated using colour as a selection criterion. We found a shift in the point of subjective audio-visual simultaneity as a function of whether the observer had focused attention to audio-then-flash or to flash-then-audio groupings during the adaptation phase. A baseline adaptation condition revealed that this effect of endogenous attention was only effective toward the lagging flash. This hints at the role of exogenous capture and/or additional endogenous effects producing an asymmetry toward the leading flash. We conclude that selective attention helps promote selected audio-visual pairings to be combined and subsequently adjusted in time but, stimulus organization exerts a strong impact on recalibration. We tentatively hypothesize that the resolution of recalibration in complex scenarios involves the orchestration of top-down selection mechanisms and stimulus-driven processes.
Aurally Aided Visual Search Performance Comparing Virtual Audio Systems

DEFF Research Database (Denmark)

Larsen, Camilla Horne; Lauritsen, David Skødt; Larsen, Jacob Junker

2014-01-01

Due to increased computational power, reproducing binaural hearing in real-time applications, through usage of head-related transfer functions (HRTFs), is now possible. This paper addresses the differences in aurally-aided visual search performance between a HRTF enhanced audio system (3D) and an...... with white dots. The results indicate that 3D audio yields faster search latencies than panning audio, especially with larger amounts of distractors. The applications of this research could fit virtual environments such as video games or virtual simulations.......Due to increased computational power, reproducing binaural hearing in real-time applications, through usage of head-related transfer functions (HRTFs), is now possible. This paper addresses the differences in aurally-aided visual search performance between a HRTF enhanced audio system (3D...
Aurally Aided Visual Search Performance Comparing Virtual Audio Systems

DEFF Research Database (Denmark)

Larsen, Camilla Horne; Lauritsen, David Skødt; Larsen, Jacob Junker

2014-01-01

Due to increased computational power reproducing binaural hearing in real-time applications, through usage of head-related transfer functions (HRTFs), is now possible. This paper addresses the differences in aurally-aided visual search performance between an HRTF enhanced audio system (3D) and an...... with white dots. The results indicate that 3D audio yields faster search latencies than panning audio, especially with larger amounts of distractors. The applications of this research could fit virtual environments such as video games or virtual simulations.......Due to increased computational power reproducing binaural hearing in real-time applications, through usage of head-related transfer functions (HRTFs), is now possible. This paper addresses the differences in aurally-aided visual search performance between an HRTF enhanced audio system (3D...
Designing between Pedagogies and Cultures: Audio-Visual Chinese Language Resources for Australian Schools

Science.gov (United States)

Yuan, Yifeng; Shen, Huizhong

2016-01-01

This design-based study examines the creation and development of audio-visual Chinese language teaching and learning materials for Australian schools by incorporating users' feedback and content writers' input that emerged in the designing process. Data were collected from workshop feedback of two groups of Chinese-language teachers from primary…
Audio-Visual Fusion for Sound Source Localization and Improved Attention

Energy Technology Data Exchange (ETDEWEB)

Lee, Byoung Gi; Choi, Jong Suk; Yoon, Sang Suk; Choi, Mun Taek; Kim, Mun Sang [Korea Institute of Science and Technology, Daejeon (Korea, Republic of); Kim, Dai Jin [Pohang University of Science and Technology, Pohang (Korea, Republic of)

2011-07-15

Service robots are equipped with various sensors such as vision camera, sonar sensor, laser scanner, and microphones. Although these sensors have their own functions, some of them can be made to work together and perform more complicated functions. AudioFvisual fusion is a typical and powerful combination of audio and video sensors, because audio information is complementary to visual information and vice versa. Human beings also mainly depend on visual and auditory information in their daily life. In this paper, we conduct two studies using audioFvision fusion: one is on enhancing the performance of sound localization, and the other is on improving robot attention through sound localization and face detection.
Audio-Visual Fusion for Sound Source Localization and Improved Attention

International Nuclear Information System (INIS)

Lee, Byoung Gi; Choi, Jong Suk; Yoon, Sang Suk; Choi, Mun Taek; Kim, Mun Sang; Kim, Dai Jin

2011-01-01

Service robots are equipped with various sensors such as vision camera, sonar sensor, laser scanner, and microphones. Although these sensors have their own functions, some of them can be made to work together and perform more complicated functions. AudioFvisual fusion is a typical and powerful combination of audio and video sensors, because audio information is complementary to visual information and vice versa. Human beings also mainly depend on visual and auditory information in their daily life. In this paper, we conduct two studies using audioFvision fusion: one is on enhancing the performance of sound localization, and the other is on improving robot attention through sound localization and face detection
A conceptual framework for audio-visual museum media

DEFF Research Database (Denmark)

Kirkedahl Lysholm Nielsen, Mikkel

2017-01-01

In today's history museums, the past is communicated through many other means than original artefacts. This interdisciplinary and theoretical article suggests a new approach to studying the use of audio-visual media, such as film, video and related media types, in a museum context. The centre...... and museum studies, existing case studies, and real life observations, the suggested framework instead stress particular characteristics of contextual use of audio-visual media in history museums, such as authenticity, virtuality, interativity, social context and spatial attributes of the communication...
Parametric Packet-Layer Model for Evaluation Audio Quality in Multimedia Streaming Services

Science.gov (United States)

Egi, Noritsugu; Hayashi, Takanori; Takahashi, Akira

We propose a parametric packet-layer model for monitoring audio quality in multimedia streaming services such as Internet protocol television (IPTV). This model estimates audio quality of experience (QoE) on the basis of quality degradation due to coding and packet loss of an audio sequence. The input parameters of this model are audio bit rate, sampling rate, frame length, packet-loss frequency, and average burst length. Audio bit rate, packet-loss frequency, and average burst length are calculated from header information in received IP packets. For sampling rate, frame length, and audio codec type, the values or the names used in monitored services are input into this model directly. We performed a subjective listening test to examine the relationships between these input parameters and perceived audio quality. The codec used in this test was the Advanced Audio Codec-Low Complexity (AAC-LC), which is one of the international standards for audio coding. On the basis of the test results, we developed an audio quality evaluation model. The verification results indicate that audio quality estimated by the proposed model has a high correlation with perceived audio quality.
Audio-Visual Perception System for a Humanoid Robotic Head

Directory of Open Access Journals (Sweden)

Raquel Viciana-Abad

2014-05-01

Full Text Available One of the main issues within the field of social robotics is to endow robots with the ability to direct attention to people with whom they are interacting. Different approaches follow bio-inspired mechanisms, merging audio and visual cues to localize a person using multiple sensors. However, most of these fusion mechanisms have been used in fixed systems, such as those used in video-conference rooms, and thus, they may incur difficulties when constrained to the sensors with which a robot can be equipped. Besides, within the scope of interactive autonomous robots, there is a lack in terms of evaluating the benefits of audio-visual attention mechanisms, compared to only audio or visual approaches, in real scenarios. Most of the tests conducted have been within controlled environments, at short distances and/or with off-line performance measurements. With the goal of demonstrating the benefit of fusing sensory information with a Bayes inference for interactive robotics, this paper presents a system for localizing a person by processing visual and audio data. Moreover, the performance of this system is evaluated and compared via considering the technical limitations of unimodal systems. The experiments show the promise of the proposed approach for the proactive detection and tracking of speakers in a human-robot interactive framework.

Audio-Visual and Meaningful Semantic Context Enhancements in Older and Younger Adults.

Directory of Open Access Journals (Sweden)

Kirsten E Smayda

Full Text Available Speech perception is critical to everyday life. Oftentimes noise can degrade a speech signal; however, because of the cues available to the listener, such as visual and semantic cues, noise rarely prevents conversations from continuing. The interaction of visual and semantic cues in aiding speech perception has been studied in young adults, but the extent to which these two cues interact for older adults has not been studied. To investigate the effect of visual and semantic cues on speech perception in older and younger adults, we recruited forty-five young adults (ages 18-35 and thirty-three older adults (ages 60-90 to participate in a speech perception task. Participants were presented with semantically meaningful and anomalous sentences in audio-only and audio-visual conditions. We hypothesized that young adults would outperform older adults across SNRs, modalities, and semantic contexts. In addition, we hypothesized that both young and older adults would receive a greater benefit from a semantically meaningful context in the audio-visual relative to audio-only modality. We predicted that young adults would receive greater visual benefit in semantically meaningful contexts relative to anomalous contexts. However, we predicted that older adults could receive a greater visual benefit in either semantically meaningful or anomalous contexts. Results suggested that in the most supportive context, that is, semantically meaningful sentences presented in the audiovisual modality, older adults performed similarly to young adults. In addition, both groups received the same amount of visual and meaningful benefit. Lastly, across groups, a semantically meaningful context provided more benefit in the audio-visual modality relative to the audio-only modality, and the presence of visual cues provided more benefit in semantically meaningful contexts relative to anomalous contexts. These results suggest that older adults can perceive speech as well as younger
Effects of Audio-Visual Information on the Intelligibility of Alaryngeal Speech

Science.gov (United States)

Evitts, Paul M.; Portugal, Lindsay; Van Dine, Ami; Holler, Aline

2010-01-01

Background: There is minimal research on the contribution of visual information on speech intelligibility for individuals with a laryngectomy (IWL). Aims: The purpose of this project was to determine the effects of mode of presentation (audio-only, audio-visual) on alaryngeal speech intelligibility. Method: Twenty-three naive listeners were…
Computationally efficient clustering of audio-visual meeting data

NARCIS (Netherlands)

Hung, H.; Friedland, G.; Yeo, C.; Shao, L.; Shan, C.; Luo, J.; Etoh, M.

2010-01-01

This chapter presents novel computationally efficient algorithms to extract semantically meaningful acoustic and visual events related to each of the participants in a group discussion using the example of business meeting recordings. The recording setup involves relatively few audio-visual sensors,
Cortical Integration of Audio-Visual Information

Science.gov (United States)

Vander Wyk, Brent C.; Ramsay, Gordon J.; Hudac, Caitlin M.; Jones, Warren; Lin, David; Klin, Ami; Lee, Su Mei; Pelphrey, Kevin A.

2013-01-01

We investigated the neural basis of audio-visual processing in speech and non-speech stimuli. Physically identical auditory stimuli (speech and sinusoidal tones) and visual stimuli (animated circles and ellipses) were used in this fMRI experiment. Relative to unimodal stimuli, each of the multimodal conjunctions showed increased activation in largely non-overlapping areas. The conjunction of Ellipse and Speech, which most resembles naturalistic audiovisual speech, showed higher activation in the right inferior frontal gyrus, fusiform gyri, left posterior superior temporal sulcus, and lateral occipital cortex. The conjunction of Circle and Tone, an arbitrary audio-visual pairing with no speech association, activated middle temporal gyri and lateral occipital cortex. The conjunction of Circle and Speech showed activation in lateral occipital cortex, and the conjunction of Ellipse and Tone did not show increased activation relative to unimodal stimuli. Further analysis revealed that middle temporal regions, although identified as multimodal only in the Circle-Tone condition, were more strongly active to Ellipse-Speech or Circle-Speech, but regions that were identified as multimodal for Ellipse-Speech were always strongest for Ellipse-Speech. Our results suggest that combinations of auditory and visual stimuli may together be processed by different cortical networks, depending on the extent to which speech or non-speech percepts are evoked. PMID:20709442
Investigating the impact of audio instruction and audio-visual biofeedback for lung cancer radiation therapy

Science.gov (United States)

George, Rohini

Lung cancer accounts for 13% of all cancers in the Unites States and is the leading cause of deaths among both men and women. The five-year survival for lung cancer patients is approximately 15%.(ACS facts & figures) Respiratory motion decreases accuracy of thoracic radiotherapy during imaging and delivery. To account for respiration, generally margins are added during radiation treatment planning, which may cause a substantial dose delivery to normal tissues and increase the normal tissue toxicity. To alleviate the above-mentioned effects of respiratory motion, several motion management techniques are available which can reduce the doses to normal tissues, thereby reducing treatment toxicity and allowing dose escalation to the tumor. This may increase the survival probability of patients who have lung cancer and are receiving radiation therapy. However the accuracy of these motion management techniques are inhibited by respiration irregularity. The rationale of this thesis was to study the improvement in regularity of respiratory motion by breathing coaching for lung cancer patients using audio instructions and audio-visual biofeedback. A total of 331 patient respiratory motion traces, each four minutes in length, were collected from 24 lung cancer patients enrolled in an IRB-approved breathing-training protocol. It was determined that audio-visual biofeedback significantly improved the regularity of respiratory motion compared to free breathing and audio instruction, thus improving the accuracy of respiratory gated radiotherapy. It was also observed that duty cycles below 30% showed insignificant reduction in residual motion while above 50% there was a sharp increase in residual motion. The reproducibility of exhale based gating was higher than that of inhale base gating. Modeling the respiratory cycles it was found that cosine and cosine 4 models had the best correlation with individual respiratory cycles. The overall respiratory motion probability distribution
Selected Audio-Visual Materials for Consumer Education. [New Version.

Science.gov (United States)

Johnston, William L.

Ninety-two films, filmstrips, multi-media kits, slides, and audio cassettes, produced between 1964 and 1974, are listed in this selective annotated bibliography on consumer education. The major portion of the bibliography is devoted to films and filmstrips. The main topics of the audio-visual materials include purchasing, advertising, money…
Audio-Visual Speech Recognition Using Lip Information Extracted from Side-Face Images

Directory of Open Access Journals (Sweden)

Koji Iwano

2007-03-01

Full Text Available This paper proposes an audio-visual speech recognition method using lip information extracted from side-face images as an attempt to increase noise robustness in mobile environments. Our proposed method assumes that lip images can be captured using a small camera installed in a handset. Two different kinds of lip features, lip-contour geometric features and lip-motion velocity features, are used individually or jointly, in combination with audio features. Phoneme HMMs modeling the audio and visual features are built based on the multistream HMM technique. Experiments conducted using Japanese connected digit speech contaminated with white noise in various SNR conditions show effectiveness of the proposed method. Recognition accuracy is improved by using the visual information in all SNR conditions. These visual features were confirmed to be effective even when the audio HMM was adapted to noise by the MLLR method.
Automatic Organisation and Quality Analysis of User-Generated Content with Audio Fingerprinting

OpenAIRE

Cavaco, Sofia; Magalhaes, Joao; Mordido, Gonçalo

2018-01-01

The increase of the quantity of user-generated content experienced in social media has boosted the importance of analysing and organising the content by its quality. Here, we propose a method that uses audio fingerprinting to organise and infer the quality of user-generated audio content. The proposed method detects the overlapping segments between different audio clips to organise and cluster the data according to events, and to infer the audio quality of the samples. A test setup with conce...
Mobile video-to-audio transducer and motion detection for sensory substitution

Directory of Open Access Journals (Sweden)

Maxime eAmbard

2015-10-01

Full Text Available Visuo-auditory sensory substitution systems are augmented reality devices that translate a video stream into an audio stream in order to help the blind in daily tasks requiring visuo-spatial information. In this work, we present both a new mobile device and a transcoding method specifically designed to sonify moving objects. Frame differencing is used to extract spatial features from the video stream and two-dimensional spatial information is converted into audio cues using pitch, interaural time difference and interaural level difference. Using numerical methods, we attempt to reconstruct visuo-spatial information based on audio signals generated from various video stimuli. We show that despite a contrasted visual background and a highly lossy encoding method, the information in the audio signal is sufficient to allow object localization, object trajectory evaluation, object approach detection, and spatial separation of multiple objects. We also show that this type of audio signal can be interpreted by human users by asking ten subjects to discriminate trajectories based on generated audio signals.
Feature Fusion Based Audio-Visual Speaker Identification Using Hidden Markov Model under Different Lighting Variations

Directory of Open Access Journals (Sweden)

Md. Rabiul Islam

2014-01-01

Full Text Available The aim of the paper is to propose a feature fusion based Audio-Visual Speaker Identification (AVSI system with varied conditions of illumination environments. Among the different fusion strategies, feature level fusion has been used for the proposed AVSI system where Hidden Markov Model (HMM is used for learning and classification. Since the feature set contains richer information about the raw biometric data than any other levels, integration at feature level is expected to provide better authentication results. In this paper, both Mel Frequency Cepstral Coefficients (MFCCs and Linear Prediction Cepstral Coefficients (LPCCs are combined to get the audio feature vectors and Active Shape Model (ASM based appearance and shape facial features are concatenated to take the visual feature vectors. These combined audio and visual features are used for the feature-fusion. To reduce the dimension of the audio and visual feature vectors, Principal Component Analysis (PCA method is used. The VALID audio-visual database is used to measure the performance of the proposed system where four different illumination levels of lighting conditions are considered. Experimental results focus on the significance of the proposed audio-visual speaker identification system with various combinations of audio and visual features.
PERANCANGAN MEDIA PEMBELAJARAN BERBASIS AUDIO VISUAL UNTUK MATA KULIAH TIPOGRAFI PADA PROGRAM STUDI DESAIN KOMUNIKASI VISUAL UNIVERSITAS DIAN NUSWANTORO

Directory of Open Access Journals (Sweden)

Puri Sulistiyawati

2017-02-01

Full Text Available Abstrak Tipografi merupakan salah satu mata kuliah pada bidang desain komunikasi visual yang mengutamakan aspek visual. Namun berdasarkan hasil observasi diketahui bahwa media pembelajaran yang selama ini digunakan kurang efektif karena kurangnya pemanfaatan teknologi informasi, sehingga mahasiswa kurang maksimal dalam memahami materi kuliah yang disampaikan oleh pengajar. Perkembangan teknologi informasi saat ini banyak memberikan dampak positif bagi kemajuan bidang pendidikan diantaranya dapat digunakan untuk mendukung media dalam proses pembelajaran. Tujuan penelitian ini adalah merancang media pembelajaran untuk mata kuliah tipografi dengan memanfaatkan teknologi informasi yaitu media audio visual. Metode yang digunakan dalam penelitian ini adalah Research and Development dengan pendekatan model ADDIE (Analysis, Design, Development, Implementation, Evaluation. Dengan diciptakannya media pembelajaran audio visual ini diharapkan proses pembelajaran mata kuliah Tipografi dapat lebih efektif dan materi kuliah lebih mudah dipahami oleh mahasiswa. Kata Kunci : audio visual, media pembelajaran, tipografi Abstract Typography is one of the subjects in the field of visual communication design that prioritizes the visual aspect. However, based on the observation note that the media has been used less effective because the lack of use information technology, so students can't understand the course material that explained by lecturers. Today, the development of information technology is being positive impact for the advancement of education which can be used to support the media in the learning process. The purpose of this research is to design learning media for the course of typography by utilizing information technology, called audio-visual media. The method that used in this research is Research and Development with ADDIE model (Analysis, Design, Development, Implementation, Evaluation. With the creation of audio-visual learning media is expected
StreamMap: Smooth Dynamic Visualization of High-Density Streaming Points.

Science.gov (United States)

Li, Chenhui; Baciu, George; Han, Yu

2018-03-01

Interactive visualization of streaming points for real-time scatterplots and linear blending of correlation patterns is increasingly becoming the dominant mode of visual analytics for both big data and streaming data from active sensors and broadcasting media. To better visualize and interact with inter-stream patterns, it is generally necessary to smooth out gaps or distortions in the streaming data. Previous approaches either animate the points directly or present a sampled static heat-map. We propose a new approach, called StreamMap, to smoothly blend high-density streaming points and create a visual flow that emphasizes the density pattern distributions. In essence, we present three new contributions for the visualization of high-density streaming points. The first contribution is a density-based method called super kernel density estimation that aggregates streaming points using an adaptive kernel to solve the overlapping problem. The second contribution is a robust density morphing algorithm that generates several smooth intermediate frames for a given pair of frames. The third contribution is a trend representation design that can help convey the flow directions of the streaming points. The experimental results on three datasets demonstrate the effectiveness of StreamMap when dynamic visualization and visual analysis of trend patterns on streaming points are required.
Audio-visual materials usage preference among agricultural ...

African Journals Online (AJOL)

It was found that respondents preferred radio, television, poster, advert, photographs, specimen, bulletin, magazine, cinema, videotape, chalkboard, and bulletin board as audio-visual materials for extension work. These are the materials that can easily be manipulated and utilized for extension work. Nigerian Journal of ...
ANALYSIS OF MULTIMODAL FUSION TECHNIQUES FOR AUDIO-VISUAL SPEECH RECOGNITION

Directory of Open Access Journals (Sweden)

D.V. Ivanko

2016-05-01

Full Text Available The paper deals with analytical review, covering the latest achievements in the field of audio-visual (AV fusion (integration of multimodal information. We discuss the main challenges and report on approaches to address them. One of the most important tasks of the AV integration is to understand how the modalities interact and influence each other. The paper addresses this problem in the context of AV speech processing and speech recognition. In the first part of the review we set out the basic principles of AV speech recognition and give the classification of audio and visual features of speech. Special attention is paid to the systematization of the existing techniques and the AV data fusion methods. In the second part we provide a consolidated list of tasks and applications that use the AV fusion based on carried out analysis of research area. We also indicate used methods, techniques, audio and video features. We propose classification of the AV integration, and discuss the advantages and disadvantages of different approaches. We draw conclusions and offer our assessment of the future in the field of AV fusion. In the further research we plan to implement a system of audio-visual Russian continuous speech recognition using advanced methods of multimodal fusion.
Semantic congruency but not temporal synchrony enhances long-term memory performance for audio-visual scenes.

Science.gov (United States)

Meyerhoff, Hauke S; Huff, Markus

2016-04-01

Human long-term memory for visual objects and scenes is tremendous. Here, we test how auditory information contributes to long-term memory performance for realistic scenes. In a total of six experiments, we manipulated the presentation modality (auditory, visual, audio-visual) as well as semantic congruency and temporal synchrony between auditory and visual information of brief filmic clips. Our results show that audio-visual clips generally elicit more accurate memory performance than unimodal clips. This advantage even increases with congruent visual and auditory information. However, violations of audio-visual synchrony hardly have any influence on memory performance. Memory performance remained intact even with a sequential presentation of auditory and visual information, but finally declined when the matching tracks of one scene were presented separately with intervening tracks during learning. With respect to memory performance, our results therefore show that audio-visual integration is sensitive to semantic congruency but remarkably robust against asymmetries between different modalities.
The audio and visual communication systems for suited engineering activities on JET

International Nuclear Information System (INIS)

Pearce, R.J.H.; Bruce, J.; Callaghan, C.; Hart, M.; Martin, P.; Middleton, R.; Tait, J.

2001-01-01

The beryllium and/or tritium contamination of the JET tokamak and auxiliary systems necessitates that many activities are carried out in air line fed pressurised suits. To enable often complex engineering activities to be performed, a number of novel audio and visual and communications systems have been designed. The paper describes these systems which give freedom of visual and audio communication between suited personnel, supervisors, operators and engineers. The system enhances the safety of the working environment as well as helping to minimise the radiation dose to personnel. It is concluded, from a number of years experience of using the audio and visual communications systems for suited operations, that safety and the progress of complex engineering tasks have been significantly enhanced
The audio and visual communication systems for suited engineering activities on JET

Energy Technology Data Exchange (ETDEWEB)

Pearce, R.J.H. E-mail: robert.pearce@jet.uk; Bruce, J.; Callaghan, C.; Hart, M.; Martin, P.; Middleton, R.; Tait, J

2001-11-01

The beryllium and/or tritium contamination of the JET tokamak and auxiliary systems necessitates that many activities are carried out in air line fed pressurised suits. To enable often complex engineering activities to be performed, a number of novel audio and visual and communications systems have been designed. The paper describes these systems which give freedom of visual and audio communication between suited personnel, supervisors, operators and engineers. The system enhances the safety of the working environment as well as helping to minimise the radiation dose to personnel. It is concluded, from a number of years experience of using the audio and visual communications systems for suited operations, that safety and the progress of complex engineering tasks have been significantly enhanced.
AudioPairBank: Towards A Large-Scale Tag-Pair-Based Audio Content Analysis

OpenAIRE

Sager, Sebastian; Elizalde, Benjamin; Borth, Damian; Schulze, Christian; Raj, Bhiksha; Lane, Ian

2016-01-01

Recently, sound recognition has been used to identify sounds, such as car and river. However, sounds have nuances that may be better described by adjective-noun pairs such as slow car, and verb-noun pairs such as flying insects, which are under explored. Therefore, in this work we investigate the relation between audio content and both adjective-noun pairs and verb-noun pairs. Due to the lack of datasets with these kinds of annotations, we collected and processed the AudioPairBank corpus cons...
Enhanced audio-visual interactions in the auditory cortex of elderly cochlear-implant users.

Science.gov (United States)

Schierholz, Irina; Finke, Mareike; Schulte, Svenja; Hauthal, Nadine; Kantzke, Christoph; Rach, Stefan; Büchner, Andreas; Dengler, Reinhard; Sandmann, Pascale

2015-10-01

Auditory deprivation and the restoration of hearing via a cochlear implant (CI) can induce functional plasticity in auditory cortical areas. How these plastic changes affect the ability to integrate combined auditory (A) and visual (V) information is not yet well understood. In the present study, we used electroencephalography (EEG) to examine whether age, temporary deafness and altered sensory experience with a CI can affect audio-visual (AV) interactions in post-lingually deafened CI users. Young and elderly CI users and age-matched NH listeners performed a speeded response task on basic auditory, visual and audio-visual stimuli. Regarding the behavioral results, a redundant signals effect, that is, faster response times to cross-modal (AV) than to both of the two modality-specific stimuli (A, V), was revealed for all groups of participants. Moreover, in all four groups, we found evidence for audio-visual integration. Regarding event-related responses (ERPs), we observed a more pronounced visual modulation of the cortical auditory response at N1 latency (approximately 100 ms after stimulus onset) in the elderly CI users when compared with young CI users and elderly NH listeners. Thus, elderly CI users showed enhanced audio-visual binding which may be a consequence of compensatory strategies developed due to temporary deafness and/or degraded sensory input after implantation. These results indicate that the combination of aging, sensory deprivation and CI facilitates the coupling between the auditory and the visual modality. We suggest that this enhancement in multisensory interactions could be used to optimize auditory rehabilitation, especially in elderly CI users, by the application of strong audio-visually based rehabilitation strategies after implant switch-on. Copyright © 2015 Elsevier B.V. All rights reserved.
Binding and unbinding the auditory and visual streams in the McGurk effect.

Science.gov (United States)

Nahorna, Olha; Berthommier, Frédéric; Schwartz, Jean-Luc

2012-08-01

Subjects presented with coherent auditory and visual streams generally fuse them into a single percept. This results in enhanced intelligibility in noise, or in visual modification of the auditory percept in the McGurk effect. It is classically considered that processing is done independently in the auditory and visual systems before interaction occurs at a certain representational stage, resulting in an integrated percept. However, some behavioral and neurophysiological data suggest the existence of a two-stage process. A first stage would involve binding together the appropriate pieces of audio and video information before fusion per se in a second stage. Then it should be possible to design experiments leading to unbinding. It is shown here that if a given McGurk stimulus is preceded by an incoherent audiovisual context, the amount of McGurk effect is largely reduced. Various kinds of incoherent contexts (acoustic syllables dubbed on video sentences or phonetic or temporal modifications of the acoustic content of a regular sequence of audiovisual syllables) can significantly reduce the McGurk effect even when they are short (less than 4 s). The data are interpreted in the framework of a two-stage "binding and fusion" model for audiovisual speech perception.

PHYSIOLOGICAL MONITORING OPERATORS ACS IN AUDIO-VISUAL SIMULATION OF AN EMERGENCY

Directory of Open Access Journals (Sweden)

S. S. Aleksanin

2010-01-01

Full Text Available In terms of ship simulator automated control systems we have investigated the information content of physiological monitoring cardiac rhythm to assess the reliability and noise immunity of operators of various specializations with audio-visual simulation of an emergency. In parallel, studied the effectiveness of protection against the adverse effects of electromagnetic fields. Monitoring of cardiac rhythm in a virtual crash it is possible to differentiate the degree of voltage regulation systems of body functions of operators on specialization and note the positive effect of the use of means of protection from exposure of electromagnetic fields.
Do gender differences in audio-visual benefit and visual influence in audio-visual speech perception emerge with age?

Directory of Open Access Journals (Sweden)

Magnus eAlm

2015-07-01

Full Text Available Gender and age have been found to affect adults’ audio-visual (AV speech perception. However, research on adult aging focuses on adults over 60 years, who have an increasing likelihood for cognitive and sensory decline, which may confound positive effects of age-related AV-experience and its interaction with gender. Observed age and gender differences in AV speech perception may also depend on measurement sensitivity and AV task difficulty. Consequently both AV benefit and visual influence were used to measure visual contribution for gender-balanced groups of young (20-30 years and middle-aged adults (50-60 years with task difficulty varied using AV syllables from different talkers in alternative auditory backgrounds. Females had better speech-reading performance than males. Whereas no gender differences in AV benefit or visual influence were observed for young adults, visually influenced responses were significantly greater for middle-aged females than middle-aged males. That speech-reading performance did not influence AV benefit may be explained by visual speech extraction and AV integration constituting independent abilities. Contrastingly, the gender difference in visually influenced responses in middle adulthood may reflect an experience-related shift in females’ general AV perceptual strategy. Although young females’ speech-reading proficiency may not readily contribute to greater visual influence, between young and middle-adulthood recurrent confirmation of the contribution of visual cues induced by speech-reading proficiency may gradually shift females AV perceptual strategy towards more visually dominated responses.
Audio-visual identification of place of articulation and voicing in white and babble noise.

Science.gov (United States)

Alm, Magnus; Behne, Dawn M; Wang, Yue; Eg, Ragnhild

2009-07-01

Research shows that noise and phonetic attributes influence the degree to which auditory and visual modalities are used in audio-visual speech perception (AVSP). Research has, however, mainly focused on white noise and single phonetic attributes, thus neglecting the more common babble noise and possible interactions between phonetic attributes. This study explores whether white and babble noise differentially influence AVSP and whether these differences depend on phonetic attributes. White and babble noise of 0 and -12 dB signal-to-noise ratio were added to congruent and incongruent audio-visual stop consonant-vowel stimuli. The audio (A) and video (V) of incongruent stimuli differed either in place of articulation (POA) or voicing. Responses from 15 young adults show that, compared to white noise, babble resulted in more audio responses for POA stimuli, and fewer for voicing stimuli. Voiced syllables received more audio responses than voiceless syllables. Results can be attributed to discrepancies in the acoustic spectra of both the noise and speech target. Voiced consonants may be more auditorily salient than voiceless consonants which are more spectrally similar to white noise. Visual cues contribute to identification of voicing, but only if the POA is visually salient and auditorily susceptible to the noise type.
Neuromorphic Audio-Visual Sensor Fusion on a Sound-Localising Robot

Directory of Open Access Journals (Sweden)

Vincent Yue-Sek Chan

2012-02-01

Full Text Available This paper presents the first robotic system featuring audio-visual sensor fusion with neuromorphic sensors. We combine a pair of silicon cochleae and a silicon retina on a robotic platform to allow the robot to learn sound localisation through self-motion and visual feedback, using an adaptive ITD-based sound localisation algorithm. After training, the robot can localise sound sources (white or pink noise in a reverberant environment with an RMS error of 4 to 5 degrees in azimuth. In the second part of the paper, we investigate the source binding problem. An experiment is conducted to test the effectiveness of matching an audio event with a corresponding visual event based on their onset time. The results show that this technique can be quite effective, despite its simplicity.
Proper Use of Audio-Visual Aids: Essential for Educators.

Science.gov (United States)

Dejardin, Conrad

1989-01-01

Criticizes educators as the worst users of audio-visual aids and among the worst public speakers. Offers guidelines for the proper use of an overhead projector and the development of transparencies. (DMM)
Audio-Visual Aid in Teaching "Fatty Liver"

Science.gov (United States)

Dash, Sambit; Kamath, Ullas; Rao, Guruprasad; Prakash, Jay; Mishra, Snigdha

2016-01-01

Use of audio visual tools to aid in medical education is ever on a rise. Our study intends to find the efficacy of a video prepared on "fatty liver," a topic that is often a challenge for pre-clinical teachers, in enhancing cognitive processing and ultimately learning. We prepared a video presentation of 11:36 min, incorporating various…
Audio-visual Classification and Fusion of Spontaneous Affect Data in Likelihood Space

NARCIS (Netherlands)

Nicolaou, Mihalis A.; Gunes, Hatice; Pantic, Maja

2010-01-01

This paper focuses on audio-visual (using facial expression, shoulder and audio cues) classification of spontaneous affect, utilising generative models for classification (i) in terms of Maximum Likelihood Classification with the assumption that the generative model structure in the classifier is
A Method to Detect AAC Audio Forgery

Directory of Open Access Journals (Sweden)

Qingzhong Liu

2015-08-01

Full Text Available Advanced Audio Coding (AAC, a standardized lossy compression scheme for digital audio, which was designed to be the successor of the MP3 format, generally achieves better sound quality than MP3 at similar bit rates. While AAC is also the default or standard audio format for many devices and AAC audio files may be presented as important digital evidences, the authentication of the audio files is highly needed but relatively missing. In this paper, we propose a scheme to expose tampered AAC audio streams that are encoded at the same encoding bit-rate. Specifically, we design a shift-recompression based method to retrieve the differential features between the re-encoded audio stream at each shifting and original audio stream, learning classifier is employed to recognize different patterns of differential features of the doctored forgery files and original (untouched audio files. Experimental results show that our approach is very promising and effective to detect the forgery of the same encoding bit-rate on AAC audio streams. Our study also shows that shift recompression-based differential analysis is very effective for detection of the MP3 forgery at the same bit rate.
The efectiveness of mnemonic audio-visual aids in teaching content words to EFL students at a Turkish university

OpenAIRE

Kılınç, A Reha

1996-01-01

Ankara : Institute of Economics and Social Sciences, Bilkent University, 1996. Thesis(Master's) -- Bilkent University, 1996. Includes bibliographical references leaves 63-67 This experimental study aimed at investigating the effects of mnemonic audio-visual aids on recognition and recall of vocabulary items in comparison to a dictionary using control group. The study was conducted at Middle East Technical University Department of Basic English. The participants were 64 beginner and u...
Concurrent audio-visual feedback for supporting drivers at intersections: A study using two linked driving simulators.

Science.gov (United States)

Houtenbos, M; de Winter, J C F; Hale, A R; Wieringa, P A; Hagenzieker, M P

2017-04-01

A large portion of road traffic crashes occur at intersections for the reason that drivers lack necessary visual information. This research examined the effects of an audio-visual display that provides real-time sonification and visualization of the speed and direction of another car approaching the crossroads on an intersecting road. The location of red blinking lights (left vs. right on the speedometer) and the lateral input direction of beeps (left vs. right ear in headphones) corresponded to the direction from where the other car approached, and the blink and beep rates were a function of the approaching car's speed. Two driving simulators were linked so that the participant and the experimenter drove in the same virtual world. Participants (N = 25) completed four sessions (two with the audio-visual display on, two with the audio-visual display off), each session consisting of 22 intersections at which the experimenter approached from the left or right and either maintained speed or slowed down. Compared to driving with the display off, the audio-visual display resulted in enhanced traffic efficiency (i.e., greater mean speed, less coasting) while not compromising safety (i.e., the time gap between the two vehicles was equivalent). A post-experiment questionnaire showed that the beeps were regarded as more useful than the lights. It is argued that the audio-visual display is a promising means of supporting drivers until fully automated driving is technically feasible. Copyright © 2016. Published by Elsevier Ltd.
Changes of the Prefrontal EEG (Electroencephalogram) Activities According to the Repetition of Audio-Visual Learning.

Science.gov (United States)

Kim, Yong-Jin; Chang, Nam-Kee

2001-01-01

Investigates the changes of neuronal response according to a four time repetition of audio-visual learning. Obtains EEG data from the prefrontal (Fp1, Fp2) lobe from 20 subjects at the 8th grade level. Concludes that the habituation of neuronal response shows up in repetitive audio-visual learning and brain hemisphericity can be changed by…
On the relative importance of audio and video in the presence of packet losses

DEFF Research Database (Denmark)

Korhonen, Jari; Reiter, Ulrich; Myakotnykh, Eugene

2010-01-01

In streaming applications, unequal protection of audio and video tracks may be necessary to maintain the optimal perceived overall quality. For this purpose, the application should be aware of the relative importance of audio and video in an audiovisual sequence. In this paper, we propose...... a subjective test arrangement for finding the optimal tradeoff between subjective audio and video qualities in situations when it is not possible to have perfect quality for both modalities concurrently. Our results show that content poses a significant impact on the preferred compromise between audio...... and video quality, but also that the currently used classification criteria for content are not sufficient to predict the users’ preference...
StreamExplorer: A Multi-Stage System for Visually Exploring Events in Social Streams.

Science.gov (United States)

Wu, Yingcai; Chen, Zhutian; Sun, Guodao; Xie, Xiao; Cao, Nan; Liu, Shixia; Cui, Weiwei

2017-10-18

Analyzing social streams is important for many applications, such as crisis management. However, the considerable diversity, increasing volume, and high dynamics of social streams of large events continue to be significant challenges that must be overcome to ensure effective exploration. We propose a novel framework by which to handle complex social streams on a budget PC. This framework features two components: 1) an online method to detect important time periods (i.e., subevents), and 2) a tailored GPU-assisted Self-Organizing Map (SOM) method, which clusters the tweets of subevents stably and efficiently. Based on the framework, we present StreamExplorer to facilitate the visual analysis, tracking, and comparison of a social stream at three levels. At a macroscopic level, StreamExplorer uses a new glyph-based timeline visualization, which presents a quick multi-faceted overview of the ebb and flow of a social stream. At a mesoscopic level, a map visualization is employed to visually summarize the social stream from either a topical or geographical aspect. At a microscopic level, users can employ interactive lenses to visually examine and explore the social stream from different perspectives. Two case studies and a task-based evaluation are used to demonstrate the effectiveness and usefulness of StreamExplorer.Analyzing social streams is important for many applications, such as crisis management. However, the considerable diversity, increasing volume, and high dynamics of social streams of large events continue to be significant challenges that must be overcome to ensure effective exploration. We propose a novel framework by which to handle complex social streams on a budget PC. This framework features two components: 1) an online method to detect important time periods (i.e., subevents), and 2) a tailored GPU-assisted Self-Organizing Map (SOM) method, which clusters the tweets of subevents stably and efficiently. Based on the framework, we present Stream
Teacher’s Voice on Metacognitive Strategy Based Instruction Using Audio Visual Aids for Listening

Directory of Open Access Journals (Sweden)

Salasiah Salasiah

2018-02-01

Full Text Available The paper primarily stresses on exploring the teacher’s voice toward the application of metacognitive strategy with audio-visual aid in improving listening comprehension. The metacognitive strategy model applied in the study was inspired from Vandergrift and Tafaghodtari (2010 instructional model. Thus it is modified in the procedure and applied with audio-visual aids for improving listening comprehension. The study’s setting was at SMA Negeri 2 Parepare, South Sulawesi Province, Indonesia. The population of the research was the teacher of English at tenth grade at SMAN 2. The sample was taken by using random sampling technique. The data was collected by using in depth interview during the research, recorded, and analyzed using qualitative analysis. This study explored the teacher’s response toward the modified model of metacognitive strategy with audio visual aids in class of listening which covers positive and negative response toward the strategy applied during the teaching of listening. The result of data showed that this strategy helped the teacher a lot in teaching listening comprehension as the procedure has systematic steps toward students’ listening comprehension. Also, it eases the teacher to teach listening by empowering audio visual aids such as video taken from youtube.
CREATING AUDIO VISUAL DIALOGUE TASK AS STUDENTS’ SELF ASSESSMENT TO ENHANCE THEIR SPEAKING ABILITY

Directory of Open Access Journals (Sweden)

Novia Trisanti

2017-04-01

Full Text Available The study is about giving overview of employing audio visual dialogue task as students creativity task and self assessment in EFL speaking class of tertiary education to enhance the students speaking ability. The qualitative research was done in one of the speaking classes at English Department, Semarang State University, Central Java, Indonesia. The results that can be seen from the rubric of self assessment show that the oral performance through audio visual recorded tasks done by the students as their self assessment gave positive evidences. The audio visual dialogue task can be very beneficial since it can motivate the students learning and increase their learning experiences. The self-assessment can be a valuable additional means to improve their speaking ability since it is one of the motives that drive self- evaluatioan, along with self- verification and self- enhancement.
An introduction to audio content analysis applications in signal processing and music informatics

CERN Document Server

Lerch, Alexander

2012-01-01

"With the proliferation of digital audio distribution over digital media, audio content analysis is fast becoming a requirement for designers of intelligent signal-adaptive audio processing systems. Written by a well-known expert in the field, this book provides quick access to different analysis algorithms and allows comparison between different approaches to the same task, making it useful for newcomers to audio signal processing and industry experts alike. A review of relevant fundamentals in audio signal processing, psychoacoustics, and music theory, as well as downloadable MATLAB files are also included"--
N1 enhancement in synesthesia during visual and audio-visual perception in semantic cross-modal conflict situations: an ERP study

Directory of Open Access Journals (Sweden)

Christopher eSinke

2014-01-01

Full Text Available Synesthesia entails a special kind of sensory perception, where stimulation in one sensory modality leads to an internally generated perceptual experience of another, not stimulated sensory modality. This phenomenon can be viewed as an abnormal multisensory integration process as here the synesthetic percept is aberrantly fused with the stimulated modality. Indeed, recent synesthesia research has focused on multimodal processing even outside of the specific synesthesia-inducing context and has revealed changed multimodal integration, thus suggesting perceptual alterations at a global level. Here, we focused on audio-visual processing in synesthesia using a semantic classification task in combination with visually or auditory-visually presented animated and inanimated objects in an audio-visual congruent and incongruent manner. Fourteen subjects with auditory-visual and/or grapheme-color synesthesia and 14 control subjects participated in the experiment. During presentation of the stimuli, event-related potentials were recorded from 32 electrodes. The analysis of reaction times and error rates revealed no group differences with best performance for audio-visually congruent stimulation indicating the well-known multimodal facilitation effect. We found an enhanced amplitude of the N1 component over occipital electrode sites for synesthetes compared to controls. The differences occurred irrespective of the experimental condition and therefore suggest a global influence on early sensory processing in synesthetes.
Computerized Audio-Visual Instructional Sequences (CAVIS): A Versatile System for Listening Comprehension in Foreign Language Teaching.

Science.gov (United States)

Aleman-Centeno, Josefina R.

1983-01-01

Discusses the development and evaluation of CAVIS, which consists of an Apple microcomputer used with audiovisual dialogs. Includes research on the effects of three conditions: (1) computer with audio and visual, (2) computer with audio alone and (3) audio alone in short-term and long-term recall. (EKN)
Stream.cz a jeho originální seriálová tvorba

OpenAIRE

Vašíčková, Dorota

2017-01-01

In general, the rise of the internet has brought many changes of production and distribution into the audio-visual industry. These changes triggered the development of specific internet portals that offer video content. The thesis focuses on the current development of internet televisions and especially on a particular platform and enriches the current theoretical list of forms with a specific example of a Czech internet television. The case study of the Czech internet television Stream.cz de...
Spatio-temporal distribution of brain activity associated with audio-visually congruent and incongruent speech and the McGurk Effect.

Science.gov (United States)

Pratt, Hillel; Bleich, Naomi; Mittelman, Nomi

2015-11-01

Spatio-temporal distributions of cortical activity to audio-visual presentations of meaningless vowel-consonant-vowels and the effects of audio-visual congruence/incongruence, with emphasis on the McGurk effect, were studied. The McGurk effect occurs when a clearly audible syllable with one consonant, is presented simultaneously with a visual presentation of a face articulating a syllable with a different consonant and the resulting percept is a syllable with a consonant other than the auditorily presented one. Twenty subjects listened to pairs of audio-visually congruent or incongruent utterances and indicated whether pair members were the same or not. Source current densities of event-related potentials to the first utterance in the pair were estimated and effects of stimulus-response combinations, brain area, hemisphere, and clarity of visual articulation were assessed. Auditory cortex, superior parietal cortex, and middle temporal cortex were the most consistently involved areas across experimental conditions. Early (visual cortex. Clarity of visual articulation impacted activity in secondary visual cortex and Wernicke's area. McGurk perception was associated with decreased activity in primary and secondary auditory cortices and Wernicke's area before 100 msec, increased activity around 100 msec which decreased again around 180 msec. Activity in Broca's area was unaffected by McGurk perception and was only increased to congruent audio-visual stimuli 30-70 msec following consonant onset. The results suggest left hemisphere prominence in the effects of stimulus and response conditions on eight brain areas involved in dynamically distributed parallel processing of audio-visual integration. Initially (30-70 msec) subcortical contributions to auditory cortex, superior parietal cortex, and middle temporal cortex occur. During 100-140 msec, peristriate visual influences and Wernicke's area join in the processing. Resolution of incongruent audio-visual inputs is then

Streaming Visual Analytics Workshop Report

Energy Technology Data Exchange (ETDEWEB)

Cook, Kristin A. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Burtner, Edwin R. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Kritzstein, Brian P. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Brisbois, Brooke R. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Mitson, Anna E. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States)

2016-03-31

How can we best enable users to understand complex emerging events and make appropriate assessments from streaming data? This was the central question addressed at a three-day workshop on streaming visual analytics. This workshop was organized by Pacific Northwest National Laboratory for a government sponsor. It brought together forty researchers and subject matter experts from government, industry, and academia. This report summarizes the outcomes from that workshop. It describes elements of the vision for a streaming visual analytic environment and set of important research directions needed to achieve this vision. Streaming data analysis is in many ways the analysis and understanding of change. However, current visual analytics systems usually focus on static data collections, meaning that dynamically changing conditions are not appropriately addressed. The envisioned mixed-initiative streaming visual analytics environment creates a collaboration between the analyst and the system to support the analysis process. It raises the level of discourse from low-level data records to higher-level concepts. The system supports the analyst’s rapid orientation and reorientation as situations change. It provides an environment to support the analyst’s critical thinking. It infers tasks and interests based on the analyst’s interactions. The system works as both an assistant and a devil’s advocate, finding relevant data and alerts as well as considering alternative hypotheses. Finally, the system supports sharing of findings with others. Making such an environment a reality requires research in several areas. The workshop discussions focused on four broad areas: support for critical thinking, visual representation of change, mixed-initiative analysis, and the use of narratives for analysis and communication.
Auditory and audio-visual processing in patients with cochlear, auditory brainstem, and auditory midbrain implants: An EEG study.

Science.gov (United States)

Schierholz, Irina; Finke, Mareike; Kral, Andrej; Büchner, Andreas; Rach, Stefan; Lenarz, Thomas; Dengler, Reinhard; Sandmann, Pascale

2017-04-01

There is substantial variability in speech recognition ability across patients with cochlear implants (CIs), auditory brainstem implants (ABIs), and auditory midbrain implants (AMIs). To better understand how this variability is related to central processing differences, the current electroencephalography (EEG) study compared hearing abilities and auditory-cortex activation in patients with electrical stimulation at different sites of the auditory pathway. Three different groups of patients with auditory implants (Hannover Medical School; ABI: n = 6, CI: n = 6; AMI: n = 2) performed a speeded response task and a speech recognition test with auditory, visual, and audio-visual stimuli. Behavioral performance and cortical processing of auditory and audio-visual stimuli were compared between groups. ABI and AMI patients showed prolonged response times on auditory and audio-visual stimuli compared with NH listeners and CI patients. This was confirmed by prolonged N1 latencies and reduced N1 amplitudes in ABI and AMI patients. However, patients with central auditory implants showed a remarkable gain in performance when visual and auditory input was combined, in both speech and non-speech conditions, which was reflected by a strong visual modulation of auditory-cortex activation in these individuals. In sum, the results suggest that the behavioral improvement for audio-visual conditions in central auditory implant patients is based on enhanced audio-visual interactions in the auditory cortex. Their findings may provide important implications for the optimization of electrical stimulation and rehabilitation strategies in patients with central auditory prostheses. Hum Brain Mapp 38:2206-2225, 2017. © 2017 Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.
Semantics and the multisensory brain: how meaning modulates processes of audio-visual integration.

Science.gov (United States)

Doehrmann, Oliver; Naumer, Marcus J

2008-11-25

By using meaningful stimuli, multisensory research has recently started to investigate the impact of stimulus content on crossmodal integration. Variations in this respect have often been termed as "semantic". In this paper we will review work related to the question for which tasks the influence of semantic factors has been found and which cortical networks are most likely to mediate these effects. More specifically, the focus of this paper will be on processing of object stimuli presented in the auditory and visual sensory modalities. Furthermore, we will investigate which cortical regions are particularly responsive to experimental variations of content by comparing semantically matching ("congruent") and mismatching ("incongruent") experimental conditions. In this context, recent neuroimaging studies point toward a possible functional differentiation of temporal and frontal cortical regions, with the former being more responsive to semantically congruent and the latter to semantically incongruent audio-visual (AV) stimulation. To account for these differential effects, we will suggest in the final section of this paper a possible synthesis of these data on semantic modulation of AV integration with findings from neuroimaging studies and theoretical accounts of semantic memory.
Effects of audio-visual aids on foreign language test anxiety, reading and listening comprehension, and retention in EFL learners.

Science.gov (United States)

Lee, Shu-Ping; Lee, Shin-Da; Liao, Yuan-Lin; Wang, An-Chi

2015-04-01

This study examined the effects of audio-visual aids on anxiety, comprehension test scores, and retention in reading and listening to short stories in English as a Foreign Language (EFL) classrooms. Reading and listening tests, general and test anxiety, and retention were measured in English-major college students in an experimental group with audio-visual aids (n=83) and a control group without audio-visual aids (n=94) with similar general English proficiency. Lower reading test anxiety, unchanged reading comprehension scores, and better reading short-term and long-term retention after four weeks were evident in the audiovisual group relative to the control group. In addition, lower listening test anxiety, higher listening comprehension scores, and unchanged short-term and long-term retention were found in the audiovisual group relative to the control group after the intervention. Audio-visual aids may help to reduce EFL learners' listening test anxiety and enhance their listening comprehension scores without facilitating retention of such materials. Although audio-visual aids did not increase reading comprehension scores, they helped reduce EFL learners' reading test anxiety and facilitated retention of reading materials.
An Annotated Guide to Audio-Visual Materials for Teaching Shakespeare.

Science.gov (United States)

Albert, Richard N.

Audio-visual materials, found in a variety of periodicals, catalogs, and reference works, are listed in this guide to expedite the process of finding appropriate classroom materials for a study of William Shakespeare in the classroom. Separate listings of films, filmstrips, and recordings are provided, with subdivisions for "The Plays"…
A scheme for racquet sports video analysis with the combination of audio-visual information

Science.gov (United States)

Xing, Liyuan; Ye, Qixiang; Zhang, Weigang; Huang, Qingming; Yu, Hua

2005-07-01

As a very important category in sports video, racquet sports video, e.g. table tennis, tennis and badminton, has been paid little attention in the past years. Considering the characteristics of this kind of sports video, we propose a new scheme for structure indexing and highlight generating based on the combination of audio and visual information. Firstly, a supervised classification method is employed to detect important audio symbols including impact (ball hit), audience cheers, commentator speech, etc. Meanwhile an unsupervised algorithm is proposed to group video shots into various clusters. Then, by taking advantage of temporal relationship between audio and visual signals, we can specify the scene clusters with semantic labels including rally scenes and break scenes. Thirdly, a refinement procedure is developed to reduce false rally scenes by further audio analysis. Finally, an exciting model is proposed to rank the detected rally scenes from which many exciting video clips such as game (match) points can be correctly retrieved. Experiments on two types of representative racquet sports video, table tennis video and tennis video, demonstrate encouraging results.
Comparison of animated jet stream visualizations

Science.gov (United States)

Nocke, Thomas; Hoffmann, Peter

2016-04-01

The visualization of 3D atmospheric phenomena in space and time is still a challenging problem. In particular, multiple solutions of animated jet stream visualizations have been produced in recent years, which were designed to visually analyze and communicate the jet and related impacts on weather circulation patterns and extreme weather events. This PICO integrates popular and new jet animation solutions and inter-compares them. The applied techniques (e.g. stream lines or line integral convolution) and parametrizations (color mapping, line lengths) are discussed with respect to visualization quality criteria and their suitability for certain visualization tasks (e.g. jet patterns and jet anomaly analysis, communicating its relevance for climate change).
Modular Sensor Environment : Audio Visual Industry Monitoring Applications

OpenAIRE

Guillot, Calvin

2017-01-01

This work was made for Electro Waves Oy. The company specializes in Audio-visual services and interactive systems. The purpose of this work is to design and implement a modular sensor environment for the company, which will be used for developing automated systems. This thesis begins with an introduction to sensor systems and their different topologies. It is followed by an introduction to the technologies used in this project. The system is divided in three parts. The client, tha...
The Dynamics and Neural Correlates of Audio-Visual Integration Capacity as Determined by Temporal Unpredictability, Proactive Interference, and SOA.

Science.gov (United States)

Wilbiks, Jonathan M P; Dyson, Benjamin J

2016-01-01

Over 5 experiments, we challenge the idea that the capacity of audio-visual integration need be fixed at 1 item. We observe that the conditions under which audio-visual integration is most likely to exceed 1 occur when stimulus change operates at a slow rather than fast rate of presentation and when the task is of intermediate difficulty such as when low levels of proactive interference (3 rather than 8 interfering visual presentations) are combined with the temporal unpredictability of the critical frame (Experiment 2), or, high levels of proactive interference are combined with the temporal predictability of the critical frame (Experiment 4). Neural data suggest that capacity might also be determined by the quality of perceptual information entering working memory. Experiment 5 supported the proposition that audio-visual integration was at play during the previous experiments. The data are consistent with the dynamic nature usually associated with cross-modal binding, and while audio-visual integration capacity likely cannot exceed uni-modal capacity estimates, performance may be better than being able to associate only one visual stimulus with one auditory stimulus.
Independent Interactive Inquiry-Based Learning Modules Using Audio-Visual Instruction In Statistics

OpenAIRE

McDaniel, Scott N.; Green, Lisa

2012-01-01

Simulations can make complex ideas easier for students to visualize and understand. It has been shown that guidance in the use of these simulations enhances students’ learning. This paper describes the implementation and evaluation of the Independent Interactive Inquiry-based (I3) Learning Modules, which use existing open-source Java applets, combined with audio-visual instruction. Students are guided to discover and visualize important concepts in post-calculus and algebra-based courses in p...
Impact of audio-visual storytelling in simulation learning experiences of undergraduate nursing students.

Science.gov (United States)

Johnston, Sandra; Parker, Christina N; Fox, Amanda

2017-09-01

Use of high fidelity simulation has become increasingly popular in nursing education to the extent that it is now an integral component of most nursing programs. Anecdotal evidence suggests that students have difficulty engaging with simulation manikins due to their unrealistic appearance. Introduction of the manikin as a 'real patient' with the use of an audio-visual narrative may engage students in the simulated learning experience and impact on their learning. A paucity of literature currently exists on the use of audio-visual narratives to enhance simulated learning experiences. This study aimed to determine if viewing an audio-visual narrative during a simulation pre-brief altered undergraduate nursing student perceptions of the learning experience. A quasi-experimental post-test design was utilised. A convenience sample of final year baccalaureate nursing students at a large metropolitan university. Participants completed a modified version of the Student Satisfaction with Simulation Experiences survey. This 12-item questionnaire contained questions relating to the ability to transfer skills learned in simulation to the real clinical world, the realism of the simulation and the overall value of the learning experience. Descriptive statistics were used to summarise demographic information. Two tailed, independent group t-tests were used to determine statistical differences within the categories. Findings indicated that students reported high levels of value, realism and transferability in relation to the viewing of an audio-visual narrative. Statistically significant results (t=2.38, psimulation to clinical practice. The subgroups of age and gender although not significant indicated some interesting results. High satisfaction with simulation was indicated by all students in relation to value and realism. There was a significant finding in relation to transferability on knowledge and this is vital to quality educational outcomes. Copyright © 2017. Published by
Tracing Trajectories of Audio-Visual Learning in the Infant Brain

Science.gov (United States)

Kersey, Alyssa J.; Emberson, Lauren L.

2017-01-01

Although infants begin learning about their environment before they are born, little is known about how the infant brain changes during learning. Here, we take the initial steps in documenting how the neural responses in the brain change as infants learn to associate audio and visual stimuli. Using functional near-infrared spectroscopy (fNRIS) to…
Real Time Recognition Of Speakers From Internet Audio Stream

Directory of Open Access Journals (Sweden)

Weychan Radoslaw

2015-09-01

Full Text Available In this paper we present an automatic speaker recognition technique with the use of the Internet radio lossy (encoded speech signal streams. We show an influence of the audio encoder (e.g., bitrate on the speaker model quality. The model of each speaker was calculated with the use of the Gaussian mixture model (GMM approach. Both the speaker recognition and the further analysis were realized with the use of short utterances to facilitate real time processing. The neighborhoods of the speaker models were analyzed with the use of the ISOMAP algorithm. The experiments were based on four 1-hour public debates with 7–8 speakers (including the moderator, acquired from the Polish radio Internet services. The presented software was developed with the MATLAB environment.
Spectacular Attractions: Museums, Audio-Visuals and the Ghosts of Memory

Directory of Open Access Journals (Sweden)

Mandelli Elisa

2015-12-01

Full Text Available In the last decades, moving images have become a common feature not only in art museums, but also in a wide range of institutions devoted to the conservation and transmission of memory. This paper focuses on the role of audio-visuals in the exhibition design of history and memory museums, arguing that they are privileged means to achieve the spectacular effects and the visitors’ emotional and “experiential” engagement that constitute the main objective of contemporary museums. I will discuss this topic through the concept of “cinematic attraction,” claiming that when embedded in displays, films and moving images often produce spectacular mises en scène with immersive effects, creating wonder and astonishment, and involving visitors on an emotional, visceral and physical level. Moreover, I will consider the diffusion of audio-visual witnesses of real or imaginary historical characters, presented in Phantasmagoria-like displays that simulate ghostly and uncanny apparitions, creating an ambiguous and often problematic coexistence of truth and illusion, subjectivity and objectivity, facts and imagination.
Finding the Correspondence of Audio-Visual Events by Object Manipulation

Science.gov (United States)

Nishibori, Kento; Takeuchi, Yoshinori; Matsumoto, Tetsuya; Kudo, Hiroaki; Ohnishi, Noboru

A human being understands the objects in the environment by integrating information obtained by the senses of sight, hearing and touch. In this integration, active manipulation of objects plays an important role. We propose a method for finding the correspondence of audio-visual events by manipulating an object. The method uses the general grouping rules in Gestalt psychology, i.e. “simultaneity” and “similarity” among motion command, sound onsets and motion of the object in images. In experiments, we used a microphone, a camera, and a robot which has a hand manipulator. The robot grasps an object like a bell and shakes it or grasps an object like a stick and beat a drum in a periodic, or non-periodic motion. Then the object emits periodical/non-periodical events. To create more realistic scenario, we put other event source (a metronome) in the environment. As a result, we had a success rate of 73.8 percent in finding the correspondence between audio-visual events (afferent signal) which are relating to robot motion (efferent signal).
A Joint Audio-Visual Approach to Audio Localization

DEFF Research Database (Denmark)

Jensen, Jesper Rindom; Christensen, Mads Græsbøll

2015-01-01

Localization of audio sources is an important research problem, e.g., to facilitate noise reduction. In the recent years, the problem has been tackled using distributed microphone arrays (DMA). A common approach is to apply direction-of-arrival (DOA) estimation on each array (denoted as nodes), a...... time-of-flight cameras. Moreover, we propose an optimal method for weighting such DOA and range information for audio localization. Our experiments on both synthetic and real data show that there is a clear, potential advantage of using the joint audiovisual localization framework....
The Dynamics and Neural Correlates of Audio-Visual Integration Capacity as Determined by Temporal Unpredictability, Proactive Interference, and SOA.

Directory of Open Access Journals (Sweden)

Jonathan M P Wilbiks

Full Text Available Over 5 experiments, we challenge the idea that the capacity of audio-visual integration need be fixed at 1 item. We observe that the conditions under which audio-visual integration is most likely to exceed 1 occur when stimulus change operates at a slow rather than fast rate of presentation and when the task is of intermediate difficulty such as when low levels of proactive interference (3 rather than 8 interfering visual presentations are combined with the temporal unpredictability of the critical frame (Experiment 2, or, high levels of proactive interference are combined with the temporal predictability of the critical frame (Experiment 4. Neural data suggest that capacity might also be determined by the quality of perceptual information entering working memory. Experiment 5 supported the proposition that audio-visual integration was at play during the previous experiments. The data are consistent with the dynamic nature usually associated with cross-modal binding, and while audio-visual integration capacity likely cannot exceed uni-modal capacity estimates, performance may be better than being able to associate only one visual stimulus with one auditory stimulus.
The effect of visualizing the flow of multimedia content among and inside devices.

Science.gov (United States)

Lee, Dong-Seok

2009-05-01

This study introduces a user interface, referred to as the flow interface, which provides a graphical representation of the movement of content among and inside audio/video devices. The proposed interface provides a different frame of reference with content-oriented visualization of the generation, manipulation, storage, and display of content as well as input and output. The flow interface was applied to a VCR/DVD recorder combo, one of the most complicated consumer products. A between-group experiment was performed to determine whether the flow interface helps users to perform various tasks and to examine the learning effect of the flow interface, particularly in regard to hooking up and recording tasks. The results showed that participants with access to the flow interface performed better in terms of success rate and elapsed time. In addition, the participants indicated that they could easily understand the flow interface. The potential of the flow interface for application to other audio video devices, and design issues requiring further consideration, are discussed.
No, there is no 150 ms lead of visual speech on auditory speech, but a range of audiovisual asynchronies varying from small audio lead to large audio lag.

Directory of Open Access Journals (Sweden)

Jean-Luc Schwartz

2014-07-01

Full Text Available An increasing number of neuroscience papers capitalize on the assumption published in this journal that visual speech would be typically 150 ms ahead of auditory speech. It happens that the estimation of audiovisual asynchrony in the reference paper is valid only in very specific cases, for isolated consonant-vowel syllables or at the beginning of a speech utterance, in what we call "preparatory gestures". However, when syllables are chained in sequences, as they are typically in most parts of a natural speech utterance, asynchrony should be defined in a different way. This is what we call "comodulatory gestures" providing auditory and visual events more or less in synchrony. We provide audiovisual data on sequences of plosive-vowel syllables (pa, ta, ka, ba, da, ga, ma, na showing that audiovisual synchrony is actually rather precise, varying between 20 ms audio lead and 70 ms audio lag. We show how more complex speech material should result in a range typically varying between 40 ms audio lead and 200 ms audio lag, and we discuss how this natural coordination is reflected in the so-called temporal integration window for audiovisual speech perception. Finally we present a toy model of auditory and audiovisual predictive coding, showing that visual lead is actually not necessary for visual prediction.
Attention to affective audio-visual information: Comparison between musicians and non-musicians

NARCIS (Netherlands)

Weijkamp, J.; Sadakata, M.

2017-01-01

Individuals with more musical training repeatedly demonstrate enhanced auditory perception abilities. The current study examined how these enhanced auditory skills interact with attention to affective audio-visual stimuli. A total of 16 participants with more than 5 years of musical training

Attitude of medical students towards the use of audio visual aids during didactic lectures in pharmacology in a medical college of central India

OpenAIRE

Mehul Agrawal; Rajanish Kumar Sankdia

2016-01-01

Background: Students favour teaching methods employing audio visual aids over didactic lectures not using these aids. However, the optimum use of audio visual aids is essential for deriving their benefits. During a lecture, both the visual and auditory senses are used to absorb information. Different methods of lecture are and ndash; chalk and board, power point presentations (PPT) and mix of aids. This study was done to know the students' preference regarding the various audio visual aids, ...
StreamSqueeze: a dynamic stream visualization for monitoring of event data

Science.gov (United States)

Mansmann, Florian; Krstajic, Milos; Fischer, Fabian; Bertini, Enrico

2012-01-01

While in clear-cut situations automated analytical solution for data streams are already in place, only few visual approaches have been proposed in the literature for exploratory analysis tasks on dynamic information. However, due to the competitive or security-related advantages that real-time information gives in domains such as finance, business or networking, we are convinced that there is a need for exploratory visualization tools for data streams. Under the conditions that new events have higher relevance and that smooth transitions enable traceability of items, we propose a novel dynamic stream visualization called StreamSqueeze. In this technique the degree of interest of recent items is expressed through an increase in size and thus recent events can be shown with more details. The technique has two main benefits: First, the layout algorithm arranges items in several lists of various sizes and optimizes the positions within each list so that the transition of an item from one list to the other triggers least visual changes. Second, the animation scheme ensures that for 50 percent of the time an item has a static screen position where reading is most effective and then continuously shrinks and moves to the its next static position in the subsequent list. To demonstrate the capability of our technique, we apply it to large and high-frequency news and syslog streams and show how it maintains optimal stability of the layout under the conditions given above.
Identification of Sparse Audio Tampering Using Distributed Source Coding and Compressive Sensing Techniques

Directory of Open Access Journals (Sweden)

Valenzise G

2009-01-01

Full Text Available In the past few years, a large amount of techniques have been proposed to identify whether a multimedia content has been illegally tampered or not. Nevertheless, very few efforts have been devoted to identifying which kind of attack has been carried out, especially due to the large data required for this task. We propose a novel hashing scheme which exploits the paradigms of compressive sensing and distributed source coding to generate a compact hash signature, and we apply it to the case of audio content protection. The audio content provider produces a small hash signature by computing a limited number of random projections of a perceptual, time-frequency representation of the original audio stream; the audio hash is given by the syndrome bits of an LDPC code applied to the projections. At the content user side, the hash is decoded using distributed source coding tools. If the tampering is sparsifiable or compressible in some orthonormal basis or redundant dictionary, it is possible to identify the time-frequency position of the attack, with a hash size as small as 200 bits/second; the bit saving obtained by introducing distributed source coding ranges between 20% to 70%.
Video Streaming in Online Learning

Science.gov (United States)

Hartsell, Taralynn; Yuen, Steve Chi-Yin

2006-01-01

The use of video in teaching and learning is a common practice in education today. As learning online becomes more of a common practice in education, streaming video and audio will play a bigger role in delivering course materials to online learners. This form of technology brings courses alive by allowing online learners to use their visual and…
Neural entrainment to rhythmically-presented auditory, visual and audio-visual speech in children

Directory of Open Access Journals (Sweden)

Alan James Power

2012-07-01

Full Text Available Auditory cortical oscillations have been proposed to play an important role in speech perception. It is suggested that the brain may take temporal ‘samples’ of information from the speech stream at different rates, phase-resetting ongoing oscillations so that they are aligned with similar frequency bands in the input (‘phase locking’. Information from these frequency bands is then bound together for speech perception. To date, there are no explorations of neural phase-locking and entrainment to speech input in children. However, it is clear from studies of language acquisition that infants use both visual speech information and auditory speech information in learning. In order to study neural entrainment to speech in typically-developing children, we use a rhythmic entrainment paradigm (underlying 2 Hz or delta rate based on repetition of the syllable ba, presented in either the auditory modality alone, the visual modality alone, or as auditory-visual speech (via a talking head. To ensure attention to the task, children aged 13 years were asked to press a button as fast as possible when the ba stimulus violated the rhythm for each stream type. Rhythmic violation depended on delaying the occurrence of a ba in the isochronous stream. Neural entrainment was demonstrated for all stream types, and individual differences in standardized measures of language processing were related to auditory entrainment at the theta rate. Further, there was significant modulation of the preferred phase of auditory entrainment in the theta band when visual speech cues were present, indicating cross-modal phase resetting. The rhythmic entrainment paradigm developed here offers a method for exploring individual differences in oscillatory phase locking during development. In particular, a method for assessing neural entrainment and cross-modal phase resetting would be useful for exploring developmental learning difficulties thought to involve temporal sampling
Correspondence between audio and visual deep models for musical instrument detection in video recordings

OpenAIRE

Slizovskaia, Olga; Gómez, Emilia; Haro, Gloria

2017-01-01

This work aims at investigating cross-modal connections between audio and video sources in the task of musical instrument recognition. We also address in this work the understanding of the representations learned by convolutional neural networks (CNNs) and we study feature correspondence between audio and visual components of a multimodal CNN architecture. For each instrument category, we select the most activated neurons and investigate exist- ing cross-correlations between neurons from the ...
BDVC (Bimodal Database of Violent Content): A database of violent audio and video

Science.gov (United States)

Rivera Martínez, Jose Luis; Mijes Cruz, Mario Humberto; Rodríguez Vázqu, Manuel Antonio; Rodríguez Espejo, Luis; Montoya Obeso, Abraham; García Vázquez, Mireya Saraí; Ramírez Acosta, Alejandro Álvaro

2017-09-01

Nowadays there is a trend towards the use of unimodal databases for multimedia content description, organization and retrieval applications of a single type of content like text, voice and images, instead bimodal databases allow to associate semantically two different types of content like audio-video, image-text, among others. The generation of a bimodal database of audio-video implies the creation of a connection between the multimedia content through the semantic relation that associates the actions of both types of information. This paper describes in detail the used characteristics and methodology for the creation of the bimodal database of violent content; the semantic relationship is stablished by the proposed concepts that describe the audiovisual information. The use of bimodal databases in applications related to the audiovisual content processing allows an increase in the semantic performance only and only if these applications process both type of content. This bimodal database counts with 580 audiovisual annotated segments, with a duration of 28 minutes, divided in 41 classes. Bimodal databases are a tool in the generation of applications for the semantic web.
Concurrent audio-visual feedback for supporting drivers at intersections : a study using two linked driving simulators.

NARCIS (Netherlands)

Houtenbos, M. Winter, J.C.F. de Hale, A.R. Wieringa, P.A. & Hagenzieker, M.P.

2016-01-01

A large portion of road traffic crashes occur at intersections for the reason that drivers lack necessary visual information. This research examined the effects of an audio-visual display that provides real-time sonification and visualization of the speed and direction of another car approaching the
Effect of Nicotine on Audio and Visual Reaction Time in Dipping ...

African Journals Online (AJOL)

Nicotine through blood is harmful and as there are fewer studies in India with respect to nicotines influence on reaction time especially in the smokeless tobacco users we studied this. Reaction time is a measure of the sensorimotor integration in a person. We used a PC 1000 Hz reaction timer to record the audio and visual ...
Museums for all: evaluation of an audio descriptive guide for visually impaired visitors at the science museum

Directory of Open Access Journals (Sweden)

Silvia Soler Gallego

2014-12-01

Full Text Available Translation and interpreting are valuable tools to improve accessibility at museums. Theese tools permit the museum communicate with visitors with different capabilities. The aim of this article is to show the results of a study carried out within the TACTO project, aimed at creating and evaluating an audio descriptive guide for visually impaired visitors at the Science Museum of Granada. The project focused on the linguistic aspects of the guide’s contents and its evaluation, which combined the participatory observation with a survey and interview. The results from this study allow us to conclude that the proposed design improves visually impaired visitors’ access to the museum. However, the expectations and specific needs of each visitor change considerably depending on individual factors such as their level of disability and museum visiting habits.
Insects and the Kafkaesque: Insectuous Re-Writings in Visual and Audio-Visual Media

Directory of Open Access Journals (Sweden)

Damianos Grammatikopoulos

2017-09-01

Full Text Available In this article, I examine techniques at work in visual and audio-visual media that deal with the creative imitation of central Kafkan themes, particularly those related to hybrid insects and bodily deformity. In addition, the opening section of my study offers a detailed and thorough discussion of the concept of the “Kafkaesque”, and an attempt will be made to circumscribe its signifying limits. The main objective of the study is to explore the relationship between Kafka’s texts and the works of contemporary cartoonists, illustrators (Charles Burns, and filmmakers (David Cronenberg and identify themes and motifs that they have in common. My approach is informed by transtextual practices and source studies, and I draw systematically on Gerard Genette’s Palimpsests and Harold Bloom’s The Anxiety of Influence.
Two different streams form the dorsal visual system: anatomy and functions.

Science.gov (United States)

Rizzolatti, Giacomo; Matelli, Massimo

2003-11-01

There are two radically different views on the functional role of the dorsal visual stream. One considers it as a system involved in space perception. The other is of a system that codes visual information for action organization. On the basis of new anatomical data and a reconsideration of previous functional and clinical data, we propose that the dorsal stream and its recipient parietal areas form two distinct functional systems: the dorso-dorsal stream (d-d stream) and the ventro-dorsal stream (v-d stream). The d-d stream is formed by area V6 (main d-d extrastriate visual node) and areas V6A and MIP of the superior parietal lobule. Its major functional role is the control of actions "on line". Its damage leads to optic ataxia. The v-d stream is formed by area MT (main v-d extrastriate visual node) and by the visual areas of the inferior parietal lobule. As the d-d stream, v-d stream is responsible for action organization. It, however, also plays a crucial role in space perception and action understanding. The putative mechanisms linking action and perception in the v-d stream is discussed.
Primary School Pupils' Response to Audio-Visual Learning Process in Port-Harcourt

Science.gov (United States)

Olube, Friday K.

2015-01-01

The purpose of this study is to examine primary school children's response on the use of audio-visual learning processes--a case study of Chokhmah International Academy, Port-Harcourt (owned by Salvation Ministries). It looked at the elements that enhance pupils' response to educational television programmes and their hindrances to these…
Deep learning, audio adversaries, and music content analysis

DEFF Research Database (Denmark)

Kereliuk, Corey Mose; Sturm, Bob L.; Larsen, Jan

2015-01-01

We present the concept of adversarial audio in the context of deep neural networks (DNNs) for music content analysis. An adversary is an algorithm that makes minor perturbations to an input that cause major repercussions to the system response. In particular, we design an adversary for a DNN...... that takes as input short-time spectral magnitudes of recorded music and outputs a high-level music descriptor. We demonstrate how this adversary can make the DNN behave in any way with only extremely minor changes to the music recording signal. We show that the adversary cannot be neutralised by a simple...... filtering of the input. Finally, we discuss adversaries in the broader context of the evaluation of music content analysis systems....
Unimodal Learning Enhances Crossmodal Learning in Robotic Audio-Visual Tracking

DEFF Research Database (Denmark)

Shaikh, Danish; Bodenhagen, Leon; Manoonpong, Poramate

2017-01-01

Crossmodal sensory integration is a fundamental feature of the brain that aids in forming an coherent and unified representation of observed events in the world. Spatiotemporally correlated sensory stimuli brought about by rich sensorimotor experiences drive the development of crossmodal integrat...... a non-holonomic robotic agent towards a moving audio-visual target. Simulation results demonstrate that unimodal learning enhances crossmodal learning and improves both the overall accuracy and precision of multisensory orientation response....
Unimodal Learning Enhances Crossmodal Learning in Robotic Audio-Visual Tracking

DEFF Research Database (Denmark)

Shaikh, Danish; Bodenhagen, Leon; Manoonpong, Poramate

2018-01-01

Crossmodal sensory integration is a fundamental feature of the brain that aids in forming an coherent and unified representation of observed events in the world. Spatiotemporally correlated sensory stimuli brought about by rich sensorimotor experiences drive the development of crossmodal integrat...... a non-holonomic robotic agent towards a moving audio-visual target. Simulation results demonstrate that unimodal learning enhances crossmodal learning and improves both the overall accuracy and precision of multisensory orientation response....
Audio visual information materials for risk communication

International Nuclear Information System (INIS)

Gunji, Ikuko; Tabata, Rimiko; Ohuchi, Naomi

2005-07-01

Japan Nuclear Cycle Development Institute (JNC), Tokai Works set up the Risk Communication Study Team in January, 2001 to promote mutual understanding between the local residents and JNC. The Team has studied risk communication from various viewpoints and developed new methods of public relations which are useful for the local residents' risk perception toward nuclear issues. We aim to develop more effective risk communication which promotes a better mutual understanding of the local residents, by providing the risk information of the nuclear fuel facilities such a Reprocessing Plant and other research and development facilities. We explain the development process of audio visual information materials which describe our actual activities and devices for the risk management in nuclear fuel facilities, and our discussion through the effectiveness measurement. (author)
APLIKASI MEDIA AUDIO-VISUAL DALAM PEMBELAJARAN SPEAKING SKILL DENGAN PENDEKATAN AUDIOLINGUAL: Studi Kasus di MAN Batang

Directory of Open Access Journals (Sweden)

Slamet Untung

2012-10-01

Full Text Available The research to study the application of audio and visual medium in order to learn speaking skill by audiolingual approach is a good contribution to educational world of senior high school and the Islamic one, particularly, in finding a way to improving the learning component relating directly to the medium and method of learning speaking skill. This research is to find out its significance and relevance. The main variable of this research includes the whole activities of the application of audio and visual medium in learning speaking skill by audio-lingual approach. The data were collected through observation, interview, questionnaire and documentation. This research took place in state Islamic senior high school of Batang in Central Java. The result shows that the application helps the students to speak English correctly and accurately and stresses the message of the speaking skill learning.
Audio Visual Media Components in Educational Game for Elementary Students

Directory of Open Access Journals (Sweden)

Meilani Hartono

2016-12-01

Full Text Available The purpose of this research was to review and implement interactive audio visual media used in an educational game to improve elementary students’ interest in learning mathematics. The game was developed for desktop platform. The art of the game was set as 2D cartoon art with animation and audio in order to make students more interest. There were four mini games developed based on the researches on mathematics study. Development method used was Multimedia Development Life Cycle (MDLC that consists of requirement, design, development, testing, and implementation phase. Data collection methods used are questionnaire, literature study, and interview. The conclusion is elementary students interest with educational game that has fun and active (moving objects, with fast tempo of music, and carefree color like blue. This educational game is hoped to be an alternative teaching tool combined with conventional teaching method.
Effects of Temporal Congruity Between Auditory and Visual Stimuli Using Rapid Audio-Visual Serial Presentation.

Science.gov (United States)

An, Xingwei; Tang, Jiabei; Liu, Shuang; He, Feng; Qi, Hongzhi; Wan, Baikun; Ming, Dong

2016-10-01

Combining visual and auditory stimuli in event-related potential (ERP)-based spellers gained more attention in recent years. Few of these studies notice the difference of ERP components and system efficiency caused by the shifting of visual and auditory onset. Here, we aim to study the effect of temporal congruity of auditory and visual stimuli onset on bimodal brain-computer interface (BCI) speller. We designed five visual and auditory combined paradigms with different visual-to-auditory delays (-33 to +100 ms). Eleven participants attended in this study. ERPs were acquired and aligned according to visual and auditory stimuli onset, respectively. ERPs of Fz, Cz, and PO7 channels were studied through the statistical analysis of different conditions both from visual-aligned ERPs and audio-aligned ERPs. Based on the visual-aligned ERPs, classification accuracy was also analyzed to seek the effects of visual-to-auditory delays. The latencies of ERP components depended mainly on the visual stimuli onset. Auditory stimuli onsets influenced mainly on early component accuracies, whereas visual stimuli onset determined later component accuracies. The latter, however, played a dominate role in overall classification. This study is important for further studies to achieve better explanations and ultimately determine the way to optimize the bimodal BCI application.

Auditory cross-modal reorganization in cochlear implant users indicates audio-visual integration.

Science.gov (United States)

Stropahl, Maren; Debener, Stefan

2017-01-01

There is clear evidence for cross-modal cortical reorganization in the auditory system of post-lingually deafened cochlear implant (CI) users. A recent report suggests that moderate sensori-neural hearing loss is already sufficient to initiate corresponding cortical changes. To what extend these changes are deprivation-induced or related to sensory recovery is still debated. Moreover, the influence of cross-modal reorganization on CI benefit is also still unclear. While reorganization during deafness may impede speech recovery, reorganization also has beneficial influences on face recognition and lip-reading. As CI users were observed to show differences in multisensory integration, the question arises if cross-modal reorganization is related to audio-visual integration skills. The current electroencephalography study investigated cortical reorganization in experienced post-lingually deafened CI users ( n = 18), untreated mild to moderately hearing impaired individuals (n = 18) and normal hearing controls ( n = 17). Cross-modal activation of the auditory cortex by means of EEG source localization in response to human faces and audio-visual integration, quantified with the McGurk illusion, were measured. CI users revealed stronger cross-modal activations compared to age-matched normal hearing individuals. Furthermore, CI users showed a relationship between cross-modal activation and audio-visual integration strength. This may further support a beneficial relationship between cross-modal activation and daily-life communication skills that may not be fully captured by laboratory-based speech perception tests. Interestingly, hearing impaired individuals showed behavioral and neurophysiological results that were numerically between the other two groups, and they showed a moderate relationship between cross-modal activation and the degree of hearing loss. This further supports the notion that auditory deprivation evokes a reorganization of the auditory system
Auditory cross-modal reorganization in cochlear implant users indicates audio-visual integration

Directory of Open Access Journals (Sweden)

Maren Stropahl

2017-01-01

Full Text Available There is clear evidence for cross-modal cortical reorganization in the auditory system of post-lingually deafened cochlear implant (CI users. A recent report suggests that moderate sensori-neural hearing loss is already sufficient to initiate corresponding cortical changes. To what extend these changes are deprivation-induced or related to sensory recovery is still debated. Moreover, the influence of cross-modal reorganization on CI benefit is also still unclear. While reorganization during deafness may impede speech recovery, reorganization also has beneficial influences on face recognition and lip-reading. As CI users were observed to show differences in multisensory integration, the question arises if cross-modal reorganization is related to audio-visual integration skills. The current electroencephalography study investigated cortical reorganization in experienced post-lingually deafened CI users (n = 18, untreated mild to moderately hearing impaired individuals (n = 18 and normal hearing controls (n = 17. Cross-modal activation of the auditory cortex by means of EEG source localization in response to human faces and audio-visual integration, quantified with the McGurk illusion, were measured. CI users revealed stronger cross-modal activations compared to age-matched normal hearing individuals. Furthermore, CI users showed a relationship between cross-modal activation and audio-visual integration strength. This may further support a beneficial relationship between cross-modal activation and daily-life communication skills that may not be fully captured by laboratory-based speech perception tests. Interestingly, hearing impaired individuals showed behavioral and neurophysiological results that were numerically between the other two groups, and they showed a moderate relationship between cross-modal activation and the degree of hearing loss. This further supports the notion that auditory deprivation evokes a reorganization of the
Impact of oral health education by audio aids, braille and tactile models on the oral health status of visually impaired children of Bhopal City.

Science.gov (United States)

Gautam, Anjali; Bhambal, Ajay; Moghe, Swapnil

2018-01-01

Children with special needs face unique challenges in day-to-day practice. They are dependent on their close ones for everything. To improve oral hygiene in such visually impaired children, undue training and education are required. Braille is an important language for reading and writing for the visually impaired. It helps them understand and visualize the world via touch. Audio aids are being used to impart health education to the visually impaired. Tactile models help them perceive things which they cannot visualize and hence are an important learning tool. This study aimed to assess the improvement in oral hygiene by audio aids and Braille and tactile models in visually impaired children aged 6-16 years of Bhopal city. This was a prospective study. Sixty visually impaired children aged 6-16 years were selected and randomly divided into three groups (20 children each). Group A: audio aids + Braille, Group B: audio aids + tactile models, and Group C: audio aids + Braille + tactile models. Instructions were given for maintaining good oral hygiene and brushing techniques were explained to all children. After 3 months' time, the oral hygiene status was recorded and compared using plaque and gingival index. ANNOVA test was used. The present study showed a decrease in the mean plaque and gingival scores at all time intervals in individual group as compared to that of the baseline that was statistically significant. The study depicts that the combination of audio aids, Braille and tactile models is an effective way to provide oral health education and improve oral health status of visually impaired children.
The presentation of expert testimony via live audio-visual communication.

Science.gov (United States)

Miller, R D

1991-01-01

As part of a national effort to improve efficiency in court procedures, the American Bar Association has recommended, on the basis of a number of pilot studies, increased use of current audio-visual technology, such as telephone and live video communication, to eliminate delays caused by unavailability of participants in both civil and criminal procedures. Although these recommendations were made to facilitate court proceedings, and for the convenience of attorneys and judges, they also have the potential to save significant time for clinical expert witnesses as well. The author reviews the studies of telephone testimony that were done by the American Bar Association and other legal research groups, as well as the experience in one state forensic evaluation and treatment center. He also reviewed the case law on the issue of remote testimony. He then presents data from a national survey of state attorneys general concerning the admissibility of testimony via audio-visual means, including video depositions. Finally, he concludes that the option to testify by telephone provides a significant savings in precious clinical time for forensic clinicians in public facilities, and urges that such clinicians work actively to convince courts and/or legislatures in states that do not permit such testimony (currently the majority), to consider accepting it, to improve the effective use of scarce clinical resources in public facilities.
Audio-Visual Feedback for Self-monitoring Posture in Ballet Training

DEFF Research Database (Denmark)

Knudsen, Esben Winther; Hølledig, Malte Lindholm; Bach-Nielsen, Sebastian Siem

2017-01-01

An application for ballet training is presented that monitors the posture position (straightness of the spine and rotation of the pelvis) deviation from the ideal position in real-time. The human skeletal data is acquired through a Microsoft Kinect v2. The movement of the student is mirrored......-coded. In an experiment with 9-12 year-old dance students from a ballet school, comparing the audio-visual feedback modality with no feedback leads to an increase in posture accuracy (p
Revealing the ecological content of long-duration audio-recordings of the environment through clustering and visualisation.

Science.gov (United States)

Phillips, Yvonne F; Towsey, Michael; Roe, Paul

2018-01-01

Audio recordings of the environment are an increasingly important technique to monitor biodiversity and ecosystem function. While the acquisition of long-duration recordings is becoming easier and cheaper, the analysis and interpretation of that audio remains a significant research area. The issue addressed in this paper is the automated reduction of environmental audio data to facilitate ecological investigations. We describe a method that first reduces environmental audio to vectors of acoustic indices, which are then clustered. This can reduce the audio data by six to eight orders of magnitude yet retain useful ecological information. We describe techniques to visualise sequences of cluster occurrence (using for example, diel plots, rose plots) that assist interpretation of environmental audio. Colour coding acoustic clusters allows months and years of audio data to be visualised in a single image. These techniques are useful in identifying and indexing the contents of long-duration audio recordings. They could also play an important role in monitoring long-term changes in species abundance brought about by habitat degradation and/or restoration.
Impact of oral health education by audio aids, braille and tactile models on the oral health status of visually impaired children of Bhopal City

Directory of Open Access Journals (Sweden)

Anjali Gautam

2018-01-01

Full Text Available Context: Children with special needs face unique challenges in day-to-day practice. They are dependent on their close ones for everything. To improve oral hygiene in such visually impaired children, undue training and education are required. Braille is an important language for reading and writing for the visually impaired. It helps them understand and visualize the world via touch. Audio aids are being used to impart health education to the visually impaired. Tactile models help them perceive things which they cannot visualize and hence are an important learning tool. Aim: This study aimed to assess the improvement in oral hygiene by audio aids and Braille and tactile models in visually impaired children aged 6–16 years of Bhopal city. Settings and Design: This was a prospective study. Materials and Methods: Sixty visually impaired children aged 6–16 years were selected and randomly divided into three groups (20 children each. Group A: audio aids + Braille, Group B: audio aids + tactile models, and Group C: audio aids + Braille + tactile models. Instructions were given for maintaining good oral hygiene and brushing techniques were explained to all children. After 3 months' time, the oral hygiene status was recorded and compared using plaque and gingival index. Statistical Analysis Used: ANNOVA test was used. Results: The present study showed a decrease in the mean plaque and gingival scores at all time intervals in individual group as compared to that of the baseline that was statistically significant. Conclusions: The study depicts that the combination of audio aids, Braille and tactile models is an effective way to provide oral health education and improve oral health status of visually impaired children.
Content-based intermedia synchronization

Science.gov (United States)

Oh, Dong-Young; Sampath-Kumar, Srihari; Rangan, P. Venkat

1995-03-01

Inter-media synchronization methods developed until now have been based on syntactic timestamping of video frames and audio samples. These methods are not fully appropriate for the synchronization of multimedia objects which may have to be accessed individually by their contents, e.g. content-base data retrieval. We propose a content-based multimedia synchronization scheme in which a media stream is viewed as hierarchial composition of smaller objects which are logically structured based on the contents, and the synchronization is achieved by deriving temporal relations among logical units of media object. content-based synchronization offers several advantages such as, elimination of the need for time stamping, freedom from limitations of jitter, synchronization of independently captured media objects in video editing, and compensation for inherent asynchronies in capture times of video and audio.
The use of ambient audio to increase safety and immersion in location-based games

Science.gov (United States)

Kurczak, John Jason

The purpose of this thesis is to propose an alternative type of interface for mobile software being used while walking or running. Our work addresses the problem of visual user interfaces for mobile software be- ing potentially unsafe for pedestrians, and not being very immersive when used for location-based games. In addition, location-based games and applications can be dif- ficult to develop when directly interfacing with the sensors used to track the user's location. These problems need to be addressed because portable computing devices are be- coming a popular tool for navigation, playing games, and accessing the internet while walking. This poses a safety problem for mobile users, who may be paying too much attention to their device to notice and react to hazards in their environment. The difficulty of developing location-based games and other location-aware applications may significantly hinder the prevalence of applications that explore new interaction techniques for ubiquitous computing. We created the TREC toolkit to address the issues with tracking sensors while developing location-based games and applications. We have developed functional location-based applications with TREC to demonstrate the amount of work that can be saved by using this toolkit. In order to have a safer and more immersive alternative to visual interfaces, we have developed ambient audio interfaces for use with mobile applications. Ambient audio uses continuous streams of sound over headphones to present information to mobile users without distracting them from walking safely. In order to test the effectiveness of ambient audio, we ran a study to compare ambient audio with handheld visual interfaces in a location-based game. We compared players' ability to safely navigate the environment, their sense of immersion in the game, and their performance at the in-game tasks. We found that ambient audio was able to significantly increase players' safety and sense of immersion compared to a
Focus on Hinduism: Audio-Visual Resources for Teaching Religion. Occasional Publication No. 23.

Science.gov (United States)

Dell, David; And Others

The guide presents annotated lists of audio and visual materials about the Hindu religion. The authors point out that Hinduism cannot be comprehended totally by reading books; thus the resources identified in this guide will enhance understanding based on reading. The guide is intended for use by high school and college students, teachers,…
A centralized audio presentation manager

Energy Technology Data Exchange (ETDEWEB)

Papp, A.L. III; Blattner, M.M.

1994-05-16

The centralized audio presentation manager addresses the problems which occur when multiple programs running simultaneously attempt to use the audio output of a computer system. Time dependence of sound means that certain auditory messages must be scheduled simultaneously, which can lead to perceptual problems due to psychoacoustic phenomena. Furthermore, the combination of speech and nonspeech audio is examined; each presents its own problems of perceptibility in an acoustic environment composed of multiple auditory streams. The centralized audio presentation manager receives abstract parameterized message requests from the currently running programs, and attempts to create and present a sonic representation in the most perceptible manner through the use of a theoretically and empirically designed rule set.
Use of audio-visual methods in radiology and physics courses

Energy Technology Data Exchange (ETDEWEB)

Holmberg, P

1987-03-15

Today's medicine utilizes sophisticated equipment for radiological, biochemical and microbiological investigation procedures and analyses. Hence it is necessary that physicans have adequate scientific and technical knowledge of the apparatus they are using so that the equipment can be used in the most effective way. Partly this knowledge is obtained from science-orientated courses in the preclinical stage of the study program for medical students. To increase the motivation to study science-courses (medical physics) audio-visual methods are used to describe diagnostic and therapeutic procedures in the clinical routines.
Audio-visual feedback improves the BCI performance in the navigational control of a humanoid robot

Directory of Open Access Journals (Sweden)

Emmanuele eTidoni

2014-06-01

Full Text Available Advancement in brain computer interfaces (BCI technology allows people to actively interact in the world through surrogates. Controlling real humanoid robots using BCI as intuitively as we control our body represents a challenge for current research in robotics and neuroscience. In order to successfully interact with the environment the brain integrates multiple sensory cues to form a coherent representation of the world. Cognitive neuroscience studies demonstrate that multisensory integration may imply a gain with respect to a single modality and ultimately improve the overall sensorimotor performance. For example, reactivity to simultaneous visual and auditory stimuli may be higher than to the sum of the same stimuli delivered in isolation or in temporal sequence. Yet, knowledge about whether audio-visual integration may improve the control of a surrogate is meager. To explore this issue, we provided human footstep sounds as audio feedback to BCI users while controlling a humanoid robot. Participants were asked to steer their robot surrogate and perform a pick-and-place task through BCI-SSVEPs. We found that audio-visual synchrony between footsteps sound and actual humanoid’s walk reduces the time required for steering the robot. Thus, auditory feedback congruent with the humanoid actions may improve motor decisions of the BCI’s user and help in the feeling of control over it. Our results shed light on the possibility to increase robot’s control through the combination of multisensory feedback to a BCI user.
A Bit Stream Scalable Speech/Audio Coder Combining Enhanced Regular Pulse Excitation and Parametric Coding

Directory of Open Access Journals (Sweden)

Albertus C. den Brinker

2007-01-01

Full Text Available This paper introduces a new audio and speech broadband coding technique based on the combination of a pulse excitation coder and a standardized parametric coder, namely, MPEG-4 high-quality parametric coder. After presenting a series of enhancements to regular pulse excitation (RPE to make it suitable for the modeling of broadband signals, it is shown how pulse and parametric codings complement each other and how they can be merged to yield a layered bit stream scalable coder able to operate at different points in the quality bit rate plane. The performance of the proposed coder is evaluated in a listening test. The major result is that the extra functionality of the bit stream scalability does not come at the price of a reduced performance since the coder is competitive with standardized coders (MP3, AAC, SSC.
A Bit Stream Scalable Speech/Audio Coder Combining Enhanced Regular Pulse Excitation and Parametric Coding

Science.gov (United States)

Riera-Palou, Felip; den Brinker, Albertus C.

2007-12-01

This paper introduces a new audio and speech broadband coding technique based on the combination of a pulse excitation coder and a standardized parametric coder, namely, MPEG-4 high-quality parametric coder. After presenting a series of enhancements to regular pulse excitation (RPE) to make it suitable for the modeling of broadband signals, it is shown how pulse and parametric codings complement each other and how they can be merged to yield a layered bit stream scalable coder able to operate at different points in the quality bit rate plane. The performance of the proposed coder is evaluated in a listening test. The major result is that the extra functionality of the bit stream scalability does not come at the price of a reduced performance since the coder is competitive with standardized coders (MP3, AAC, SSC).
Interactive Football-Training Based on Rebounders with Hit Position Sensing and Audio-Visual Feedback

DEFF Research Database (Denmark)

Jensen, Mads Møller; Grønbæk, Kaj; Thomassen, Nikolaj

2014-01-01

. However, most of these tools are created with a single goal, either to measure or train, and are often used and tested in very controlled settings. In this paper, we present an interactive football-training platform, called Football Lab, featuring sensor- mounted rebounders as well as audio-visual...
MeetingVis: Visual Narratives to Assist in Recalling Meeting Context and Content.

Science.gov (United States)

Shi, Yang; Bryan, Chris; Bhamidipati, Sridatt; Zhao, Ying; Zhang, Yaoxue; Ma, Kwan-Liu

2018-06-01

In team-based workplaces, reviewing and reflecting on the content from a previously held meeting can lead to better planning and preparation. However, ineffective meeting summaries can impair this process, especially when participants have difficulty remembering what was said and what its context was. To assist with this process, we introduce MeetingVis, a visual narrative-based approach to meeting summarization. MeetingVis is composed of two primary components: (1) a data pipeline that processes the spoken audio from a group discussion, and (2) a visual-based interface that efficiently displays the summarized content. To design MeetingVis, we create a taxonomy of relevant meeting data points, identifying salient elements to promote recall and reflection. These are mapped to an augmented storyline visualization, which combines the display of participant activities, topic evolutions, and task assignments. For evaluation, we conduct a qualitative user study with five groups. Feedback from the study indicates that MeetingVis effectively triggers the recall of subtle details from prior meetings: all study participants were able to remember new details, points, and tasks compared to an unaided, memory-only baseline. This visual-based approaches can also potentially enhance the productivity of both individuals and the whole team.
Tune in the Net with RealAudio.

Science.gov (United States)

Buchanan, Larry

1997-01-01

Describes how to connect to the RealAudio Web site to download a player that provides sound from Web pages to the computer through streaming technology. Explains hardware and software requirements and provides addresses for other RealAudio Web sites are provided, including weather information and current news. (LRW)
Audio-based Age and Gender Identification to Enhance the Recommendation of TV Content

DEFF Research Database (Denmark)

Shepstone, Sven Ewan; Tan, Zheng-Hua; Jensen, Søren Holdt

2013-01-01

Recommending TV content to groups of viewers is best carried out when relevant information such as the demographics of the group is available. However, it can be difficult and time consuming to extract information for every user in the group. This paper shows how an audio analysis of the age...... and gender of a group of users watching the TV can be used for recommending a sequence of N short TV content items for the group. First, a state of the art audio-based classifier determines the age and gender of each user in an M-user group and creates a group profile. A genetic recommender algorithm...... profile, thus ensuring that items are proportionally allocated to users with respect to their demographic categorization. The proposed system is compared to an ideal system where the group demographics are provided explicitly. Results using real speaker utterances show that, in spite of the inaccuracies...
A Content-Adaptive Analysis and Representation Framework for Audio Event Discovery from "Unscripted" Multimedia

Science.gov (United States)

Radhakrishnan, Regunathan; Divakaran, Ajay; Xiong, Ziyou; Otsuka, Isao

2006-12-01

We propose a content-adaptive analysis and representation framework to discover events using audio features from "unscripted" multimedia such as sports and surveillance for summarization. The proposed analysis framework performs an inlier/outlier-based temporal segmentation of the content. It is motivated by the observation that "interesting" events in unscripted multimedia occur sparsely in a background of usual or "uninteresting" events. We treat the sequence of low/mid-level features extracted from the audio as a time series and identify subsequences that are outliers. The outlier detection is based on eigenvector analysis of the affinity matrix constructed from statistical models estimated from the subsequences of the time series. We define the confidence measure on each of the detected outliers as the probability that it is an outlier. Then, we establish a relationship between the parameters of the proposed framework and the confidence measure. Furthermore, we use the confidence measure to rank the detected outliers in terms of their departures from the background process. Our experimental results with sequences of low- and mid-level audio features extracted from sports video show that "highlight" events can be extracted effectively as outliers from a background process using the proposed framework. We proceed to show the effectiveness of the proposed framework in bringing out suspicious events from surveillance videos without any a priori knowledge. We show that such temporal segmentation into background and outliers, along with the ranking based on the departure from the background, can be used to generate content summaries of any desired length. Finally, we also show that the proposed framework can be used to systematically select "key audio classes" that are indicative of events of interest in the chosen domain.

Acceptance of online audio-visual cultural heritage archive services: a study of the general public

NARCIS (Netherlands)

Ongena, G.; van de Wijngaert, Lidwien; Huizer, E.

2013-01-01

Introduction. This study examines the antecedents of user acceptance of an audio-visual heritage archive for a wider audience (i.e., the general public) by extending the technology acceptance model with the concepts of perceived enjoyment, nostalgia proneness and personal innovativeness. Method. A
M-Stream Deficits and Reading-Related Visual Processes in Developmental Dyslexia

Science.gov (United States)

Boden, Catherine; Giaschi, Deborah

2007-01-01

Some visual processing deficits in developmental dyslexia have been attributed to abnormalities in the subcortical M stream and/or the cortical dorsal stream of the visual pathways. The nature of the relationship between these visual deficits and reading is unknown. The purpose of the present article was to characterize reading-related perceptual…
Visual saliency in MPEG-4 AVC video stream

Science.gov (United States)

Ammar, M.; Mitrea, M.; Hasnaoui, M.; Le Callet, P.

2015-03-01

Visual saliency maps already proved their efficiency in a large variety of image/video communication application fields, covering from selective compression and channel coding to watermarking. Such saliency maps are generally based on different visual characteristics (like color, intensity, orientation, motion,…) computed from the pixel representation of the visual content. This paper resumes and extends our previous work devoted to the definition of a saliency map solely extracted from the MPEG-4 AVC stream syntax elements. The MPEG-4 AVC saliency map thus defined is a fusion of static and dynamic map. The static saliency map is in its turn a combination of intensity, color and orientation features maps. Despite the particular way in which all these elementary maps are computed, the fusion techniques allowing their combination plays a critical role in the final result and makes the object of the proposed study. A total of 48 fusion formulas (6 for combining static features and, for each of them, 8 to combine static to dynamic features) are investigated. The performances of the obtained maps are evaluated on a public database organized at IRCCyN, by computing two objective metrics: the Kullback-Leibler divergence and the area under curve.
UNDERSTANDING PROSE THROUGH TASK ORIENTED AUDIO-VISUAL ACTIVITY: AN AMERICAN MODERN PROSE COURSE AT THE FACULTY OF LETTERS, PETRA CHRISTIAN UNIVERSITY

Directory of Open Access Journals (Sweden)

Sarah Prasasti

2001-01-01

Full Text Available The method presented here provides the basis for a course in American prose for EFL students. Understanding and appreciation of American prose is a difficult task for the students because they come into contact with works that are full of cultural baggage and far apart from their own world. The audio visual aid is one of the alternatives to sensitize the students to the topic and the cultural background. Instead of proving the ready-made audio visual aids, teachers can involve students to actively engage in a more task oriented audiovisual project. Here, the teachers encourage their students to create their own audio visual aids using colors, pictures, sound and gestures as a point of initiation for further discussion. The students can use color that has become a strong element of fiction to help them calling up a forceful visual representation. Pictures can also stimulate the students to build their mental image. Sound and silence, which are a part of the fabric of literature, may also help them to increase the emotional impact.
Joint evaluation of communication quality and user experience in an audio-visual virtual reality meeting

DEFF Research Database (Denmark)

Møller, Anders Kalsgaard; Hoffmann, Pablo F.; Carrozzino, Marcello

2013-01-01

The state-of-the-art speech intelligibility tests are created with the purpose of evaluating acoustic communication devices and not for evaluating audio-visual virtual reality systems. This paper present a novel method to evaluate a communication situation based on both the speech intelligibility...
Open-Loop Audio-Visual Stimulation (AVS): A Useful Tool for Management of Insomnia?

Science.gov (United States)

Tang, Hsin-Yi Jean; Riegel, Barbara; McCurry, Susan M; Vitiello, Michael V

2016-03-01

Audio Visual Stimulation (AVS), a form of neurofeedback, is a non-pharmacological intervention that has been used for both performance enhancement and symptom management. We review the history of AVS, its two sub-types (close- and open-loop), and discuss its clinical implications. We also describe a promising new application of AVS to improve sleep, and potentially decrease pain. AVS research can be traced back to the late 1800s. AVS's efficacy has been demonstrated for both performance enhancement and symptom management. Although AVS is commonly used in clinical settings, there is limited literature evaluating clinical outcomes and mechanisms of action. One of the challenges to AVS research is the lack of standardized terms, which makes systematic review and literature consolidation difficult. Future studies using AVS as an intervention should; (1) use operational definitions that are consistent with the existing literature, such as AVS, Audio-visual Entrainment, or Light and Sound Stimulation, (2) provide a clear rationale for the chosen training frequency modality, (3) use a randomized controlled design, and (4) follow the Consolidated Standards of Reporting Trials and/or related guidelines when disseminating results.
Inner Sound: Altered States of Consciousness in Electronic Music and Audio-Visual Media

DEFF Research Database (Denmark)

Weinel, Jonathan

Over the last century, developments in electronic music and art have enabled new possibilities for creating audio and audio-visual artworks. With this new potential has come the possibility for representing subjective internal conscious states, such as the experience of hallucinations, using...... the creative influence of ASCs, from Amazonian chicha festivals to the synaesthetic assaults of neon raves; and from an immersive outdoor electroacoustic performance on an Athenian hilltop to a mushroom trip on a tropical island in virtual reality. Beginning with a discussion of consciousness, the book...... explores how our subjective realities may change during states of dream, psychedelic experience, meditation, and trance. Taking a broad view across a wide range of genres, Inner Sound draws connections between shamanic art and music, and the modern technoshamanism of psychedelic rock, electronic dance...
WLAN Technologies for Audio Delivery

Directory of Open Access Journals (Sweden)

Nicolas-Alexander Tatlas

2007-01-01

Full Text Available Audio delivery and reproduction for home or professional applications may greatly benefit from the adoption of digital wireless local area network (WLAN technologies. The most challenging aspect of such integration relates the synchronized and robust real-time streaming of multiple audio channels to multipoint receivers, for example, wireless active speakers. Here, it is shown that current WLAN solutions are susceptible to transmission errors. A detailed study of the IEEE802.11e protocol (currently under ratification is also presented and all relevant distortions are assessed via an analytical and experimental methodology. A novel synchronization scheme is also introduced, allowing optimized playback for multiple receivers. The perceptual audio performance is assessed for both stereo and 5-channel applications based on either PCM or compressed audio signals.
Book review: An Introduction to Audio Content Analysis: Applications in Signal Processing and Music Informatics by Alexander Lerch

DEFF Research Database (Denmark)

Sturm, Bob L.

2013-01-01

A critical review of the book: An Introduction to Audio Content Analysis: Applications in Signal Processing and Music Informatics, by Alexander Lerch, October 2012, Wiley-IEEE Press. ISBN: 978-1-118-26682-3, Hardcover, 272 pages, 503 references. List price $125.00......A critical review of the book: An Introduction to Audio Content Analysis: Applications in Signal Processing and Music Informatics, by Alexander Lerch, October 2012, Wiley-IEEE Press. ISBN: 978-1-118-26682-3, Hardcover, 272 pages, 503 references. List price $125.00...
The use of audio-visual methods in radiology and physics courses

International Nuclear Information System (INIS)

Holmberg, P.

1987-01-01

Today's medicine utilizes sophisticated equipment for radiological, biochemical and microbiological investigation procedures and analyses. Hence it is necessary that physicans have adequate scientific and technical knowledge of the apparatus they are using so that the equipment can be used in the most effective way. Partly this knowledge is obtained from science-orientated courses in the preclinical stage of the study program for medical students. To increase the motivation to study science-courses (medical physics) audio-visual methods are used to describe diagnostic and therapeutic procedures in the clinical routines. (orig.)
Sensitivity to audio-visual synchrony and its relation to language abilities in children with and without ASD.

Science.gov (United States)

Righi, Giulia; Tenenbaum, Elena J; McCormick, Carolyn; Blossom, Megan; Amso, Dima; Sheinkopf, Stephen J

2018-04-01

Autism Spectrum Disorder (ASD) is often accompanied by deficits in speech and language processing. Speech processing relies heavily on the integration of auditory and visual information, and it has been suggested that the ability to detect correspondence between auditory and visual signals helps to lay the foundation for successful language development. The goal of the present study was to examine whether young children with ASD show reduced sensitivity to temporal asynchronies in a speech processing task when compared to typically developing controls, and to examine how this sensitivity might relate to language proficiency. Using automated eye tracking methods, we found that children with ASD failed to demonstrate sensitivity to asynchronies of 0.3s, 0.6s, or 1.0s between a video of a woman speaking and the corresponding audio track. In contrast, typically developing children who were language-matched to the ASD group, were sensitive to both 0.6s and 1.0s asynchronies. We also demonstrated that individual differences in sensitivity to audiovisual asynchronies and individual differences in orientation to relevant facial features were both correlated with scores on a standardized measure of language abilities. Results are discussed in the context of attention to visual language and audio-visual processing as potential precursors to language impairment in ASD. Autism Res 2018, 11: 645-653. © 2018 International Society for Autism Research, Wiley Periodicals, Inc. Speech processing relies heavily on the integration of auditory and visual information, and it has been suggested that the ability to detect correspondence between auditory and visual signals helps to lay the foundation for successful language development. The goal of the present study was to explore whether children with ASD process audio-visual synchrony in ways comparable to their typically developing peers, and the relationship between preference for synchrony and language ability. Results showed that
Content Discovery from Composite Audio : An unsupervised approach

NARCIS (Netherlands)

Lu, L.

2009-01-01

In this thesis, we developed and assessed a novel robust and unsupervised framework for semantic inference from composite audio signals. We focused on the problem of detecting audio scenes and grouping them into meaningful clusters. Our approach addressed all major steps in a general process of
Shape representations in the primate dorsal visual stream

Directory of Open Access Journals (Sweden)

Tom eTheys

2015-04-01

Full Text Available The primate visual system extracts object shape information for object recognition in the ventral visual stream. Recent research has demonstrated that object shape is also processed in the dorsal visual stream, which is specialized for spatial vision and the planning of actions. A number of studies have investigated the coding of 2D shape in the anterior intraparietal area (AIP, one of the end-stage areas of the dorsal stream which has been implicated in the extraction of affordances for the purpose of grasping. These findings challenge the current understanding of area AIP as a critical stage in the dorsal stream for the extraction of object affordances. The representation of three-dimensional (3D shape has been studied in two interconnected areas known to be critical for object grasping: area AIP and area F5a in the ventral premotor cortex (PMv, to which AIP projects. In both areas neurons respond selectively to 3D shape defined by binocular disparity, but the latency of the neural selectivity is approximately 10 ms longer in F5a compared to AIP, consistent with its higher position in the hierarchy of cortical areas. Furthermore F5a neurons were more sensitive to small amplitudes of 3D curvature and could detect subtle differences in 3D structure more reliably than AIP neurons. In both areas, 3D-shape selective neurons were co-localized with neurons showing motor-related activity during object grasping in the dark, indicating a close convergence of visual and motor information on the same clusters of neurons.
Audio Description as a Pedagogical Tool

Directory of Open Access Journals (Sweden)

Georgina Kleege

2015-05-01

Full Text Available Audio description is the process of translating visual information into words for people who are blind or have low vision. Typically such description has focused on films, museum exhibitions, images and video on the internet, and live theater. Because it allows people with visual impairments to experience a variety of cultural and educational texts that would otherwise be inaccessible, audio description is a mandated aspect of disability inclusion, although it remains markedly underdeveloped and underutilized in our classrooms and in society in general. Along with increasing awareness of disability, audio description pushes students to practice close reading of visual material, deepen their analysis, and engage in critical discussions around the methodology, standards and values, language, and role of interpretation in a variety of academic disciplines. We outline a few pedagogical interventions that can be customized to different contexts to develop students' writing and critical thinking skills through guided description of visual material.
Modeling Audio Fingerprints : Structure, Distortion, Capacity

NARCIS (Netherlands)

Doets, P.J.O.

2010-01-01

An audio fingerprint is a compact low-level representation of a multimedia signal. An audio fingerprint can be used to identify audio files or fragments in a reliable way. The use of audio fingerprints for identification consists of two phases. In the enrollment phase known content is fingerprinted,
An integrated audio-visual impact tool for wind turbine installations

International Nuclear Information System (INIS)

Lymberopoulos, N.; Belessis, M.; Wood, M.; Voutsinas, S.

1996-01-01

An integrated software tool was developed for the design of wind parks that takes into account their visual and audio impact. The application is built on a powerful hardware platform and is fully operated through a graphic user interface. The topography, the wind turbines and the daylight conditions are realised digitally. The wind park can be animated in real time and the user can take virtual walks in it while the set-up of the park can be altered interactively. In parallel, the wind speed levels on the terrain, the emitted noise intensity, the annual energy output and the cash flow can be estimated at any stage of the session and prompt the user for rearrangements. The tool has been used to visually simulate existing wind parks in St. Breok, UK and Andros Island, Greece. The results lead to the conclusion that such a tool can assist to the public acceptance and licensing procedures of wind parks. (author)
Audio-Visual Integration Modifies Emotional Judgment in Music

Directory of Open Access Journals (Sweden)

Shen-Yuan Su

2011-10-01

Full Text Available The conventional view that perceived emotion in music is derived mainly from auditory signals has led to neglect of the contribution of visual image. In this study, we manipulated mode (major vs. minor and examined the influence of a video image on emotional judgment in music. Melodies in either major or minor mode were controlled for tempo and rhythm and played to the participants. We found that Taiwanese participants, like Westerners, judged major melodies as expressing positive, and minor melodies negative, emotions. The major or minor melodies were then paired with video images of the singers, which were either emotionally congruent or incongruent with their modes. Results showed that participants perceived stronger positive or negative emotions with congruent audio-visual stimuli. Compared to listening to music alone, stronger emotions were perceived when an emotionally congruent video image was added and weaker emotions were perceived when an incongruent image was added. We therefore demonstrate that mode is important to perceive the emotional valence in music and that treating musical art as a purely auditory event might lose the enhanced emotional strength perceived in music, since going to a concert may lead to stronger perceived emotion than listening to the CD at home.
Online Dissection Audio-Visual Resources for Human Anatomy: Undergraduate Medical Students' Usage and Learning Outcomes

Science.gov (United States)

Choi-Lundberg, Derek L.; Cuellar, William A.; Williams, Anne-Marie M.

2016-01-01

In an attempt to improve undergraduate medical student preparation for and learning from dissection sessions, dissection audio-visual resources (DAVR) were developed. Data from e-learning management systems indicated DAVR were accessed by 28% ± 10 (mean ± SD for nine DAVR across three years) of students prior to the corresponding dissection…
On the Importance of Audiovisual Coherence for the Perceived Quality of Synthesized Visual Speech

Directory of Open Access Journals (Sweden)

Wesley Mattheyses

2009-01-01

Full Text Available Audiovisual text-to-speech systems convert a written text into an audiovisual speech signal. Typically, the visual mode of the synthetic speech is synthesized separately from the audio, the latter being either natural or synthesized speech. However, the perception of mismatches between these two information streams requires experimental exploration since it could degrade the quality of the output. In order to increase the intermodal coherence in synthetic 2D photorealistic speech, we extended the well-known unit selection audio synthesis technique to work with multimodal segments containing original combinations of audio and video. Subjective experiments confirm that the audiovisual signals created by our multimodal synthesis strategy are indeed perceived as being more synchronous than those of systems in which both modes are not intrinsically coherent. Furthermore, it is shown that the degree of coherence between the auditory mode and the visual mode has an influence on the perceived quality of the synthetic visual speech fragment. In addition, the audio quality was found to have only a minor influence on the perceived visual signal's quality.
Human Factors in Streaming Data Analysis: Challenges and Opportunities for Information Visualization: Human Factors in Streaming Data Analysis

Energy Technology Data Exchange (ETDEWEB)

Dasgupta, Aritra [Pacific Northwest National Laboratory, Richland Washington USA; Arendt, Dustin L. [Pacific Northwest National Laboratory, Richland Washington USA; Franklin, Lyndsey R. [Pacific Northwest National Laboratory, Richland Washington USA; Wong, Pak Chung [Pacific Northwest National Laboratory, Richland Washington USA; Cook, Kristin A. [Pacific Northwest National Laboratory, Richland Washington USA

2017-09-01

Real-world systems change continuously and across domains like traffic monitoring, cyber security, etc., such changes occur within short time scales. This leads to a streaming data problem and produces unique challenges for the human in the loop, as analysts have to ingest and make sense of dynamic patterns in real time. In this paper, our goal is to study how the state-of-the-art in streaming data visualization handles these challenges and reflect on the gaps and opportunities. To this end, we have three contributions: i) problem characterization for identifying domain-specific goals and challenges for handling streaming data, ii) a survey and analysis of the state-of-the-art in streaming data visualization research with a focus on the visualization design space, and iii) reflections on the perceptually motivated design challenges and potential research directions for addressing them.

The Improvement of Students’ Leadership Ethic in Studying History by Using Baratayuda Audio Visual Media

Directory of Open Access Journals (Sweden)

Wendhy Rachmadhany

2018-04-01

Full Text Available The purpose of this research is to know the improvement of students’ leadership ethic in studying History after the implementation of Baratayuda Audio Visual Media. The population of this research is XI-Social Science-1 Class of SMAN 1 Pare, Kediri Regency, in academic year 2016/2017, consisted of 39 students. This Classroom Action Research (CAR is arranged by Pre-test, Cycle-1 and Cycle-2 which consisted by some steps, such like; planning, implementation, observation, and reflection. Collecting the data is by using questionnaire of leadership ethic, interview, and documentation. The method of data analysis in this research is descriptive analysis by comparing the improvement from one cycle to another. The result of the research is showing that: There is an improvement of leadership ethic in studying History after the implementation of Baratayuda Audio Visual media. It is shown by the results as follows; Pre-test indicates that the passing score is about 17, 95%. On Cycle-1 indicates 46, 1% and on Cycle-2 indicates a significant improvement about 71, 83%.
Linking Audio and Visual Information while Navigating in a Virtual Reality Kiosk Display

Science.gov (United States)

Sullivan, Briana; Ware, Colin; Plumlee, Matthew

2006-01-01

3D interactive virtual reality museum exhibits should be easy to use, entertaining, and informative. If the interface is intuitive, it will allow the user more time to learn the educational content of the exhibit. This research deals with interface issues concerning activating audio descriptions of images in such exhibits while the user is…
Huffman coding in advanced audio coding standard

Science.gov (United States)

Brzuchalski, Grzegorz

2012-05-01

This article presents several hardware architectures of Advanced Audio Coding (AAC) Huffman noiseless encoder, its optimisations and working implementation. Much attention has been paid to optimise the demand of hardware resources especially memory size. The aim of design was to get as short binary stream as possible in this standard. The Huffman encoder with whole audio-video system has been implemented in FPGA devices.
Quality models for audiovisual streaming

Science.gov (United States)

Thang, Truong Cong; Kim, Young Suk; Kim, Cheon Seog; Ro, Yong Man

2006-01-01

Quality is an essential factor in multimedia communication, especially in compression and adaptation. Quality metrics can be divided into three categories: within-modality quality, cross-modality quality, and multi-modality quality. Most research has so far focused on within-modality quality. Moreover, quality is normally just considered from the perceptual perspective. In practice, content may be drastically adapted, even converted to another modality. In this case, we should consider the quality from semantic perspective as well. In this work, we investigate the multi-modality quality from the semantic perspective. To model the semantic quality, we apply the concept of "conceptual graph", which consists of semantic nodes and relations between the nodes. As an typical of multi-modality example, we focus on audiovisual streaming service. Specifically, we evaluate the amount of information conveyed by a audiovisual content where both video and audio channels may be strongly degraded, even audio are converted to text. In the experiments, we also consider the perceptual quality model of audiovisual content, so as to see the difference with semantic quality model.
Concurrent Unimodal Learning Enhances Multisensory Responses of Bi-Directional Crossmodal Learning in Robotic Audio-Visual Tracking

DEFF Research Database (Denmark)

Shaikh, Danish; Bodenhagen, Leon; Manoonpong, Poramate

2018-01-01

modalities to independently update modality-specific neural weights on a moment-by-moment basis, in response to dynamic changes in noisy sensory stimuli. The circuit is embodied as a non-holonomic robotic agent that must orient a towards a moving audio-visual target. The circuit continuously learns the best...
Blindness alters the microstructure of the ventral but not the dorsal visual stream.

Science.gov (United States)

Reislev, Nina L; Kupers, Ron; Siebner, Hartwig R; Ptito, Maurice; Dyrby, Tim B

2016-07-01

Visual deprivation from birth leads to reorganisation of the brain through cross-modal plasticity. Although there is a general agreement that the primary afferent visual pathways are altered in congenitally blind individuals, our knowledge about microstructural changes within the higher-order visual streams, and how this is affected by onset of blindness, remains scant. We used diffusion tensor imaging and tractography to investigate microstructural features in the dorsal (superior longitudinal fasciculus) and ventral (inferior longitudinal and inferior fronto-occipital fasciculi) visual pathways in 12 congenitally blind, 15 late blind and 15 normal sighted controls. We also studied six prematurely born individuals with normal vision to control for the effects of prematurity on brain connectivity. Our data revealed a reduction in fractional anisotropy in the ventral but not the dorsal visual stream for both congenitally and late blind individuals. Prematurely born individuals, with normal vision, did not differ from normal sighted controls, born at term. Our data suggest that although the visual streams are structurally developing without normal visual input from the eyes, blindness selectively affects the microstructure of the ventral visual stream regardless of the time of onset. We suggest that the decreased fractional anisotropy of the ventral stream in the two groups of blind subjects is the combined result of both degenerative and cross-modal compensatory processes, affecting normal white matter development.
Automatic processing of CERN video, audio and photo archives

International Nuclear Information System (INIS)

Kwiatek, M

2008-01-01

The digitalization of CERN audio-visual archives, a major task currently in progress, will generate over 40 TB of video, audio and photo files. Storing these files is one issue, but a far more important challenge is to provide long-time coherence of the archive and to make these files available on-line with minimum manpower investment. An infrastructure, based on standard CERN services, has been implemented, whereby master files, stored in the CERN Distributed File System (DFS), are discovered and scheduled for encoding into lightweight web formats based on predefined profiles. Changes in master files, conversion profiles or in the metadata database (read from CDS, the CERN Document Server) are automatically detected and the media re-encoded whenever necessary. The encoding processes are run on virtual servers provided on-demand by the CERN Server Self Service Centre, so that new servers can be easily configured to adapt to higher load. Finally, the generated files are made available from the CERN standard web servers with streaming implemented using Windows Media Services
Automatic processing of CERN video, audio and photo archives

Energy Technology Data Exchange (ETDEWEB)

Kwiatek, M [CERN, Geneva (Switzerland)], E-mail: Michal.Kwiatek@cem.ch

2008-07-15

The digitalization of CERN audio-visual archives, a major task currently in progress, will generate over 40 TB of video, audio and photo files. Storing these files is one issue, but a far more important challenge is to provide long-time coherence of the archive and to make these files available on-line with minimum manpower investment. An infrastructure, based on standard CERN services, has been implemented, whereby master files, stored in the CERN Distributed File System (DFS), are discovered and scheduled for encoding into lightweight web formats based on predefined profiles. Changes in master files, conversion profiles or in the metadata database (read from CDS, the CERN Document Server) are automatically detected and the media re-encoded whenever necessary. The encoding processes are run on virtual servers provided on-demand by the CERN Server Self Service Centre, so that new servers can be easily configured to adapt to higher load. Finally, the generated files are made available from the CERN standard web servers with streaming implemented using Windows Media Services.
Exploring determinants of early user acceptance for an audio-visual heritage archive service using the vignette method

NARCIS (Netherlands)

Ongena, G.; van de Wijngaert, Lidwien; Huizer, E.

2013-01-01

The purpose of this study is to investigate factors, which explain the behavioural intention of the use of a new audio-visual cultural heritage archive service. An online survey in combination with a factorial survey is utilised to investigate the predictable strength of technological, individual
ATLAS Live: Collaborative Information Streams

CERN Document Server

Goldfarb, S; The ATLAS collaboration

2011-01-01

I report on a pilot project launched in 2010 focusing on facilitating communication and information exchange within the ATLAS Collaboration, through the combination of digital signage software and webcasting. The project, called ATLAS Live, implements video streams of information, ranging from detailed detector and data status to educational and outreach material. The content, including text, images, video and audio, is collected, visualised and scheduled using digital signage software. The system is robust and flexible, utilizing scripts to input data from remote sources, such as the CERN Document Server, Indico, or any available URL, and to integrate these sources into professional-quality streams, including text scrolling, transition effects, inter and intra-screen divisibility. Information is published via the encoding and webcasting of standard video streams, viewable on all common platforms, using a web browser or other common video tool. Authorisation is enforced at the level of the streaming and at th...
Fuzzy logic control for selective hydrogenation of acetylene in ethylene rich streams using visual basic

International Nuclear Information System (INIS)

Malik, S.R.; Suleman, H.; Khan, J.R.

2010-01-01

Presence of acetylene is technically disadvantageous in the ethylene rich gas streams from steam crackers. Acetylene tends to polymerize and inactivates the transition metal catalysts, forming highly explosive compounds. The acetylene content has to be selectively reduced to less than one part per million for such streams. The acetylene hydrogenation unit requires stringent control parameters and needs specialized process control techniques for its operation. This study is concerned with application of Fuzzy Logic Control to manipulate and control the process plant with higher precision and greater simplicity. The control program has been written in visual Basic and entails all major scenarios of work modes for successful hydrogenation of Acetylene. (author)
Crossmodal Recruitment of the Ventral Visual Stream in Congenital Blindness

Directory of Open Access Journals (Sweden)

Maurice Ptito

2012-01-01

Full Text Available We used functional MRI (fMRI to test the hypothesis that blind subjects recruit the ventral visual stream during nonhaptic tactile-form recognition. Congenitally blind and blindfolded sighted control subjects were scanned after they had been trained during four consecutive days to perform a tactile-form recognition task with the tongue display unit (TDU. Both groups learned the task at the same rate. In line with our hypothesis, the fMRI data showed that during nonhaptic shape recognition, blind subjects activated large portions of the ventral visual stream, including the cuneus, precuneus, inferotemporal (IT, cortex, lateral occipital tactile vision area (LOtv, and fusiform gyrus. Control subjects activated area LOtv and precuneus but not cuneus, IT and fusiform gyrus. These results indicate that congenitally blind subjects recruit key regions in the ventral visual pathway during nonhaptic tactile shape discrimination. The activation of LOtv by nonhaptic tactile shape processing in blind and sighted subjects adds further support to the notion that this area subserves an abstract or supramodal representation of shape. Together with our previous findings, our data suggest that the segregation of the efferent projections of the primary visual cortex into a dorsal and ventral visual stream is preserved in individuals blind from birth.
106-17 Telemetry Standards Digitized Audio Telemetry Standard Chapter 5

Science.gov (United States)

2017-07-01

Digitized Audio Telemetry Standard 5.1 General This chapter defines continuously variable slope delta (CVSD) modulation as the standard for digitizing...audio signal. The CVSD modulator is, in essence , a 1-bit analog-to-digital converter. The output of this 1-bit encoder is a serial bit stream, where
Analysis of sound data streamed over the network

Directory of Open Access Journals (Sweden)

Jiří Fejfar

2013-01-01

Full Text Available In this paper we inspect a difference between original sound recording and signal captured after streaming this original recording over a network loaded with a heavy traffic. There are several kinds of failures occurring in the captured recording caused by network congestion. We try to find a method how to evaluate correctness of streamed audio. Usually there are metrics based on a human perception of a signal such as “signal is clear, without audible failures”, “signal is having some failures but it is understandable”, or “signal is inarticulate”. These approaches need to be statistically evaluated on a broad set of respondents, which is time and resource consuming. We try to propose some metrics based on signal properties allowing us to compare the original and captured recording. We use algorithm called Dynamic Time Warping (Müller, 2007 commonly used for time series comparison in this paper. Some other time series exploration approaches can be found in (Fejfar, 2011 and (Fejfar, 2012. The data was acquired in our network laboratory simulating network traffic by downloading files, streaming audio and video simultaneously. Our former experiment inspected Quality of Service (QoS and its impact on failures of received audio data stream. This experiment is focused on the comparison of sound recordings rather than network mechanism.We focus, in this paper, on a real time audio stream such as a telephone call, where it is not possible to stream audio in advance to a “pool”. Instead it is necessary to achieve as small delay as possible (between speaker voice recording and listener voice replay. We are using RTP protocol for streaming audio.
Effects of virtual speaker density and room reverberation on spatiotemporal thresholds of audio-visual motion coherence.

Directory of Open Access Journals (Sweden)

Narayan Sankaran

Full Text Available The present study examined the effects of spatial sound-source density and reverberation on the spatiotemporal window for audio-visual motion coherence. Three different acoustic stimuli were generated in Virtual Auditory Space: two acoustically "dry" stimuli via the measurement of anechoic head-related impulse responses recorded at either 1° or 5° spatial intervals (Experiment 1, and a reverberant stimulus rendered from binaural room impulse responses recorded at 5° intervals in situ in order to capture reverberant acoustics in addition to head-related cues (Experiment 2. A moving visual stimulus with invariant localization cues was generated by sequentially activating LED's along the same radial path as the virtual auditory motion. Stimuli were presented at 25°/s, 50°/s and 100°/s with a random spatial offset between audition and vision. In a 2AFC task, subjects made a judgment of the leading modality (auditory or visual. No significant differences were observed in the spatial threshold based on the point of subjective equivalence (PSE or the slope of psychometric functions (β across all three acoustic conditions. Additionally, both the PSE and β did not significantly differ across velocity, suggesting a fixed spatial window of audio-visual separation. Findings suggest that there was no loss in spatial information accompanying the reduction in spatial cues and reverberation levels tested, and establish a perceptual measure for assessing the veracity of motion generated from discrete locations and in echoic environments.
Marketing engagement through visual content

Directory of Open Access Journals (Sweden)

Marius MANIC

2015-12-01

Full Text Available Engaging visual is a must in the modern marketing world. Wide access to mass communication devices, with extended visuals enhancements, made visual content an important point of interest for any publisher, on all media channels. The decreasing costs and huge variety of types are premises for an easy and effective marketing investment, with strong benefits for any company and its brands. Loyal customers are achieved and kept through visual content; the lack of it in the general marketing
A Network and Visual Quality Aware N-Screen Content Recommender System Using Joint Matrix Factorization

Directory of Open Access Journals (Sweden)

Farman Ullah

2014-01-01

Full Text Available We propose a network and visual quality aware N-Screen content recommender system. N-Screen provides more ways than ever before to access multimedia content through multiple devices and heterogeneous access networks. The heterogeneity of devices and access networks present new questions of QoS (quality of service in the realm of user experience with content. We propose, a recommender system that ensures a better visual quality on user’s N-screen devices and the efficient utilization of available access network bandwidth with user preferences. The proposed system estimates the available bandwidth and visual quality on users N-Screen devices and integrates it with users preferences and contents genre information to personalize his N-Screen content. The objective is to recommend content that the user’s N-Screen device and access network are capable of displaying and streaming with the user preferences that have not been supported in existing systems. Furthermore, we suggest a joint matrix factorization approach to jointly factorize the users rating matrix with the users N-Screen device similarity and program genres similarity. Finally, the experimental results show that we also enhance the prediction and recommendation accuracy, sparsity, and cold start issues.
A network and visual quality aware N-screen content recommender system using joint matrix factorization.

Science.gov (United States)

Ullah, Farman; Sarwar, Ghulam; Lee, Sungchang

2014-01-01

We propose a network and visual quality aware N-Screen content recommender system. N-Screen provides more ways than ever before to access multimedia content through multiple devices and heterogeneous access networks. The heterogeneity of devices and access networks present new questions of QoS (quality of service) in the realm of user experience with content. We propose, a recommender system that ensures a better visual quality on user's N-screen devices and the efficient utilization of available access network bandwidth with user preferences. The proposed system estimates the available bandwidth and visual quality on users N-Screen devices and integrates it with users preferences and contents genre information to personalize his N-Screen content. The objective is to recommend content that the user's N-Screen device and access network are capable of displaying and streaming with the user preferences that have not been supported in existing systems. Furthermore, we suggest a joint matrix factorization approach to jointly factorize the users rating matrix with the users N-Screen device similarity and program genres similarity. Finally, the experimental results show that we also enhance the prediction and recommendation accuracy, sparsity, and cold start issues.
Semantic Context Detection Using Audio Event Fusion

Directory of Open Access Journals (Sweden)

Cheng Wen-Huang

2006-01-01

Full Text Available Semantic-level content analysis is a crucial issue in achieving efficient content retrieval and management. We propose a hierarchical approach that models audio events over a time series in order to accomplish semantic context detection. Two levels of modeling, audio event and semantic context modeling, are devised to bridge the gap between physical audio features and semantic concepts. In this work, hidden Markov models (HMMs are used to model four representative audio events, that is, gunshot, explosion, engine, and car braking, in action movies. At the semantic context level, generative (ergodic hidden Markov model and discriminative (support vector machine (SVM approaches are investigated to fuse the characteristics and correlations among audio events, which provide cues for detecting gunplay and car-chasing scenes. The experimental results demonstrate the effectiveness of the proposed approaches and provide a preliminary framework for information mining by using audio characteristics.
ATLAS Live: Collaborative Information Streams

Energy Technology Data Exchange (ETDEWEB)

Goldfarb, Steven [Department of Physics, University of Michigan, Ann Arbor, MI 48109 (United States); Collaboration: ATLAS Collaboration

2011-12-23

I report on a pilot project launched in 2010 focusing on facilitating communication and information exchange within the ATLAS Collaboration, through the combination of digital signage software and webcasting. The project, called ATLAS Live, implements video streams of information, ranging from detailed detector and data status to educational and outreach material. The content, including text, images, video and audio, is collected, visualised and scheduled using digital signage software. The system is robust and flexible, utilizing scripts to input data from remote sources, such as the CERN Document Server, Indico, or any available URL, and to integrate these sources into professional-quality streams, including text scrolling, transition effects, inter and intra-screen divisibility. Information is published via the encoding and webcasting of standard video streams, viewable on all common platforms, using a web browser or other common video tool. Authorisation is enforced at the level of the streaming and at the web portals, using the CERN SSO system.

ATLAS Live: Collaborative Information Streams

International Nuclear Information System (INIS)

Goldfarb, Steven

2011-01-01

I report on a pilot project launched in 2010 focusing on facilitating communication and information exchange within the ATLAS Collaboration, through the combination of digital signage software and webcasting. The project, called ATLAS Live, implements video streams of information, ranging from detailed detector and data status to educational and outreach material. The content, including text, images, video and audio, is collected, visualised and scheduled using digital signage software. The system is robust and flexible, utilizing scripts to input data from remote sources, such as the CERN Document Server, Indico, or any available URL, and to integrate these sources into professional-quality streams, including text scrolling, transition effects, inter and intra-screen divisibility. Information is published via the encoding and webcasting of standard video streams, viewable on all common platforms, using a web browser or other common video tool. Authorisation is enforced at the level of the streaming and at the web portals, using the CERN SSO system.
Real-time analytics techniques to analyze and visualize streaming data

CERN Document Server

Ellis, Byron

2014-01-01

Construct a robust end-to-end solution for analyzing and visualizing streaming data Real-time analytics is the hottest topic in data analytics today. In Real-Time Analytics: Techniques to Analyze and Visualize Streaming Data, expert Byron Ellis teaches data analysts technologies to build an effective real-time analytics platform. This platform can then be used to make sense of the constantly changing data that is beginning to outpace traditional batch-based analysis platforms. The author is among a very few leading experts in the field. He has a prestigious background in research, development,
Distortion Estimation in Compressed Music Using Only Audio Fingerprints

NARCIS (Netherlands)

Doets, P.J.O.; Lagendijk, R.L.

2008-01-01

An audio fingerprint is a compact yet very robust representation of the perceptually relevant parts of an audio signal. It can be used for content-based audio identification, even when the audio is severely distorted. Audio compression changes the fingerprint slightly. We show that these small
Literary Genres in Social Life: A Narrative, Audio-visual and Poetic Approach

Directory of Open Access Journals (Sweden)

Luis Felipe González Gutiérrez

2008-05-01

Full Text Available The proposal, "Literary Genres in Social Life: a Narrative, Audio-visual and Poetic Approach", attempts, by objective, to present/display to the academic psychology community and compatible social science disciplines the main contributions of literary genre theory through a social constructionist understanding of narrations and daily stories, and by means of an interactive construction of narrative collage. This work, sustained by an investigation financed by the University Santo Tomás in Bogota, Colombia, "Understanding of structuralist literary theories in the development of the narrative 'I' within the social constructionist approach", tries to propose alternative spaces for the presentation of its investigative results through the expression of metaphors, visual narrative sequences and interactive artistic forms, which invite the spectator to share in and to include/understand important concepts in the consolidation of social forms of construction of the quotidian. URN: urn:nbn:de:0114-fqs0802373
Simulator study of the effect of visual-motion time delays on pilot tracking performance with an audio side task

Science.gov (United States)

Riley, D. R.; Miller, G. K., Jr.

1978-01-01

The effect of time delay was determined in the visual and motion cues in a flight simulator on pilot performance in tracking a target aircraft that was oscillating sinusoidally in altitude only. An audio side task was used to assure the subject was fully occupied at all times. The results indicate that, within the test grid employed, about the same acceptable time delay (250 msec) was obtained for a single aircraft (fighter type) by each of two subjects for both fixed-base and motion-base conditions. Acceptable time delay is defined as the largest amount of delay that can be inserted simultaneously into the visual and motion cues before performance degradation occurs. A statistical analysis of the data was made to establish this value of time delay. Audio side task provided quantitative data that documented the subject's work level.
Integration of Audio Visual Multimedia for Special Education Pre-Service Teachers' Self Reflections in Developing Teaching Competencies

Science.gov (United States)

Sediyani, Tri; Yufiarti; Hadi, Eko

2017-01-01

This study aims to develop a model of learning by integrating multimedia and audio-visual self-reflective learners. This multimedia was developed as a tool for prospective teachers as learners in the education of children with special needs to reflect on their teaching competencies before entering the world of education. Research methods to…
Designing Promotion Strategy of Malang Raya’s Tourism Destination Branding through Audio Visual Media

Directory of Open Access Journals (Sweden)

Chanira Nuansa

2014-04-01

Full Text Available This study examines the suitability concept of destination branding with existing models of Malang tourism promotion. This research is qualitative by taking the data directly in the form of existing promotional models of Malang, namely: information portal sites, blogs, social networking, and video via the Internet. This study used SWOT analysis to find strengths, weaknesses, opportunities, and threats on existing models of the tourism promotion. The data is analyzed based on destination branding’s concept indicators. Results of analysis are used as a basis in designing solutions for Malang tourism promotion through a new integrated tourism advertising model. Through the analysis we found that video is the most suitable media that used to promote Malang tourism in the form of advertisements. Videos are able to show the objectivity of the fact that intact better through audio-visual form, making it easier to associate the viewer thoughts on the phenomenon of destination. Moreover, video creation of Malang tourism as well as conceptualized ad is still rare. This is an opportunity, because later models of audio-visual advertisements made of this study is expected to be an example for concerned parties to conceptualize the next Malang tourism advertising.Keywords: Advertise, SWOT Analysis, Malang City, tourism promotion
Advanced content delivery, streaming, and cloud services

CERN Document Server

Sitaraman, Ramesh Kumar; Robinson, Dom

2014-01-01

While other books on the market provide limited coverage of advanced CDNs and streaming technologies, concentrating solely on the fundamentals, this book provides an up-to-date comprehensive coverage of the state-of-the-art advancements in CDNs, with a special focus on Cloud-based CDNs. The book includes CDN and media streaming basics, performance models, practical applications, and business analysis. It features industry case studies, CDN applications, and open research issues to aid practitioners and researchers, and a market analysis to provide a reference point for commercial entities. The book covers Adaptive Bitrate Streaming (ABR), Content Delivery Cloud (CDC), Web Acceleration, Front End Optimization (FEO), Transparent Caching, Next Generation CDNs, CDN Business Intelligence and more.
STEGANOGRAPHY USAGE TO CONTROL MULTIMEDIA STREAM

Directory of Open Access Journals (Sweden)

Grzegorz Koziel

2014-03-01

Full Text Available In the paper, a proposal of new application for steganography is presented. It is possible to use steganographic techniques to control multimedia stream playback. Special control markers can be included in the sound signal and the player can detect markers and modify the playback parameters according to the hidden instructions. This solution allows for remembering user preferences within the audio track as well as allowing for preparation of various versions of the same content at the production level.
ATLAS Live: Collaborative Information Streams

CERN Document Server

Goldfarb, S; The ATLAS collaboration

2010-01-01

I report on a pilot project launched in 2010 focusing on facilitating communication and information exchange within the ATLAS Collaboration, through the combination of digital signage software and webcasting. The project, called ATLAS Live, implements video streams of information, ranging from detailed detector and data status to educational and outreach material. The content, including text, images, video and audio, is collected, visualised and scheduled using the SCALA digital signage software system. The system is robust and flexible, allowing for the usage of scripts to input data from remote sources, such as the CERN Document Server, Indico, or any available URL, and to integrate these sources into professional-quality streams, including text scrolling, transition effects, inter and intrascreen divisibility. The video is made available to the collaboration or public through the encoding and webcasting of standard video streams, viewable on all common platforms, using a web browser or other common video t...
Use of Effective Audio in E-learning Courseware

OpenAIRE

Ray, Kisor

2015-01-01

E-Learning uses electronic media, information & communication technologies to provide education to the masses. E-learning deliver hypertext, text, audio, images, animation and videos using desktop standalone computer, local area network based intranet and internet based contents. While producing an e-learning content or course-ware, a major decision making factor is whether to use audio for the benefit of the end users. Generally, three types of audio can be used in e-learning: narration, mus...
The Effect of Visual Cueing and Control Design on Children's Reading Achievement of Audio E-Books with Tablet Computers

Science.gov (United States)

Wang, Pei-Yu; Huang, Chung-Kai

2015-01-01

This study aims to explore the impact of learner grade, visual cueing, and control design on children's reading achievement of audio e-books with tablet computers. This research was a three-way factorial design where the first factor was learner grade (grade four and six), the second factor was e-book visual cueing (word-based, line-based, and…
PENGGUNAAN MEDIA AUDIO VISUAL UNTUK MENINGKATKAN HASIL BELAJAR MATERI MERODA PADA SENAM LANTAI KELAS VIII SMP NEGERI 13 SEMARANG TAHUN 2013/2014

Directory of Open Access Journals (Sweden)

Sigit Budi Prastyyo

2015-01-01

Full Text Available The purpose of this study was to determine the improvement of teaching physical education in schools through the use of audio-visual media aids the learning outcomes gymnastics floor meroda the eighth grade students of SMP Negeri 13 Semarang. In this research, a classroom action research (CAR cycle , the study was conducted in two cycles of action . Methods of data collection using the methods of documentation , observation , and testing . Analysis of the data using descriptive method by way of student learning outcomes after the action . Based on the results obtained by the use of audio-visual media in the learning material meroda floor exercises can improve learning outcomes eighth grade at Junior High School 13 Semarang 2013/2014 . This is evidenced by the acquisition value of the learning outcomes of each cycle has increased . The average value of students in the first cycle the average test score of students reached 70.51 , reaching 84.72 in the second cycle . Classical completeness in the first cycle of 54.84 % and the second cycle was 90.32 % . From the research results obtained it can be concluded that the learning material meroda floor exercises with the use of audio-visual media can improve learning outcomes students of SMP Negeri 13 Semarang .
Functional Imaging of Audio-Visual Selective Attention in Monkeys and Humans: How do Lapses in Monkey Performance Affect Cross-Species Correspondences?

Science.gov (United States)

Rinne, Teemu; Muers, Ross S; Salo, Emma; Slater, Heather; Petkov, Christopher I

2017-06-01

The cross-species correspondences and differences in how attention modulates brain responses in humans and animal models are poorly understood. We trained 2 monkeys to perform an audio-visual selective attention task during functional magnetic resonance imaging (fMRI), rewarding them to attend to stimuli in one modality while ignoring those in the other. Monkey fMRI identified regions strongly modulated by auditory or visual attention. Surprisingly, auditory attention-related modulations were much more restricted in monkeys than humans performing the same tasks during fMRI. Further analyses ruled out trivial explanations, suggesting that labile selective-attention performance was associated with inhomogeneous modulations in wide cortical regions in the monkeys. The findings provide initial insights into how audio-visual selective attention modulates the primate brain, identify sources for "lost" attention effects in monkeys, and carry implications for modeling the neurobiology of human cognition with nonhuman animals. © The Author 2017. Published by Oxford University Press.
Content congruency and its interplay with temporal synchrony modulate integration between rhythmic audiovisual streams

Directory of Open Access Journals (Sweden)

Yi-Huang eSu

2014-12-01

Full Text Available Both lower-level stimulus factors (e.g., temporal proximity and higher-level cognitive factors (e.g., content congruency are known to influence multisensory integration. The former can direct attention in a converging manner, and the latter can indicate whether information from the two modalities belongs together. The present research investigated whether and how these two factors interacted in the perception of rhythmic, audiovisual streams derived from a human movement scenario. Congruency here was based on sensorimotor correspondence pertaining to rhythm perception. Participants attended to bimodal stimuli consisting of a humanlike figure moving regularly to a sequence of auditory beat, and detected a possible auditory temporal deviant. The figure moved either downwards (congruently or upwards (incongruently to the downbeat, while in both situations the movement was either synchronous with the beat, or lagging behind it. Greater cross-modal binding was expected to hinder deviant detection. Results revealed poorer detection for congruent than for incongruent streams, suggesting stronger integration in the former. False alarms increased in asynchronous stimuli only for congruent streams, indicating greater tendency for deviant report due to visual capture of asynchronous auditory events. In addition, a greater increase in perceived synchrony was associated with a greater reduction in false alarms for congruent streams, while the pattern was reversed for incongruent ones. These results demonstrate that content congruency as a top-down factor not only promotes integration, but also modulates bottom-up effects of synchrony. Results are also discussed regarding how theories of integration and attentional entrainment may be combined in the context of rhythmic multisensory stimuli.
Content congruency and its interplay with temporal synchrony modulate integration between rhythmic audiovisual streams.

Science.gov (United States)

Su, Yi-Huang

2014-01-01

Both lower-level stimulus factors (e.g., temporal proximity) and higher-level cognitive factors (e.g., content congruency) are known to influence multisensory integration. The former can direct attention in a converging manner, and the latter can indicate whether information from the two modalities belongs together. The present research investigated whether and how these two factors interacted in the perception of rhythmic, audiovisual (AV) streams derived from a human movement scenario. Congruency here was based on sensorimotor correspondence pertaining to rhythm perception. Participants attended to bimodal stimuli consisting of a humanlike figure moving regularly to a sequence of auditory beat, and detected a possible auditory temporal deviant. The figure moved either downwards (congruently) or upwards (incongruently) to the downbeat, while in both situations the movement was either synchronous with the beat, or lagging behind it. Greater cross-modal binding was expected to hinder deviant detection. Results revealed poorer detection for congruent than for incongruent streams, suggesting stronger integration in the former. False alarms increased in asynchronous stimuli only for congruent streams, indicating greater tendency for deviant report due to visual capture of asynchronous auditory events. In addition, a greater increase in perceived synchrony was associated with a greater reduction in false alarms for congruent streams, while the pattern was reversed for incongruent ones. These results demonstrate that content congruency as a top-down factor not only promotes integration, but also modulates bottom-up effects of synchrony. Results are also discussed regarding how theories of integration and attentional entrainment may be combined in the context of rhythmic multisensory stimuli.
The effect of visual information on verbal communication process in remote conversation

OpenAIRE

國田, 祥子; 中條, 和光

2005-01-01

This article examined how visual information affects verbal communication process in remote communication. In the experiment twenty pairs of subjects performed a collaborative task remotely via video and audio links or audio link only. During the task used in this experiment one of a pair (an instruction-giver) gave direction with a map to the other of the pair (an instruction-receiver). We recorded and analyzed contents of utterances. Consequently, the existence of visual information did not...
Semantic Analysis of Multimedial Information Usign Both Audio and Visual Clues

Directory of Open Access Journals (Sweden)

Andrej Lukac

2008-01-01

Full Text Available Nowadays, there is a lot of information in databases (text, audio/video form, etc.. It is important to be able to describe this data for better orientation in them. It is necessary to apply audio/video properties, which are used for metadata management, segmenting the document into semantically meaningful units, classifying each unit into a predefined scene type, indexing, summarizing the document for efficient retrieval and browsing. Data can be used for system that automatically searches for a specific person in a sequence also for special video sequences. Audio/video properties are presented by descriptors and description schemes. There are many features that can be used to characterize multimedial signals. We can analyze audio and video sequences jointly or considered them completely separately. Our aim is oriented to possibilities of combining multimedial features. Focus is direct into discussion programs, because there are more decisions how to combine audio features with video sequences.
Rhythmic synchronization tapping to an audio-visual metronome in budgerigars.

Science.gov (United States)

Hasegawa, Ai; Okanoya, Kazuo; Hasegawa, Toshikazu; Seki, Yoshimasa

2011-01-01

In all ages and countries, music and dance have constituted a central part in human culture and communication. Recently, vocal-learning animals such as parrots and elephants have been found to share rhythmic ability with humans. Thus, we investigated the rhythmic synchronization of budgerigars, a vocal-mimicking parrot species, under controlled conditions and a systematically designed experimental paradigm as a first step in understanding the evolution of musical entrainment. We trained eight budgerigars to perform isochronous tapping tasks in which they pecked a key to the rhythm of audio-visual metronome-like stimuli. The budgerigars showed evidence of entrainment to external stimuli over a wide range of tempos. They seemed to be inherently inclined to tap at fast tempos, which have a similar time scale to the rhythm of budgerigars' natural vocalizations. We suggest that vocal learning might have contributed to their performance, which resembled that of humans.
Penerapan Model Pembelajaran Treffinger dengan Bantuan Media Audio Visual Untuk Meningkatkan Aktivitas dan Hasil Belajar IPA Terpadu pada Siswa Kelas VII SMP Frater Makassar

Directory of Open Access Journals (Sweden)

Nur Indah Sari

2016-08-01

Full Text Available Penelitian ini adalah jenis Penelitian Tindakan Kelas (Classroom Action Research yang bertujuan untuk meningkatkan hasil belajar siswa pada pembelajaran IPA Terpadu melalui penerapan model pembelajaran Treffinger dengan bantuan media audio visual pada materi ekosistem siswa kelas VII SMP FRATER Makassar. Teknik pengumpulan data dilakukan dengan observasi aktivitas belajar siswa dan evaluasi pada setiap akhir siklus. Data yang terkumpul dianalisis dengan menggunakan analisis statistik deskriptif dan dilengkapi dengan tabel frekuensi dan presentase. Dari hasil kegiatan pembelajaran yang telah dilakukan terjadi peningkatan hasil belajar siswa, siklus I sebanyak 14 orang dengan presentase 37,83%, sedangkan pada siklus II sebanyak 32 orang dengan persentase 86,48%. dan terjadi peningkatan aktivitas belajar siswa, Semangat siswa dalam mengikuti pembelajaran IPA Terpadu pada siklus I 50,15% dan meningkat pada siklus II menjadi 80,05%. Hasil penelitian ini menunjukkan bahwa penerapan model pembelajaran Treffinger dengan bantuan media audio visual dapat meningkatkan hasil belajar IPA Terpadu pada materi ekosistem pada siswa kelas VII A SMP FRATER Makassar.Kata kunci: model pembelajaran treffinger, hasil belajar, ipa terpadu.ABSTRACTThis study is classroom action research study that aims to increase activity and study results of Integrated Science of student by using Treffinger model with audio visual media on ecosystem material of Class VII Student at SMP Frater Makassar. Data collection used in this study was observation and achievement test in the end of every cycle. Analytical data by using descriptive statistical analysis include the frequency tables and percentages. The results of this study indicate that: Treffinger model with audio visual media showed a positive tendency from 14 students with 37,83% in cycle I and improve to 32 students with 86,48% in cycle II and showed a positive tendency on student’s activity in study. Student�

Knitting Relational Documentary Networks: The Database Meta-Documentary Filming Revolution as a paradigm of bringing interactive audio-visual archives alive

NARCIS (Netherlands)

Wiehl, Anna

2016-01-01

abstractOne phenomenon in the emerging field of digital documentary are experiments with rhizomatic interfaces and database-logics to bring audio-visual archives 'alive'. A paradigm hereof is Filming Revolution (2015), an interactive platform which gathers and interlinks films of the uprisings in
Securing Digital Audio using Complex Quadratic Map

Science.gov (United States)

Suryadi, MT; Satria Gunawan, Tjandra; Satria, Yudi

2018-03-01

In This digital era, exchanging data are common and easy to do, therefore it is vulnerable to be attacked and manipulated from unauthorized parties. One data type that is vulnerable to attack is digital audio. So, we need data securing method that is not vulnerable and fast. One of the methods that match all of those criteria is securing the data using chaos function. Chaos function that is used in this research is complex quadratic map (CQM). There are some parameter value that causing the key stream that is generated by CQM function to pass all 15 NIST test, this means that the key stream that is generated using this CQM is proven to be random. In addition, samples of encrypted digital sound when tested using goodness of fit test are proven to be uniform, so securing digital audio using this method is not vulnerable to frequency analysis attack. The key space is very huge about 8.1×l031 possible keys and the key sensitivity is very small about 10-10, therefore this method is also not vulnerable against brute-force attack. And finally, the processing speed for both encryption and decryption process on average about 450 times faster that its digital audio duration.
A system for the semantic multimodal analysis of news audio-visual content

NARCIS (Netherlands)

Mezaris, Vasileios; Gidaros, Spyros; Papadopoulos, Georgios Th.; Kasper, Walter; Ordelman, Roeland J.F.; Steffen, Jörg; Huijbregts, M.A.H.; de Jong, Franciska M.G.; Kompatsiaris, Ioannis; Strintzis, Michael G.

News-related content is nowadays among the most popular types of content for users in everyday applications. Although the generation and distribution of news content has become commonplace, due to the availability of inexpensive media capturing devices and the development of media sharing services
Design guidelines for audio presentation of graphs and tables

OpenAIRE

Brown, L.M.; Brewster, S.A.; Ramloll, S.A.; Burton, R.; Riedel, B.

2003-01-01

Audio can be used to make visualisations accessible to blind and visually impaired people. The MultiVis Project has carried out research into suitable methods for presenting graphs and tables to blind people through the use of both speech and non-speech audio. This paper presents guidelines extracted from this research. These guidelines will enable designers to implement visualisation systems for blind and visually impaired users, and will provide a framework for researchers wishing to invest...
An interactive audio-visual installation using ubiquitous hardware and web-based software deployment

Directory of Open Access Journals (Sweden)

Tiago Fernandes Tavares

2015-05-01

Full Text Available This paper describes an interactive audio-visual musical installation, namely MOTUS, that aims at being deployed using low-cost hardware and software. This was achieved by writing the software as a web application and using only hardware pieces that are built-in most modern personal computers. This scenario implies in specific technical restrictions, which leads to solutions combining both technical and artistic aspects of the installation. The resulting system is versatile and can be freely used from any computer with Internet access. Spontaneous feedback from the audience has shown that the provided experience is interesting and engaging, regardless of the use of minimal hardware.
Gestión documental de la información audiovisual deportiva en las televisiones generalistas Documentary management of the sport audio-visual information in the generalist televisions

Directory of Open Access Journals (Sweden)

Jorge Caldera Serrano

2005-01-01

Full Text Available Se analiza la gestión de la información audiovisual deportiva en el marco de los Sistemas de Información Documental de las cadenas estatales, zonales y locales. Para ello se realiza un realiza un recorrido por la cadena documental que realiza la información audiovisual deportiva con el fin de ir analizando cada uno de los parámetros, mostrando así una serie de recomendaciones y normativas para la confección del registro audiovisual deportivo. Evidentemente la documentación deportiva audiovisual no se diferencia en exceso del análisis de otros tipos documentales televisivos por lo que se lleva a cabo una profundización yampliación de su gestión y difusión, mostrando el flujo informacional dentro del Sistema.The management of the sport audio-visual documentation of the Information Systems of the state, zonal and local chains is analyzed within the framework. For it it is made makes a route by the documentary chain that makes the sport audio-visual information with the purpose of being analyzing each one of the parameters, showing therefore a series of recommendations and norms for the preparation of the sport audio-visual registry. Evidently the audio-visual sport documentation difference in excess of the analysis of other televising documentary types reason why is not carried out a deepening and extension of its management and diffusion, showing the informational flow within the System.
MP3 audio-editing software for the department of radiology

International Nuclear Information System (INIS)

Hong Qingfen; Sun Canhui; Li Ziping; Meng Quanfei; Jiang Li

2006-01-01

Objective: To evaluate the MP3 audio-editing software in the daily work in the department of radiology. Methods: The audio content of daily consultation seminar, held in the department of radiology every morning, was recorded and converted into MP3 audio format by a computer integrated recording device. The audio data were edited, archived, and eventually saved in the computer memory storage media, which was experimentally replayed and applied in the research or teaching. Results: MP3 audio-editing was a simple process and convenient for saving and searching the data. The record could be easily replayed. Conclusion: MP3 audio-editing perfectly records and saves the contents of consultation seminar, and has replaced the conventional hand writing notes. It is a valuable tool in both research and teaching in the department. (authors)
A comparative evaluation of oral hygiene using Braille and audio instructions among institutionalized visually impaired children aged between 6 years and 20 years: A 3-monthfollow-up study

OpenAIRE

Mahantesha, Taranatha; Nara, Asha; Kumari, Parveen Reddy; Halemani, Praveen Kumar Nugadoni; Buddiga, Vinutna; Mythri, Sarpangala

2015-01-01

Aim: The aim of this study is to compare the oral hygiene status among institutionalized visually impaired children of age between 6 and 20 years given with Braille and audio instructions in Raichur city of Karnataka. Materials and Methods: A total of 50 children aged between 6 to 20 years were included in this study from a residential school for visually impaired children. These children were randomly divided into two equal groups. One group was given oral hygiene instructions by audio recor...
Applying Spatial Audio to Human Interfaces: 25 Years of NASA Experience

Science.gov (United States)

Begault, Durand R.; Wenzel, Elizabeth M.; Godfrey, Martine; Miller, Joel D.; Anderson, Mark R.

2010-01-01

From the perspective of human factors engineering, the inclusion of spatial audio within a human-machine interface is advantageous from several perspectives. Demonstrated benefits include the ability to monitor multiple streams of speech and non-speech warning tones using a cocktail party advantage, and for aurally-guided visual search. Other potential benefits include the spatial coordination and interaction of multimodal events, and evaluation of new communication technologies and alerting systems using virtual simulation. Many of these technologies were developed at NASA Ames Research Center, beginning in 1985. This paper reviews examples and describes the advantages of spatial sound in NASA-related technologies, including space operations, aeronautics, and search and rescue. The work has involved hardware and software development as well as basic and applied research.
Streaming Media Seminar--Effective Development and Distribution of Streaming Multimedia in Education

Science.gov (United States)

Mainhart, Robert; Gerraughty, James; Anderson, Kristine M.

2004-01-01

Concisely defined, "streaming media" is moving video and/or audio transmitted over the Internet for immediate viewing/listening by an end user. However, at Saint Francis University's Center of Excellence for Remote and Medically Under-Served Areas (CERMUSA), streaming media is approached from a broader perspective. The working definition includes…
Pitch contour impairment in congenital amusia: New insights from the Self-paced Audio-visual Contour Task (SACT).

Science.gov (United States)

Lu, Xuejing; Sun, Yanan; Ho, Hao Tam; Thompson, William Forde

2017-01-01

Individuals with congenital amusia usually exhibit impairments in melodic contour processing when asked to compare pairs of melodies that may or may not be identical to one another. However, it is unclear whether the impairment observed in contour processing is caused by an impairment of pitch discrimination, or is a consequence of poor pitch memory. To help resolve this ambiguity, we designed a novel Self-paced Audio-visual Contour Task (SACT) that evaluates sensitivity to contour while placing minimal burden on memory. In this task, participants control the pace of an auditory contour that is simultaneously accompanied by a visual contour, and they are asked to judge whether the two contours are congruent or incongruent. In Experiment 1, melodic contours varying in pitch were presented with a series of dots that varied in spatial height. Amusics exhibited reduced sensitivity to audio-visual congruency in comparison to control participants. To exclude the possibility that the impairment arises from a general deficit in cross-modal mapping, Experiment 2 examined sensitivity to cross-modal mapping for two other auditory dimensions: timbral brightness and loudness. Amusics and controls were significantly more sensitive to large than small contour changes, and to changes in loudness than changes in timbre. However, there were no group differences in cross-modal mapping, suggesting that individuals with congenital amusia can comprehend spatial representations of acoustic information. Taken together, the findings indicate that pitch contour processing in congenital amusia remains impaired even when pitch memory is relatively unburdened.
Audio-visual aid in teaching "fatty liver".

Science.gov (United States)

Dash, Sambit; Kamath, Ullas; Rao, Guruprasad; Prakash, Jay; Mishra, Snigdha

2016-05-06

Use of audio visual tools to aid in medical education is ever on a rise. Our study intends to find the efficacy of a video prepared on "fatty liver," a topic that is often a challenge for pre-clinical teachers, in enhancing cognitive processing and ultimately learning. We prepared a video presentation of 11:36 min, incorporating various concepts of the topic, while keeping in view Mayer's and Ellaway guidelines for multimedia presentation. A pre-post test study on subject knowledge was conducted for 100 students with the video shown as intervention. A retrospective pre study was conducted as a survey which inquired about students understanding of the key concepts of the topic and a feedback on our video was taken. Students performed significantly better in the post test (mean score 8.52 vs. 5.45 in pre-test), positively responded in the retrospective pre-test and gave a positive feedback for our video presentation. Well-designed multimedia tools can aid in cognitive processing and enhance working memory capacity as shown in our study. In times when "smart" device penetration is high, information and communication tools in medical education, which can act as essential aid and not as replacement for traditional curriculums, can be beneficial to the students. © 2015 by The International Union of Biochemistry and Molecular Biology, 44:241-245, 2016. © 2015 The International Union of Biochemistry and Molecular Biology.
No double-dissociation between optic ataxia and visual agnosia: Multiple sub-streams for multiple visuo-manual integrations

NARCIS (Netherlands)

Pisella, L.; Binkofski, F.; Lasek, K.; Toni, I.; Rossetti, Y.

2006-01-01

The current dominant view of the visual system is marked by the functional and anatomical dissociation between a ventral stream specialised for perception and a dorsal stream specialised for action. The "double-dissociation" between visual agnosia (VA), a deficit of visual recognition, and optic
Imagination and Modern Audio Visual Form

Directory of Open Access Journals (Sweden)

Ana Đurković

2017-09-01

Full Text Available Through three episodes Archetype of modern fairy tales, the mysterious world of fantasy and reality,tell as a serious story about archetypes, symbols, knowledge of good and evil. Rts editor: Natasa Neskovic Written and directed by: Suncica Jergovic Editing: Ana Djurkovic How to illuminate concept of phantasy and affective factors in our imagination a priori something so imaginary, by their genetic provenance, such as a movie scene, or digital picture and sound. You can not always avoid the association to a valid phrase of arnhajm’s truth: mass age -massage: the medium is the message. In elementary and tersely definition of „the shot“ from Plaževsky film language there is term for „le cadre“, however these are selected bits of reality, immanent frame that contains the individual act of images divided of the continent’s view of reality, handling the specific code of semantic value, when its’s imaginative, of course, by aesthetic categories and evaluations. In this type of positive simulacrum, it can not be better segment for the current thinking about the limits of imagination and truth in contemporary media, and contemporary global environment, than the original audio-visual forms through whose prism we search throught a fairy tale in a same time myth and imagination as well as exploring its overall impact on the personality. Everything can be a fairy tale, even false, amoral platitudes politicized by political lobbies in a contemporary existing power sistems, but this is no fairy tale authenticity in it, or creative act, nor humanity and artificial and historical entity of a man that is always present in the ethical effort of a true artist. So, we are investigating the conditions of creative images, modalities of audiovisual media in film language,and it is the archetype of the fairy tale, which, with its psychodynamics still exists and which is removed when the modern man is tired of lies and simulations during his global
Contribution of Prosody in Audio-Visual Integration to Emotional Perception of Virtual Characters

Directory of Open Access Journals (Sweden)

Ekaterina Volkova

2011-10-01

Full Text Available Recent technology provides us with realistic looking virtual characters. Motion capture and elaborate mathematical models supply data for natural looking, controllable facial and bodily animations. With the help of computational linguistics and artificial intelligence, we can automatically assign emotional categories to appropriate stretches of text for a simulation of those social scenarios where verbal communication is important. All this makes virtual characters a valuable tool for creation of versatile stimuli for research on the integration of emotion information from different modalities. We conducted an audio-visual experiment to investigate the differential contributions of emotional speech and facial expressions on emotion identification. We used recorded and synthesized speech as well as dynamic virtual faces, all enhanced for seven emotional categories. The participants were asked to recognize the prevalent emotion of paired faces and audio. Results showed that when the voice was recorded, the vocalized emotion influenced participants' emotion identification more than the facial expression. However, when the voice was synthesized, facial expression influenced participants' emotion identification more than vocalized emotion. Additionally, individuals did worse on identifying either the facial expression or vocalized emotion when the voice was synthesized. Our experimental method can help to determine how to improve synthesized emotional speech.
Television and the Internet: The Role Digital Technologies Play in Adolescents’ Audio-Visual Media Consumption. Young Television Audiences in Catalonia (Spain

Directory of Open Access Journals (Sweden)

Meritxell Roca

2014-03-01

Full Text Available The aim of this reported study was to investigate adolescents TV consumption habits and perceptions. Although there appears to be no general consensus on how the Internet affects TV consumption by teenagers, and data vary depending on the country, according to our study, Spanish adolescents perceive television as a habit “of the past” and find the computer a device more suited to their recreational and audio-visual consumption needs. The data obtained from eight focus groups of teenagers aged between 12 and 18 and an online survey sent to their parents show that watching TV is an activity usually linked to the home’s communal spaces. On the contrary, online audio-visual consumption (understood as a wider term not limited to just TV shows is perceived by adolescents as a more convenient activity as it adapts to their own schedules and needs.
Matisse: A Visual Analytics System for Exploring Emotion Trends in Social Media Text Streams

Energy Technology Data Exchange (ETDEWEB)

Steed, Chad A [ORNL; Drouhard, Margaret MEG G [ORNL; Beaver, Justin M [ORNL; Pyle, Joshua M [ORNL; BogenII, Paul L. [Google Inc.

2015-01-01

Dynamically mining textual information streams to gain real-time situational awareness is especially challenging with social media systems where throughput and velocity properties push the limits of a static analytical approach. In this paper, we describe an interactive visual analytics system, called Matisse, that aids with the discovery and investigation of trends in streaming text. Matisse addresses the challenges inherent to text stream mining through the following technical contributions: (1) robust stream data management, (2) automated sentiment/emotion analytics, (3) interactive coordinated visualizations, and (4) a flexible drill-down interaction scheme that accesses multiple levels of detail. In addition to positive/negative sentiment prediction, Matisse provides fine-grained emotion classification based on Valence, Arousal, and Dominance dimensions and a novel machine learning process. Information from the sentiment/emotion analytics are fused with raw data and summary information to feed temporal, geospatial, term frequency, and scatterplot visualizations using a multi-scale, coordinated interaction model. After describing these techniques, we conclude with a practical case study focused on analyzing the Twitter sample stream during the week of the 2013 Boston Marathon bombings. The case study demonstrates the effectiveness of Matisse at providing guided situational awareness of significant trends in social media streams by orchestrating computational power and human cognition.
About subjective evaluation of adaptive video streaming

Science.gov (United States)

Tavakoli, Samira; Brunnström, Kjell; Garcia, Narciso

2015-03-01

The usage of HTTP Adaptive Streaming (HAS) technology by content providers is increasing rapidly. Having available the video content in multiple qualities, using HAS allows to adapt the quality of downloaded video to the current network conditions providing smooth video-playback. However, the time-varying video quality by itself introduces a new type of impairment. The quality adaptation can be done in different ways. In order to find the best adaptation strategy maximizing users perceptual quality it is necessary to investigate about the subjective perception of adaptation-related impairments. However, the novelties of these impairments and their comparably long time duration make most of the standardized assessment methodologies fall less suited for studying HAS degradation. Furthermore, in traditional testing methodologies, the quality of the video in audiovisual services is often evaluated separated and not in the presence of audio. Nevertheless, the requirement of jointly evaluating the audio and the video within a subjective test is a relatively under-explored research field. In this work, we address the research question of determining the appropriate assessment methodology to evaluate the sequences with time-varying quality due to the adaptation. This was done by studying the influence of different adaptation related parameters through two different subjective experiments using a methodology developed to evaluate long test sequences. In order to study the impact of audio presence on quality assessment by the test subjects, one of the experiments was done in the presence of audio stimuli. The experimental results were subsequently compared with another experiment using the standardized single stimulus Absolute Category Rating (ACR) methodology.
PEMBELAJARAN LAY UP SHOOT MENGGUNAKAN MEDIA AUDIO VISUAL BASIC LAY UP SHOOT UNTUK MENINGKATKAN HASILBELAJAR LAY UP SHOOT PADA SISWA KELAS VIIIA SMP KANISIUS PATI TAHUN 2013/2014

Directory of Open Access Journals (Sweden)

Frendy Nurochwan Febryanto

2015-01-01

Full Text Available The purpose of this study was to determine the learning lay up shoot using basic audiovisual media shoot lay ups can improve learning outcomes shoot lay ups in class VIIIA Starch Canisius junior year 2013/2014 . This study uses Classroom Action Research ( CAR. The technique of collecting data through observation and assessment of learning outcomes shoot basketball lay up. Data analysis techniques used in this research is descriptive . At the end of the first cycle activity of teachers in teaching basic techniques lay up shoot using audio-visual media reaches 76.19%, whereas at the end of the first cycle of student activity during the learning process lay up shoot using audio-visualmediareaches78.57%. At the end of the second cycle of activity of teachers in teaching basic techniques lay up shoot using audio-visual media reaches 85.71%, whereas at the end of the second cycle of activity of students during the learning process lay up shoot using audio-visual media reaches 92.86%. Based on the results of the study it can be concluded that learning the lay-up shoot using basic audiovisual media shoot lay ups can improve student learning outcomes at Canisius junior class VIIIA Pati year 2013/2014.
Audio pacemaker : Walking, talking indigenous knowledge

CSIR Research Space (South Africa)

Bidwell, NJ

2012-10-01

Full Text Available stream_source_info Bidwell1_2012_ABSTRACT ONLY.pdf.txt stream_content_type text/plain stream_size 1422 Content-Encoding ISO-8859-1 stream_name Bidwell1_2012_ABSTRACT ONLY.pdf.txt Content-Type text/plain; charset=ISO-8859-1...

Impact of oral health education by audio aids, braille and tactile models on the oral health status of visually impaired children of Bhopal City

OpenAIRE

Anjali Gautam; Ajay Bhambal; Swapnil Moghe

2018-01-01

Context: Children with special needs face unique challenges in day-to-day practice. They are dependent on their close ones for everything. To improve oral hygiene in such visually impaired children, undue training and education are required. Braille is an important language for reading and writing for the visually impaired. It helps them understand and visualize the world via touch. Audio aids are being used to impart health education to the visually impaired. Tactile models help them perceiv...
Horatio Audio-Describes Shakespeare's "Hamlet": Blind and Low-Vision Theatre-Goers Evaluate an Unconventional Audio Description Strategy

Science.gov (United States)

Udo, J. P.; Acevedo, B.; Fels, D. I.

2010-01-01

Audio description (AD) has been introduced as one solution for providing people who are blind or have low vision with access to live theatre, film and television content. However, there is little research to inform the process, user preferences and presentation style. We present a study of a single live audio-described performance of Hart House…
A Low-Cost Audio Prescription Labeling System Using RFID for Thai Visually-Impaired People.

Science.gov (United States)

Lertwiriyaprapa, Titipong; Fakkheow, Pirapong

2015-01-01

This research aims to develop a low-cost audio prescription labeling (APL) system for visually-impaired people by using the RFID system. The developed APL system includes the APL machine and APL software. The APL machine is for visually-impaired people while APL software allows caregivers to record all important information into the APL machine. The main objective of the development of the APL machine is to reduce costs and size by designing all of the electronic devices to fit into one print circuit board. Also, it is designed so that it is easy to use and can become an electronic aid for daily living. The developed APL software is based on Java and MySQL, both of which can operate on various operating platforms and are easy to develop as commercial software. The developed APL system was first evaluated by 5 experts. The APL system was also evaluated by 50 actual visually-impaired people (30 elders and 20 blind individuals) and 20 caregivers, pharmacists and nurses. After using the APL system, evaluations were carried out, and it can be concluded from the evaluation results that this proposed APL system can be effectively used for helping visually-impaired people in terms of self-medication.
Pitch contour impairment in congenital amusia: New insights from the Self-paced Audio-visual Contour Task (SACT.

Directory of Open Access Journals (Sweden)

Xuejing Lu

Full Text Available Individuals with congenital amusia usually exhibit impairments in melodic contour processing when asked to compare pairs of melodies that may or may not be identical to one another. However, it is unclear whether the impairment observed in contour processing is caused by an impairment of pitch discrimination, or is a consequence of poor pitch memory. To help resolve this ambiguity, we designed a novel Self-paced Audio-visual Contour Task (SACT that evaluates sensitivity to contour while placing minimal burden on memory. In this task, participants control the pace of an auditory contour that is simultaneously accompanied by a visual contour, and they are asked to judge whether the two contours are congruent or incongruent. In Experiment 1, melodic contours varying in pitch were presented with a series of dots that varied in spatial height. Amusics exhibited reduced sensitivity to audio-visual congruency in comparison to control participants. To exclude the possibility that the impairment arises from a general deficit in cross-modal mapping, Experiment 2 examined sensitivity to cross-modal mapping for two other auditory dimensions: timbral brightness and loudness. Amusics and controls were significantly more sensitive to large than small contour changes, and to changes in loudness than changes in timbre. However, there were no group differences in cross-modal mapping, suggesting that individuals with congenital amusia can comprehend spatial representations of acoustic information. Taken together, the findings indicate that pitch contour processing in congenital amusia remains impaired even when pitch memory is relatively unburdened.
Bayesian networks and information theory for audio-visual perception modeling.

Science.gov (United States)

Besson, Patricia; Richiardi, Jonas; Bourdin, Christophe; Bringoux, Lionel; Mestre, Daniel R; Vercher, Jean-Louis

2010-09-01

Thanks to their different senses, human observers acquire multiple information coming from their environment. Complex cross-modal interactions occur during this perceptual process. This article proposes a framework to analyze and model these interactions through a rigorous and systematic data-driven process. This requires considering the general relationships between the physical events or factors involved in the process, not only in quantitative terms, but also in term of the influence of one factor on another. We use tools from information theory and probabilistic reasoning to derive relationships between the random variables of interest, where the central notion is that of conditional independence. Using mutual information analysis to guide the model elicitation process, a probabilistic causal model encoded as a Bayesian network is obtained. We exemplify the method by using data collected in an audio-visual localization task for human subjects, and we show that it yields a well-motivated model with good predictive ability. The model elicitation process offers new prospects for the investigation of the cognitive mechanisms of multisensory perception.
Perancangan Radio Streaming Edukasi (Studi Kasus Balai Pengembangan Media Radio YOGYAKARTA)

OpenAIRE

Nurwulan, Ayu Isni; Paputungan, Irving Vitra

2009-01-01

Pendidikan berkualitas sudah sewajarnya bisa dinikmati secara merata oleh semua orang. Mediapembelajaran secara audio yang selama ini disampaikan masih memiliki banyak keterbatasan, terutama padalingkup wilayah penyampaian. Dalam makalah ini, sebuah media pendidikan berbasis audio dengan cara laindiusulkan. Media tersebut bernama radio streaming. Pembuatan radio streaming memerlukan banyak analisissehingga perancangannya tepat. Hasil analisis dan perancangan yang disampaikan dalam makalah ini...
Adapting models of visual aesthetics for personalized content creation

DEFF Research Database (Denmark)

Liapis, Antonios; Yannakakis, Georgios N.; Togelius, Julian

2012-01-01

This paper introduces a search-based approach to personalized content generation with respect to visual aesthetics. The approach is based on a two-step adaptation procedure where (1) the evaluation function that characterizes the content is adjusted to match the visual aesthetics of users and (2......) the content itself is optimized based on the personalized evaluation function. To test the efficacy of the approach we design fitness functions based on universal properties of visual perception, inspired by psychological and neurobiological research. Using these visual properties we generate aesthetically...... spaceships according to their visual taste: the impact of the various visual properties is adjusted based on player preferences and new content is generated online based on the updated computational model of visual aesthetics of the player. Results are presented which show the potential of the approach...
Audio-Visual and Autogenic Relaxation Alter Amplitude of Alpha EEG Band, Causing Improvements in Mental Work Performance in Athletes.

Science.gov (United States)

Mikicin, Mirosław; Kowalczyk, Marek

2015-09-01

The aim of the present study was to investigate the effect of regular audio-visual relaxation combined with Schultz's autogenic training on: (1) the results of behavioral tests that evaluate work performance during burdensome cognitive tasks (Kraepelin test), (2) changes in classical EEG alpha frequency band, neocortex (frontal, temporal, occipital, parietal), hemisphere (left, right) versus condition (only relaxation 7-12 Hz). Both experimental (EG) and age-and skill-matched control group (CG) consisted of eighteen athletes (ten males and eight females). After 7-month training EG demonstrated changes in the amplitude of mean electrical activity of the EEG alpha bend at rest and an improvement was significantly changing and an improvement in almost all components of Kraepelin test. The same examined variables in CG were unchanged following the period without the intervention. Summing up, combining audio-visual relaxation with autogenic training significantly improves athlete's ability to perform a prolonged mental effort. These changes are accompanied by greater amplitude of waves in alpha band in the state of relax. The results suggest usefulness of relaxation techniques during performance of mentally difficult sports tasks (sports based on speed and stamina, sports games, combat sports) and during relax of athletes.
Investigating category- and shape-selective neural processing in ventral and dorsal visual stream under interocular suppression.

Science.gov (United States)

Ludwig, Karin; Kathmann, Norbert; Sterzer, Philipp; Hesselmann, Guido

2015-01-01

Recent behavioral and neuroimaging studies using continuous flash suppression (CFS) have suggested that action-related processing in the dorsal visual stream might be independent of perceptual awareness, in line with the "vision-for-perception" versus "vision-for-action" distinction of the influential dual-stream theory. It remains controversial if evidence suggesting exclusive dorsal stream processing of tool stimuli under CFS can be explained by their elongated shape alone or by action-relevant category representations in dorsal visual cortex. To approach this question, we investigated category- and shape-selective functional magnetic resonance imaging-blood-oxygen level-dependent responses in both visual streams using images of faces and tools. Multivariate pattern analysis showed enhanced decoding of elongated relative to non-elongated tools, both in the ventral and dorsal visual stream. The second aim of our study was to investigate whether the depth of interocular suppression might differentially affect processing in dorsal and ventral areas. However, parametric modulation of suppression depth by varying the CFS mask contrast did not yield any evidence for differential modulation of category-selective activity. Together, our data provide evidence for shape-selective processing under CFS in both dorsal and ventral stream areas and, therefore, do not support the notion that dorsal "vision-for-action" processing is exclusively preserved under interocular suppression. © 2014 Wiley Periodicals, Inc.
High End Visualization of Geophysical Datasets Using Immersive Technology: The SIO Visualization Center.

Science.gov (United States)

Newman, R. L.

2002-12-01

How many images can you display at one time with Power Point without getting "postage stamps"? Do you have fantastic datasets that you cannot view because your computer is too slow/small? Do you assume a few 2-D images of a 3-D picture are sufficient? High-end visualization centers can minimize and often eliminate these problems. The new visualization center [http://siovizcenter.ucsd.edu] at Scripps Institution of Oceanography [SIO] immerses users into a virtual world by projecting 3-D images onto a Panoram GVR-120E wall-sized floor-to-ceiling curved screen [7' x 23'] that has 3.2 mega-pixels of resolution. The Infinite Reality graphics subsystem is driven by a single-pipe SGI Onyx 3400 with a system bandwidth of 44 Gbps. The Onyx is powered by 16 MIPS R12K processors and 16 GB of addressable memory. The system is also equipped with transmitters and LCD shutter glasses which permit stereographic 3-D viewing of high-resolution images. This center is ideal for groups of up to 60 people who can simultaneously view these large-format images. A wide range of hardware and software is available, giving the users a totally immersive working environment in which to display, analyze, and discuss large datasets. The system enables simultaneous display of video and audio streams from sources such as SGI megadesktop and stereo megadesktop, S-VHS video, DVD video, and video from a Macintosh or PC. For instance, one-third of the screen might be displaying S-VHS video from a remotely-operated-vehicle [ROV], while the remaining portion of the screen might be used for an interactive 3-D flight over the same parcel of seafloor. The video and audio combinations using this system are numerous, allowing users to combine and explore data and images in innovative ways, greatly enhancing scientists' ability to visualize, understand and collaborate on complex datasets. In the not-distant future, with the rapid growth in networking speeds in the US, it will be possible for Earth Sciences
Teach Yourself VISUALLY iPad

CERN Document Server

Watson, Lonzell

2010-01-01

An ideal, visual guide for the image-driven iPad. Whether your interests veer towards movies, games, books, or music—the iPad is the computing device for dazzling graphics, crisp and clear audio, and effortless portability. If ever there existed a device that demanded a reading companion for the visual learner, it's the iPad—and this resource is perfectly suited for the visual audience. Veteran VISUAL author Lonzell Watson walks you through all the features unique to the iPad and shows you how to download books, apps, music, and video content, as well as send photos and e-mails. Plus, you'll d
Automatic Detection and Classification of Audio Events for Road Surveillance Applications

Directory of Open Access Journals (Sweden)

Noor Almaadeed

2018-06-01

Full Text Available This work investigates the problem of detecting hazardous events on roads by designing an audio surveillance system that automatically detects perilous situations such as car crashes and tire skidding. In recent years, research has shown several visual surveillance systems that have been proposed for road monitoring to detect accidents with an aim to improve safety procedures in emergency cases. However, the visual information alone cannot detect certain events such as car crashes and tire skidding, especially under adverse and visually cluttered weather conditions such as snowfall, rain, and fog. Consequently, the incorporation of microphones and audio event detectors based on audio processing can significantly enhance the detection accuracy of such surveillance systems. This paper proposes to combine time-domain, frequency-domain, and joint time-frequency features extracted from a class of quadratic time-frequency distributions (QTFDs to detect events on roads through audio analysis and processing. Experiments were carried out using a publicly available dataset. The experimental results conform the effectiveness of the proposed approach for detecting hazardous events on roads as demonstrated by 7% improvement of accuracy rate when compared against methods that use individual temporal and spectral features.
A reference web architecture and patterns for real-time visual analytics on large streaming data

Science.gov (United States)

Kandogan, Eser; Soroker, Danny; Rohall, Steven; Bak, Peter; van Ham, Frank; Lu, Jie; Ship, Harold-Jeffrey; Wang, Chun-Fu; Lai, Jennifer

2013-12-01

Monitoring and analysis of streaming data, such as social media, sensors, and news feeds, has become increasingly important for business and government. The volume and velocity of incoming data are key challenges. To effectively support monitoring and analysis, statistical and visual analytics techniques need to be seamlessly integrated; analytic techniques for a variety of data types (e.g., text, numerical) and scope (e.g., incremental, rolling-window, global) must be properly accommodated; interaction, collaboration, and coordination among several visualizations must be supported in an efficient manner; and the system should support the use of different analytics techniques in a pluggable manner. Especially in web-based environments, these requirements pose restrictions on the basic visual analytics architecture for streaming data. In this paper we report on our experience of building a reference web architecture for real-time visual analytics of streaming data, identify and discuss architectural patterns that address these challenges, and report on applying the reference architecture for real-time Twitter monitoring and analysis.
Auditory, Visual and Audiovisual Speech Processing Streams in Superior Temporal Sulcus.

Science.gov (United States)

Venezia, Jonathan H; Vaden, Kenneth I; Rong, Feng; Maddox, Dale; Saberi, Kourosh; Hickok, Gregory

2017-01-01

The human superior temporal sulcus (STS) is responsive to visual and auditory information, including sounds and facial cues during speech recognition. We investigated the functional organization of STS with respect to modality-specific and multimodal speech representations. Twenty younger adult participants were instructed to perform an oddball detection task and were presented with auditory, visual, and audiovisual speech stimuli, as well as auditory and visual nonspeech control stimuli in a block fMRI design. Consistent with a hypothesized anterior-posterior processing gradient in STS, auditory, visual and audiovisual stimuli produced the largest BOLD effects in anterior, posterior and middle STS (mSTS), respectively, based on whole-brain, linear mixed effects and principal component analyses. Notably, the mSTS exhibited preferential responses to multisensory stimulation, as well as speech compared to nonspeech. Within the mid-posterior and mSTS regions, response preferences changed gradually from visual, to multisensory, to auditory moving posterior to anterior. Post hoc analysis of visual regions in the posterior STS revealed that a single subregion bordering the mSTS was insensitive to differences in low-level motion kinematics yet distinguished between visual speech and nonspeech based on multi-voxel activation patterns. These results suggest that auditory and visual speech representations are elaborated gradually within anterior and posterior processing streams, respectively, and may be integrated within the mSTS, which is sensitive to more abstract speech information within and across presentation modalities. The spatial organization of STS is consistent with processing streams that are hypothesized to synthesize perceptual speech representations from sensory signals that provide convergent information from visual and auditory modalities.
Real-time decreased sensitivity to an audio-visual illusion during goal-directed reaching.

Directory of Open Access Journals (Sweden)

Luc Tremblay

Full Text Available In humans, sensory afferences are combined and integrated by the central nervous system (Ernst MO, Bülthoff HH (2004 Trends Cogn. Sci. 8: 162-169 and appear to provide a holistic representation of the environment. Empirical studies have repeatedly shown that vision dominates the other senses, especially for tasks with spatial demands. In contrast, it has also been observed that sound can strongly alter the perception of visual events. For example, when presented with 2 flashes and 1 beep in a very brief period of time, humans often report seeing 1 flash (i.e. fusion illusion, Andersen TS, Tiippana K, Sams M (2004 Brain Res. Cogn. Brain Res. 21: 301-308. However, it is not known how an unfolding movement modulates the contribution of vision to perception. Here, we used the audio-visual illusion to demonstrate that goal-directed movements can alter visual information processing in real-time. Specifically, the fusion illusion was linearly reduced as a function of limb velocity. These results suggest that cue combination and integration can be modulated in real-time by goal-directed behaviors; perhaps through sensory gating (Chapman CE, Beauchamp E (2006 J. Neurophysiol. 96: 1664-1675 and/or altered sensory noise (Ernst MO, Bülthoff HH (2004 Trends Cogn. Sci. 8: 162-169 during limb movements.
Streamer Motives and User-Generated Content on Social Live-Streaming Services

Directory of Open Access Journals (Sweden)

Friedlander, Mathilde B.

2017-03-01

Full Text Available Three most popular information services, Periscope, Ustream, and YouNow, vicarious for all Social Live-Streaming Services (SLSSs, are investigated to analyze their streamers' motivations and the user-generated content. Additionally, we collected demographic data (gender and age. More than 7,500 streams by users from the U.S., Germany, and Japan were observed. Main streamer motivations on SLSSs are boredom, socializing, the need to reach a specific group, the need to communicate, and fun. Important content categories on all three SLSSs are chatting, sharing information, 24/7, and 'slice of life.' We were able to identify differences between users from the U.S., Germany, and Japan as well as between the users of Periscope, Ustream, and YouNow. The main motive to stream in the U.S. is to reach a specific group, while in Japan it is socializing, and in Germany boredom. The top content category for both, YouNow as well as Periscope, is to chat; on Ustream it is 24/7 (i.e., webcams.
A Preliminary Investigation into the Search Behaviour of Users in a Collection of Digitized Broadcast Audio

DEFF Research Database (Denmark)

Lund, Haakon; Skov, Mette; Larsen, Birger

2014-01-01

An increasing number of large digitized audio-visual collections within digital humanities have recently been made available for users. Often access to digitized audio-visual collections is hampered by little and inconsistent metadata. This paper presents the preliminary findings from a study of ...
Audio-visual assistance in co-creating transition knowledge

Science.gov (United States)

Hezel, Bernd; Broschkowski, Ephraim; Kropp, Jürgen P.

2013-04-01

Earth system and climate impact research results point to the tremendous ecologic, economic and societal implications of climate change. Specifically people will have to adopt lifestyles that are very different from those they currently strive for in order to mitigate severe changes of our known environment. It will most likely not suffice to transfer the scientific findings into international agreements and appropriate legislation. A transition is rather reliant on pioneers that define new role models, on change agents that mainstream the concept of sufficiency and on narratives that make different futures appealing. In order for the research community to be able to provide sustainable transition pathways that are viable, an integration of the physical constraints and the societal dynamics is needed. Hence the necessary transition knowledge is to be co-created by social and natural science and society. To this end, the Climate Media Factory - in itself a massively transdisciplinary venture - strives to provide an audio-visual connection between the different scientific cultures and a bi-directional link to stake holders and society. Since methodology, particular language and knowledge level of the involved is not the same, we develop new entertaining formats on the basis of a "complexity on demand" approach. They present scientific information in an integrated and entertaining way with different levels of detail that provide entry points to users with different requirements. Two examples shall illustrate the advantages and restrictions of the approach.
Training of audio descriptors: the cinematographic aesthetics as basis for the learning of the audio description aesthetics – materials, methods and products

Directory of Open Access Journals (Sweden)

Soraya Ferreira Alves

2016-12-01

Full Text Available Audio description (AD, a resource used to make theater, cinema, TV, and visual works of art accessible to people with visual impairments, is slowly being implemented in Brazil and demanding qualified professionals. Based on this statement, this article reports the results of a research developed during post-doctoral studies. The study is dedicated to the confrontation of film aesthetics with audio description techniques to check how the knowledge of the former can contribute to audiodescritor training. Through action research, a short film adapted from a Mario de Andrade’s, a Brazilian writer, short story called O Peru de Natal (Christmas Turkey was produced. The film as well as its audio description were carried out involving students and teachers from the discipline Intersemiotic Translation at the State University of Ceará. Thus, we intended to suggest pedagogical procedures generated by the students experiences by evaluating their choices and their implications.
Bringing Legacy Visualization Software to Modern Computing Devices via Application Streaming

Science.gov (United States)

Fisher, Ward

2014-05-01

Planning software compatibility across forthcoming generations of computing platforms is a problem commonly encountered in software engineering and development. While this problem can affect any class of software, data analysis and visualization programs are particularly vulnerable. This is due in part to their inherent dependency on specialized hardware and computing environments. A number of strategies and tools have been designed to aid software engineers with this task. While generally embraced by developers at 'traditional' software companies, these methodologies are often dismissed by the scientific software community as unwieldy, inefficient and unnecessary. As a result, many important and storied scientific software packages can struggle to adapt to a new computing environment; for example, one in which much work is carried out on sub-laptop devices (such as tablets and smartphones). Rewriting these packages for a new platform often requires significant investment in terms of development time and developer expertise. In many cases, porting older software to modern devices is neither practical nor possible. As a result, replacement software must be developed from scratch, wasting resources better spent on other projects. Enabled largely by the rapid rise and adoption of cloud computing platforms, 'Application Streaming' technologies allow legacy visualization and analysis software to be operated wholly from a client device (be it laptop, tablet or smartphone) while retaining full functionality and interactivity. It mitigates much of the developer effort required by other more traditional methods while simultaneously reducing the time it takes to bring the software to a new platform. This work will provide an overview of Application Streaming and how it compares against other technologies which allow scientific visualization software to be executed from a remote computer. We will discuss the functionality and limitations of existing application streaming

Design and analysis of ultrasonic monaural audio guiding device for the visually impaired.

Science.gov (United States)

Kim, Keonwook; Kim, Hyunjai; Yun, Gihun; Kim, Myungsoo

2009-01-01

The novel Audio Guiding Device (AGD) based on the ultrasonic, which is named as SonicID, has been developed in order to localize point of interest for the visually impaired. The SonicID requires the infrastructure of the transmitters for broadcasting the location information over the ultrasonic carrier. The user with ultrasonic headset receives the information with variable amplitude upon the location and direction of the user due to the ultrasonic characteristic and modulation method. This paper proposes the monaural headset form factor of the SonicID which improves the daily life of the beneficiary compare to the previous version which uses the both ears. Experimental results from SonicID, Bluetooth, and audible sound show that the SonicID demonstrates comparable localization performance to the audible sound with silence to others.
Identification and annotation of erotic film based on content analysis

Science.gov (United States)

Wang, Donghui; Zhu, Miaoliang; Yuan, Xin; Qian, Hui

2005-02-01

The paper brings forward a new method for identifying and annotating erotic films based on content analysis. First, the film is decomposed to video and audio stream. Then, the video stream is segmented into shots and key frames are extracted from each shot. We filter the shots that include potential erotic content by finding the nude human body in key frames. A Gaussian model in YCbCr color space for detecting skin region is presented. An external polygon that covered the skin regions is used for the approximation of the human body. Last, we give the degree of the nudity by calculating the ratio of skin area to whole body area with weighted parameters. The result of the experiment shows the effectiveness of our method.
Crossmodal recruitment of the ventral visual stream in congenital blindness

DEFF Research Database (Denmark)

Ptito, Maurice; Matteau, Isabelle; Zhi Wang, Arthur

2012-01-01

We used functional MRI (fMRI) to test the hypothesis that blind subjects recruit the ventral visual stream during nonhaptic tactile-form recognition. Congenitally blind and blindfolded sighted control subjects were scanned after they had been trained during four consecutive days to perform......, inferotemporal (IT), cortex, lateral occipital tactile vision area (LOtv), and fusiform gyrus. Control subjects activated area LOtv and precuneus but not cuneus, IT and fusiform gyrus. These results indicate that congenitally blind subjects recruit key regions in the ventral visual pathway during nonhaptic...
Perceptual Audio Hashing Functions

Directory of Open Access Journals (Sweden)

Emin Anarım

2005-07-01

Full Text Available Perceptual hash functions provide a tool for fast and reliable identification of content. We present new audio hash functions based on summarization of the time-frequency spectral characteristics of an audio document. The proposed hash functions are based on the periodicity series of the fundamental frequency and on singular-value description of the cepstral frequencies. They are found, on one hand, to perform very satisfactorily in identification and verification tests, and on the other hand, to be very resilient to a large variety of attacks. Moreover, we address the issue of security of hashes and propose a keying technique, and thereby a key-dependent hash function.
Increasing Valid Profiles in Phallometric Assessment of Sex Offenders with Child Victims: Combining the Strengths of Audio Stimuli and Synthetic Characters.

Science.gov (United States)

Marschall-Lévesque, Shawn; Rouleau, Joanne-Lucine; Renaud, Patrice

2018-02-01

Penile plethysmography (PPG) is a measure of sexual interests that relies heavily on the stimuli it uses to generate valid results. Ethical considerations surrounding the use of real images in PPG have further limited the content admissible for these stimuli. To palliate this limitation, the current study aimed to combine audio and visual stimuli by incorporating computer-generated characters to create new stimuli capable of accurately classifying sex offenders with child victims, while also increasing the number of valid profiles. Three modalities (audio, visual, and audiovisual) were compared using two groups (15 sex offenders with child victims and 15 non-offenders). Both the new visual and audiovisual stimuli resulted in a 13% increase in the number of valid profiles at 2.5 mm, when compared to the standard audio stimuli. Furthermore, the new audiovisual stimuli generated a 34% increase in penile responses. All three modalities were able to discriminate between the two groups by their responses to the adult and child stimuli. Lastly, sexual interest indices for all three modalities could accurately classify participants in their appropriate groups, as demonstrated by ROC curve analysis (i.e., audio AUC = .81, 95% CI [.60, 1.00]; visual AUC = .84, 95% CI [.66, 1.00], and audiovisual AUC = .83, 95% CI [.63, 1.00]). Results suggest that computer-generated characters allow accurate discrimination of sex offenders with child victims and can be added to already validated stimuli to increase the number of valid profiles. The implications of audiovisual stimuli using computer-generated characters and their possible use in PPG evaluations are also discussed.
IST BENOGO (IST – 2001-39184) Deliverable I-AAU-05-01: Role of sound in VR and Audio Visual Preferences

DEFF Research Database (Denmark)

Nordahl, Rolf

This Periodic Progres Report (PPR) document reports on the studies done in Aalborg University on December 2004 concerning role of sound in VR, audio-visual correlations and attention triggering. The report contains a description and evaluation of the experiments run, together with the analysis...... of the data captured by the head tracker, which provide valuable insights on the role of sound events in VR....
Adaptación de tecnologías Stream y XML a centros de documentación en televisión

Directory of Open Access Journals (Sweden)

Pérez Agüera, José Ramón

2004-12-01

Full Text Available Potential of media streaming technologies for its use in information broadcasting both in Internet and corporative intranets is presented. To achieve this a definition and statement of scope of media streaming technologies in broadcasting visual and audio information are carried on, both on demand and as direct broadcasting. The importance of technologies in the documentation departments is highlighted as a means of broadcasting among both internal and external users as an economic added value for the enterprise. We will also outline the main lines of evolution of these technologies in combination with XML for the management of audiovisual contents guided by standards.

Se presenta la potencialidad de la tecnología media streaming para su utilización en la difusión de la información tanto en las intranet corporativas como por medio de la red Internet. Para ello se lleva a cabo la definición y alcance de media streaming como difusión de información visual y sonora, tanto en solicitud bajo demanda como difusión en directo. Se muestra la importancia de la tecnología en el departamento de documentación de las cadenas audiovisuales, tanto para su difusión entre los usuarios internos como para su extensión como activo económico empresarial. También marcaremos las líneas de evolución de esta tecnología en el ámbito documental con su combinación con las tecnologías XML para el tratamiento documental de los contenidos audiovisuales en función de un standar.
Feature-based memory-driven attentional capture: Visual working memory content affects visual attention.

NARCIS (Netherlands)

Olivers, C.N.L.; Meijer, F.; Theeuwes, J.

2006-01-01

In 7 experiments, the authors explored whether visual attention (the ability to select relevant visual information) and visual working memory (the ability to retain relevant visual information) share the same content representations. The presence of singleton distractors interfered more strongly
Spatial audio reproduction with primary ambient extraction

CERN Document Server

He, JianJun

2017-01-01

This book first introduces the background of spatial audio reproduction, with different types of audio content and for different types of playback systems. A literature study on the classical and emerging Primary Ambient Extraction (PAE) techniques is presented. The emerging techniques aim to improve the extraction performance and also enhance the robustness of PAE approaches in dealing with more complex signals encountered in practice. The in-depth theoretical study helps readers to understand the rationales behind these approaches. Extensive objective and subjective experiments validate the feasibility of applying PAE in spatial audio reproduction systems. These experimental results, together with some representative audio examples and MATLAB codes of the key algorithms, illustrate clearly the differences among various approaches and also help readers gain insights on selecting different approaches for different applications.
Wavelet-based audio embedding and audio/video compression

Science.gov (United States)

Mendenhall, Michael J.; Claypoole, Roger L., Jr.

2001-12-01

Watermarking, traditionally used for copyright protection, is used in a new and exciting way. An efficient wavelet-based watermarking technique embeds audio information into a video signal. Several effective compression techniques are applied to compress the resulting audio/video signal in an embedded fashion. This wavelet-based compression algorithm incorporates bit-plane coding, index coding, and Huffman coding. To demonstrate the potential of this audio embedding and audio/video compression algorithm, we embed an audio signal into a video signal and then compress. Results show that overall compression rates of 15:1 can be achieved. The video signal is reconstructed with a median PSNR of nearly 33 dB. Finally, the audio signal is extracted from the compressed audio/video signal without error.
Feature-Based Memory-Driven Attentional Capture: Visual Working Memory Content Affects Visual Attention

Science.gov (United States)

Olivers, Christian N. L.; Meijer, Frank; Theeuwes, Jan

2006-01-01

In 7 experiments, the authors explored whether visual attention (the ability to select relevant visual information) and visual working memory (the ability to retain relevant visual information) share the same content representations. The presence of singleton distractors interfered more strongly with a visual search task when it was accompanied by…
Content-Adaptive Packetization and Streaming of Wavelet Video over IP Networks

Directory of Open Access Journals (Sweden)

Chien-Peng Ho

2007-03-01

Full Text Available This paper presents a framework of content-adaptive packetization scheme for streaming of 3D wavelet-based video content over lossy IP networks. The tradeoff between rate and distortion is controlled by jointly adapting scalable source coding rate and level of forward error correction (FEC protection. A content dependent packetization mechanism with data-interleaving and Reed-Solomon protection for wavelet-based video codecs is proposed to provide unequal error protection. This paper also tries to answer an important question for scalable video streaming systems: given extra bandwidth, should one increase the level of channel protection for the most important packets, or transmit more scalable source data? Experimental results show that the proposed framework achieves good balance between quality of the received video and level of error protection under bandwidth-varying lossy IP networks.
A Contents Encryption Mechanism Using Reused Key in IPTV

Science.gov (United States)

Jeong, Yoon-Su; Kim, Yong-Tae; Cho, Young-Bok; Lee, Ki-Jeong; Park, Gil-Cheol; Lee, Sang-Ho

Recently IPTV is being spotlighted as a new stream service to stably provide video, audio and control signals to subscribers through the application of IP protocol. However, the IPTV system is facing more security threats than the traditional TV. This study proposes a multicasting encryption mechanism for secure transmission of the contents of IPTV by which the content provider encrypts their contents and send the encrypted contents and the key used for encryption of the contents to the user. In order to reduce the time and cost of Head-End, the proposed mechanism encrypts the media contents at the Head-End, embeds the code of the IPTV terminal used at the Head-End in the media contents for user tracking, and performs desynchronization for protection of the media contents from various attacks.
The Effect of Audio and Animation in Multimedia Instruction

Science.gov (United States)

Koroghlanian, Carol; Klein, James D.

2004-01-01

This study investigated the effects of audio, animation, and spatial ability in a multimedia computer program for high school biology. Participants completed a multimedia program that presented content by way of text or audio with lean text. In addition, several instructional sequences were presented either with static illustrations or animations.…
The Effect of Bio/Neurofeedback Training on Performance, Audio and Visual Attention in Elite Shooters

Directory of Open Access Journals (Sweden)

Farzaneh Bagheri asl

2017-10-01

Full Text Available The aim of this study was the effect of Bio/Neurofeedback training on performance, audio and visual attention of elite shooters. In this study 36 elite shooters of Kermanshah Province participated. They divided in three groups. Two groups were experimental groups how participated biofeedback and neurofeedback training and one group was control group. All participants were tried that their trainings as well as the number of shoots were closely controlled in order to assure their physical and special trainings. In this study, for attention affects the computerized Integrated Visual and Auditory test (IVA was used. This test has been considered as both a pretest and a posttest after the therapeutic intervention in three groups. The score of shooting also were collected before and after intervention. Each athlete in neurofeedback training group carried out the neurofeedback training for 20 sessions, each lasting 45 minutes. To do so, both auricles and T3 and PZ of each individual were cleaned using alcohol and new-perp gel to prepare for the neurofeedback training. The biofeedback training was heart rate and respiratory training. To compare the results of the pretest and the posttest in each group, the dependent t-test was used. For compare three groups we used ANOVA test. The significance level was set at 0.05. The results indicated that there is a significant difference in three groups. It indicates a significant increase in the total score for attention after the implementation of the biofeedback and neurofeedback training. The results showed that the attention mean scores in three visual, audio, and total variables were higher in the posttest than in the pretest for two experimental groups. The results also indicated that the scores of the shoots were improved after training. According the research finding, we can be said that the neurofeedback and biofeedback training act on the waves of the sensory-motor beats and which are responsible
Automated Speech and Audio Analysis for Semantic Access to Multimedia

NARCIS (Netherlands)

Jong, F.M.G. de; Ordelman, R.; Huijbregts, M.

2006-01-01

The deployment and integration of audio processing tools can enhance the semantic annotation of multimedia content, and as a consequence, improve the effectiveness of conceptual access tools. This paper overviews the various ways in which automatic speech and audio analysis can contribute to
Automated speech and audio analysis for semantic access to multimedia

NARCIS (Netherlands)

de Jong, Franciska M.G.; Ordelman, Roeland J.F.; Huijbregts, M.A.H.; Avrithis, Y.; Kompatsiaris, Y.; Staab, S.; O' Connor, N.E.

2006-01-01

The deployment and integration of audio processing tools can enhance the semantic annotation of multimedia content, and as a consequence, improve the effectiveness of conceptual access tools. This paper overviews the various ways in which automatic speech and audio analysis can contribute to
Temporal visual cues aid speech recognition

DEFF Research Database (Denmark)

Zhou, Xiang; Ross, Lars; Lehn-Schiøler, Tue

2006-01-01

of audio to generate an artificial talking-face video and measured word recognition performance on simple monosyllabic words. RESULTS: When presenting words together with the artificial video we find that word recognition is improved over purely auditory presentation. The effect is significant (p......BACKGROUND: It is well known that under noisy conditions, viewing a speaker's articulatory movement aids the recognition of spoken words. Conventionally it is thought that the visual input disambiguates otherwise confusing auditory input. HYPOTHESIS: In contrast we hypothesize...... that it is the temporal synchronicity of the visual input that aids parsing of the auditory stream. More specifically, we expected that purely temporal information, which does not convey information such as place of articulation may facility word recognition. METHODS: To test this prediction we used temporal features...
Text Stream Trend Analysis using Multiscale Visual Analytics with Applications to Social Media Systems

Energy Technology Data Exchange (ETDEWEB)

Steed, Chad A [ORNL; Beaver, Justin M [ORNL; BogenII, Paul L. [Google Inc.; Drouhard, Margaret MEG G [ORNL; Pyle, Joshua M [ORNL

2015-01-01

In this paper, we introduce a new visual analytics system, called Matisse, that allows exploration of global trends in textual information streams with specific application to social media platforms. Despite the potential for real-time situational awareness using these services, interactive analysis of such semi-structured textual information is a challenge due to the high-throughput and high-velocity properties. Matisse addresses these challenges through the following contributions: (1) robust stream data management, (2) automated sen- timent/emotion analytics, (3) inferential temporal, geospatial, and term-frequency visualizations, and (4) a flexible drill-down interaction scheme that progresses from macroscale to microscale views. In addition to describing these contributions, our work-in-progress paper concludes with a practical case study focused on the analysis of Twitter 1% sample stream information captured during the week of the Boston Marathon bombings.
Conceptual Content and Unattended Visual Features

Directory of Open Access Journals (Sweden)

Francisco Pereira

2009-08-01

Full Text Available McDowell (1994 proposed a philosophical theory about perceptual content −call it “conceptualism”− that states that in every case the content of a visual experience necessarily involves concepts that fully specify every single feature consciously and simultaneously available during the experience. In this paper I will question conceptualism, arguing that some visual experiences carry information about so many objects, properties and relations at the same time that it is unlikely for subjects to possess and implement concepts for every feature represented simultaneously by the experience at that time. If this is the case, then McDowell’s conceptualism is insufficiently grounded.

Cardiac and pulmonary dose reduction for tangentially irradiated breast cancer, utilizing deep inspiration breath-hold with audio-visual guidance, without compromising target coverage

International Nuclear Information System (INIS)

Vikstroem, Johan; Hjelstuen, Mari H.B.; Mjaaland, Ingvil; Dybvik, Kjell Ivar

2011-01-01

Background and purpose. Cardiac disease and pulmonary complications are documented risk factors in tangential breast irradiation. Respiratory gating radiotherapy provides a possibility to substantially reduce cardiopulmonary doses. This CT planning study quantifies the reduction of radiation doses to the heart and lung, using deep inspiration breath-hold (DIBH). Patients and methods. Seventeen patients with early breast cancer, referred for adjuvant radiotherapy, were included. For each patient two CT scans were acquired; the first during free breathing (FB) and the second during DIBH. The scans were monitored by the Varian RPM respiratory gating system. Audio coaching and visual feedback (audio-visual guidance) were used. The treatment planning of the two CT studies was performed with conformal tangential fields, focusing on good coverage (V95>98%) of the planning target volume (PTV). Dose-volume histograms were calculated and compared. Doses to the heart, left anterior descending (LAD) coronary artery, ipsilateral lung and the contralateral breast were assessed. Results. Compared to FB, the DIBH-plans obtained lower cardiac and pulmonary doses, with equal coverage of PTV. The average mean heart dose was reduced from 3.7 to 1.7 Gy and the number of patients with >5% heart volume receiving 25 Gy or more was reduced from four to one of the 17 patients. With DIBH the heart was completely out of the beam portals for ten patients, with FB this could not be achieved for any of the 17 patients. The average mean dose to the LAD coronary artery was reduced from 18.1 to 6.4 Gy. The average ipsilateral lung volume receiving more than 20 Gy was reduced from 12.2 to 10.0%. Conclusion. Respiratory gating with DIBH, utilizing audio-visual guidance, reduces cardiac and pulmonary doses for tangentially treated left sided breast cancer patients without compromising the target coverage
Cardiac and pulmonary dose reduction for tangentially irradiated breast cancer, utilizing deep inspiration breath-hold with audio-visual guidance, without compromising target coverage

Energy Technology Data Exchange (ETDEWEB)

Vikstroem, Johan; Hjelstuen, Mari H.B.; Mjaaland, Ingvil; Dybvik, Kjell Ivar (Dept. of Radiotherapy, Stavanger Univ. Hospital, Stavanger (Norway)), e-mail: vijo@sus.no

2011-01-15

Background and purpose. Cardiac disease and pulmonary complications are documented risk factors in tangential breast irradiation. Respiratory gating radiotherapy provides a possibility to substantially reduce cardiopulmonary doses. This CT planning study quantifies the reduction of radiation doses to the heart and lung, using deep inspiration breath-hold (DIBH). Patients and methods. Seventeen patients with early breast cancer, referred for adjuvant radiotherapy, were included. For each patient two CT scans were acquired; the first during free breathing (FB) and the second during DIBH. The scans were monitored by the Varian RPM respiratory gating system. Audio coaching and visual feedback (audio-visual guidance) were used. The treatment planning of the two CT studies was performed with conformal tangential fields, focusing on good coverage (V95>98%) of the planning target volume (PTV). Dose-volume histograms were calculated and compared. Doses to the heart, left anterior descending (LAD) coronary artery, ipsilateral lung and the contralateral breast were assessed. Results. Compared to FB, the DIBH-plans obtained lower cardiac and pulmonary doses, with equal coverage of PTV. The average mean heart dose was reduced from 3.7 to 1.7 Gy and the number of patients with >5% heart volume receiving 25 Gy or more was reduced from four to one of the 17 patients. With DIBH the heart was completely out of the beam portals for ten patients, with FB this could not be achieved for any of the 17 patients. The average mean dose to the LAD coronary artery was reduced from 18.1 to 6.4 Gy. The average ipsilateral lung volume receiving more than 20 Gy was reduced from 12.2 to 10.0%. Conclusion. Respiratory gating with DIBH, utilizing audio-visual guidance, reduces cardiac and pulmonary doses for tangentially treated left sided breast cancer patients without compromising the target coverage
Design Issues and Information Contents of the Provincial Government Websites of Indonesia: A Content Analysis on Visual Messages

Directory of Open Access Journals (Sweden)

Achmad Syarief

2009-07-01

Full Text Available A website is not just merely act as an object of displaying information, but it also represents a contextual medium of communication through visuals and contents. The interplay of website design elements builds up meanings that affect users beyond what previous communication practices have uncovered. Previous research acknowledges that visuals and contents have significant effects in attracting users’ attention and trust. Thus, the ability of a website to provide credible information through visuals and contents to target users is therefore plays great importance in the success of a website. However, although a considerable number of researches on website design have been performed, study in understanding the characteristics of site’s visual appearances and information contents for the purpose of promoting local investment in Indonesia has been very limited. This paper addresses visual design issues and information contents of eighteen provincial government websites of Indonesia. Through content analysis, the paper comparatively examines visual appearances, information contents, and functions of each website, in order to determine visual characteristics and contents that suit the purpose of promoting local potencies. The paper focuses on commonality, discrepancy, and pattern of contents, provide suggestions to improve the use of provincial government website design of Indonesia.
Exclusively visual analysis of classroom group interactions

Science.gov (United States)

Tucker, Laura; Scherr, Rachel E.; Zickler, Todd; Mazur, Eric

2016-12-01

Large-scale audiovisual data that measure group learning are time consuming to collect and analyze. As an initial step towards scaling qualitative classroom observation, we qualitatively coded classroom video using an established coding scheme with and without its audio cues. We find that interrater reliability is as high when using visual data only—without audio—as when using both visual and audio data to code. Also, interrater reliability is high when comparing use of visual and audio data to visual-only data. We see a small bias to code interactions as group discussion when visual and audio data are used compared with video-only data. This work establishes that meaningful educational observation can be made through visual information alone. Further, it suggests that after initial work to create a coding scheme and validate it in each environment, computer-automated visual coding could drastically increase the breadth of qualitative studies and allow for meaningful educational analysis on a far greater scale.
Exclusively visual analysis of classroom group interactions

Directory of Open Access Journals (Sweden)

Laura Tucker

2016-11-01

Full Text Available Large-scale audiovisual data that measure group learning are time consuming to collect and analyze. As an initial step towards scaling qualitative classroom observation, we qualitatively coded classroom video using an established coding scheme with and without its audio cues. We find that interrater reliability is as high when using visual data only—without audio—as when using both visual and audio data to code. Also, interrater reliability is high when comparing use of visual and audio data to visual-only data. We see a small bias to code interactions as group discussion when visual and audio data are used compared with video-only data. This work establishes that meaningful educational observation can be made through visual information alone. Further, it suggests that after initial work to create a coding scheme and validate it in each environment, computer-automated visual coding could drastically increase the breadth of qualitative studies and allow for meaningful educational analysis on a far greater scale.
EFEKTIVITAS MODEL PROBLEM BASED LEARNING BERBANTUAN MEDIA AUDIO VISUAL DITINJAU DARI HASIL BELAJAR IPA SISWA KELAS 5 SDN 1 GADU SAMBONG - BLORA SEMESTER 2 TAHUN 2014/2015

Directory of Open Access Journals (Sweden)

Andhini Virgiana

2016-05-01

Full Text Available Tujuan dari penelitian ini adalah untuk mengetahui perbedaan tingkat hasil belajar antara model problem based learning berbantuan media audio visual dengan model pembelajaran think pair share berbantuan media visual pada pembelajaran IPA siswa kelas 5 SDN 1 Gadu Sambong Kabupaten Blora semester 2 tahun pelajaran 2014/2015. Penelitian ini merupakan penelitian quasi experiment dengan nonequivalent control group design. Subjek penelitian dalam penelitian ini adalah siswa kelas 5 SDN 1 Gadu dan siswa kelas 5 SDN 2 Gagakan. Teknik pengumpulan data dalam penelitian adalah tes dan observasi. Teknik analisis data yang digunakan adalah statistik deskriptif, statistik parametrik, dan uji t dengan independent sample t-tes pada taraf signifikansi 5% (α = 0,05. Berdasarkan hasil penelitian dan pembahasan, maka dapat disimpulkan bahwa terdapat perbedaan tingkat efektivitas antara model problem based learning berbantu media audio visual dengan model pembelajaran think pair share berbantu media visual terhadap hasil belajar IPA siswa kelas 5 SDN 1 Gadu Kecamatan Sambong Kabupaten Blora semester 2 tahun 2014/2015. Terbukti hal ini ditunjukkan oleh hasil uji t-test sebesar 3,603 > 1,999 dan signifikansi sebesar 0,001 rata-rata kelas kontrol yaitu 87,0588 > 80,2000.
Audio Papers

DEFF Research Database (Denmark)

Groth, Sanne Krogh; Samson, Kristine

2016-01-01

With this special issue of Seismograf we are happy to present a new format of articles: Audio Papers. Audio papers resemble the regular essay or the academic text in that they deal with a certain topic of interest, but presented in the form of an audio production. The audio paper is an extension...
Streamer Motives and User-Generated Content on Social Live-Streaming Services

OpenAIRE

Friedlander, Mathilde B.

2017-01-01

Three most popular information services, Periscope, Ustream, and YouNow, vicarious for all Social Live-Streaming Services (SLSSs), are investigated to analyze their streamers' motivations and the user-generated content. Additionally, we collected demographic data (gender and age). More than 7,500 streams by users from the U.S., Germany, and Japan were observed. Main streamer motivations on SLSSs are boredom, socializing, the need to reach a specific group, the need to communicate, and fun. Im...
Live Aircraft Encounter Visualization at FutureFlight Central

Science.gov (United States)

Murphy, James R.; Chinn, Fay; Monheim, Spencer; Otto, Neil; Kato, Kenji; Archdeacon, John

2018-01-01

Researchers at the National Aeronautics and Space Administration (NASA) have developed an aircraft data streaming capability that can be used to visualize live aircraft in near real-time. During a joint Federal Aviation Administration (FAA)/NASA Airborne Collision Avoidance System flight series, test sorties between unmanned aircraft and manned intruder aircraft were shown in real-time at NASA Ames' FutureFlight Central tower facility as a virtual representation of the encounter. This capability leveraged existing live surveillance, video, and audio data streams distributed through a Live, Virtual, Constructive test environment, then depicted the encounter from the point of view of any aircraft in the system showing the proximity of the other aircraft. For the demonstration, position report data were sent to the ground from on-board sensors on the unmanned aircraft. The point of view can be change dynamically, allowing encounters from all angles to be observed. Visualizing the encounters in real-time provides a safe and effective method for observation of live flight testing and a strong alternative to travel to the remote test range.
Estudi i implementació del protocol de streaming http live streaming per un client i-phone

OpenAIRE

Núñez Vera, Jordi

2013-01-01

[ANGLÈS] The aim of this project is, on the one hand, the analysis of Apple's HTTP Live Streaming protocol, which is an adaptative video and audio streaming protocol able to change the streams' bit rate according to the capacity of the media through which it is being transmitted. On the other hand, the project shows a client development of this protocol for the iPhone mobile device describing this platform from scratch. I trace here the necessary steps for developing applications on iOS and I...
Visual Analysis of Weblog Content

Energy Technology Data Exchange (ETDEWEB)

Gregory, Michelle L.; Payne, Deborah A.; McColgin, Dave; Cramer, Nick O.; Love, Douglas V.

2007-03-26

In recent years, one of the advances of the World Wide Web is social media and one of the fastest growing aspects of social media is the blogosphere. Blogs make content creation easy and are highly accessible through web pages and syndication. With their growing influence, a need has arisen to be able to monitor the opinions and insight revealed within their content. In this paper we describe a technical approach for analyzing the content of blog data using a visual analytic tool, IN-SPIRE, developed by Pacific Northwest National Laboratory. We highlight the capabilities of this tool that are particularly useful for information gathering from blog data.
Local Control of Audio Environment: A Review of Methods and Applications

Directory of Open Access Journals (Sweden)

Jussi Kuutti

2014-02-01

Full Text Available The concept of a local audio environment is to have sound playback locally restricted such that, ideally, adjacent regions of an indoor or outdoor space could exhibit their own individual audio content without interfering with each other. This would enable people to listen to their content of choice without disturbing others next to them, yet, without any headphones to block conversation. In practice, perfect sound containment in free air cannot be attained, but a local audio environment can still be satisfactorily approximated using directional speakers. Directional speakers may be based on regular audible frequencies or they may employ modulated ultrasound. Planar, parabolic, and array form factors are commonly used. The directivity of a speaker improves as its surface area and sound frequency increases, making these the main design factors for directional audio systems. Even directional speakers radiate some sound outside the main beam, and sound can also reflect from objects. Therefore, directional speaker systems perform best when there is enough ambient noise to mask the leaking sound. Possible areas of application for local audio include information and advertisement audio feed in commercial facilities, guiding and narration in museums and exhibitions, office space personalization, control room messaging, rehabilitation environments, and entertainment audio systems.
Evolution of broadcast content distribution

CERN Document Server

Beutler, Roland

2017-01-01

This book discusses opportunities for broadcasters that arise with the advent of broadband networks, both fixed and mobile. It discusses how the traditional way of distributing audio-visual content over broadcasting networks has been complemented by the usage of broadband networks. The author shows how this also gives the possibility to offer new types of interactive or so-called nonlinear services. The book illustrates how change in distribution technology is accelerating the need for broadcasters around the world to adapt their content distribution strategy and how it will impact the portfolios of content they offer. Outlines the shift in broadcast content distribution paradigms and related strategic issues Provides an overview of the new broadcasting ecosystem encompassing new types of content, user habits, expectations, and devices Discusses complementary usage of different distribution technologies and platforms.
Exploring the connectome: Petascale volume visualization of microscopy data streams

KAUST Repository

Beyer, Johanna; Hadwiger, Markus; Al-Awami, Ali K.; Jeong, Wonki; Kasthuri, Narayanan; Lichtman, Jeff W M D; Pfister, Hanspeter

2013-01-01

Recent advances in high-resolution microscopy let neuroscientists acquire neural-tissue volume data of extremely large sizes. However, the tremendous resolution and the high complexity of neural structures present big challenges to storage, processing, and visualization at interactive rates. A proposed system provides interactive exploration of petascale (petavoxel) volumes resulting from high-throughput electron microscopy data streams. The system can concurrently handle multiple volumes and can support the simultaneous visualization of high-resolution voxel segmentation data. Its visualization-driven design restricts most computations to a small subset of the data. It employs a multiresolution virtual-memory architecture for better scalability than previous approaches and for handling incomplete data. Researchers have employed it for a 1-teravoxel mouse cortex volume, of which several hundred axons and dendrites as well as synapses have been segmented and labeled. © 1981-2012 IEEE.
Exploring the connectome: Petascale volume visualization of microscopy data streams

KAUST Repository

Beyer, Johanna

2013-07-01

Recent advances in high-resolution microscopy let neuroscientists acquire neural-tissue volume data of extremely large sizes. However, the tremendous resolution and the high complexity of neural structures present big challenges to storage, processing, and visualization at interactive rates. A proposed system provides interactive exploration of petascale (petavoxel) volumes resulting from high-throughput electron microscopy data streams. The system can concurrently handle multiple volumes and can support the simultaneous visualization of high-resolution voxel segmentation data. Its visualization-driven design restricts most computations to a small subset of the data. It employs a multiresolution virtual-memory architecture for better scalability than previous approaches and for handling incomplete data. Researchers have employed it for a 1-teravoxel mouse cortex volume, of which several hundred axons and dendrites as well as synapses have been segmented and labeled. © 1981-2012 IEEE.
An accurate analysis for guaranteed performance of multiprocessor streaming applications

NARCIS (Netherlands)

Poplavko, P.

2008-01-01

Already for more than a decade, consumer electronic devices have been available for entertainment, educational, or telecommunication tasks based on multimedia streaming applications, i.e., applications that process streams of audio and video samples in digital form. Multimedia capabilities are
Frequency Hopping Method for Audio Watermarking

Directory of Open Access Journals (Sweden)

A. Anastasijević

2012-11-01

Full Text Available This paper evaluates the degradation of audio content for a perceptible removable watermark. Two different approaches to embedding the watermark in the spectral domain were investigated. The frequencies for watermark embedding are chosen according to a pseudorandom sequence making the methods robust. Consequentially, the lower quality audio can be used for promotional purposes. For a fee, the watermark can be removed with a secret watermarking key. Objective and subjective testing was conducted in order to measure degradation level for the watermarked music samples and to examine residual distortion for different parameters of the watermarking algorithm and different music genres.
Virtual environment display for a 3D audio room simulation

Science.gov (United States)

Chapin, William L.; Foster, Scott

1992-06-01

Recent developments in virtual 3D audio and synthetic aural environments have produced a complex acoustical room simulation. The acoustical simulation models a room with walls, ceiling, and floor of selected sound reflecting/absorbing characteristics and unlimited independent localizable sound sources. This non-visual acoustic simulation, implemented with 4 audio ConvolvotronsTM by Crystal River Engineering and coupled to the listener with a Poihemus IsotrakTM, tracking the listener's head position and orientation, and stereo headphones returning binaural sound, is quite compelling to most listeners with eyes closed. This immersive effect should be reinforced when properly integrated into a full, multi-sensory virtual environment presentation. This paper discusses the design of an interactive, visual virtual environment, complementing the acoustic model and specified to: 1) allow the listener to freely move about the space, a room of manipulable size, shape, and audio character, while interactively relocating the sound sources; 2) reinforce the listener's feeling of telepresence into the acoustical environment with visual and proprioceptive sensations; 3) enhance the audio with the graphic and interactive components, rather than overwhelm or reduce it; and 4) serve as a research testbed and technology transfer demonstration. The hardware/software design of two demonstration systems, one installed and one portable, are discussed through the development of four iterative configurations. The installed system implements a head-coupled, wide-angle, stereo-optic tracker/viewer and multi-computer simulation control. The portable demonstration system implements a head-mounted wide-angle, stereo-optic display, separate head and pointer electro-magnetic position trackers, a heterogeneous parallel graphics processing system, and object oriented C++ program code.
Intelligent audio analysis

CERN Document Server

Schuller, Björn W

2013-01-01

This book provides the reader with the knowledge necessary for comprehension of the field of Intelligent Audio Analysis. It firstly introduces standard methods and discusses the typical Intelligent Audio Analysis chain going from audio data to audio features to audio recognition. Further, an introduction to audio source separation, and enhancement and robustness are given. After the introductory parts, the book shows several applications for the three types of audio: speech, music, and general sound. Each task is shortly introduced, followed by a description of the specific data and methods applied, experiments and results, and a conclusion for this specific task. The books provides benchmark results and standardized test-beds for a broader range of audio analysis tasks. The main focus thereby lies on the parallel advancement of realism in audio analysis, as too often today’s results are overly optimistic owing to idealized testing conditions, and it serves to stimulate synergies arising from transfer of ...
Pengaruh Model Pembelajaran Kooperatif Tipe Stad Berbantuan Media Audio Visual Terhadap Hasil Belajar IPA Siswa Kelas III SD Negeri 42 Pekanbaru

OpenAIRE

Oktarianda, Ranty; Alpusari, Mahmud; Noviana, Eddy

2017-01-01

This research is motivated by the teacher who still uses the teaching method with the old method and the difficulty of the students to understand the abstract science learning, thus causing the low value of the students' science. Implementation of STAD cooperative learning method using media audio visual is expected to be influential improving science achivements. This research uses quasi experimental method with nonequivalent control group design. The purpose of this study is to determine th...

Haptic and Visual feedback in 3D Audio Mixing Interfaces

DEFF Research Database (Denmark)

Gelineck, Steven; Overholt, Daniel

2015-01-01

This paper describes the implementation and informal evaluation of a user interface that explores haptic feedback for 3D audio mixing. The implementation compares different approaches using either the LEAP Motion for mid-air hand gesture control, or the Novint Falcon for active haptic feed- back...
Fusion in computer vision understanding complex visual content

CERN Document Server

Ionescu, Bogdan; Piatrik, Tomas

2014-01-01

This book presents a thorough overview of fusion in computer vision, from an interdisciplinary and multi-application viewpoint, describing successful approaches, evaluated in the context of international benchmarks that model realistic use cases. Features: examines late fusion approaches for concept recognition in images and videos; describes the interpretation of visual content by incorporating models of the human visual system with content understanding methods; investigates the fusion of multi-modal features of different semantic levels, as well as results of semantic concept detections, fo
The audio expert everything you need to know about audio

CERN Document Server

Winer, Ethan

2012-01-01

The Audio Expert is a comprehensive reference that covers all aspects of audio, with many practical, as well as theoretical, explanations. Providing in-depth descriptions of how audio really works, using common sense plain-English explanations and mechanical analogies with minimal math, the book is written for people who want to understand audio at the deepest, most technical level, without needing an engineering degree. It's presented in an easy-to-read, conversational tone, and includes more than 400 figures and photos augmenting the text.The Audio Expert takes th
Studies on a Spatialized Audio Interface for Sonar

Science.gov (United States)

2011-10-03

addition of spatialized audio to visual displays for sonar is much akin to the development of talking movies in the early days of cinema and can be...than using the brute-force approach. PCA is one among several techniques that share similarities with the computational architecture of a
Applying the EBU R128 loudness standard in live-streaming sound sculptures

DEFF Research Database (Denmark)

Højlund, Marie Koldkjær; Riis, Morten S.; Rothmann, Daniel

2017-01-01

to preserve a natural sounding dynamic image from the varying sound sources that can be played back under varying conditions, an adaptation of the EBU R128 loudness measurement recommendation, originally developed for levelling non-real-time broadcast material, has been applied. The paper describes the Pure......This paper describes the development of a loudness-based compressor for live audio streams. The need for this device arose while developing the public sound art project The Overheard, which involves mixing together several live audio streams through a web based mixing interface. In order...
Delivering Instruction via Streaming Media: A Higher Education Perspective.

Science.gov (United States)

Mortensen, Mark; Schlieve, Paul; Young, Jon

2000-01-01

Describes streaming media, an audio/video presentation that is delivered across a network so that it is viewed while being downloaded onto the user's computer, including a continuous stream of video that can be pre-recorded or live. Discusses its use for nontraditional students in higher education and reports on implementation experiences. (LRW)
MRI-compatible audio/visual system: impact on pediatric sedation

International Nuclear Information System (INIS)

Harned, R.K. II; Strain, J.D.

2001-01-01

Background. While sedation is necessary for much pediatric imaging, there are new alternatives that may help patients hold still without medication. Objective. We examined the effect of an audio/visual system consisting of video goggles and earphones on the need for sedation during magnetic resonance imaging (MRI). Materials and methods. All MRI examinations from May 1999 to October 1999 performed after installation of the MRVision 2000 (Resonance Technology, Inc.) were compared to the same 6-month period in 1998. Imaging and sedation protocols remained constant. Data collected included: patient age, type of examination, use of intravenous contrast enhancement, and need for sedation. The average supply charge and nursing cost per sedated patient were calculated. Results. The 955 patients from 1998 and 1,112 patients from 1999 were similar in demographics and examination distribution. There was an overall reduction in the percent of patients requiring sedation in the group using the video goggle system from 49 to 40 % (P < 0.001). There was no significant change for 0-2 years (P = 0.805), but there was a reduction from 53 to 40 % for age 3-10 years (P < 0.001) and 16 to 8 % for those older than 10 years (P < 0.001). There was a 17 % decrease in MRI room time for those patients whose examinations could be performed without sedation. Sedation costs per patient were $80 for nursing and $29 for supplies. Conclusion. The use of this video system reduced the number of children requiring sedation for MRI examination by 18 %. In addition to reducing patient risk, this can potentially reduce cost. (orig.)
The challenge of reducing scientific complexity for different target groups (without losing the essence) - experiences from interdisciplinary audio-visual media production

Science.gov (United States)

Hezel, Bernd; Broschkowski, Ephraim; Kropp, Jürgen

2013-04-01

The Climate Media Factory originates from an interdisciplinary media lab run by the Film and Television University "Konrad Wolf" Potsdam-Babelsberg (HFF) and the Potsdam Institute for Climate Impact Research (PIK). Climate scientists, authors, producers and media scholars work together to develop media products on climate change and sustainability. We strive towards communicating scientific content via different media platforms reconciling the communication needs of scientists and the audience's need to understand the complexity of topics that are relevant in their everyday life. By presenting four audio-visual examples, that have been designed for very different target groups, we show (i) the interdisciplinary challenges during the production process and the lessons learnt and (ii) possibilities to reach the required degree of simplification without the need for dumbing down the content. "We know enough about climate change" is a short animated film that was produced for the German Agency for International Cooperation (GIZ) for training programs and conferences on adaptation in the target countries including Indonesia, Tunisia and Mexico. "Earthbook" is a short animation produced for "The Year of Science" to raise awareness for the topics of sustainability among digital natives. "What is Climate Engineering?". Produced for the Institute for Advanced Sustainability Studies (IASS) the film is meant for an informed and interested public. "Wimmelwelt Energie!" is a prototype of an iPad application for children from 4-6 years of age to help them learn about different forms of energy and related greenhouse gas emissions.
An Interactive Concert Program Based on Infrared Watermark and Audio Synthesis

Science.gov (United States)

Wang, Hsi-Chun; Lee, Wen-Pin Hope; Liang, Feng-Ju

The objective of this research is to propose a video/audio system which allows the user to listen the typical music notes in the concert program under infrared detection. The system synthesizes audio with different pitches and tempi in accordance with the encoded data in a 2-D barcode embedded in the infrared watermark. The digital halftoning technique has been used to fabricate the infrared watermark composed of halftone dots by both amplitude modulation (AM) and frequency modulation (FM). The results show that this interactive system successfully recognizes the barcode and synthesizes audio under infrared detection of a concert program which is also valid for human observation of the contents. This interactive video/audio system has greatly expanded the capability of the printout paper to audio display and also has many potential value-added applications.
An Interactive Mobile Application for the Visually Impaired to Have Access to Listening Audio Books with Handy Books Portal

Directory of Open Access Journals (Sweden)

Avanthika Meenakshi

2015-01-01

Full Text Available Mobile phones are used in almost all aspects of life by people. But in the case of visually impaired, they are still a step behind in using smart phones for various purposes. Having interactive android OS, navigation and travel aiding apps using sensors and voice user interfaces (VUI or the voice response systems, we are still a step lagging in giving them an application for educational purposes. This paper proposes a complete new idea of having a portal where they can store audio books aided with interactive system so that they can use them whenever needed.
Large Scale Functional Brain Networks Underlying Temporal Integration of Audio-Visual Speech Perception: An EEG Study.

Science.gov (United States)

Kumar, G Vinodh; Halder, Tamesh; Jaiswal, Amit K; Mukherjee, Abhishek; Roy, Dipanjan; Banerjee, Arpan

2016-01-01

Observable lip movements of the speaker influence perception of auditory speech. A classical example of this influence is reported by listeners who perceive an illusory (cross-modal) speech sound (McGurk-effect) when presented with incongruent audio-visual (AV) speech stimuli. Recent neuroimaging studies of AV speech perception accentuate the role of frontal, parietal, and the integrative brain sites in the vicinity of the superior temporal sulcus (STS) for multisensory speech perception. However, if and how does the network across the whole brain participates during multisensory perception processing remains an open question. We posit that a large-scale functional connectivity among the neural population situated in distributed brain sites may provide valuable insights involved in processing and fusing of AV speech. Varying the psychophysical parameters in tandem with electroencephalogram (EEG) recordings, we exploited the trial-by-trial perceptual variability of incongruent audio-visual (AV) speech stimuli to identify the characteristics of the large-scale cortical network that facilitates multisensory perception during synchronous and asynchronous AV speech. We evaluated the spectral landscape of EEG signals during multisensory speech perception at varying AV lags. Functional connectivity dynamics for all sensor pairs was computed using the time-frequency global coherence, the vector sum of pairwise coherence changes over time. During synchronous AV speech, we observed enhanced global gamma-band coherence and decreased alpha and beta-band coherence underlying cross-modal (illusory) perception compared to unisensory perception around a temporal window of 300-600 ms following onset of stimuli. During asynchronous speech stimuli, a global broadband coherence was observed during cross-modal perception at earlier times along with pre-stimulus decreases of lower frequency power, e.g., alpha rhythms for positive AV lags and theta rhythms for negative AV lags. Thus, our
Transcript of Audio Narrative Portion of: Scandinavian Heritage. A Set of Five Audio-Visual Film Strip/Cassette Presentations.

Science.gov (United States)

Anderson, Gerald D.; Olson, David B.

The document presents the transcript of the audio narrative portion of approximately 100 interviews with first and second generation Scandinavian immigrants to the United States. The document is intended for use by secondary school classroom teachers as they develop and implement educational programs related to the Scandinavian heritage in…
Robust Detection and Visualization of Jet-Stream Core Lines in Atmospheric Flow.

Science.gov (United States)

Kern, Michael; Hewson, Tim; Sadlo, Filip; Westermann, Rudiger; Rautenhaus, Marc

2018-01-01

Jet-streams, their core lines and their role in atmospheric dynamics have been subject to considerable meteorological research since the first half of the twentieth century. Yet, until today no consistent automated feature detection approach has been proposed to identify jet-stream core lines from 3D wind fields. Such 3D core lines can facilitate meteorological analyses previously not possible. Although jet-stream cores can be manually analyzed by meteorologists in 2D as height ridges in the wind speed field, to the best of our knowledge no automated ridge detection approach has been applied to jet-stream core detection. In this work, we -a team of visualization scientists and meteorologists-propose a method that exploits directional information in the wind field to extract core lines in a robust and numerically less involved manner than traditional 3D ridge detection. For the first time, we apply the extracted 3D core lines to meteorological analysis, considering real-world case studies and demonstrating our method's benefits for weather forecasting and meteorological research.
Effectiveness of braille and audio-tactile performance technique for improving oral hygiene status of visually impaired adolescents

Directory of Open Access Journals (Sweden)

Sushmita Deshpande

2017-01-01

Full Text Available Background: Visually impaired people encounter numerous challenges in their daily life which makes it a cumbersome task to pay special attention to oral health needs. Furthermore, there is little knowledge about oral health practices among caretakers and visually impaired individuals, due to which oral health is often neglected when compared to the general health. Hence, there was a need to educate visually challenged individuals about oral hygiene practices in a customized format so that the comprehension of brushing techniques could be conveyed at its best. Materials and Methods: The present study was a randomized control trial of sixty visually impaired adolescents who were divided into three groups of 20 each. In Group 1, Braille was used, whereas in Group 2, audio-tactile performance (ATP technique and in Group 3, a combination of both the methods were used to teach tooth brushing as a part of oral health education. Pre- and post-plaque index score using Silness and Loe (1967 after health education were calculated and tabulated for statistical analysis. Results: The postintervention mean plaque index score increased in Group 1 from 29.45 to 42.98, whereas the mean plaque score decreased in Groups 2 and 3 from 30.83–29.9 to 30.23–18.73, respectively. Intergroup comparison of postplaque index score using Kruskal–Wallis and ANOVA analysis showed significant difference among all three study groups. Conclusion: The combination of Braille and ATP technique of health education served as the most effective medium to teach oral hygiene methods to visually impaired adolescents.
Effectiveness of braille and audio-tactile performance technique for improving oral hygiene status of visually impaired adolescents.

Science.gov (United States)

Deshpande, Sushmita; Rajpurohit, Ladusingh; Kokka, Vivian Varghese

2017-01-01

Visually impaired people encounter numerous challenges in their daily life which makes it a cumbersome task to pay special attention to oral health needs. Furthermore, there is little knowledge about oral health practices among caretakers and visually impaired individuals, due to which oral health is often neglected when compared to the general health. Hence, there was a need to educate visually challenged individuals about oral hygiene practices in a customized format so that the comprehension of brushing techniques could be conveyed at its best. The present study was a randomized control trial of sixty visually impaired adolescents who were divided into three groups of 20 each. In Group 1, Braille was used, whereas in Group 2, audio-tactile performance (ATP) technique and in Group 3, a combination of both the methods were used to teach tooth brushing as a part of oral health education. Pre- and post-plaque index score using Silness and Loe (1967) after health education were calculated and tabulated for statistical analysis. The postintervention mean plaque index score increased in Group 1 from 29.45 to 42.98, whereas the mean plaque score decreased in Groups 2 and 3 from 30.83-29.9 to 30.23-18.73, respectively. Intergroup comparison of postplaque index score using Kruskal-Wallis and ANOVA analysis showed significant difference among all three study groups. The combination of Braille and ATP technique of health education served as the most effective medium to teach oral hygiene methods to visually impaired adolescents.
Technical Evaluation Report 31: Internet Audio Products (3/ 3

Directory of Open Access Journals (Sweden)

Jim Rudolph

2004-08-01

Full Text Available Two contrasting additions to the online audio market are reviewed: iVocalize, a browser-based audio-conferencing software, and Skype, a PC-to-PC Internet telephone tool. These products are selected for review on the basis of their success in gaining rapid popular attention and usage during 2003-04. The iVocalize review emphasizes the product’s role in the development of a series of successful online audio communities – notably several serving visually impaired users. The Skype review stresses the ease with which the product may be used for simultaneous PC-to-PC communication among up to five users. Editor’s Note: This paper serves as an introduction to reports about online community building, and reviews of online products for disabled persons, in the next ten reports in this series. JPB, Series Ed.
How visual working memory contents influence priming of visual attention.

Science.gov (United States)

Carlisle, Nancy B; Kristjánsson, Árni

2017-04-12

Recent evidence shows that when the contents of visual working memory overlap with targets and distractors in a pop-out search task, intertrial priming is inhibited (Kristjánsson, Sævarsson & Driver, Psychon Bull Rev 20(3):514-521, 2013, Experiment 2, Psychonomic Bulletin and Review). This may reflect an interesting interaction between implicit short-term memory-thought to underlie intertrial priming-and explicit visual working memory. Evidence from a non-pop-out search task suggests that it may specifically be holding distractors in visual working memory that disrupts intertrial priming (Cunningham & Egeth, Psychol Sci 27(4):476-485, 2016, Experiment 2, Psychological Science). We examined whether the inhibition of priming depends on whether feature values in visual working memory overlap with targets or distractors in the pop-out search, and we found that the inhibition of priming resulted from holding distractors in visual working memory. These results are consistent with separate mechanisms of target and distractor effects in intertrial priming, and support the notion that the impact of implicit short-term memory and explicit visual working memory can interact when each provides conflicting attentional signals.
A Fundamental Study on Influence of Concurrently Presented Visual Stimulus Upon Loudness Perception

Directory of Open Access Journals (Sweden)

Koji Abe

2011-10-01

Full Text Available As a basic study on the influence of the dynamic properties of the audio-visual stimuli upon interaction between audition and vision, the effect of the simple movement involved in the visual stimulus on the loudness perception of the audio stimulus was investigated via psychophysical experiment. In this experiment, the visual stimulus given to subjects along with the audio stimulus is a bar appeared on a display, one side of which is flexibly expanding and contracting. The loudness of the audio stimulus with such a visual effect concurrently presented was rated as an absolute numerical value by using the Magnitude Estimation method. The reference of the bar length is determined so as to correspond to the Zwicker's loudness calculated for the given audio stimulus. As a result, the visual stimulus did not affect the loudness perception, when the bar was presented with its length same as the reference. On the other hand, the rating of the loudness for the same audio stimulus was significantly increased when the bar length was longer than the reference. This indicates that the change in the correspondence between the audio and the visual stimuli affect the loudness perception.
The Transfer of Learning Associated with Audio Feedback on Written Work

Directory of Open Access Journals (Sweden)

Tanya Martini

2014-11-01

Full Text Available This study examined whether audio feedback provided to undergraduates (N=51 about one paper would prove beneficial in terms of improving their grades on another, unrelated paper of the same type. We examined this issue both in terms of student beliefs about learning transfer, as well as their actual ability to transfer what had been learned on one assignment to another, subsequent assignment. Results indicated that students believed that they would be able to transfer what they had learned via audio feedback. Moreover, results also suggested that students actually did generalize the overarching comments about content and structure made in the audio files to a subsequent paper, the content of which differed substantially from the initial one. Both students and teaching assistants demonstrated very favourable responses to this type of feedback, suggesting that it was both clear and comprehensive.
The time-course of activation in the dorsal and ventral visual streams during landmark cueing and perceptual discrimination tasks.

Science.gov (United States)

Lambert, Anthony J; Wootton, Adrienne

2017-08-01

Different patterns of high density EEG activity were elicited by the same peripheral stimuli, in the context of Landmark Cueing and Perceptual Discrimination tasks. The C1 component of the visual event-related potential (ERP) at parietal - occipital electrode sites was larger in the Landmark Cueing task, and source localisation suggested greater activation in the superior parietal lobule (SPL) in this task, compared to the Perceptual Discrimination task, indicating stronger early recruitment of the dorsal visual stream. In the Perceptual Discrimination task, source localisation suggested widespread activation of the inferior temporal gyrus (ITG) and fusiform gyrus (FFG), structures associated with the ventral visual stream, during the early phase of the P1 ERP component. Moreover, during a later epoch (171-270ms after stimulus onset) increased temporal-occipital negativity, and stronger recruitment of ITG and FFG were observed in the Perceptual Discrimination task. These findings illuminate the contrasting functions of the dorsal and ventral visual streams, to support rapid shifts of attention in response to contextual landmarks, and conscious discrimination, respectively. Copyright © 2017 Elsevier Ltd. All rights reserved.

Audio Conferencing Enhancements

OpenAIRE

VESTERINEN, LEENA

2006-01-01

Audio conferencing allows multiple people in distant locations to interact in a single voice call. Whilst it can be very useful service it also has several key disadvantages. This thesis study investigated the options for improving the user experience of the mobile teleconferencing applications. In particular, the use of 3D, spatial audio and visualinteractive functionality was investigated as the means of improving the intelligibility and audio perception during the audio...
Audio-visual speech perception in prelingually deafened Japanese children following sequential bilateral cochlear implantation.

Science.gov (United States)

Yamamoto, Ryosuke; Naito, Yasushi; Tona, Risa; Moroto, Saburo; Tamaya, Rinko; Fujiwara, Keizo; Shinohara, Shogo; Takebayashi, Shinji; Kikuchi, Masahiro; Michida, Tetsuhiko

2017-11-01

An effect of audio-visual (AV) integration is observed when the auditory and visual stimuli are incongruent (the McGurk effect). In general, AV integration is helpful especially in subjects wearing hearing aids or cochlear implants (CIs). However, the influence of AV integration on spoken word recognition in individuals with bilateral CIs (Bi-CIs) has not been fully investigated so far. In this study, we investigated AV integration in children with Bi-CIs. The study sample included thirty one prelingually deafened children who underwent sequential bilateral cochlear implantation. We assessed their responses to congruent and incongruent AV stimuli with three CI-listening modes: only the 1st CI, only the 2nd CI, and Bi-CIs. The responses were assessed in the whole group as well as in two sub-groups: a proficient group (syllable intelligibility ≥80% with the 1st CI) and a non-proficient group (syllable intelligibility effect in each of the three CI-listening modes. AV integration responses were observed in a subset of incongruent AV stimuli, and the patterns observed with the 1st CI and with Bi-CIs were similar. In the proficient group, the responses with the 2nd CI were not significantly different from those with the 1st CI whereas in the non-proficient group the responses with the 2nd CI were driven by visual stimuli more than those with the 1st CI. Our results suggested that prelingually deafened Japanese children who underwent sequential bilateral cochlear implantation exhibit AV integration abilities, both in monaural listening as well as in binaural listening. We also observed a higher influence of visual stimuli on speech perception with the 2nd CI in the non-proficient group, suggesting that Bi-CIs listeners with poorer speech recognition rely on visual information more compared to the proficient subjects to compensate for poorer auditory input. Nevertheless, poorer quality auditory input with the 2nd CI did not interfere with AV integration with binaural
Blindness alters the microstructure of the ventral but not the dorsal visual stream

DEFF Research Database (Denmark)

Reislev, Nina L; Kupers, Ron; Siebner, Hartwig R

2016-01-01

Visual deprivation from birth leads to reorganisation of the brain through cross-modal plasticity. Although there is a general agreement that the primary afferent visual pathways are altered in congenitally blind individuals, our knowledge about microstructural changes within the higher...... pathways in 12 congenitally blind, 15 late blind and 15 normal sighted controls. We also studied six prematurely born individuals with normal vision to control for the effects of prematurity on brain connectivity. Our data revealed a reduction in fractional anisotropy in the ventral but not the dorsal......-order visual streams, and how this is affected by onset of blindness, remains scant. We used diffusion tensor imaging and tractography to investigate microstructural features in the dorsal (superior longitudinal fasciculus) and ventral (inferior longitudinal and inferior fronto-occipital fasciculi) visual...
UPAYA MENINGKATKAN AKTIVITAS DAN HASIL BELAJAR MATERI APRESIASI TERHADAP KEUNIKAN SENI MUSIK DAERAH SETEMPAT DENGAN MENGGUNAKAN MEDIA AUDIO VISUAL PADA SISWA KELAS VII A SMP NEGERI 3 RANDUDONGKAL

Directory of Open Access Journals (Sweden)

Rina Muktinurasih

2014-02-01

Full Text Available Folk music is an element of simplicity and regionalism. Improving activities toward the appreciation on the work of art, especially folk music, was carried out by identifying the variety of folk songs, according to the personal view of most students. Over the course of the years, most students can only enjoy music. Because it takes an interest in advance so that students can express the music. Music learningneeds a lot of practice, however most of the times teachers are dominating the classroom time allocation meanwhile the students do not have adequate time to practice. The problems addressed in this study are : (1 whether or not the use of Audio Visual media can improvestudents learning activity in folk music appreciation (2 whether or not the use ofAudio Visual media can improve students learning outcomes in folk music appreciation material. The method used in this study was classroom action research with two cycles, each cycle consists of 4 phases: (1 planning (2 implementation (3 observation/ evaluation, (4 reflection. The research results shows that there were improvements both in the students learning activities and outcome from the use of Audio Visual learning media in folk music appreciation material. During the pre cycle there were only 16 out of 34 students passed (47.07%, onthe first cycle there were 20 out of 34 students passed (74.24%, and finally onthe second cycle there were 28out of 34 students passed (82.35%. Therefore it can be concluded that by the end of this second cycle, the indicator of the overall success has achieved the required frequency.
Analysis and Implementation of Gossip-Based P2P Streaming with Distributed Incentive Mechanisms for Peer Cooperation

Directory of Open Access Journals (Sweden)

Sachin Agarwal

2007-10-01

Full Text Available Peer-to-peer (P2P systems are becoming a popular means of streaming audio and video content but they are prone to bandwidth starvation if selfish peers do not contribute bandwidth to other peers. We prove that an incentive mechanism can be created for a live streaming P2P protocol while preserving the asymptotic properties of randomized gossip-based streaming. In order to show the utility of our result, we adapt a distributed incentive scheme from P2P file storage literature to the live streaming scenario. We provide simulation results that confirm the ability to achieve a constant download rate (in time, per peer that is needed for streaming applications on peers. The incentive scheme fairly differentiates peers' download rates according to the amount of useful bandwidth they contribute back to the P2P system, thus creating a powerful quality-of-service incentive for peers to contribute bandwidth to other peers. We propose a functional architecture and protocol format for a gossip-based streaming system with incentive mechanisms, and present evaluation data from a real implementation of a P2P streaming application.
Communicating Risk Information in Direct-to-Consumer Prescription Drug Television Ads: A Content Analysis.

Science.gov (United States)

Sullivan, Helen W; Aikin, Kathryn J; Poehlman, Jon

2017-11-10

Direct-to-consumer (DTC) television ads for prescription drugs are required to disclose the product's major risks in the audio or audio and visual parts of the presentation (sometimes referred to as the "major statement"). The objective of this content analysis was to determine how the major statement of risks is presented in DTC television ads, including what risk information is presented, how easy or difficult it is to understand the risk information, and the audio and visual characteristics of the major statement. We identified 68 DTC television ads for branded prescription drugs, which included a unique major statement and that aired between July 2012 and August 2014. We used subjective and objective measures to code 50 ads randomly selected from the main sample. Major statements often presented numerous risks, usually in order of severity, with no quantitative information about the risks' severity or prevalence. The major statements required a high school reading level, and many included long and complex sentences. The major statements were often accompanied by competing non-risk information in the visual images, presented with moderately fast-paced music, and read at a faster pace than benefit information. Overall, we discovered several ways in which the communication of risk information could be improved.
Basin Visual Estimation Technique (BVET) and Representative Reach Approaches to Wadeable Stream Surveys: Methodological Limitations and Future Directions

Science.gov (United States)

Lance R. Williams; Melvin L. Warren; Susan B. Adams; Joseph L. Arvai; Christopher M. Taylor

2004-01-01

Basin Visual Estimation Techniques (BVET) are used to estimate abundance for fish populations in small streams. With BVET, independent samples are drawn from natural habitat units in the stream rather than sampling "representative reaches." This sampling protocol provides an alternative to traditional reach-level surveys, which are criticized for their lack...
Delayed action does not always require the ventral stream: a study on a patient with visual form agnosia.

Science.gov (United States)

Hesse, Constanze; Schenk, Thomas

2014-05-01

It has been suggested that while movements directed at visible targets are processed within the dorsal stream, movements executed after delay rely on the visual representations of the ventral stream (Milner & Goodale, 2006). This interpretation is supported by the observation that a patient with ventral stream damage (D.F.) has trouble performing accurate movements after a delay, but performs normally when the target is visible during movement programming. We tested D.F.'s visuomotor performance in a letter-posting task whilst varying the amount of visual feedback available. Additionally, we also varied whether D.F. received tactile feedback at the end of each trial (posting through a letter box vs posting on a screen) and whether environmental cues were available during the delay period (removing the target only vs suppressing vision completely with shutter glasses). We found that in the absence of environmental cues patient D.F. was unaffected by the introduction of delay and performed as accurately as healthy controls. However, when environmental cues and vision of the moving hand were available during and after the delay period, D.F.'s visuomotor performance was impaired. Thus, while healthy controls benefit from the availability of environmental landmarks and/or visual feedback of the moving hand, such cues seem less beneficial to D.F. Taken together our findings suggest that ventral stream damage does not always impact the ability to make delayed movements but compromises the ability to use environmental landmarks and visual feedback efficiently. Copyright © 2014 Elsevier Ltd. All rights reserved.
A comparative evaluation of oral hygiene using Braille and audio instructions among institutionalized visually impaired children aged between 6 years and 20 years: A 3-monthfollow-up study.

Science.gov (United States)

Mahantesha, Taranatha; Nara, Asha; Kumari, Parveen Reddy; Halemani, Praveen Kumar Nugadoni; Buddiga, Vinutna; Mythri, Sarpangala

2015-12-01

The aim of this study is to compare the oral hygiene status among institutionalized visually impaired children of age between 6 and 20 years given with Braille and audio instructions in Raichur city of Karnataka. A total of 50 children aged between 6 to 20 years were included in this study from a residential school for visually impaired children. These children were randomly divided into two equal groups. One group was given oral hygiene instructions by audio recordings and another written in Braille and were instructed to practice the same. After three months time the oral hygiene status and dental caries experience was recorded and compared using patient performance index. Statistical analysis was done by student paired t test and multiple comparison by Tukey's HSD (honest significant difference) test. The mean PHP (Patient Hygiene Performance) score of group A at baseline was 3.88 compared to 3.90 of group B. At 7 days PHP score of group A and group B was 3.42 and 3.45 respectively. At 3 month PHP score of group A and group B was 2.47 and 2.86 respectively. Even though over a period of time the mean score of PHP index reduced the score comparison between the 2 groups were statistically non significant. In group A the mean difference of PHP score between baseline and 7 days was 0.46, between baseline and 3 months it was 1.40. The PHP score between 7 days and 3 months was 0.94. All the above values were statistically significant. Effective dental health education method has to be instituted for visually impaired children. The present study shows improvement of oral health status in both the study population by decrease in the mean plaque score. Hence continuous motivation and reinforcement in the form of Braille and audio instruction is beneficial to achieve good oral hygiene levels in visually impaired children.
Open soundcard as a platform for practical, laboratory study of digital audio

DEFF Research Database (Denmark)

Dimitrov, Smilen; Serafin, Stefania

2014-01-01

This article investigates how lacking suitable platforms for laboratory exercises becomes a learning problem, limiting the practical experience students gain. In engineering education, laboratory demonstration difficulty of issues like real-time streaming in digital signal and audio processing...... afforded by such laboratories, and their open nature, could testably improve the diversity of demonstrated practical topics, while maintaining engineering students' motivation....
Visualizing the semantic content of large text databases using text maps

Science.gov (United States)

Combs, Nathan

1993-01-01

A methodology for generating text map representations of the semantic content of text databases is presented. Text maps provide a graphical metaphor for conceptualizing and visualizing the contents and data interrelationships of large text databases. Described are a set of experiments conducted against the TIPSTER corpora of Wall Street Journal articles. These experiments provide an introduction to current work in the representation and visualization of documents by way of their semantic content.
Real-Time Transmission and Storage of Video, Audio, and Health Data in Emergency and Home Care Situations

Directory of Open Access Journals (Sweden)

Riccardo Stagnaro

2007-01-01

Full Text Available The increase in the availability of bandwidth for wireless links, network integration, and the computational power on fixed and mobile platforms at affordable costs allows nowadays for the handling of audio and video data, their quality making them suitable for medical application. These information streams can support both continuous monitoring and emergency situations. According to this scenario, the authors have developed and implemented the mobile communication system which is described in this paper. The system is based on ITU-T H.323 multimedia terminal recommendation, suitable for real-time data/video/audio and telemedical applications. The audio and video codecs, respectively, H.264 and G723.1, were implemented and optimized in order to obtain high performance on the system target processors. Offline media streaming storage and retrieval functionalities were supported by integrating a relational database in the hospital central system. The system is based on low-cost consumer technologies such as general packet radio service (GPRS and wireless local area network (WLAN or WiFi for lowband data/video transmission. Implementation and testing were carried out for medical emergency and telemedicine application. In this paper, the emergency case study is described.
Early Local Activity in Temporal Areas Reflects Graded Content of Visual Perception

Directory of Open Access Journals (Sweden)

Chiara Francesca Tagliabue

2016-04-01

Full Text Available In visual cognitive neuroscience the debate on consciousness is focused on two major topics: the search for the neural correlates of the different properties of visual awareness and the controversy on the graded versus dichotomous nature of visual conscious experience. The aim of this study is to search for the possible neural correlates of different grades of visual awareness investigating the Event Related Potentials (ERPs to reduced contrast visual stimuli whose perceptual clarity was rated on the four-point Perceptual Awareness Scale (PAS. Results revealed a left centro-parietal negative deflection (Visual Awareness Negativity; VAN peaking at 280-320 ms from stimulus onset, related to the perceptual content of the stimulus, followed by a bilateral positive deflection (Late Positivity; LP peaking at 510-550 ms over almost all electrodes, reflecting post-perceptual processes performed on such content. Interestingly, the amplitude of both deflections gradually increased as a function of visual awareness. Moreover, the intracranial generators of the phenomenal content (VAN were found to be located in the left temporal lobe. The present data thus seem to suggest 1 that visual conscious experience is characterized by a gradual increase of perceived clarity at both behavioral and neural level and 2 that the actual content of perceptual experiences emerges from early local activation in temporal areas, without the need of later widespread frontal engagement.
Content-based TV sports video retrieval using multimodal analysis

Science.gov (United States)

Yu, Yiqing; Liu, Huayong; Wang, Hongbin; Zhou, Dongru

2003-09-01

In this paper, we propose content-based video retrieval, which is a kind of retrieval by its semantical contents. Because video data is composed of multimodal information streams such as video, auditory and textual streams, we describe a strategy of using multimodal analysis for automatic parsing sports video. The paper first defines the basic structure of sports video database system, and then introduces a new approach that integrates visual stream analysis, speech recognition, speech signal processing and text extraction to realize video retrieval. The experimental results for TV sports video of football games indicate that the multimodal analysis is effective for video retrieval by quickly browsing tree-like video clips or inputting keywords within predefined domain.
Audio Twister

DEFF Research Database (Denmark)

Cermak, Daniel; Moreno Garcia, Rodrigo; Monastiridis, Stefanos

2015-01-01

Daniel Cermak-Sassenrath, Rodrigo Moreno Garcia, Stefanos Monastiridis. Audio Twister. Installation. P-Hack Copenhagen 2015, Copenhagen, DK, Apr 24, 2015.......Daniel Cermak-Sassenrath, Rodrigo Moreno Garcia, Stefanos Monastiridis. Audio Twister. Installation. P-Hack Copenhagen 2015, Copenhagen, DK, Apr 24, 2015....
Back to basics audio

CERN Document Server

Nathan, Julian

1998-01-01

Back to Basics Audio is a thorough, yet approachable handbook on audio electronics theory and equipment. The first part of the book discusses electrical and audio principles. Those principles form a basis for understanding the operation of equipment and systems, covered in the second section. Finally, the author addresses planning and installation of a home audio system.Julian Nathan joined the audio service and manufacturing industry in 1954 and moved into motion picture engineering and production in 1960. He installed and operated recording theaters in Sydney, Austra
Reduction in time-to-sleep through EEG based brain state detection and audio stimulation.

Science.gov (United States)

Zhuo Zhang; Cuntai Guan; Ti Eu Chan; Juanhong Yu; Aung Aung Phyo Wai; Chuanchu Wang; Haihong Zhang

2015-08-01

We developed an EEG- and audio-based sleep sensing and enhancing system, called iSleep (interactive Sleep enhancement apparatus). The system adopts a closed-loop approach which optimizes the audio recording selection based on user's sleep status detected through our online EEG computing algorithm. The iSleep prototype comprises two major parts: 1) a sleeping mask integrated with a single channel EEG electrode and amplifier, a pair of stereo earphones and a microcontroller with wireless circuit for control and data streaming; 2) a mobile app to receive EEG signals for online sleep monitoring and audio playback control. In this study we attempt to validate our hypothesis that appropriate audio stimulation in relation to brain state can induce faster onset of sleep and improve the quality of a nap. We conduct experiments on 28 healthy subjects, each undergoing two nap sessions - one with a quiet background and one with our audio-stimulation. We compare the time-to-sleep in both sessions between two groups of subjects, e.g., fast and slow sleep onset groups. The p-value obtained from Wilcoxon Signed Rank Test is 1.22e-04 for slow onset group, which demonstrates that iSleep can significantly reduce the time-to-sleep for people with difficulty in falling sleep.
Effect of Cartoon Illustrations on the Comprehension and Evaluation of Information Presented in the Print and Audio Mode.

Science.gov (United States)

Sewell, Edward H., Jr.

This study investigates the effects of cartoon illustrations on female and male college student comprehension and evaluation of information presented in several combinations of print, audio, and visual formats. Subjects were assigned to one of five treatment groups: printed text, printed text with cartoons, audiovisual presentations, audio only…
Using Audio Description to Improve FLL Students' Oral Competence in MALL: Methodological Preliminaries

Science.gov (United States)

Ibáñez Moreno, Ana; Vermeulen, Anna; Jordano, Maria

2016-01-01

During the last decades of the 20th century, audiovisual products began to be audio described in order to make them accessible to blind and visually impaired people (Benecke, 2004). This means that visual information is orally described in the gaps between dialogues. In order to meet the wishes of the so-called On Demand (OD) generation that wants…
MEL-IRIS: An Online Tool for Audio Analysis and Music Indexing

Directory of Open Access Journals (Sweden)

Dimitrios Margounakis

2009-01-01

Full Text Available Chroma is an important attribute of music and sound, although it has not yet been adequately defined in literature. As such, it can be used for further analysis of sound, resulting in interesting colorful representations that can be used in many tasks: indexing, classification, and retrieval. Especially in Music Information Retrieval (MIR, the visualization of the chromatic analysis can be used for comparison, pattern recognition, melodic sequence prediction, and color-based searching. MEL-IRIS is the tool which has been developed in order to analyze audio files and characterize music based on chroma. The tool implements specially designed algorithms and a unique way of visualization of the results. The tool is network-oriented and can be installed in audio servers, in order to manipulate large music collections. Several samples from world music have been tested and processed, in order to demonstrate the possible uses of such an analysis.

Speech and audio processing for coding, enhancement and recognition

CERN Document Server

Togneri, Roberto; Narasimha, Madihally

2015-01-01

This book describes the basic principles underlying the generation, coding, transmission and enhancement of speech and audio signals, including advanced statistical and machine learning techniques for speech and speaker recognition with an overview of the key innovations in these areas. Key research undertaken in speech coding, speech enhancement, speech recognition, emotion recognition and speaker diarization are also presented, along with recent advances and new paradigms in these areas. · Offers readers a single-source reference on the significant applications of speech and audio processing to speech coding, speech enhancement and speech/speaker recognition. Enables readers involved in algorithm development and implementation issues for speech coding to understand the historical development and future challenges in speech coding research; · Discusses speech coding methods yielding bit-streams that are multi-rate and scalable for Voice-over-IP (VoIP) Networks; · �...
APPLICATION OF CONTROLLED SOURCE AUDIO MAGNETOTELLURIC (CSAMT AT GEOTHERMAL

Directory of Open Access Journals (Sweden)

Susilawati S.

2017-04-01

Full Text Available CSAMT or Controlled Source Audio-Magnetotelluric is one of the Geophysics methods to determine the resistivity of rock under earth surface. CSAMT method utilizes artificial stream and injected into the ground, the frequency of artificial sources ranging from 0.1 Hz to 10 kHz, CSAMT data source effect correction is inverted. From the inversion results showed that there is a layer having resistivity values ranged between 2.5 Ω.m – 15 Ω.m, which is interpreted that the layer is clay.
Comprehending News Videotexts: The Influence of the Visual Content

Science.gov (United States)

Cross, Jeremy

2011-01-01

Informed by dual coding theory, this study explores the role of the visual content in L2 listeners' comprehension of news videotexts. L1 research into the visual characteristics and comprehension of news videotexts is outlined, subsequently informing the quantitative analysis of audiovisual correspondence in the news videotexts used. In each of…
Audio Quality Assurance : An Application of Cross Correlation

DEFF Research Database (Denmark)

Jurik, Bolette Ammitzbøll; Nielsen, Jesper Asbjørn Sindahl

2012-01-01

We describe algorithms for automated quality assurance on content of audio files in context of preservation actions and access. The algorithms use cross correlation to compare the sound waves. They are used to do overlap analysis in an access scenario, where preserved radio broadcasts are used in...
The transnational appeal of Danish TV series

DEFF Research Database (Denmark)

Jensen, Pia Majbritt

because it challenges existing theories on global media geography, import/export of audio-visual content, transnational media reception and the importance of transnational TV viewing. According to these theories, non-Anglophone audio-visual content rarely exports outside its geo-linguistic region...... – in Denmark’s case the Nordic region – because audiences in other regions would be too far removed culturally and linguistically, and hence feel alienated Similarly, theories on the consumption of audio-visual content have neglected transnational, ‘non-resident’, viewing – i.e. when audiences engage...... with audio-visual content removed from their own (cultural) context as would be the case with international audiences engaging with Danish series – and instead emphasized the importance of geo-linguistic, national or ‘resident’ viewing. Even in cases when transnational viewing has been theorized...
“A Real China” on User-Generated Videos? Audio-Visual Narratives of Confucianism

Directory of Open Access Journals (Sweden)

Jianxiu Hao

2014-03-01

Full Text Available Beneath the “Chinese successful story”, social stratification, class polarization, and cultural displacement have been accelerated. The Chinese Communist Party has not found a coherent solution to the challenges of reconciling social interests, since Communism has been more and more becoming mere “lip service”. However, it has been claimed that Confucian values can provide sources to dissolve the downsides of modernization in contemporary Chinese society. This study intends to investigate the revival of Confucianism, as a source for criticism and construction in Chinese socio-culture, as portrayed in user-generated videos which are produced/consumed by the largest Internet using population in the world, under the Chinese authoritarian regime which controls over communication. By means of a thematic audio-visual narrative analysis, this study has investigated 20 hours of Youku Paike videos published between 2007 and 2013. It has been detected: (1 about one third of the user-generated videos can be interpreted as Confucian thematic narratives; and there is a slightly increasing trend portraying Confucian values; (2 Confucianism can become a source for the formation of a new online socio-culture, in the circumstances of China’s modernization and cyberization, to advocate social actors’ cultivation and humanity’s flourishing.
Content and user-based music visual analysis

Science.gov (United States)

Guo, Xiaochun; Tang, Lei

2015-12-01

In recent years, people's ability to collect music got enhanced greatly. Many people who prefer listening music offline even stored thousands of music on their local storage or portable device. However, their ability to deal with music information has not been improved accordingly, which results in two problems. One is how to find out the favourite songs from large music dataset and satisfy different individuals. The other one is how to compose a play list quickly. To solve these problems, the authors proposed a content and user-based music visual analysis approach. We first developed a new recommendation algorithm based on the content of music and user's behaviour, which satisfy individual's preference. Then, we make use of visualization and interaction tools to illustrate the relationship between songs and help people compose a suitable play list. At the end of this paper, a survey is mentioned to show that our system is available and effective.
Online dissection audio-visual resources for human anatomy: Undergraduate medical students' usage and learning outcomes.

Science.gov (United States)

Choi-Lundberg, Derek L; Cuellar, William A; Williams, Anne-Marie M

2016-11-01

In an attempt to improve undergraduate medical student preparation for and learning from dissection sessions, dissection audio-visual resources (DAVR) were developed. Data from e-learning management systems indicated DAVR were accessed by 28% ± 10 (mean ± SD for nine DAVR across three years) of students prior to the corresponding dissection sessions, representing at most 58% ± 20 of assigned dissectors. Approximately 50% of students accessed all available DAVR by the end of semester, while 10% accessed none. Ninety percent of survey respondents (response rate 58%) generally agreed that DAVR improved their preparation for and learning from dissection when used. Of several learning resources, only DAVR usage had a significant positive correlation (P = 0.002) with feeling prepared for dissection. Results on cadaveric anatomy practical examination questions in year 2 (Y2) and year 3 (Y3) cohorts were 3.9% (P learning outcomes of more students. Anat Sci Educ 9: 545-554. © 2016 American Association of Anatomists. © 2016 American Association of Anatomists.
Integrating sentiment analysis and term associations with geo-temporal visualizations on customer feedback streams

Science.gov (United States)

Hao, Ming; Rohrdantz, Christian; Janetzko, Halldór; Keim, Daniel; Dayal, Umeshwar; Haug, Lars-Erik; Hsu, Mei-Chun

2012-01-01

Twitter currently receives over 190 million tweets (small text-based Web posts) and manufacturing companies receive over 10 thousand web product surveys a day, in which people share their thoughts regarding a wide range of products and their features. A large number of tweets and customer surveys include opinions about products and services. However, with Twitter being a relatively new phenomenon, these tweets are underutilized as a source for determining customer sentiments. To explore high-volume customer feedback streams, we integrate three time series-based visual analysis techniques: (1) feature-based sentiment analysis that extracts, measures, and maps customer feedback; (2) a novel idea of term associations that identify attributes, verbs, and adjectives frequently occurring together; and (3) new pixel cell-based sentiment calendars, geo-temporal map visualizations and self-organizing maps to identify co-occurring and influential opinions. We have combined these techniques into a well-fitted solution for an effective analysis of large customer feedback streams such as for movie reviews (e.g., Kung-Fu Panda) or web surveys (buyers).
Publicación de materiales audiovisuales a través de un servidor de video-streaming Publication of audio-visual materials through a streaming video server

Directory of Open Access Journals (Sweden)

Acevedo Clavijo Edwin Jovanny

2010-07-01

Full Text Available Esta propuesta tiene como objetivo estudiar varias alternativas de servidores Streaming para determinar la mejor herramienta para el desarrollo de la publicación de material audiovisual educativo. Se evaluaron las plataformas más utilizadas teniendo en cuenta sus características y beneficios que tiene cada servidor entre las los cuales están: Hélix Universal Server, Windows Media Server de Microsoft, Peer Cast y Darwin Server. implementando un servidor con mayores capacidades y beneficios para la publicación de videos con fines académicos a través de la intranet de la Universidad Cooperativa de Colombia seccional Barrancabermeja This proposal has as an principal objective to study different alternatives for streaming servers to determine the best tool in the project’s development. Platforms most used were evaluated features and benefits in each served such as: Helix Universal Server, Microsoft Windows Media Server, Peer Cast and Darwin Server. Implementing a server with more capabilities and benefits for the publication of videos for academic purposes through the intranet of the Cooperative University of Colombia Barrancabermeja’s sectional
The Effects of Audio-Visual Recorded and Audio Recorded Listening Tasks on the Accuracy of Iranian EFL Learners' Oral Production

Science.gov (United States)

Drood, Pooya; Asl, Hanieh Davatgari

2016-01-01

The ways in which task in classrooms has developed and proceeded have receive great attention in the field of language teaching and learning in the sense that they draw attention of learners to the competing features such as accuracy, fluency, and complexity. English audiovisual and audio recorded materials have been widely used by teachers and…
Optimal bus and buffer allocation for a set of leaky-bucket-controlled streams

NARCIS (Netherlands)

Boef, den E.; Korst, J.H.M.; Verhaegh, W.F.J.; De Souza, J.N.; Dini, P.; Lorenz, P.

2004-01-01

In an in-home digital network (IHDN) it may be expected that several variable-bit-rate streams (audio, video) run simultaneously over a shared communication device, e.g. a bus. The data supply and demand of most of these streams will not be exactly known in advance, but only a coarse traffic
Rapid discrimination of visual scene content in the human brain

Science.gov (United States)

Anokhin, Andrey P.; Golosheykin, Simon; Sirevaag, Erik; Kristjansson, Sean; Rohrbaugh, John W.; Heath, Andrew C.

2007-01-01

The rapid evaluation of complex visual environments is critical for an organism's adaptation and survival. Previous studies have shown that emotionally significant visual scenes, both pleasant and unpleasant, elicit a larger late positive wave in the event-related brain potential (ERP) than emotionally neutral pictures. The purpose of the present study was to examine whether neuroelectric responses elicited by complex pictures discriminate between specific, biologically relevant contents of the visual scene and to determine how early in the picture processing this discrimination occurs. Subjects (n=264) viewed 55 color slides differing in both scene content and emotional significance. No categorical judgments or responses were required. Consistent with previous studies, we found that emotionally arousing pictures, regardless of their content, produce a larger late positive wave than neutral pictures. However, when pictures were further categorized by content, anterior ERP components in a time window between 200−600 ms following stimulus onset showed a high selectivity for pictures with erotic content compared to other pictures regardless of their emotional valence (pleasant, neutral, and unpleasant) or emotional arousal. The divergence of ERPs elicited by erotic and non-erotic contents started at 185 ms post-stimulus in the fronto-central midline regions, with a later onset in parietal regions. This rapid, selective, and content-specific processing of erotic materials and its dissociation from other pictures (including emotionally positive pictures) suggests the existence of a specialized neural network for prioritized processing of a distinct category of biologically relevant stimuli with high adaptive and evolutionary significance. PMID:16712815
Weak surround suppression of the attentional focus characterizes visual selection in the ventral stream in autism

Directory of Open Access Journals (Sweden)

Luca Ronconi

Full Text Available Neurophysiological findings in the typical population demonstrate that spatial scrutiny for visual selection determines a center-surround profile of the attentional focus, which is the result of recurrent processing in the visual system. Individuals with autism spectrum disorder (ASD manifest several anomalies in their visual selection, with strengths in detail-oriented tasks, but also difficulties in distractor inhibition tasks. Here, we asked whether contradictory aspects of perception in ASD might be due to a different center-surround profile of their attentional focus. In two experiments, we tested two independent samples of children with ASD, comparing them with typically developing (TD peers. In Experiment 1, we used a psychophysical task that mapped the entire spatial profile of the attentional focus. In Experiment 2, we used dense-array electroencephalography (EEG to explore its neurophysiological underpinnings. Experiment 1 results showed that the suppression, surrounding the attentional focus, was markedly reduced in children with ASD. Experiment 2 showed that the center-surround profile in TD children resulted in a modulation of the posterior N2 ERP component, with cortical sources in the lateral-occipital and medial/inferior temporal areas. In contrast, children with ASD did not show modulation of the N2 and related activations in the ventral visual stream. Furthermore, behavioural and neurophysiological measures of weaker suppression predicted more severe autistic symptomatology. The present findings, showing an altered center-surround profile during attentional selection, give an important insight to understand superior visual processing in autism as well as the experiencing of sensory overload. Keywords: EEG, Source analysis, Ventral visual stream, Perception, Rehabilitation
Online Class Review: Using Streaming-Media Technology

Science.gov (United States)

Loudon, Marc; Sharp, Mark

2006-01-01

We present an automated system that allows students to replay both audio and video from a large nonmajors' organic chemistry class as streaming RealMedia. Once established, this system requires no technical intervention and is virtually transparent to the instructor. This gives students access to online class review at any time. Assessment has…
Audio feature extraction using probability distribution function

Science.gov (United States)

Suhaib, A.; Wan, Khairunizam; Aziz, Azri A.; Hazry, D.; Razlan, Zuradzman M.; Shahriman A., B.

2015-05-01

Voice recognition has been one of the popular applications in robotic field. It is also known to be recently used for biometric and multimedia information retrieval system. This technology is attained from successive research on audio feature extraction analysis. Probability Distribution Function (PDF) is a statistical method which is usually used as one of the processes in complex feature extraction methods such as GMM and PCA. In this paper, a new method for audio feature extraction is proposed which is by using only PDF as a feature extraction method itself for speech analysis purpose. Certain pre-processing techniques are performed in prior to the proposed feature extraction method. Subsequently, the PDF result values for each frame of sampled voice signals obtained from certain numbers of individuals are plotted. From the experimental results obtained, it can be seen visually from the plotted data that each individuals' voice has comparable PDF values and shapes.
Do the Contents of Visual Working Memory Automatically Influence Attentional Selection During Visual Search?

OpenAIRE

Woodman, Geoffrey F.; Luck, Steven J.

2007-01-01

In many theories of cognition, researchers propose that working memory and perception operate interactively. For example, in previous studies researchers have suggested that sensory inputs matching the contents of working memory will have an automatic advantage in the competition for processing resources. The authors tested this hypothesis by requiring observers to perform a visual search task while concurrently maintaining object representations in visual working memory. The hypothesis that ...
Cambridge English First 2 audio CDs : authentic examination papers

CERN Document Server

2016-01-01

Four authentic Cambridge English Language Assessment examination papers for the Cambridge English: First (FCE) exam. These examination papers for the Cambridge English: First (FCE) exam provide the most authentic exam preparation available, allowing candidates to familiarise themselves with the content and format of the exam and to practise useful exam techniques. The Audio CDs contain the recorded material to allow thorough preparation for the Listening paper and are designed to be used with the Student's Book. A Student's Book with or without answers and a Student's Book with answers and downloadable Audio are available separately. These tests are also available as Cambridge English: First Tests 5-8 on Testbank.org.uk
Application of MPEG-7 descriptors for content-based indexing of sports videos

Science.gov (United States)

Hoeynck, Michael; Auweiler, Thorsten; Ohm, Jens-Rainer

2003-06-01

The amount of multimedia data available worldwide is increasing every day. There is a vital need to annotate multimedia data in order to allow universal content access and to provide content-based search-and-retrieval functionalities. Since supervised video annotation can be time consuming, an automatic solution is appreciated. We review recent approaches to content-based indexing and annotation of videos for different kind of sports, and present our application for the automatic annotation of equestrian sports videos. Thereby, we especially concentrate on MPEG-7 based feature extraction and content description. We apply different visual descriptors for cut detection. Further, we extract the temporal positions of single obstacles on the course by analyzing MPEG-7 edge information and taking specific domain knowledge into account. Having determined single shot positions as well as the visual highlights, the information is jointly stored together with additional textual information in an MPEG-7 description scheme. Using this information, we generate content summaries which can be utilized in a user front-end in order to provide content-based access to the video stream, but further content-based queries and navigation on a video-on-demand streaming server.
Information matching the content of visual working memory is prioritized for conscious access.

Science.gov (United States)

Gayet, Surya; Paffen, Chris L E; Van der Stigchel, Stefan

2013-12-01

Visual working memory (VWM) is used to retain relevant information for imminent goal-directed behavior. In the experiments reported here, we found that VWM helps to prioritize relevant information that is not yet available for conscious experience. In five experiments, we demonstrated that information matching VWM content reaches visual awareness faster than does information not matching VWM content. Our findings suggest a functional link between VWM and visual awareness: The content of VWM is recruited to funnel down the vast amount of sensory input to that which is relevant for subsequent behavior and therefore requires conscious access.

Digital signal processor for silicon audio playback devices; Silicon audio saisei kikiyo digital signal processor

Energy Technology Data Exchange (ETDEWEB)

NONE

2000-03-01

The digital audio signal processor (DSP) TC9446F series has been developed silicon audio playback devices with a memory medium of, e.g., flash memory, DVD players, and AV devices, e.g., TV sets. It corresponds to AAC (advanced audio coding) (2ch) and MP3 (MPEG1 Layer3), as the audio compressing techniques being used for transmitting music through an internet. It also corresponds to compressed types, e.g., Dolby Digital, DTS (digital theater system) and MPEG2 audio, being adopted for, e.g., DVDs. It can carry a built-in audio signal processing program, e.g., Dolby ProLogic, equalizer, sound field controlling, and 3D sound. TC9446XB has been lined up anew. It adopts an FBGA (fine pitch ball grid array) package for portable audio devices. (translated by NEDO)
The Success of Free to Play Games and Possibilities of Audio Monetization

OpenAIRE

Hahl, Kalle

2014-01-01

Video games are a huge business – nearly four times greater than film and music business combined. Free to play is the fastest growing category in video gaming. Game audio is part of the development of every game having a direct correlation between the growth of gaming industry and the growth of gaming audio industry. Games have inherently different goals for the players and the developers. Players are consumers seeking for entertainment. Developers are content producers trying to moneti...
Using online handwriting and audio streams for mathematical expressions recognition: a bimodal approach

Science.gov (United States)

Medjkoune, Sofiane; Mouchère, Harold; Petitrenaud, Simon; Viard-Gaudin, Christian

2013-01-01

The work reported in this paper concerns the problem of mathematical expressions recognition. This task is known to be a very hard one. We propose to alleviate the difficulties by taking into account two complementary modalities. The modalities referred to are handwriting and audio ones. To combine the signals coming from both modalities, various fusion methods are explored. Performances evaluated on the HAMEX dataset show a significant improvement compared to a single modality (handwriting) based system.
Publicación de materiales audiovisuales a través de un servidor de video-streaming Publication of audio-visual materials through a streaming video server

OpenAIRE

Acevedo Clavijo Edwin Jovanny; Parra Toloza Dina Julieth; Winlker Hernadez Walter

2010-01-01

Esta propuesta tiene como objetivo estudiar varias alternativas de servidores Streaming para determinar la mejor herramienta para el desarrollo de la publicación de material audiovisual educativo. Se evaluaron las plataformas más utilizadas teniendo en cuenta sus características y beneficios que tiene cada servidor entre las los cuales están: Hélix Universal Server, Windows Media Server de Microsoft, Peer Cast y Darwin Server. implementando un servidor con mayores capacidades y beneficios par...
Creating Accessible Science Museums with User-Activated Environmental Audio Beacons (Ping!)

Science.gov (United States)

Landau, Steven; Wiener, William; Naghshineh, Koorosh; Giusti, Ellen

2005-01-01

In 2003, Touch Graphics Company carried out research on a new invention that promises to improve accessibility to science museums for visitors who are visually impaired. The system, nicknamed Ping!, allows users to navigate an exhibit area, listen to audio descriptions, and interact with exhibits using a cell phone-based interface. The system…
Low-cost synchronization of high-speed audio and video recordings in bio-acoustic experiments.

Science.gov (United States)

Laurijssen, Dennis; Verreycken, Erik; Geipel, Inga; Daems, Walter; Peremans, Herbert; Steckel, Jan

2018-02-27

In this paper, we present a method for synchronizing high-speed audio and video recordings of bio-acoustic experiments. By embedding a random signal into the recorded video and audio data, robust synchronization of a diverse set of sensor streams can be performed without the need to keep detailed records. The synchronization can be performed using recording devices without dedicated synchronization inputs. We demonstrate the efficacy of the approach in two sets of experiments: behavioral experiments on different species of echolocating bats and the recordings of field crickets. We present the general operating principle of the synchronization method, discuss its synchronization strength and provide insights into how to construct such a device using off-the-shelf components. © 2018. Published by The Company of Biologists Ltd.
Capturing lived experiences in movement educational contexts through videographic participation and visual narratives

DEFF Research Database (Denmark)

Svendler Nielsen, Charlotte; Degerbøl, Stine Mikés

visualizing and communicating the meaning-making of the participants and emphasizes the role of the researcher’s embodied involvement when ‘looking for lived experiences’. The paper exemplifies the use of videographic participation and presents (audio)visual narratives from two educational contexts: children...... of how meaning-making of the participants can be captured and disseminated through (audio)visual narratives....
How to understand Film, Video, Television

DEFF Research Database (Denmark)

Juel, Henrik

2017-01-01

Theory on how camera work, cuts, and audio-visual montage (horizontal and vertical) defines content and impact of media with moving images......Theory on how camera work, cuts, and audio-visual montage (horizontal and vertical) defines content and impact of media with moving images...
Categorizing Video Game Audio

DEFF Research Database (Denmark)

Westerberg, Andreas Rytter; Schoenau-Fog, Henrik

2015-01-01

they can use audio in video games. The conclusion of this study is that the current models' view of the diegetic spaces, used to categorize video game audio, is not t to categorize all sounds. This can however possibly be changed though a rethinking of how the player interprets audio.......This paper dives into the subject of video game audio and how it can be categorized in order to deliver a message to a player in the most precise way. A new categorization, with a new take on the diegetic spaces, can be used a tool of inspiration for sound- and game-designers to rethink how...
Analysis, Retrieval and Delivery of Multimedia Content

CERN Document Server

Cavallaro, Andrea; Leonardi, Riccardo; Migliorati, Pierangelo

2013-01-01

Covering some of the most cutting-edge research on the delivery and retrieval of interactive multimedia content, this volume of specially chosen contributions provides the most updated perspective on one of the hottest contemporary topics. The material represents extended versions of papers presented at the 11th International Workshop on Image Analysis for Multimedia Interactive Services, a vital international forum on this fast-moving field. Logically organized in discrete sections that approach the subject from its various angles, the content deals in turn with content analysis, motion and activity analysis, high-level descriptors and video retrieval, 3-D and multi-view, and multimedia delivery. The chapters cover the finest detail of emerging techniques such as the use of high-level audio information in improving scene segmentation and the use of subjective logic for forensic visual surveillance. On content delivery, the book examines both images and video, focusing on key subjects including an efficient p...
Interactive volume exploration of petascale microscopy data streams using a visualization-driven virtual memory approach

KAUST Repository

Hadwiger, Markus; Beyer, Johanna; Jeong, Wonki; Pfister, Hanspeter

2012-01-01

This paper presents the first volume visualization system that scales to petascale volumes imaged as a continuous stream of high-resolution electron microscopy images. Our architecture scales to dense, anisotropic petascale volumes because it: (1) decouples construction of the 3D multi-resolution representation required for visualization from data acquisition, and (2) decouples sample access time during ray-casting from the size of the multi-resolution hierarchy. Our system is designed around a scalable multi-resolution virtual memory architecture that handles missing data naturally, does not pre-compute any 3D multi-resolution representation such as an octree, and can accept a constant stream of 2D image tiles from the microscopes. A novelty of our system design is that it is visualization-driven: we restrict most computations to the visible volume data. Leveraging the virtual memory architecture, missing data are detected during volume ray-casting as cache misses, which are propagated backwards for on-demand out-of-core processing. 3D blocks of volume data are only constructed from 2D microscope image tiles when they have actually been accessed during ray-casting. We extensively evaluate our system design choices with respect to scalability and performance, compare to previous best-of-breed systems, and illustrate the effectiveness of our system for real microscopy data from neuroscience. © 1995-2012 IEEE.
Interactive volume exploration of petascale microscopy data streams using a visualization-driven virtual memory approach

KAUST Repository

Hadwiger, Markus

2012-12-01

This paper presents the first volume visualization system that scales to petascale volumes imaged as a continuous stream of high-resolution electron microscopy images. Our architecture scales to dense, anisotropic petascale volumes because it: (1) decouples construction of the 3D multi-resolution representation required for visualization from data acquisition, and (2) decouples sample access time during ray-casting from the size of the multi-resolution hierarchy. Our system is designed around a scalable multi-resolution virtual memory architecture that handles missing data naturally, does not pre-compute any 3D multi-resolution representation such as an octree, and can accept a constant stream of 2D image tiles from the microscopes. A novelty of our system design is that it is visualization-driven: we restrict most computations to the visible volume data. Leveraging the virtual memory architecture, missing data are detected during volume ray-casting as cache misses, which are propagated backwards for on-demand out-of-core processing. 3D blocks of volume data are only constructed from 2D microscope image tiles when they have actually been accessed during ray-casting. We extensively evaluate our system design choices with respect to scalability and performance, compare to previous best-of-breed systems, and illustrate the effectiveness of our system for real microscopy data from neuroscience. © 1995-2012 IEEE.
Audio-visual perception of 3D cinematography: an fMRI study using condition-based and computation-based analyses.

Directory of Open Access Journals (Sweden)

Akitoshi Ogawa

Full Text Available The use of naturalistic stimuli to probe sensory functions in the human brain is gaining increasing interest. Previous imaging studies examined brain activity associated with the processing of cinematographic material using both standard "condition-based" designs, as well as "computational" methods based on the extraction of time-varying features of the stimuli (e.g. motion. Here, we exploited both approaches to investigate the neural correlates of complex visual and auditory spatial signals in cinematography. In the first experiment, the participants watched a piece of a commercial movie presented in four blocked conditions: 3D vision with surround sounds (3D-Surround, 3D with monaural sound (3D-Mono, 2D-Surround, and 2D-Mono. In the second experiment, they watched two different segments of the movie both presented continuously in 3D-Surround. The blocked presentation served for standard condition-based analyses, while all datasets were submitted to computation-based analyses. The latter assessed where activity co-varied with visual disparity signals and the complexity of auditory multi-sources signals. The blocked analyses associated 3D viewing with the activation of the dorsal and lateral occipital cortex and superior parietal lobule, while the surround sounds activated the superior and middle temporal gyri (S/MTG. The computation-based analyses revealed the effects of absolute disparity in dorsal occipital and posterior parietal cortices and of disparity gradients in the posterior middle temporal gyrus plus the inferior frontal gyrus. The complexity of the surround sounds was associated with activity in specific sub-regions of S/MTG, even after accounting for changes of sound intensity. These results demonstrate that the processing of naturalistic audio-visual signals entails an extensive set of visual and auditory areas, and that computation-based analyses can track the contribution of complex spatial aspects characterizing such life
Audio-visual perception of 3D cinematography: an fMRI study using condition-based and computation-based analyses.

Science.gov (United States)

Ogawa, Akitoshi; Bordier, Cecile; Macaluso, Emiliano

2013-01-01

The use of naturalistic stimuli to probe sensory functions in the human brain is gaining increasing interest. Previous imaging studies examined brain activity associated with the processing of cinematographic material using both standard "condition-based" designs, as well as "computational" methods based on the extraction of time-varying features of the stimuli (e.g. motion). Here, we exploited both approaches to investigate the neural correlates of complex visual and auditory spatial signals in cinematography. In the first experiment, the participants watched a piece of a commercial movie presented in four blocked conditions: 3D vision with surround sounds (3D-Surround), 3D with monaural sound (3D-Mono), 2D-Surround, and 2D-Mono. In the second experiment, they watched two different segments of the movie both presented continuously in 3D-Surround. The blocked presentation served for standard condition-based analyses, while all datasets were submitted to computation-based analyses. The latter assessed where activity co-varied with visual disparity signals and the complexity of auditory multi-sources signals. The blocked analyses associated 3D viewing with the activation of the dorsal and lateral occipital cortex and superior parietal lobule, while the surround sounds activated the superior and middle temporal gyri (S/MTG). The computation-based analyses revealed the effects of absolute disparity in dorsal occipital and posterior parietal cortices and of disparity gradients in the posterior middle temporal gyrus plus the inferior frontal gyrus. The complexity of the surround sounds was associated with activity in specific sub-regions of S/MTG, even after accounting for changes of sound intensity. These results demonstrate that the processing of naturalistic audio-visual signals entails an extensive set of visual and auditory areas, and that computation-based analyses can track the contribution of complex spatial aspects characterizing such life-like stimuli.
Warmth and competence in your face! Visual encoding of stereotype content

NARCIS (Netherlands)

Imhoff, R.; Woelki, J.; Hanke, S.; Dotsch, R.

2013-01-01

Previous research suggests that stereotypes about a group's warmth bias our visual representation of group members. Based on the stereotype content model (SCM) the current research explored whether the second big dimension of social perception, competence, is also reflected in visual stereotypes. To
Cultural diversity in the digital age: EU competences, policies and regulations for diverse audio-visual and online content

NARCIS (Netherlands)

Irion, K.; Valcke, P.; Psychogiopoulou, E.

2015-01-01

Cultural diversity is a multifaceted concept that differs from the notion of media pluralism. However, the two concepts share important concerns particularly as regards content production, content distribution and access to content. This chapter considers the EU’s role in contributing to diverse
High-Fidelity Piezoelectric Audio Device

Science.gov (United States)

Woodward, Stanley E.; Fox, Robert L.; Bryant, Robert G.

2003-01-01

ModalMax is a very innovative means of harnessing the vibration of a piezoelectric actuator to produce an energy efficient low-profile device with high-bandwidth high-fidelity audio response. The piezoelectric audio device outperforms many commercially available speakers made using speaker cones. The piezoelectric device weighs substantially less (4 g) than the speaker cones which use magnets (10 g). ModalMax devices have extreme fabrication simplicity. The entire audio device is fabricated by lamination. The simplicity of the design lends itself to lower cost. The piezoelectric audio device can be used without its acoustic chambers and thereby resulting in a very low thickness of 0.023 in. (0.58 mm). The piezoelectric audio device can be completely encapsulated, which makes it very attractive for use in wet environments. Encapsulation does not significantly alter the audio response. Its small size (see Figure 1) is applicable to many consumer electronic products, such as pagers, portable radios, headphones, laptop computers, computer monitors, toys, and electronic games. The audio device can also be used in automobile or aircraft sound systems.
Short-term memory for scenes with affective content

OpenAIRE

Maljkovic, Vera; Martini, Paolo

2005-01-01

The emotional content of visual images can be parameterized along two dimensions: valence (pleasantness) and arousal (intensity of emotion). In this study we ask how these distinct emotional dimensions affect the short-term memory of human observers viewing a rapid stream of images and trying to remember their content. We show that valence and arousal modulate short-term memory as independent factors. Arousal influences dramatically the average speed of data accumulation in memory: Higher aro...
Multisensory and Modality Specific Processing of Visual Speech in Different Regions of the Premotor Cortex

Directory of Open Access Journals (Sweden)

Daniel eCallan

2014-05-01

Full Text Available Behavioral and neuroimaging studies have demonstrated that brain regions involved with speech production also support speech perception, especially under degraded conditions. The premotor cortex has been shown to be active during both observation and execution of action (‘Mirror System’ properties, and may facilitate speech perception by mapping unimodal and multimodal sensory features onto articulatory speech gestures. For this functional magnetic resonance imaging (fMRI study, participants identified vowels produced by a speaker in audio-visual (saw the speaker’s articulating face and heard her voice, visual only (only saw the speaker’s articulating face, and audio only (only heard the speaker’s voice conditions with varying audio signal-to-noise ratios in order to determine the regions of the premotor cortex involved with multisensory and modality specific processing of visual speech gestures. The task was designed so that identification could be made with a high level of accuracy from visual only stimuli to control for task difficulty and differences in intelligibility. The results of the fMRI analysis for visual only and audio-visual conditions showed overlapping activity in inferior frontal gyrus and premotor cortex. The left ventral inferior premotor cortex showed properties of multimodal (audio-visual enhancement with a degraded auditory signal. The left inferior parietal lobule and right cerebellum also showed these properties. The left ventral superior and dorsal premotor cortex did not show this multisensory enhancement effect, but there was greater activity for the visual only over audio-visual conditions in these areas. The results suggest that the inferior regions of the ventral premotor cortex are involved with integrating multisensory information, whereas, more superior and dorsal regions of the premotor cortex are involved with mapping unimodal (in this case visual sensory features of the speech signal with
Continuity-Aware Scheduling Algorithm for Scalable Video Streaming

Directory of Open Access Journals (Sweden)

Atinat Palawan

2016-05-01

Full Text Available The consumer demand for retrieving and delivering visual content through consumer electronic devices has increased rapidly in recent years. The quality of video in packet networks is susceptible to certain traffic characteristics: average bandwidth availability, loss, delay and delay variation (jitter. This paper presents a scheduling algorithm that modifies the stream of scalable video to combat jitter. The algorithm provides unequal look-ahead by safeguarding the base layer (without the need for overhead of the scalable video. The results of the experiments show that our scheduling algorithm reduces the number of frames with a violated deadline and significantly improves the continuity of the video stream without compromising the average Y Peek Signal-to-Noise Ratio (PSNR.

STREAMTO: Streaming Content using a Tamper-Resistant Token

NARCIS (Netherlands)

Cheng, Jieyin; Chong, C.N.; Doumen, J.M.; Etalle, Sandro; Hartel, Pieter H.; Nikolaus, Stefan

2004-01-01

StreamTo uses tamper resistant hardware tokens to generate the key stream needed to decrypt encrypted streaming music. The combination of a hardware token and steaming media effectively brings tried and tested PayTV technology to the Internet. We provide a security analysis and present two prototype
Introduction to audio analysis a MATLAB approach

CERN Document Server

Giannakopoulos, Theodoros

2014-01-01

Introduction to Audio Analysis serves as a standalone introduction to audio analysis, providing theoretical background to many state-of-the-art techniques. It covers the essential theory necessary to develop audio engineering applications, but also uses programming techniques, notably MATLAB®, to take a more applied approach to the topic. Basic theory and reproducible experiments are combined to demonstrate theoretical concepts from a practical point of view and provide a solid foundation in the field of audio analysis. Audio feature extraction, audio classification, audio segmentation, au
Relative Effectiveness of Audio Tools for Fighter Pilots in Simulated Operational Flights: A Human Factors Approach

National Research Council Canada - National Science Library

Hourlier, Sylvain; Meehan, James; Leger, Alain; Roumes, Corinne

2005-01-01

.... Increasing use of audio has been suggested as a means to reduce visual workload, to enhance situation awareness, and mitigate the manual and cognitive demands of HOTAS and existing command-and-display concepts...
The value of filmed interviews: issues of visualization, visual transcrips and the reading of visual texts

NARCIS (Netherlands)

Witteveen, L.; Lie, R.

2013-01-01

The increased access to video technology advances the use of visual methodologies in research. Following these developments, researchers are confronted with new challenges. Video recordings of interviews, as compared to audio recordings, are gaining interest in qualitative field research in the
Investigating emotional top down modulation of ambiguous faces by single pulse TMS on early visual cortices

Directory of Open Access Journals (Sweden)

Zachary Adam Yaple

2016-06-01

Full Text Available Top-down processing is a mechanism in which memory, context and expectation are used to perceive stimuli. For this study we investigated how emotion content, induced by music mood, influences perception of happy and sad emoticons. Using single pulse TMS we stimulated right occipital face area (rOFA, primary visual cortex (V1 and vertex while subjects performed a face-detection task and listened to happy and sad music. At baseline, incongruent audio-visual pairings decreased performance, demonstrating dependence of emotion while perceiving ambiguous faces. However, performance of face identification decreased during rOFA stimulation regardless of emotional content. No effects were found between Cz and V1 stimulation. These results suggest that while rOFA is important for processing faces regardless of emotion, V1 stimulation had no effect. Our findings suggest that early visual cortex activity may not integrate emotional auditory information with visual information during emotion top-down modulation of faces.
Roundtable Audio Discussion

Directory of Open Access Journals (Sweden)

Chris Bigum

2007-01-01

Full Text Available RoundTable on Technology, Teaching and Tools. This is a roundtable audio interview conducted by James Farmer, founder of Edublogs, with Anne Bartlett-Bragg (University of Technology Sydney and Chris Bigum (Deakin University. Skype was used to make and record the audio conference and the resulting sound file was edited by Andrew McLauchlan.
Coexistence issues for a 2.4 GHz wireless audio streaming in presence of bluetooth paging and WLAN

Science.gov (United States)

Pfeiffer, F.; Rashwan, M.; Biebl, E.; Napholz, B.

2015-11-01

Nowadays, customers expect to integrate their mobile electronic devices (smartphones and laptops) in a vehicle to form a wireless network. Typically, IEEE 802.11 is used to provide a high-speed wireless local area network (WLAN) and Bluetooth is used for cable replacement applications in a wireless personal area network (PAN). In addition, Daimler uses KLEER as third wireless technology in the unlicensed (UL) 2.4 GHz-ISM-band to transmit full CD-quality digital audio. As Bluetooth, IEEE 802.11 and KLEER are operating in the same frequency band, it has to be ensured that all three technologies can be used simultaneously without interference. In this paper, we focus on the impact of Bluetooth and IEEE 802.11 as interferer in presence of a KLEER audio transmission.
Akamai Streaming

OpenAIRE

ECT Team, Purdue

2007-01-01

Akamai offers world-class streaming media services that enable Internet content providers and enterprises to succeed in today's Web-centric marketplace. They deliver live event Webcasts (complete with video production, encoding, and signal acquisition services), streaming media on demand, 24/7 Webcasts and a variety of streaming application services based upon their EdgeAdvantage.
Control and Innovation on Digital Platforms : the case of Netflix and streaming of video content

OpenAIRE

Vigeland, Eirik

2012-01-01

In this thesis I investigate innovation processes on innovation platforms, and look at the role played by content release for innovation in digital distribution of home entertainment. I argue that innovation platforms rely on several aspects of innovation in order to succeed, and this thesis is concerned with one of these, namely release of digital entertainment content. I use the American video streaming service Netflix as a case and example of such an innovation platform. By using techno...
Claroscura Representation: An Audio-visual and Theoretical Exploration of the Representation of the Past Through Documentary Filmmaking

Directory of Open Access Journals (Sweden)

Gerrit Stollbrock Trujillo

2017-09-01

Full Text Available At the nexus between audio-visual production and theoretical research, this article is based on the experience of producing a documentary on the history of a cement plant in Colombia: La Siberia. The tensions between the narratives constructed in the documentary and the immensity of the discarded archives from the plant drive a theoretical quest to respond to its own iconoclast and the post-structuralist critique of history. This brought us to the formulation of the concept of claroscura representation, defined as representation that is transparent about its own limitations. I put this concept to the test through the medium of documentary film, talking specifically about the making of La Siberia, and suggest its relevance in other projects that attempt to represent the past or history through film. I suggest that this theory drives us towards the formulation of a new artistic project. The research process, and the dialogue between theory and practice, is interpreted using the model of abduction proposed by Charles Sanders Peirce.
Multisensory guidance of goal-oriented behaviour of legged robots

DEFF Research Database (Denmark)

Shaikh, Danish; Manoonpong, Poramate; Tuxworth, Gervase

2017-01-01

Biological systems often combine cues from two different sensory modalities to execute goal-oriented sensorimotor tasks, which otherwise cannot be accurately executed with either sensory stream in isolation. When auditory cues alone are not sufficient to accurately localise an audio-visual target...... is tasked with localising an audio-visual target by turning towards it. The architecture extracts sound direction information with a model of the peripheral auditory system of lizards to modulate locomotion control parameters driving the turning behaviour. The visual information adaptively changes...... the strength of the acoustomotor coupling to adjust turning speed of the robot. Our experiments demonstrate improved orientation towards the audio-visual target emitting a tone of frequency 2.2kHz located at an angular offset of 45 degrees from the robot....
Visual search, visual streams, and visual architectures.

Science.gov (United States)

Green, M

1991-10-01

Most psychological, physiological, and computational models of early vision suggest that retinal information is divided into a parallel set of feature modules. The dominant theories of visual search assume that these modules form a "blackboard" architecture: a set of independent representations that communicate only through a central processor. A review of research shows that blackboard-based theories, such as feature-integration theory, cannot easily explain the existing data. The experimental evidence is more consistent with a "network" architecture, which stresses that: (1) feature modules are directly connected to one another, (2) features and their locations are represented together, (3) feature detection and integration are not distinct processing stages, and (4) no executive control process, such as focal attention, is needed to integrate features. Attention is not a spotlight that synthesizes objects from raw features. Instead, it is better to conceptualize attention as an aperture which masks irrelevant visual information.
Visual Analytics in Public Safety: Example Capabilities for Example Government Agencies

Science.gov (United States)

2011-10-01

appelé « analytique visuel », lequel combine et approfondit les domaines de la visualisation de données et de l’analytique computationnel...Data Sources Data Analysis Analytic Reasoning Information Sharing Data types • Transaction • Image • Video • Text • Audio • Spatial...Broadcast Monitoring System creates a continuous, searchable, one-year archive of international television broadcasts. The real-time audio stream is
Low Latency Audio Video: Potentials for Collaborative Music Making through Distance Learning

Science.gov (United States)

Riley, Holly; MacLeod, Rebecca B.; Libera, Matthew

2016-01-01

The primary purpose of this study was to examine the potential of LOw LAtency (LOLA), a low latency audio visual technology designed to allow simultaneous music performance, as a distance learning tool for musical styles in which synchronous playing is an integral aspect of the learning process (e.g., jazz, folk styles). The secondary purpose was…
Data-Proximate Analysis and Visualization in the Cloud using Cloudstream, an Open-Source Application Streaming Technology Stack

Science.gov (United States)

Fisher, W. I.

2017-12-01

The rise in cloud computing, coupled with the growth of "Big Data", has lead to a migration away from local scientific data storage. The increasing size of remote scientific data sets increase, however, makes it difficult for scientists to subject them to large-scale analysis and visualization. These large datasets can take an inordinate amount of time to download; subsetting is a potential solution, but subsetting services are not yet ubiquitous. Data providers may also pay steep prices, as many cloud providers meter data based on how much data leaves their cloud service. The solution to this problem is a deceptively simple one; move data analysis and visualization tools to the cloud, so that scientists may perform data-proximate analysis and visualization. This results in increased transfer speeds, while egress costs are lowered or completely eliminated. Moving standard desktop analysis and visualization tools to the cloud is enabled via a technique called "Application Streaming". This technology allows a program to run entirely on a remote virtual machine while still allowing for interactivity and dynamic visualizations. When coupled with containerization technology such as Docker, we are able to easily deploy legacy analysis and visualization software to the cloud whilst retaining access via a desktop, netbook, a smartphone, or the next generation of hardware, whatever it may be. Unidata has created a Docker-based solution for easily adapting legacy software for Application Streaming. This technology stack, dubbed Cloudstream, allows desktop software to run in the cloud with little-to-no effort. The docker container is configured by editing text files, and the legacy software does not need to be modified in any way. This work will discuss the underlying technologies used by Cloudstream, and outline how to use Cloudstream to run and access an existing desktop application to the cloud.
Big Data Analytics: Challenges And Applications For Text, Audio, Video, And Social Media Data

OpenAIRE

Jai Prakash Verma; Smita Agrawal; Bankim Patel; Atul Patel

2016-01-01

All types of machine automated systems are generating large amount of data in different forms like statistical, text, audio, video, sensor, and bio-metric data that emerges the term Big Data. In this paper we are discussing issues, challenges, and application of these types of Big Data with the consideration of big data dimensions. Here we are discussing social media data analytics, content based analytics, text data analytics, audio, and video data analytics their issues and expected applica...
Lateralized visual behavior in bottlenose dolphins (Tursiops truncatus) performing audio-visual tasks: the right visual field advantage.

Science.gov (United States)

Delfour, F; Marten, K

2006-01-10

Analyzing cerebral asymmetries in various species helps in understanding brain organization. The left and right sides of the brain (lateralization) are involved in different cognitive and sensory functions. This study focuses on dolphin visual lateralization as expressed by spontaneous eye preference when performing a complex cognitive task; we examine lateralization when processing different visual stimuli displayed on an underwater touch-screen (two-dimensional figures, three-dimensional figures and dolphin/human video sequences). Three female bottlenose dolphins (Tursiops truncatus) were submitted to a 2-, 3- or 4-, choice visual/auditory discrimination problem, without any food reward: the subjects had to correctly match visual and acoustic stimuli together. In order to visualize and to touch the underwater target, the dolphins had to come close to the touch-screen and to position themselves using monocular vision (left or right eye) and/or binocular naso-ventral vision. The results showed an ability to associate simple visual forms and auditory information using an underwater touch-screen. Moreover, the subjects showed a spontaneous tendency to use monocular vision. Contrary to previous findings, our results did not clearly demonstrate right eye preference in spontaneous choice. However, the individuals' scores of correct answers were correlated with right eye vision, demonstrating the advantage of this visual field in visual information processing and suggesting a left hemispheric dominance. We also demonstrated that the nature of the presented visual stimulus does not seem to have any influence on the animals' monocular vision choice.
Multimodal integration in statistical learning

DEFF Research Database (Denmark)

Mitchell, Aaron; Christiansen, Morten Hyllekvist; Weiss, Dan

2014-01-01

, we investigated the ability of adults to integrate audio and visual input during statistical learning. We presented learners with a speech stream synchronized with a video of a speaker’s face. In the critical condition, the visual (e.g., /gi/) and auditory (e.g., /mi/) signals were occasionally...... facilitated participants’ ability to segment the speech stream. Our results therefore demonstrate that participants can integrate audio and visual input to perceive the McGurk illusion during statistical learning. We interpret our findings as support for modality-interactive accounts of statistical learning.......Recent advances in the field of statistical learning have established that learners are able to track regularities of multimodal stimuli, yet it is unknown whether the statistical computations are performed on integrated representations or on separate, unimodal representations. In the present study...
Visual impact in the digital press: a Spanish empirical research

Directory of Open Access Journals (Sweden)

Joan Francesc Fondevila Gascón

2010-12-01

Full Text Available Visual resource (photography and video inclusion in digital journalism is obtaining importance in the multimedia area. The principal resources of digital press are multimedia, hypertext and interactivity. Multimedia is in an initial process of evolution. The objective of this research is to observe empirically the use of visual resources by the digital pure player press. These media try to take advantage of the new multimedia possibilities in the development and presentation of the contents. We have analyzed empirically video and photography inclusion in the multimedia framework (text, photography, video, audio, infograph and animation programs in four digital newspapers (Libertad Digital and El Plural, in Spanish, and Vilaweb.cat and e-Noticies, in Catalan analyzed according to journalistic genres.
Could Audio-Described Films Benefit from Audio Introductions? An Audience Response Study

Science.gov (United States)

Romero-Fresco, Pablo; Fryer, Louise

2013-01-01

Introduction: Time constraints limit the quantity and type of information conveyed in audio description (AD) for films, in particular the cinematic aspects. Inspired by introductory notes for theatre AD, this study developed audio introductions (AIs) for "Slumdog Millionaire" and "Man on Wire." Each AI comprised 10 minutes of…

Semantic Labeling of Nonspeech Audio Clips

Directory of Open Access Journals (Sweden)

Xiaojuan Ma

2010-01-01

Full Text Available Human communication about entities and events is primarily linguistic in nature. While visual representations of information are shown to be highly effective as well, relatively little is known about the communicative power of auditory nonlinguistic representations. We created a collection of short nonlinguistic auditory clips encoding familiar human activities, objects, animals, natural phenomena, machinery, and social scenes. We presented these sounds to a broad spectrum of anonymous human workers using Amazon Mechanical Turk and collected verbal sound labels. We analyzed the human labels in terms of their lexical and semantic properties to ascertain that the audio clips do evoke the information suggested by their pre-defined captions. We then measured the agreement with the semantically compatible labels for each sound clip. Finally, we examined which kinds of entities and events, when captured by nonlinguistic acoustic clips, appear to be well-suited to elicit information for communication, and which ones are less discriminable. Our work is set against the broader goal of creating resources that facilitate communication for people with some types of language loss. Furthermore, our data should prove useful for future research in machine analysis/synthesis of audio, such as computational auditory scene analysis, and annotating/querying large collections of sound effects.
Relationship between the dough quality and content of specific glutenin proteins in wheat mill streams, and its application to making flour suitable for instant Chinese noodles.

Science.gov (United States)

Yahata, Eriko; Maruyama-Funatsuki, Wakako; Nishio, Zenta; Yamamoto, Yoshihiko; Hanaoka, Akihiro; Sugiyama, Hisashi; Tanida, Masatoshi; Saruyama, Haruo

2006-04-01

The content of specific proteins such as high-molecular-weight glutenin subunits HMW-GS 5+10 and low-molecular-weight glutenin subunits LMW-GS KS2 in wheat mill streams of extra-strong Kachikei 33 wheat was quantified by SDS-PAGE and 2D-PAGE. The mill streams showed varied quantities of HMW-GS 5+10 (0.077 to 2.007 mg/g of mill stream), LMW-GS KS2 (0.018 to 0.586 mg/g of mill stream) and total protein (9.42% to 18.98%). The contents of these specific proteins in the mill streams were significantly correlated with the SDS sedimentation volume and the mixing properties, which are respective indices of specific loaf volume and dough strength. The contents of these specific glutenin proteins in the mill streams were therefore found to be significantly important for improving the dough quality suitable for bread and Chinese noodles. Accordingly, we present here the application of this information to the development of an effective method for producing mill streams with high quality and yield that are suitable for instant Chinese noodles.
Location audio simplified capturing your audio and your audience

CERN Document Server

Miles, Dean

2014-01-01

From the basics of using camera, handheld, lavalier, and shotgun microphones to camera calibration and mixer set-ups, Location Audio Simplified unlocks the secrets to clean and clear broadcast quality audio no matter what challenges you face. Author Dean Miles applies his twenty-plus years of experience as a professional location operator to teach the skills, techniques, tips, and secrets needed to produce high-quality production sound on location. Humorous and thoroughly practical, the book covers a wide array of topics, such as:* location selection* field mixing* boo
Perceptual Coding of Audio Signals Using Adaptive Time-Frequency Transform

Directory of Open Access Journals (Sweden)

Umapathy Karthikeyan

2007-01-01

Full Text Available Wide band digital audio signals have a very high data-rate associated with them due to their complex nature and demand for high-quality reproduction. Although recent technological advancements have significantly reduced the cost of bandwidth and miniaturized storage facilities, the rapid increase in the volume of digital audio content constantly compels the need for better compression algorithms. Over the years various perceptually lossless compression techniques have been introduced, and transform-based compression techniques have made a significant impact in recent years. In this paper, we propose one such transform-based compression technique, where the joint time-frequency (TF properties of the nonstationary nature of the audio signals were exploited in creating a compact energy representation of the signal in fewer coefficients. The decomposition coefficients were processed and perceptually filtered to retain only the relevant coefficients. Perceptual filtering (psychoacoustics was applied in a novel way by analyzing and performing TF specific psychoacoustics experiments. An added advantage of the proposed technique is that, due to its signal adaptive nature, it does not need predetermined segmentation of audio signals for processing. Eight stereo audio signal samples of different varieties were used in the study. Subjective (mean opinion score—MOS listening tests were performed and the subjective difference grades (SDG were used to compare the performance of the proposed coder with MP3, AAC, and HE-AAC encoders. Compression ratios in the range of 8 to 40 were achieved by the proposed technique with subjective difference grades (SDG ranging from –0.53 to –2.27.
Smartphone audio port data collection cookbook

Directory of Open Access Journals (Sweden)

Kyle Forinash

2018-06-01

Full Text Available The audio port of a smartphone is designed to send and receive audio but can be harnessed for portable, economical, and accurate data collection from a variety of sources. While smartphones have internal sensors to measure a number of physical phenomena such as acceleration, magnetism and illumination levels, measurement of other phenomena such as voltage, external temperature, or accurate timing of moving objects are excluded. The audio port cannot be only employed to sense external phenomena. It has the additional advantage of timing precision; because audio is recorded or played at a controlled rate separated from other smartphone activities, timings based on audio can be highly accurate. The following outlines unpublished details of the audio port technical elements for data collection, a general data collection recipe and an example timing application for Android devices.
Turkish Music Genre Classification using Audio and Lyrics Features

Directory of Open Access Journals (Sweden)

Önder ÇOBAN

2017-05-01

Full Text Available Music Information Retrieval (MIR has become a popular research area in recent years. In this context, researchers have developed music information systems to find solutions for such major problems as automatic playlist creation, hit song detection, and music genre or mood classification. Meta-data information, lyrics, or melodic content of music are used as feature resource in previous works. However, lyrics do not often used in MIR systems and the number of works in this field is not enough especially for Turkish. In this paper, firstly, we have extended our previously created Turkish MIR (TMIR dataset, which comprises of Turkish lyrics, by including the audio file of each song. Secondly, we have investigated the effect of using audio and textual features together or separately on automatic Music Genre Classification (MGC. We have extracted textual features from lyrics using different feature extraction models such as word2vec and traditional Bag of Words. We have conducted our experiments on Support Vector Machine (SVM algorithm and analysed the impact of feature selection and different feature groups on MGC. We have considered lyrics based MGC as a text classification task and also investigated the effect of term weighting method. Experimental results show that textual features can also be effective as well as audio features for Turkish MGC, especially when a supervised term weighting method is employed. We have achieved the highest success rate as 99,12\\% by using both audio and textual features together.
Structure Learning in Audio

DEFF Research Database (Denmark)

Nielsen, Andreas Brinch

By having information about the setting a user is in, a computer is able to make decisions proactively to facilitate tasks for the user. Two approaches are taken in this thesis to achieve more information about an audio environment. One approach is that of classifying audio, and a new approach...... investigated. A fast and computationally simple approach that compares recordings and classifies if they are from the same audio environment have been developed, and shows very high accuracy and the ability to synchronize recordings in the case of recording devices which are not connected. A more general model...
Listeners' expectation of room acoustical parameters based on visual cues

Science.gov (United States)

Valente, Daniel L.

Despite many studies investigating auditory spatial impressions in rooms, few have addressed the impact of simultaneous visual cues on localization and the perception of spaciousness. The current research presents an immersive audio-visual study, in which participants are instructed to make spatial congruency and quantity judgments in dynamic cross-modal environments. The results of these psychophysical tests suggest the importance of consilient audio-visual presentation to the legibility of an auditory scene. Several studies have looked into audio-visual interaction in room perception in recent years, but these studies rely on static images, speech signals, or photographs alone to represent the visual scene. Building on these studies, the aim is to propose a testing method that uses monochromatic compositing (blue-screen technique) to position a studio recording of a musical performance in a number of virtual acoustical environments and ask subjects to assess these environments. In the first experiment of the study, video footage was taken from five rooms varying in physical size from a small studio to a small performance hall. Participants were asked to perceptually align two distinct acoustical parameters---early-to-late reverberant energy ratio and reverberation time---of two solo musical performances in five contrasting visual environments according to their expectations of how the room should sound given its visual appearance. In the second experiment in the study, video footage shot from four different listening positions within a general-purpose space was coupled with sounds derived from measured binaural impulse responses (IRs). The relationship between the presented image, sound, and virtual receiver position was examined. It was found that many visual cues caused different perceived events of the acoustic environment. This included the visual attributes of the space in which the performance was located as well as the visual attributes of the performer
Audio-visual interactions in product sound design

NARCIS (Netherlands)

Özcan, E.; Van Egmond, R.

2010-01-01

Consistent product experience requires congruity between product properties such as visual appearance and sound. Therefore, for designing appropriate product sounds by manipulating their spectral-temporal structure, product sounds should preferably not be considered in isolation but as an integral
Instrumentation in Support of Interactive Visualization, Computation and Simulation

National Research Council Canada - National Science Library

Wegman, Edward

1997-01-01

... and related spatial and volumetric visualization problems. By virtual environments, we meant an immersive visual and audio technology such that experimenter has little or no awareness of the real environment...
Instrumental Landing Using Audio Indication

Science.gov (United States)

Burlak, E. A.; Nabatchikov, A. M.; Korsun, O. N.

2018-02-01

The paper proposes an audio indication method for presenting to a pilot the information regarding the relative positions of an aircraft in the tasks of precision piloting. The implementation of the method is presented, the use of such parameters of audio signal as loudness, frequency and modulation are discussed. To confirm the operability of the audio indication channel the experiments using modern aircraft simulation facility were carried out. The simulated performed the instrument landing using the proposed audio method to indicate the aircraft deviations in relation to the slide path. The results proved compatible with the simulated instrumental landings using the traditional glidescope pointers. It inspires to develop the method in order to solve other precision piloting tasks.
Bit rates in audio source coding

NARCIS (Netherlands)

Veldhuis, Raymond N.J.

1992-01-01

The goal is to introduce and solve the audio coding optimization problem. Psychoacoustic results such as masking and excitation pattern models are combined with results from rate distortion theory to formulate the audio coding optimization problem. The solution of the audio optimization problem is a
Hierarchical structure for audio-video based semantic classification of sports video sequences

Science.gov (United States)

Kolekar, M. H.; Sengupta, S.

2005-07-01

A hierarchical structure for sports event classification based on audio and video content analysis is proposed in this paper. Compared to the event classifications in other games, those of cricket are very challenging and yet unexplored. We have successfully solved cricket video classification problem using a six level hierarchical structure. The first level performs event detection based on audio energy and Zero Crossing Rate (ZCR) of short-time audio signal. In the subsequent levels, we classify the events based on video features using a Hidden Markov Model implemented through Dynamic Programming (HMM-DP) using color or motion as a likelihood function. For some of the game-specific decisions, a rule-based classification is also performed. Our proposed hierarchical structure can easily be applied to any other sports. Our results are very promising and we have moved a step forward towards addressing semantic classification problems in general.
Implementing Audio-CASI on Windows’ Platforms

Science.gov (United States)

Cooley, Philip C.; Turner, Charles F.

2011-01-01

Audio computer-assisted self interviewing (Audio-CASI) technologies have recently been shown to provide important and sometimes dramatic improvements in the quality of survey measurements. This is particularly true for measurements requiring respondents to divulge highly sensitive information such as their sexual, drug use, or other sensitive behaviors. However, DOS-based Audio-CASI systems that were designed and adopted in the early 1990s have important limitations. Most salient is the poor control they provide for manipulating the video presentation of survey questions. This article reports our experiences adapting Audio-CASI to Microsoft Windows 3.1 and Windows 95 platforms. Overall, our Windows-based system provided the desired control over video presentation and afforded other advantages including compatibility with a much wider array of audio devices than our DOS-based Audio-CASI technologies. These advantages came at the cost of increased system requirements --including the need for both more RAM and larger hard disks. While these costs will be an issue for organizations converting large inventories of PCS to Windows Audio-CASI today, this will not be a serious constraint for organizations and individuals with small inventories of machines to upgrade or those purchasing new machines today. PMID:22081743
Estudio del streaming de audio y vídeo sobre redes heterogéneas

OpenAIRE

Gómez Cruz, María del Carmen

2009-01-01

Las operadoras han encontrado nichos de negocio en la integración de múltiples servicios avanzados, como pueden ser Voz sobre IP, Vídeos bajo demanda, datos de alta capacidad, distribución de televisión de alta definición, etc. Esto supone una adaptación constante de sus redes de comunicación de banda ancha para soportar mayores anchos de banda, con mayor alcance, con menores pérdidas, en definitiva con mayores prestaciones; y promueve el desarrollo por parte de los fabricantes de audio y víd...
DEVELOPING VISUAL NOVEL GAME WITH SPEECH-RECOGNITION INTERACTIVITY TO ENHANCE STUDENTS’ MASTERY ON ENGLISH EXPRESSIONS

OpenAIRE

Elizabeth Anggraeni Amalo; Imam Dui Agusalim; Citra Devi Murdaningtyas

2017-01-01

The teaching of English-expressions has always been done through conversation samples in form of written texts, audio recordings, and videos. In the meantime, the development of computer-aided learning technology has made autonomous language learning possible. Game, as one of computer-aided learning technology products, can serve as a medium to provide educational contents like that of language teaching and learning. Visual Novel is considered as a conversational game that is suitable to be c...
Visual and visuomotor processing of hands and tools as a case study of cross talk between the dorsal and ventral streams.

Science.gov (United States)

Almeida, Jorge; Amaral, Lénia; Garcea, Frank E; Aguiar de Sousa, Diana; Xu, Shan; Mahon, Bradford Z; Martins, Isabel Pavão

2018-05-24

A major principle of organization of the visual system is between a dorsal stream that processes visuomotor information and a ventral stream that supports object recognition. Most research has focused on dissociating processing across these two streams. Here we focus on how the two streams interact. We tested neurologically-intact and impaired participants in an object categorization task over two classes of objects that depend on processing within both streams-hands and tools. We measured how unconscious processing of images from one of these categories (e.g., tools) affects the recognition of images from the other category (i.e., hands). Our findings with neurologically-intact participants demonstrated that processing an image of a hand hampers the subsequent processing of an image of a tool, and vice versa. These results were not present in apraxic patients (N = 3). These findings suggest local and global inhibitory processes working in tandem to co-register information across the two streams.
Perceptual Coding of Audio Signals Using Adaptive Time-Frequency Transform

Directory of Open Access Journals (Sweden)

Karthikeyan Umapathy

2007-08-01

Full Text Available Wide band digital audio signals have a very high data-rate associated with them due to their complex nature and demand for high-quality reproduction. Although recent technological advancements have significantly reduced the cost of bandwidth and miniaturized storage facilities, the rapid increase in the volume of digital audio content constantly compels the need for better compression algorithms. Over the years various perceptually lossless compression techniques have been introduced, and transform-based compression techniques have made a significant impact in recent years. In this paper, we propose one such transform-based compression technique, where the joint time-frequency (TF properties of the nonstationary nature of the audio signals were exploited in creating a compact energy representation of the signal in fewer coefficients. The decomposition coefficients were processed and perceptually filtered to retain only the relevant coefficients. Perceptual filtering (psychoacoustics was applied in a novel way by analyzing and performing TF specific psychoacoustics experiments. An added advantage of the proposed technique is that, due to its signal adaptive nature, it does not need predetermined segmentation of audio signals for processing. Eight stereo audio signal samples of different varieties were used in the study. Subjective (mean opinion scoreÃ¢Â€Â”MOS listening tests were performed and the subjective difference grades (SDG were used to compare the performance of the proposed coder with MP3, AAC, and HE-AAC encoders. Compression ratios in the range of 8 to 40 were achieved by the proposed technique with subjective difference grades (SDG ranging from Ã¢Â€Â“0.53 to Ã¢Â€Â“2.27.
Audio wiring guide how to wire the most popular audio and video connectors

CERN Document Server

Hechtman, John

2012-01-01

Whether you're a pro or an amateur, a musician or into multimedia, you can't afford to guess about audio wiring. The Audio Wiring Guide is a comprehensive, easy-to-use guide that explains exactly what you need to know. No matter the size of your wiring project or installation, this handy tool provides you with the essential information you need and the techniques to use it. Using The Audio Wiring Guide is like having an expert at your side. By following the clear, step-by-step directions, you can do professional-level work at a fraction of the cost.
The role of automated speech and audio analysis in semantic multimedia annotation

NARCIS (Netherlands)

de Jong, Franciska M.G.; Ordelman, Roeland J.F.; van Hessen, Adrianus J.

This paper overviews the various ways in which automatic speech and audio analysis can be deployed to enhance the semantic annotation of multimedia content, and as a consequence to improve the effectiveness of conceptual access tools. A number of techniques will be presented, including the alignment

Audio Frequency Analysis in Mobile Phones

Science.gov (United States)

Aguilar, Horacio Munguía

2016-01-01

A new experiment using mobile phones is proposed in which its audio frequency response is analyzed using the audio port for inputting external signal and getting a measurable output. This experiment shows how the limited audio bandwidth used in mobile telephony is the main cause of the poor speech quality in this service. A brief discussion is…
Presence and the utility of audio spatialization

DEFF Research Database (Denmark)

Bormann, Karsten

2005-01-01

The primary concern of this paper is whether the utility of audio spatialization, as opposed to the fidelity of audio spatialization, impacts presence. An experiment is reported that investigates the presence-performance relationship by decoupling spatial audio fidelity (realism) from task...... performance by varying the spatial fidelity of the audio independently of its relevance to performance on the search task that subjects were to perform. This was achieved by having conditions in which subjects searched for a music-playing radio (an active sound source) and having conditions in which...... supplied only nonattenuated audio was detrimental to performance. Even so, this group of subjects consistently had the largest increase in presence scores over the baseline experiment. Further, the Witmer and Singer (1998) presence questionnaire was more sensitive to whether the audio source was active...
Modified BTC Algorithm for Audio Signal Coding

Directory of Open Access Journals (Sweden)

TOMIC, S.

2016-11-01

Full Text Available This paper describes modification of a well-known image coding algorithm, named Block Truncation Coding (BTC and its application in audio signal coding. BTC algorithm was originally designed for black and white image coding. Since black and white images and audio signals have different statistical characteristics, the application of this image coding algorithm to audio signal presents a novelty and a challenge. Several implementation modifications are described in this paper, while the original idea of the algorithm is preserved. The main modifications are performed in the area of signal quantization, by designing more adequate quantizers for audio signal processing. The result is a novel audio coding algorithm, whose performance is presented and analyzed in this research. The performance analysis indicates that this novel algorithm can be successfully applied in audio signal coding.
Visual analytics for semantic queries of TerraSAR-X image content

Science.gov (United States)

Espinoza-Molina, Daniela; Alonso, Kevin; Datcu, Mihai

2015-10-01

With the continuous image product acquisition of satellite missions, the size of the image archives is considerably increasing every day as well as the variety and complexity of their content, surpassing the end-user capacity to analyse and exploit them. Advances in the image retrieval field have contributed to the development of tools for interactive exploration and extraction of the images from huge archives using different parameters like metadata, key-words, and basic image descriptors. Even though we count on more powerful tools for automated image retrieval and data analysis, we still face the problem of understanding and analyzing the results. Thus, a systematic computational analysis of these results is required in order to provide to the end-user a summary of the archive content in comprehensible terms. In this context, visual analytics combines automated analysis with interactive visualizations analysis techniques for an effective understanding, reasoning and decision making on the basis of very large and complex datasets. Moreover, currently several researches are focused on associating the content of the images with semantic definitions for describing the data in a format to be easily understood by the end-user. In this paper, we present our approach for computing visual analytics and semantically querying the TerraSAR-X archive. Our approach is mainly composed of four steps: 1) the generation of a data model that explains the information contained in a TerraSAR-X product. The model is formed by primitive descriptors and metadata entries, 2) the storage of this model in a database system, 3) the semantic definition of the image content based on machine learning algorithms and relevance feedback, and 4) querying the image archive using semantic descriptors as query parameters and computing the statistical analysis of the query results. The experimental results shows that with the help of visual analytics and semantic definitions we are able to explain
Digital audio watermarking fundamentals, techniques and challenges

CERN Document Server

Xiang, Yong; Yan, Bin

2017-01-01

This book offers comprehensive coverage on the most important aspects of audio watermarking, from classic techniques to the latest advances, from commonly investigated topics to emerging research subdomains, and from the research and development achievements to date, to current limitations, challenges, and future directions. It also addresses key topics such as reversible audio watermarking, audio watermarking with encryption, and imperceptibility control methods. The book sets itself apart from the existing literature in three main ways. Firstly, it not only reviews classical categories of audio watermarking techniques, but also provides detailed descriptions, analysis and experimental results of the latest work in each category. Secondly, it highlights the emerging research topic of reversible audio watermarking, including recent research trends, unique features, and the potentials of this subdomain. Lastly, the joint consideration of audio watermarking and encryption is also reviewed. With the help of this...
The visual neuroscience of robotic grasping achieving sensorimotor skills through dorsal-ventral stream integration

CERN Document Server

Chinellato, Eris

2016-01-01

This book presents interdisciplinary research that pursues the mutual enrichment of neuroscience and robotics. Building on experimental work, and on the wealth of literature regarding the two cortical pathways of visual processing - the dorsal and ventral streams - we define and implement, computationally and on a real robot, a functional model of the brain areas involved in vision-based grasping actions. Grasping in robotics is largely an unsolved problem, and we show how the bio-inspired approach is successful in dealing with some fundamental issues of the task. Our robotic system can safely perform grasping actions on different unmodeled objects, denoting especially reliable visual and visuomotor skills. The computational model and the robotic experiments help in validating theories on the mechanisms employed by the brain areas more directly involved in grasping actions. This book offers new insights and research hypotheses regarding such mechanisms, especially for what concerns the interaction between the...
Discriminating native from non-native speech using fusion of visual cues

NARCIS (Netherlands)

Georgakis, Christos; Petridis, Stavros; Pantic, Maja

2014-01-01

The task of classifying accent, as belonging to a native language speaker or a foreign language speaker, has been so far addressed by means of the audio modality only. However, features extracted from the visual modality have been successfully used to extend or substitute audio-only approaches
Monitoring the sulfur content of coal streams by thermal-neutron-capture gamma-ray analysis

International Nuclear Information System (INIS)

Martin, J.W.; Hall, A.W.

1976-07-01

A theory was developed for evaluating a complex, prompt gamma ray spectrum to serve as the basis for an instrument to monitor continuously the sulfur content of tonnage streams of coal. Equations for the energies and intensities of prompt gamma rays emitted from 13 most significant elements in coal are combined into a single equation that defines the basic electronic design of the meter. The sulfur content of up to 10 tons per hour of coal was determined in pilot plant tests with a prototype meter. The precision of 0.04 percent sulfur substantiates the validity of the theory. In subsequent industrial plant tests the precision was determined to be a comparable 0.05 percent sulfur
Audio-Visual Biofeedback Does Not Improve the Reliability of Target Delineation Using Maximum Intensity Projection in 4-Dimensional Computed Tomography Radiation Therapy Planning

International Nuclear Information System (INIS)

Lu, Wei; Neuner, Geoffrey A.; George, Rohini; Wang, Zhendong; Sasor, Sarah; Huang, Xuan; Regine, William F.; Feigenberg, Steven J.; D'Souza, Warren D.

2014-01-01

Purpose: To investigate whether coaching patients' breathing would improve the match between ITV MIP (internal target volume generated by contouring in the maximum intensity projection scan) and ITV 10 (generated by combining the gross tumor volumes contoured in 10 phases of a 4-dimensional CT [4DCT] scan). Methods and Materials: Eight patients with a thoracic tumor and 5 patients with an abdominal tumor were included in an institutional review board-approved prospective study. Patients underwent 3 4DCT scans with: (1) free breathing (FB); (2) coaching using audio-visual (AV) biofeedback via the Real-Time Position Management system; and (3) coaching via a spirometer system (Active Breathing Coordinator or ABC). One physician contoured all scans to generate the ITV 10 and ITV MIP . The match between ITV MIP and ITV 10 was quantitatively assessed with volume ratio, centroid distance, root mean squared distance, and overlap/Dice coefficient. We investigated whether coaching (AV or ABC) or uniform expansions (1, 2, 3, or 5 mm) of ITV MIP improved the match. Results: Although both AV and ABC coaching techniques improved frequency reproducibility and ABC improved displacement regularity, neither improved the match between ITV MIP and ITV 10 over FB. On average, ITV MIP underestimated ITV 10 by 19%, 19%, and 21%, with centroid distance of 1.9, 2.3, and 1.7 mm and Dice coefficient of 0.87, 0.86, and 0.88 for FB, AV, and ABC, respectively. Separate analyses indicated a better match for lung cancers or tumors not adjacent to high-intensity tissues. Uniform expansions of ITV MIP did not correct for the mismatch between ITV MIP and ITV 10 . Conclusions: In this pilot study, audio-visual biofeedback did not improve the match between ITV MIP and ITV 10 . In general, ITV MIP should be limited to lung cancers, and modification of ITV MIP in each phase of the 4DCT data set is recommended
Audio-Visual Biofeedback Does Not Improve the Reliability of Target Delineation Using Maximum Intensity Projection in 4-Dimensional Computed Tomography Radiation Therapy Planning

Energy Technology Data Exchange (ETDEWEB)

Lu, Wei, E-mail: wlu@umm.edu [Department of Radiation Oncology, University of Maryland School of Medicine, Baltimore, Maryland (United States); Neuner, Geoffrey A.; George, Rohini; Wang, Zhendong; Sasor, Sarah [Department of Radiation Oncology, University of Maryland School of Medicine, Baltimore, Maryland (United States); Huang, Xuan [Research and Development, Care Management Department, Johns Hopkins HealthCare LLC, Glen Burnie, Maryland (United States); Regine, William F.; Feigenberg, Steven J.; D' Souza, Warren D. [Department of Radiation Oncology, University of Maryland School of Medicine, Baltimore, Maryland (United States)

2014-01-01

Purpose: To investigate whether coaching patients' breathing would improve the match between ITV{sub MIP} (internal target volume generated by contouring in the maximum intensity projection scan) and ITV{sub 10} (generated by combining the gross tumor volumes contoured in 10 phases of a 4-dimensional CT [4DCT] scan). Methods and Materials: Eight patients with a thoracic tumor and 5 patients with an abdominal tumor were included in an institutional review board-approved prospective study. Patients underwent 3 4DCT scans with: (1) free breathing (FB); (2) coaching using audio-visual (AV) biofeedback via the Real-Time Position Management system; and (3) coaching via a spirometer system (Active Breathing Coordinator or ABC). One physician contoured all scans to generate the ITV{sub 10} and ITV{sub MIP}. The match between ITV{sub MIP} and ITV{sub 10} was quantitatively assessed with volume ratio, centroid distance, root mean squared distance, and overlap/Dice coefficient. We investigated whether coaching (AV or ABC) or uniform expansions (1, 2, 3, or 5 mm) of ITV{sub MIP} improved the match. Results: Although both AV and ABC coaching techniques improved frequency reproducibility and ABC improved displacement regularity, neither improved the match between ITV{sub MIP} and ITV{sub 10} over FB. On average, ITV{sub MIP} underestimated ITV{sub 10} by 19%, 19%, and 21%, with centroid distance of 1.9, 2.3, and 1.7 mm and Dice coefficient of 0.87, 0.86, and 0.88 for FB, AV, and ABC, respectively. Separate analyses indicated a better match for lung cancers or tumors not adjacent to high-intensity tissues. Uniform expansions of ITV{sub MIP} did not correct for the mismatch between ITV{sub MIP} and ITV{sub 10}. Conclusions: In this pilot study, audio-visual biofeedback did not improve the match between ITV{sub MIP} and ITV{sub 10}. In general, ITV{sub MIP} should be limited to lung cancers, and modification of ITV{sub MIP} in each phase of the 4DCT data set is recommended.
A content analysis of visual cancer information: prevalence and use of photographs and illustrations in printed health materials.

Science.gov (United States)

King, Andy J

2015-01-01

Researchers and practitioners have an increasing interest in visual components of health information and health communication messages. This study contributes to this evolving body of research by providing an account of the visual images and information featured in printed cancer communication materials. Using content analysis, 147 pamphlets and 858 images were examined to determine how frequently images are used in printed materials, what types of images are used, what information is conveyed visually, and whether or not current recommendations for the inclusion of visual content were being followed. Although visual messages were found to be common in printed health materials, existing recommendations about the inclusion of visual content were only partially followed. Results are discussed in terms of how relevant theoretical frameworks in the areas of behavior change and visual persuasion seem to be used in these materials, as well as how more theory-oriented research is necessary in visual messaging efforts.
Parietal and early visual cortices encode working memory content across mental transformations.

Science.gov (United States)

Christophel, Thomas B; Cichy, Radoslaw M; Hebart, Martin N; Haynes, John-Dylan

2015-02-01

Active and flexible manipulations of memory contents "in the mind's eye" are believed to occur in a dedicated neural workspace, frequently referred to as visual working memory. Such a neural workspace should have two important properties: The ability to store sensory information across delay periods and the ability to flexibly transform sensory information. Here we used a combination of functional MRI and multivariate decoding to indentify such neural representations. Subjects were required to memorize a complex artificial pattern for an extended delay, then rotate the mental image as instructed by a cue and memorize this transformed pattern. We found that patterns of brain activity already in early visual areas and posterior parietal cortex encode not only the initially remembered image, but also the transformed contents after mental rotation. Our results thus suggest that the flexible and general neural workspace supporting visual working memory can be realized within posterior brain regions. Copyright © 2014 Elsevier Inc. All rights reserved.
Detecting double compression of audio signal

Science.gov (United States)

Yang, Rui; Shi, Yun Q.; Huang, Jiwu

2010-01-01

MP3 is the most popular audio format nowadays in our daily life, for example music downloaded from the Internet and file saved in the digital recorder are often in MP3 format. However, low bitrate MP3s are often transcoded to high bitrate since high bitrate ones are of high commercial value. Also audio recording in digital recorder can be doctored easily by pervasive audio editing software. This paper presents two methods for the detection of double MP3 compression. The methods are essential for finding out fake-quality MP3 and audio forensics. The proposed methods use support vector machine classifiers with feature vectors formed by the distributions of the first digits of the quantized MDCT (modified discrete cosine transform) coefficients. Extensive experiments demonstrate the effectiveness of the proposed methods. To the best of our knowledge, this piece of work is the first one to detect double compression of audio signal.
Deep Complementary Bottleneck Features for Visual Speech Recognition

NARCIS (Netherlands)

Petridis, Stavros; Pantic, Maja

Deep bottleneck features (DBNFs) have been used successfully in the past for acoustic speech recognition from audio. However, research on extracting DBNFs for visual speech recognition is very limited. In this work, we present an approach to extract deep bottleneck visual features based on deep
Elicitation of attributes for the evaluation of audio-on audio-interference

DEFF Research Database (Denmark)

Francombe, Jon; Mason, R.; Dewhirst, M.

2014-01-01

procedure was used to reduce these phrases into a comprehensive set of attributes. Groups of experienced and inexperienced listeners determined nine and eight attributes, respectively. These attribute sets were combined by the listeners to produce a final set of 12 attributes: masking, calming, distraction......An experiment to determine the perceptual attributes of the experience of listening to a target audio program in the presence of an audio interferer was performed. The first stage was a free elicitation task in which a total of 572 phrases were produced. In the second stage, a consensus vocabulary...
Landscaping Considerations for Urban Stream Restoration Projects

National Research Council Canada - National Science Library

Bailey, Pam

2004-01-01

... after restoration and its functionality for public use. The landscaping component of such stream and riparian restoration projects must be emphasized given its importance of visual success and public perception. The purpose of this technical note is to address landscaping considerations associated with urban stream and riparian restoration projects, and provide ideas to managers for enhancing the visual appeal and aesthetic qualities of urban projects.
Software architecture for an indigenous knowledge management system

CSIR Research Space (South Africa)

Fogwill, T

2011-11-01

Full Text Available in technological land- scape. It preserves IK in a digital form that remains acces- sible to the original IK holders. It does so by recording and archiving audio/visual content that captures, to the extent possible, the oral, visual and performed aspects of IK...-end (described in section 6.2). The knowledge repository and the Fedora technology at its core are described in this section. The DKR stores and maintains digital audio/visual record- ings of the captured IK. While these audio/visual record- ings may...
CERN automatic audio-conference service

CERN Multimedia

Sierra Moral, R

2009-01-01

Scientists from all over the world need to collaborate with CERN on a daily basis. They must be able to communicate effectively on their joint projects at any time; as a result telephone conferences have become indispensable and widely used. Managed by 6 operators, CERN already has more than 20000 hours and 5700 audio-conferences per year. However, the traditional telephone based audio-conference system needed to be modernized in three ways. Firstly, to provide the participants with more autonomy in the organization of their conferences; secondly, to eliminate the constraints of manual intervention by operators; and thirdly, to integrate the audio-conferences into a collaborative working framework. The large number, and hence cost, of the conferences prohibited externalization and so the CERN telecommunications team drew up a specification to implement a new system. It was decided to use a new commercial collaborative audio-conference solution based on the SIP protocol. The system was tested as the first Euro...
CERN automatic audio-conference service

CERN Document Server

Sierra Moral, R

2010-01-01

Scientists from all over the world need to collaborate with CERN on a daily basis. They must be able to communicate effectively on their joint projects at any time; as a result telephone conferences have become indispensable and widely used. Managed by 6 operators, CERN already has more than 20000 hours and 5700 audio-conferences per year. However, the traditional telephone based audio-conference system needed to be modernized in three ways. Firstly, to provide the participants with more autonomy in the organization of their conferences; secondly, to eliminate the constraints of manual intervention by operators; and thirdly, to integrate the audio-conferences into a collaborative working framework. The large number, and hence cost, of the conferences prohibited externalization and so the CERN telecommunications team drew up a specification to implement a new system. It was decided to use a new commercial collaborative audio-conference solution based on the SIP protocol. The system was tested as the first Euro...
Debugging of Class-D Audio Power Amplifiers

DEFF Research Database (Denmark)

Crone, Lasse; Pedersen, Jeppe Arnsdorf; Mønster, Jakob Døllner

2012-01-01

Determining and optimizing the performance of a Class-D audio power amplier can be very dicult without knowledge of the use of audio performance measuring equipment and of how the various noise and distortion sources in uence the audio performance. This paper gives an introduction on how to measure...

Classification of dual language audio-visual content: Introduction to the VideoCLEF 2008 pilot benchmark evaluation task

NARCIS (Netherlands)

Larson, M.; Newman, E.; Jones, G.J.F.; Köhler, J.; Larson, M.; de Jong, F.M.G.; Kraaij, W.; Ordelman, R.J.F.

2008-01-01

VideoCLEF is a new track for the CLEF 2008 campaign. This track aims to develop and evaluate tasks in analyzing multilingual video content. A pilot of a Vid2RSS task involving assigning thematic class labels to video kicks off the VideoCLEF track in 2008. Task participants deliver classification
Audio-visual interactions uniquely contribute to resolution of visual conflict in people possessing absolute pitch.

Directory of Open Access Journals (Sweden)

Sujin Kim

Full Text Available Individuals possessing absolute pitch (AP are able to identify a given musical tone or to reproduce it without reference to another tone. The present study sought to learn whether this exceptional auditory ability impacts visual perception under stimulus conditions that provoke visual competition in the form of binocular rivalry. Nineteen adult participants with 3-19 years of musical training were divided into two groups according to their performance on a task involving identification of the specific note associated with hearing a given musical pitch. During test trials lasting just over half a minute, participants dichoptically viewed a scrolling musical score presented to one eye and a drifting sinusoidal grating presented to the other eye; throughout the trial they pressed buttons to track the alternations in visual awareness produced by these dissimilar monocular stimuli. On "pitch-congruent" trials, participants heard an auditory melody that was congruent in pitch with the visual score, on "pitch-incongruent" trials they heard a transposed auditory melody that was congruent with the score in melody but not in pitch, and on "melody-incongruent" trials they heard an auditory melody completely different from the visual score. For both groups, the visual musical scores predominated over the gratings when the auditory melody was congruent compared to when it was incongruent. Moreover, the AP participants experienced greater predominance of the visual score when it was accompanied by the pitch-congruent melody compared to the same melody transposed in pitch; for non-AP musicians, pitch-congruent and pitch-incongruent trials yielded equivalent predominance. Analysis of individual durations of dominance revealed differential effects on dominance and suppression durations for AP and non-AP participants. These results reveal that AP is accompanied by a robust form of bisensory interaction between tonal frequencies and musical notation that boosts
Design of an audio advertisement dataset

Science.gov (United States)

Fu, Yutao; Liu, Jihong; Zhang, Qi; Geng, Yuting

2015-12-01

Since more and more advertisements swarm into radios, it is necessary to establish an audio advertising dataset which could be used to analyze and classify the advertisement. A method of how to establish a complete audio advertising dataset is presented in this paper. The dataset is divided into four different kinds of advertisements. Each advertisement's sample is given in *.wav file format, and annotated with a txt file which contains its file name, sampling frequency, channel number, broadcasting time and its class. The classifying rationality of the advertisements in this dataset is proved by clustering the different advertisements based on Principal Component Analysis (PCA). The experimental results show that this audio advertisement dataset offers a reliable set of samples for correlative audio advertisement experimental studies.
Making the Switch to Digital Audio

Directory of Open Access Journals (Sweden)

Shannon Gwin Mitchell

2004-12-01

Full Text Available In this article, the authors describe the process of converting from analog to digital audio data. They address the step-by-step decisions that they made in selecting hardware and software for recording and converting digital audio, issues of system integration, and cost considerations. The authors present a brief description of how digital audio is being used in their current research project and how it has enhanced the “quality” of their qualitative research.
Streaming simplification of tetrahedral meshes.

Science.gov (United States)

Vo, Huy T; Callahan, Steven P; Lindstrom, Peter; Pascucci, Valerio; Silva, Cláudio T

2007-01-01

Unstructured tetrahedral meshes are commonly used in scientific computing to represent scalar, vector, and tensor fields in three dimensions. Visualization of these meshes can be difficult to perform interactively due to their size and complexity. By reducing the size of the data, we can accomplish real-time visualization necessary for scientific analysis. We propose a two-step approach for streaming simplification of large tetrahedral meshes. Our algorithm arranges the data on disk in a streaming, I/O-efficient format that allows coherent access to the tetrahedral cells. A quadric-based simplification is sequentially performed on small portions of the mesh in-core. Our output is a coherent streaming mesh which facilitates future processing. Our technique is fast, produces high quality approximations, and operates out-of-core to process meshes too large for main memory.
Efficient Audio Power Amplification - Challenges

DEFF Research Database (Denmark)

Andersen, Michael Andreas E.

2005-01-01

For more than a decade efficient audio power amplification has evolved and today switch-mode audio power amplification in various forms are the state-of-the-art. The technical steps that lead to this evolution are described and in addition many of the challenges still to be faced and where...
The Build-Up Course of Visuo-Motor and Audio-Motor Temporal Recalibration

Directory of Open Access Journals (Sweden)

Yoshimori Sugano

2011-10-01

Full Text Available The sensorimotor timing is recalibrated after a brief exposure to a delayed feedback of voluntary actions (temporal recalibration effect: TRE (Heron et al., 2009; Stetson et al., 2006; Sugano et al., 2010. We introduce a new paradigm, namely ‘synchronous tapping’ (ST which allows us to investigate how the TRE builds up during adaptation. In each experimental trial, participants were repeatedly exposed to a constant lag (∼150 ms between their voluntary action (pressing a mouse and a feedback stimulus (a visual flash / an auditory click 10 times. Immediately after that, they performed a ST task with the same stimulus as a pace signal (7 flashes / clicks. A subjective ‘no-delay condition’ (∼50 ms served as control. The TRE manifested itself as a change in the tap-stimulus asynchrony that compensated the exposed lag (eg, after lag adaptation, the tap preceded the stimulus more than in control and built up quickly (∼3–6 trials, ∼23–45 sec in both the visuo- and audio-motor domain. The audio-motor TRE was bigger and built-up faster than the visuo-motor one. To conclude, the TRE is comparable between visuo- and audio-motor domain, though they are slightly different in size and build-up rate.
Decentralized Cloud Method For Multicasting Media Stream

Directory of Open Access Journals (Sweden)

D M B N Bandara

2015-08-01

Full Text Available With the advancement of Information technology the concept of idea sharing has advanced. Mostly on presentations personal computer and projector have become essentials. But on most occasions for connecting these equipment cables and physical devices are used. This is inefficient and time consuming. If a problem occurs someone with technical knowledge is necessary to solve the situation. The objective of this research is to use the wireless technology to reduce the manual configuration and build up a platform where one can easily share files a visuals media and feedback. A system has been developed to detect all the devices over a network and upon granted permission will share video audio and access controls. Final outcome of the research was a collaborative software bundle which work together on a network. One part of the system is a Desktop Network Software. And other is a Mobile Application. Desktop application can detect all other devices in the network which provides the same facility and if required can allocate a group and share its screen files and have a message stream to each device using multicasting. Mobile application can act as a mobile remote to the host computer of the group which can detect any input from user and pass it to the system.
Low Delay Video Streaming on the Internet of Things Using Raspberry Pi

Directory of Open Access Journals (Sweden)

Ulf Jennehag

2016-09-01

Full Text Available The Internet of Things is predicted to consist of over 50 billion devices aiming to solve problems in most areas of our digital society. A large part of the data communicated is expected to consist of various multimedia contents, such as live audio and video. This article presents a solution for the communication of high definition video in low-delay scenarios (<200 ms under the constraints of devices with limited hardware resources, such as the Raspberry Pi. We verify that it is possible to enable low delay video streaming between Raspberry Pi devices using a distributed Internet of Things system called the SensibleThings platform. Specifically, our implementation transfers a 6 Mbps H.264 video stream of 1280 × 720 pixels at 25 frames per second between devices with a total delay of 181 ms on the public Internet, of which the overhead of the distributed Internet of Things communication platform only accounts for 18 ms of this delay. We have found that the most significant bottleneck of video transfer on limited Internet of Things devices is the video coding and not the distributed communication platform, since the video coding accounts for 90% of the total delay.
Visual tables of contents: structure and navigation of digital video material

NARCIS (Netherlands)

Janse, M.D.; Das, D.A.D.; Tang, H.K.; Paassen, van R.L.F.

1997-01-01

This paper presents a study that was initiated to address the relationship between visualization of content information, the structure of this information and the effective traversal and navigation for users of digital video storage systems in domestic environments. Preliminary results in two topic
New audio applications of beryllium metal

International Nuclear Information System (INIS)

Sato, M.

1977-01-01

The major applications of beryllium metal in the field of audio appliances are for the vibrating cones for the two types of speakers 'TWITTER' for high range sound and 'SQUAWKER' for mid range sound, and also for beryllium cantilever tube assembled in stereo cartridge. These new applications are based on the characteristic property of beryllium having high ratio of modulus of elasticity to specific gravity. The production of these audio parts is described, and the audio response is shown. (author)
Visual input that matches the content of vist of visual working memory requires less (not faster) evidence sampling to reach conscious access

NARCIS (Netherlands)

Gayet, S.; van Maanen, L.; Heilbron, M.; Paffen, C.L.E.; Van Der Stigchel, S.

2016-01-01

The content of visual working memory (VWM) affects the processing of concurrent visual input. Recently, it has been demonstrated that stimuli are released from interocular suppression faster when they match rather than mismatch a color that is memorized for subsequent recall. In order to investigate
Visually induced gains in pitch discrimination: Linking audio-visual processing with auditory abilities.

Science.gov (United States)

Møller, Cecilie; Højlund, Andreas; Bærentsen, Klaus B; Hansen, Niels Chr; Skewes, Joshua C; Vuust, Peter

2018-05-01

Perception is fundamentally a multisensory experience. The principle of inverse effectiveness (PoIE) states how the multisensory gain is maximal when responses to the unisensory constituents of the stimuli are weak. It is one of the basic principles underlying multisensory processing of spatiotemporally corresponding crossmodal stimuli that are well established at behavioral as well as neural levels. It is not yet clear, however, how modality-specific stimulus features influence discrimination of subtle changes in a crossmodally corresponding feature belonging to another modality. Here, we tested the hypothesis that reliance on visual cues to pitch discrimination follow the PoIE at the interindividual level (i.e., varies with varying levels of auditory-only pitch discrimination abilities). Using an oddball pitch discrimination task, we measured the effect of varying visually perceived vertical position in participants exhibiting a wide range of pitch discrimination abilities (i.e., musicians and nonmusicians). Visual cues significantly enhanced pitch discrimination as measured by the sensitivity index d', and more so in the crossmodally congruent than incongruent condition. The magnitude of gain caused by compatible visual cues was associated with individual pitch discrimination thresholds, as predicted by the PoIE. This was not the case for the magnitude of the congruence effect, which was unrelated to individual pitch discrimination thresholds, indicating that the pitch-height association is robust to variations in auditory skills. Our findings shed light on individual differences in multisensory processing by suggesting that relevant multisensory information that crucially aids some perceivers' performance may be of less importance to others, depending on their unisensory abilities.
Visual cues and listening effort: individual variability.

Science.gov (United States)

Picou, Erin M; Ricketts, Todd A; Hornsby, Benjamin W Y

2011-10-01

To investigate the effect of visual cues on listening effort as well as whether predictive variables such as working memory capacity (WMC) and lipreading ability affect the magnitude of listening effort. Twenty participants with normal hearing were tested using a paired-associates recall task in 2 conditions (quiet and noise) and 2 presentation modalities (audio only [AO] and auditory-visual [AV]). Signal-to-noise ratios were adjusted to provide matched speech recognition across audio-only and AV noise conditions. Also measured were subjective perceptions of listening effort and 2 predictive variables: (a) lipreading ability and (b) WMC. Objective and subjective results indicated that listening effort increased in the presence of noise, but on average the addition of visual cues did not significantly affect the magnitude of listening effort. Although there was substantial individual variability, on average participants who were better lipreaders or had larger WMCs demonstrated reduced listening effort in noise in AV conditions. Overall, the results support the hypothesis that integrating auditory and visual cues requires cognitive resources in some participants. The data indicate that low lipreading ability or low WMC is associated with relatively effortful integration of auditory and visual information in noise.
Efficient audio power amplification - challenges

Energy Technology Data Exchange (ETDEWEB)

Andersen, Michael A.E.

2005-07-01

For more than a decade efficient audio power amplification has evolved and today switch-mode audio power amplification in various forms are the state-of-the-art. The technical steps that lead to this evolution are described and in addition many of the challenges still to be faced and where extensive research and development are needed is covered. (au)
Detection Of Alterations In Audio Files Using Spectrograph Analysis

Directory of Open Access Journals (Sweden)

Anandha Krishnan G

2015-08-01

Full Text Available The corresponding study was carried out to detect changes in audio file using spectrograph. An audio file format is a file format for storing digital audio data on a computer system. A sound spectrograph is a laboratory instrument that displays a graphical representation of the strengths of the various component frequencies of a sound as time passes. The objectives of the study were to find the changes in spectrograph of audio after altering them to compare altering changes with spectrograph of original files and to check for similarity and difference in mp3 and wav. Five different alterations were carried out on each audio file to analyze the differences between the original and the altered file. For altering the audio file MP3 or WAV by cutcopy the file was opened in Audacity. A different audio was then pasted to the audio file. This new file was analyzed to view the differences. By adjusting the necessary parameters the noise was reduced. The differences between the new file and the original file were analyzed. By adjusting the parameters from the dialog box the necessary changes were made. The edited audio file was opened in the software named spek where after analyzing a graph is obtained of that particular file which is saved for further analysis. The original audio graph received was combined with the edited audio file graph to see the alterations.
Audio-Visual Speech in Noise Perception in Dyslexia

Science.gov (United States)

van Laarhoven, Thijs; Keetels, Mirjam; Schakel, Lemmy; Vroomen, Jean

2018-01-01

Individuals with developmental dyslexia (DD) may experience, besides reading problems, other speech-related processing deficits. Here, we examined the influence of visual articulatory information (lip-read speech) at various levels of background noise on auditory word recognition in children and adults with DD. We found that children with a…
Evolution-based Virtual Content Insertion with Visually Virtual Interactions in Videos

Science.gov (United States)

Chang, Chia-Hu; Wu, Ja-Ling

With the development of content-based multimedia analysis, virtual content insertion has been widely used and studied for video enrichment and multimedia advertising. However, how to automatically insert a user-selected virtual content into personal videos in a less-intrusive manner, with an attractive representation, is a challenging problem. In this chapter, we present an evolution-based virtual content insertion system which can insert virtual contents into videos with evolved animations according to predefined behaviors emulating the characteristics of evolutionary biology. The videos are considered not only as carriers of message conveyed by the virtual content but also as the environment in which the lifelike virtual contents live. Thus, the inserted virtual content will be affected by the videos to trigger a series of artificial evolutions and evolve its appearances and behaviors while interacting with video contents. By inserting virtual contents into videos through the system, users can easily create entertaining storylines and turn their personal videos into visually appealing ones. In addition, it would bring a new opportunity to increase the advertising revenue for video assets of the media industry and online video-sharing websites.
AudioMUD: a multiuser virtual environment for blind people.

Science.gov (United States)

Sánchez, Jaime; Hassler, Tiago

2007-03-01

A number of virtual environments have been developed during the last years. Among them there are some applications for blind people based on different type of audio, from simple sounds to 3-D audio. In this study, we pursued a different approach. We designed AudioMUD by using spoken text to describe the environment, navigation, and interaction. We have also introduced some collaborative features into the interaction between blind users. The core of a multiuser MUD game is a networked textual virtual environment. We developed AudioMUD by adding some collaborative features to the basic idea of a MUD and placed a simulated virtual environment inside the human body. This paper presents the design and usability evaluation of AudioMUD. Blind learners were motivated when interacted with AudioMUD and helped to improve the interaction through audio and interface design elements.
Kinesthetic working memory and action control within the dorsal stream.

Science.gov (United States)

Fiehler, Katja; Burke, Michael; Engel, Annerose; Bien, Siegfried; Rösler, Frank

2008-02-01

There is wide agreement that the "dorsal (action) stream" processes visual information for movement control. However, movements depend not only on vision but also on tactile and kinesthetic information (=haptics). Using functional magnetic resonance imaging, the present study investigates to what extent networks within the dorsal stream are also utilized for kinesthetic action control and whether they are also involved in kinesthetic working memory. Fourteen blindfolded participants performed a delayed-recognition task in which right-handed movements had to be encoded, maintained, and later recognized without any visual feedback. Encoding of hand movements activated somatosensory areas, superior parietal lobe (dorsodorsal stream), anterior intraparietal sulcus (aIPS) and adjoining areas (ventrodorsal stream), premotor cortex, and occipitotemporal cortex (ventral stream). Short-term maintenance of kinesthetic information elicited load-dependent activity in the aIPS and adjacent anterior portion of the superior parietal lobe (ventrodorsal stream) of the left hemisphere. We propose that the action representation system of the dorsodorsal and ventrodorsal stream is utilized not only for visual but also for kinesthetic action control. Moreover, the present findings demonstrate that networks within the ventrodorsal stream, in particular the left aIPS and closely adjacent areas, are also engaged in working memory maintenance of kinesthetic information.

[Symptoms and lesion localization in visual agnosia].

Science.gov (United States)

Suzuki, Kyoko

2004-11-01

There are two cortical visual processing streams, the ventral and dorsal stream. The ventral visual stream plays the major role in constructing our perceptual representation of the visual world and the objects within it. Disturbance of visual processing at any stage of the ventral stream could result in impairment of visual recognition. Thus we need systematic investigations to diagnose visual agnosia and its type. Two types of category-selective visual agnosia, prosopagnosia and landmark agnosia, are different from others in that patients could recognize a face as a face and buildings as buildings, but could not identify an individual person or building. Neuronal bases of prosopagnosia and landmark agnosia are distinct. Importance of the right fusiform gyrus for face recognition was confirmed by both clinical and neuroimaging studies. Landmark agnosia is related to lesions in the right parahippocampal gyrus. Enlarged lesions including both the right fusiform and parahippocampal gyri can result in prosopagnosia and landmark agnosia at the same time. Category non-selective visual agnosia is related to bilateral occipito-temporal lesions, which is in agreement with the results of neuroimaging studies that revealed activation of the bilateral occipito-temporal during object recognition tasks.
Warmth and competence in your face! Visual encoding of stereotype content

Directory of Open Access Journals (Sweden)

Roland eImhoff

2013-06-01

Full Text Available Previous research suggests that stereotypes about a group’s warmth bias our visual representation of group members. Based on the Stereotype Content Model the current research explored whether the second big dimension of social perception, competence, is also reflected in visual stereotypes. To test this, participants created typical faces for groups either high in warmth and low in competence (male nursery teachers or vice versa (managers in a reverse correlation image classification task, which allows for the visualization of stereotypes without any a priori assumptions about relevant dimensions. In support of the independent encoding of both SCM dimensions hypotheses-blind raters judged the resulting visualizations of nursery teachers as warmer but less competent than the resulting image for managers, even when statistically controlling for judgments on one dimension. People thus seem to use facial cues indicating both relevant dimensions to make sense of social groups in a parsimonious, non-verbal and spontaneous manner.
Audio Recording of Children with Dyslalia

OpenAIRE

Stefan Gheorghe Pentiuc; Maria D. Schipor; Ovidiu A. Schipor

2008-01-01

In this paper we present our researches regarding automat parsing of audio recordings. These recordings are obtained from children with dyslalia and are necessary for an accurate identification of speech problems. We develop a software application that helps parsing audio, real time, recordings.
The Two Visual Systems Hypothesis: new challenges and insights from visual form agnosic patient DF

Directory of Open Access Journals (Sweden)

Robert Leslie Whitwell

2014-12-01

Full Text Available Patient DF, who developed visual form agnosia following carbon monoxide poisoning, is still able to use vision to adjust the configuration of her grasping hand to the geometry of a goal object. This striking dissociation between perception and action in DF provided a key piece of evidence for the formulation of Goodale and Milner’s Two Visual Systems Hypothesis (TVSH. According to the TVSH, the ventral stream plays a critical role in constructing our visual percepts, whereas the dorsal stream mediates the visual control of action, such as visually guided grasping. In this review, we discuss recent studies of DF that provide new insights into the functional organization of the dorsal and ventral streams. We confirm recent evidence that DF has dorsal as well as ventral brain damage – and that her dorsal-stream lesions and surrounding atrophy have increased in size since her first published brain scan. We argue that the damage to DF’s dorsal stream explains her deficits in directing actions at targets in the periphery. We then focus on DF’s ability to accurately adjust her in-flight hand aperture to changes in the width of goal objects (grip scaling whose dimensions she cannot explicitly report. An examination of several studies of DF’s grip scaling under natural conditions reveals a modest though significant deficit. Importantly, however, she continues to show a robust dissociation between form vision for perception and form vision for action. We also review recent studies that explore the role of online visual feedback and terminal haptic feedback in the programming and control of her grasping. These studies make it clear that DF is no more reliant on visual or haptic feedback than are neurologically-intact individuals. In short, we argue that her ability to grasp objects depends on visual feedforward processing carried out by visuomotor networks in her dorsal stream that function in the much the same way as they do in neurologically
Parametric time-frequency domain spatial audio

CERN Document Server

Delikaris-Manias, Symeon; Politis, Archontis

2018-01-01

This book provides readers with the principles and best practices in spatial audio signal processing. It describes how sound fields and their perceptual attributes are captured and analyzed within the time-frequency domain, how essential representation parameters are coded, and how such signals are efficiently reproduced for practical applications. The book is split into four parts starting with an overview of the fundamentals. It then goes on to explain the reproduction of spatial sound before offering an examination of signal-dependent spatial filtering. The book finishes with coverage of both current and future applications and the direction that spatial audio research is heading in. Parametric Time-frequency Domain Spatial Audio focuses on applications in entertainment audio, including music, home cinema, and gaming--covering the capturing and reproduction of spatial sound as well as its generation, transduction, representation, transmission, and perception. This book will teach readers the tools needed...
Audio-Visual Speech Perception: A Developmental ERP Investigation

Science.gov (United States)

Knowland, Victoria C. P.; Mercure, Evelyne; Karmiloff-Smith, Annette; Dick, Fred; Thomas, Michael S. C.

2014-01-01

Being able to see a talking face confers a considerable advantage for speech perception in adulthood. However, behavioural data currently suggest that children fail to make full use of these available visual speech cues until age 8 or 9. This is particularly surprising given the potential utility of multiple informational cues during language…
Predicting the Overall Spatial Quality of Automotive Audio Systems

Science.gov (United States)

Koya, Daisuke

The spatial quality of automotive audio systems is often compromised due to their unideal listening environments. Automotive audio systems need to be developed quickly due to industry demands. A suitable perceptual model could evaluate the spatial quality of automotive audio systems with similar reliability to formal listening tests but take less time. Such a model is developed in this research project by adapting an existing model of spatial quality for automotive audio use. The requirements for the adaptation were investigated in a literature review. A perceptual model called QESTRAL was reviewed, which predicts the overall spatial quality of domestic multichannel audio systems. It was determined that automotive audio systems are likely to be impaired in terms of the spatial attributes that were not considered in developing the QESTRAL model, but metrics are available that might predict these attributes. To establish whether the QESTRAL model in its current form can accurately predict the overall spatial quality of automotive audio systems, MUSHRA listening tests using headphone auralisation with head tracking were conducted to collect results to be compared against predictions by the model. Based on guideline criteria, the model in its current form could not accurately predict the overall spatial quality of automotive audio systems. To improve prediction performance, the QESTRAL model was recalibrated and modified using existing metrics of the model, those that were proposed from the literature review, and newly developed metrics. The most important metrics for predicting the overall spatial quality of automotive audio systems included those that were interaural cross-correlation (IACC) based, relate to localisation of the frontal audio scene, and account for the perceived scene width in front of the listener. Modifying the model for automotive audio systems did not invalidate its use for domestic audio systems. The resulting model predicts the overall spatial
Energy content of suspended detritus from Arabian Sea

Digital Repository Service at National Institute of Oceanography (India)

Krishnakumari, L.; Sumitra-Vijayaraghavan; Royan, J

stream_size 3 stream_content_type text/plain stream_name Indian_J_Mar_Sci_20_80.pdf.txt stream_source_info Indian_J_Mar_Sci_20_80.pdf.txt Content-Encoding ISO-8859-1 Content-Type text/plain; charset=ISO-8859-1 ...
When pictures waste a thousand words: analysis of the 2009 H1N1 pandemic on television news.

Science.gov (United States)

Luth, Westerly; Jardine, Cindy; Bubela, Tania

2013-01-01

Effective communication by public health agencies during a pandemic promotes the adoption of recommended health behaviours. However, more information is not always the solution. Rather, attention must be paid to how information is communicated. Our study examines the television news, which combines video and audio content. We analyse (1) the content of television news about the H1N1 pandemic and vaccination campaign in Alberta, Canada; (2) the extent to which television news content conveyed key public health agency messages; (3) the extent of discrepancies in audio versus visual content. We searched for "swine flu" and "H1N1" in local English news broadcasts from the CTV online video archive. We coded the audio and visual content of 47 news clips during the peak period of coverage from April to November 2009 and identified discrepancies between audio and visual content. The dominant themes on CTV news were the vaccination rollout, vaccine shortages, long line-ups (queues) at vaccination clinics and defensive responses by public health officials. There were discrepancies in the priority groups identified by the provincial health agency (Alberta Health and Wellness) and television news coverage as well as discrepancies between audio and visual content of news clips. Public health officials were presented in official settings rather than as public health practitioners. The news footage did not match the main public health messages about risk levels and priority groups. Public health agencies lost control of their message as the media focused on failures in the rollout of the vaccination campaign. Spokespeople can enhance their local credibility by emphasizing their role as public health practitioners. Public health agencies need to learn from the H1N1 pandemic so that future television communications do not add to public confusion, demonstrate bureaucratic ineffectiveness and contribute to low vaccination rates.
Editing Audio with Audacity

Directory of Open Access Journals (Sweden)

Brandon Walsh

2016-08-01

Full Text Available For those interested in audio, basic sound editing skills go a long way. Being able to handle and manipulate the materials can help you take control of your object of study: you can zoom in and extract particular moments to analyze, process the audio, and upload the materials to a server to compliment a blog post on the topic. On a more practical level, these skills could also allow you to record and package recordings of yourself or others for distribution. That guest lecture taking place in your department? Record it and edit it yourself! Doing so is a lightweight way to distribute resources among various institutions, and it also helps make the materials more accessible for readers and listeners with a wide variety of learning needs. In this lesson you will learn how to use Audacity to load, record, edit, mix, and export audio files. Sound editing platforms are often expensive and offer extensive capabilities that can be overwhelming to the first-time user, but Audacity is a free and open source alternative that offers powerful capabilities for sound editing with a low barrier for entry. For this lesson we will work with two audio files: a recording of Bach’s Goldberg Variations available from MusOpen and another recording of your own voice that will be made in the course of the lesson. This tutorial uses Audacity 2.1.2, released January 2016.
Tourism research and audio methods

DEFF Research Database (Denmark)

Jensen, Martin Trandberg

2016-01-01

• Audio methods enriches sensuous tourism ethnographies. • The note suggests five research avenues for future auditory scholarship. • Sensuous tourism research has neglected the role of sounds in embodied tourism experiences.......• Audio methods enriches sensuous tourism ethnographies. • The note suggests five research avenues for future auditory scholarship. • Sensuous tourism research has neglected the role of sounds in embodied tourism experiences....
Newnes audio and Hi-Fi engineer's pocket book

CERN Document Server

Capel, Vivian

2013-01-01

Newnes Audio and Hi-Fi Engineer's Pocket Book, Second Edition provides concise discussion of several audio topics. The book is comprised of 10 chapters that cover different audio equipment. The coverage of the text includes microphones, gramophones, compact discs, and tape recorders. The book also covers high-quality radio, amplifiers, and loudspeakers. The book then reviews the concepts of sound and acoustics, and presents some facts and formulas relevant to audio. The text will be useful to sound engineers and other professionals whose work involves sound systems.
DAFX Digital Audio Effects

CERN Document Server

Zö

2011-01-01

The rapid development in various fields of Digital Audio Effects, or DAFX, has led to new algorithms and this second edition of the popular book, DAFX: Digital Audio Effects has been updated throughout to reflect progress in the field. It maintains a unique approach to DAFX with a lecture-style introduction into the basics of effect processing. Each effect description begins with the presentation of the physical and acoustical phenomena, an explanation of the signal processing techniques to achieve the effect, followed by a discussion of musical applications and the control of effect parameter
Accumulation and Decay of Visual Capture and the Ventriloquism Aftereffect Caused by Brief Audio-Visual Disparities

Science.gov (United States)

Bosen, Adam K.; Fleming, Justin T.; Allen, Paul D.; O’Neill, William E.; Paige, Gary D.

2016-01-01

Visual capture and the ventriloquism aftereffect resolve spatial disparities of incongruent auditory-visual (AV) objects by shifting auditory spatial perception to align with vision. Here, we demonstrated the distinct temporal characteristics of visual capture and the ventriloquism aftereffect in response to brief AV disparities. In a set of experiments, subjects localized either the auditory component of AV targets (A within AV) or a second sound presented at varying delays (1-20s) after AV exposure (A2 after AV). AV targets were trains of brief presentations (1 or 20), covering a ±30° azimuthal range, and with ±8° (R or L) disparity. We found that the magnitude of visual capture generally reached its peak within a single AV pair and did not dissipate with time, while the ventriloquism aftereffect accumulated with repetitions of AV pairs and dissipated with time. Additionally, the magnitude of the auditory shift induced by each phenomenon was uncorrelated across listeners and visual capture was unaffected by subsequent auditory targets, indicating that visual capture and the ventriloquism aftereffect are separate mechanisms with distinct effects on auditory spatial perception. Our results indicate that visual capture is a ‘sample-and-hold’ process that binds related objects and stores the combined percept in memory, whereas the ventriloquism aftereffect is a ‘leaky integrator’ process that accumulates with experience and decays with time to compensate for cross-modal disparities. PMID:27837258
AUTOMATIC SEGMENTATION OF BROADCAST AUDIO SIGNALS USING AUTO ASSOCIATIVE NEURAL NETWORKS

Directory of Open Access Journals (Sweden)

P. Dhanalakshmi

2010-12-01

Full Text Available In this paper, we describe automatic segmentation methods for audio broadcast data. Today, digital audio applications are part of our everyday lives. Since there are more and more digital audio databases in place these days, the importance of effective management for audio databases have become prominent. Broadcast audio data is recorded from the Television which comprises of various categories of audio signals. Efficient algorithms for segmenting the audio broadcast data into predefined categories are proposed. Audio features namely Linear prediction coefficients (LPC, Linear prediction cepstral coefficients, and Mel frequency cepstral coefficients (MFCC are extracted to characterize the audio data. Auto Associative Neural Networks are used to segment the audio data into predefined categories using the extracted features. Experimental results indicate that the proposed algorithms can produce satisfactory results.
Patient DF's visual brain in action: Visual feedforward control in visual form agnosia.

Science.gov (United States)

Whitwell, Robert L; Milner, A David; Cavina-Pratesi, Cristiana; Barat, Masihullah; Goodale, Melvyn A

2015-05-01

Patient DF, who developed visual form agnosia following ventral-stream damage, is unable to discriminate the width of objects, performing at chance, for example, when asked to open her thumb and forefinger a matching amount. Remarkably, however, DF adjusts her hand aperture to accommodate the width of objects when reaching out to pick them up (grip scaling). While this spared ability to grasp objects is presumed to be mediated by visuomotor modules in her relatively intact dorsal stream, it is possible that it may rely abnormally on online visual or haptic feedback. We report here that DF's grip scaling remained intact when her vision was completely suppressed during grasp movements, and it still dissociated sharply from her poor perceptual estimates of target size. We then tested whether providing trial-by-trial haptic feedback after making such perceptual estimates might improve DF's performance, but found that they remained significantly impaired. In a final experiment, we re-examined whether DF's grip scaling depends on receiving veridical haptic feedback during grasping. In one condition, the haptic feedback was identical to the visual targets. In a second condition, the haptic feedback was of a constant intermediate width while the visual target varied trial by trial. Despite this incongruent feedback, DF still scaled her grip aperture to the visual widths of the target blocks, showing only normal adaptation to the false haptically-experienced width. Taken together, these results strengthen the view that DF's spared grasping relies on a normal mode of dorsal-stream functioning, based chiefly on visual feedforward processing. Copyright © 2014 Elsevier B.V. All rights reserved.
The Vibe: A Versatile Vision-to-Audition Sensory Substitution Device

Directory of Open Access Journals (Sweden)

Sylvain Hanneton

2010-01-01

Full Text Available We describe a sensory substitution scheme that converts a video stream into an audio stream in real-time. It was initially developed as a research tool for studying human ability to learn new ways of perceiving the world: the Vibe can give us the ability to learn a kind of ‘vision’ by audition. It converts a video stream into a continuous stereophonic audio signal that conveys information coded from the video stream. The conversion from the video stream to the audio stream uses a kind of retina with receptive fields. Each receptive field controls a sound source and the user listens to a sound that is a mixture of all these sound sources. Compared to other existing vision-to-audition sensory substitution devices, the Vibe is highly versatile in particular because it uses a set of configurable units working in parallel. In order to demonstrate the validity and interest of this method of vision to audition conversion, we give the results of an experiment involving a pointing task to targets memorised through visual perception or through their auditory conversion by the Vibe. This article is also an opportunity to precisely draw the general specifications of this scheme in order to prepare its implementation on an autonomous/mobile hardware.
47 CFR 10.520 - Common audio attention signal.

Science.gov (United States)

2010-10-01

... 47 Telecommunication 1 2010-10-01 2010-10-01 false Common audio attention signal. 10.520 Section... Equipment Requirements § 10.520 Common audio attention signal. A Participating CMS Provider and equipment manufacturers may only market devices for public use under part 10 that include an audio attention signal that...
Audio Recording of Children with Dyslalia

Directory of Open Access Journals (Sweden)

Stefan Gheorghe Pentiuc

2008-01-01

Full Text Available In this paper we present our researches regarding automat parsing of audio recordings. These recordings are obtained from children with dyslalia and are necessary for an accurate identification of speech problems. We develop a software application that helps parsing audio, real time, recordings.
Audio Journal in an ELT Context

Directory of Open Access Journals (Sweden)

Neşe Aysin Siyli

2012-09-01

Full Text Available It is widely acknowledged that one of the most serious problems students of English as a foreign language face is their deprivation of practicing the language outside the classroom. Generally, the classroom is the sole environment where they can practice English, which by its nature does not provide rich setting to help students develop their competence by putting the language into practice. Motivated by this need, this descriptive study investigated the impact of audio dialog journals on students’ speaking skills. It also aimed to gain insights into students’ and teacher’s opinions on keeping audio dialog journals outside the class. The data of the study developed from student and teacher audio dialog journals, student written feedbacks, interviews held with the students, and teacher observations. The descriptive analysis of the data revealed that audio dialog journals served a number of functions ranging from cognitive to linguistic, from pedagogical to psychological, and social. The findings and pedagogical implications of the study are discussed in detail.

Virtual Microphones for Multichannel Audio Resynthesis

Directory of Open Access Journals (Sweden)

Athanasios Mouchtaris

2003-09-01

Full Text Available Multichannel audio offers significant advantages for music reproduction, including the ability to provide better localization and envelopment, as well as reduced imaging distortion. On the other hand, multichannel audio is a demanding media type in terms of transmission requirements. Often, bandwidth limitations prohibit transmission of multiple audio channels. In such cases, an alternative is to transmit only one or two reference channels and recreate the rest of the channels at the receiving end. Here, we propose a system capable of synthesizing the required signals from a smaller set of signals recorded in a particular venue. These synthesized Ã‚Â“virtualÃ‚Â” microphone signals can be used to produce multichannel recordings that accurately capture the acoustics of that venue. Applications of the proposed system include transmission of multichannel audio over the current Internet infrastructure and, as an extension of the methods proposed here, remastering existing monophonic and stereophonic recordings for multichannel rendering.
Identification of Values of Ornaments in Indonesian Batik in Visual Content of Nitiki Game

Directory of Open Access Journals (Sweden)

Chandra Tresnadi

2015-08-01

Full Text Available Batik is a form of visual art on textile materials produced using traditional drawing techniques originating from Indonesia. For the Javanese, batik is a traditional cloth integral to their cultural identity. Visuals on ornaments of batik cloths illustrate the life sayings and values upon which the life of the community is laid. The study focuses on identifying the values found in Indonesian batik ornaments which are adapted as visual content on the Nitiki game. The findings are then used to reconstruct the values that represent the real batik culture. This study employs the qualitative descriptive method by collecting dozens of batik ornaments on the Nitiki game, exploring the values mentioned in literature, sorting out the dominant values, and reconstructing them. The findings suggest that the values found in Indonesian batik ornaments in the Nitiki game clearly show the patterns of how traditional culture of batik survives and thrives in Indonesian society, as well as show the flexibility of batik against the current development of modern culture, including its integration as culture-based content in interactive media. This study contributes to the dissertation research on aesthetical interaction in cultural content-based game.
More than a House of Cards: Developing a Firm Foundation for Streaming Media and Consumer-Licensed Content in the Library

Directory of Open Access Journals (Sweden)

William Cross

2016-09-01

Full Text Available This article will introduce traditional library practice for licensing multimedia content and discuss the way that consumer-licensing and streaming services disrupt that practice. Sections II and III describe the statutory copyright regime designed by Congress to facilitate the socially-valuable work done by libraries and the impact of the move from ownership to licensed content. Collecting multimedia materials has always presented special legal challenges for libraries, particularly as licensed content has replaced the traditional practice of purchasing and circulation based on the first sale doctrine. These issues have grown even more complex as streaming services like Netflix and Amazon and video game downloads through services like Steam have come to dominate the landscape. Section IV will describe the way that consumer-licensed materials, which not only remove the ownership that undergirds library practice, but also the ability to negotiate for library use, imperil the congressionally-designed balance. Section V will present a path forward for libraries to develop robust, cutting-edge collections that reflect a sophisticated understanding of the contractual and copyright issues at play.
Conditioning Influences Audio-Visual Integration by Increasing Sound Saliency

Directory of Open Access Journals (Sweden)

Fabrizio Leo

2011-10-01

Full Text Available We investigated the effect of prior conditioning of an auditory stimulus on audiovisual integration in a series of four psychophysical experiments. The experiments factorially manipulated the conditioning procedure (picture vs monetary conditioning and multisensory paradigm (2AFC visual detection vs redundant target paradigm. In the conditioning sessions, subjects were presented with three pure tones (= conditioned stimulus, CS that were paired with neutral, positive, or negative unconditioned stimuli (US, monetary: +50 euro cents,.–50 cents, 0 cents; pictures: highly pleasant, unpleasant, and neutral IAPS. In a 2AFC visual selective attention paradigm, detection of near-threshold Gabors was improved by concurrent sounds that had previously been paired with a positive (monetary or negative (picture outcome relative to neutral sounds. In the redundant target paradigm, sounds previously paired with positive (monetary or negative (picture outcomes increased response speed to both auditory and audiovisual targets similarly. Importantly, prior conditioning did not increase the multisensory response facilitation (ie, (A + V/2 – AV or the race model violation. Collectively, our results suggest that prior conditioning primarily increases the saliency of the auditory stimulus per se rather than influencing audiovisual integration directly. In turn, conditioned sounds are rendered more potent for increasing response accuracy or speed in detection of visual targets.
Music Genre Classification Using MIDI and Audio Features

Science.gov (United States)

Cataltepe, Zehra; Yaslan, Yusuf; Sonmez, Abdullah

2007-12-01

We report our findings on using MIDI files and audio features from MIDI, separately and combined together, for MIDI music genre classification. We use McKay and Fujinaga's 3-root and 9-leaf genre data set. In order to compute distances between MIDI pieces, we use normalized compression distance (NCD). NCD uses the compressed length of a string as an approximation to its Kolmogorov complexity and has previously been used for music genre and composer clustering. We convert the MIDI pieces to audio and then use the audio features to train different classifiers. MIDI and audio from MIDI classifiers alone achieve much smaller accuracies than those reported by McKay and Fujinaga who used not NCD but a number of domain-based MIDI features for their classification. Combining MIDI and audio from MIDI classifiers improves accuracy and gets closer to, but still worse, accuracies than McKay and Fujinaga's. The best root genre accuracies achieved using MIDI, audio, and combination of them are 0.75, 0.86, and 0.93, respectively, compared to 0.98 of McKay and Fujinaga. Successful classifier combination requires diversity of the base classifiers. We achieve diversity through using certain number of seconds of the MIDI file, different sample rates and sizes for the audio file, and different classification algorithms.
High contents of rare earth elements (REEs) in stream waters of a Cu-Pb-Zn mining area.

Science.gov (United States)

Protano, G; Riccobono, F

2002-01-01

Stream waters draining an old mining area present very high rare earth element (REE) contents, reaching 928 microg/l as the maximum total value (sigmaREE). The middle rare earth elements (MREEs) are usually enriched with respect to both the light (LREEs) and heavy (HREEs) elements of this group, producing a characteristic "roof-shaped" pattern of the shale Post-Archean Australian Shales-normalized concentrations. At the Fenice Capanne Mine (FCM), the most important base metal mine of the study area, the REE source coincides with the mine tailings, mostly the oldest ones composed of iron-rich materials. The geochemical history of the REEs released into Noni stream from wastes in the FCM area is strictly determined by the pH, which controls the REE speciation and in-stream processes. The formation of Al-rich and mainly Fe-rich flocs effectively scavenges the REEs, which are readily and drastically removed from the solution when the pH approaches neutrality. Leaching experiments performed on flocs and waste materials demonstrate that Fe-oxides/oxyhydroxides play a key role in the release of lanthanide elements into stream waters. The origin of the "roof-shaped" REE distribution pattern as well as the peculiar geochemical behavior of some lanthanide elements in the aqueous system are discussed.
The effect of early visual deprivation on the neural bases of multisensory processing

OpenAIRE

Guerreiro, Maria J. S.; Putzar, Lisa; Röder, Brigitte

2015-01-01

Animal studies have shown that congenital visual deprivation reduces the ability of neurons to integrate cross-modal inputs. Guerreiro et al. reveal that human patients who suffer transient congenital visual deprivation because of cataracts lack multisensory integration in auditory and multisensory areas as adults, and suppress visual processing during audio-visual stimulation.
Visual information constrains early and late stages of spoken-word recognition in sentence context.

Science.gov (United States)

Brunellière, Angèle; Sánchez-García, Carolina; Ikumi, Nara; Soto-Faraco, Salvador

2013-07-01

Audiovisual speech perception has been frequently studied considering phoneme, syllable and word processing levels. Here, we examined the constraints that visual speech information might exert during the recognition of words embedded in a natural sentence context. We recorded event-related potentials (ERPs) to words that could be either strongly or weakly predictable on the basis of the prior semantic sentential context and, whose initial phoneme varied in the degree of visual saliency from lip movements. When the sentences were presented audio-visually (Experiment 1), words weakly predicted from semantic context elicited a larger long-lasting N400, compared to strongly predictable words. This semantic effect interacted with the degree of visual saliency over a late part of the N400. When comparing audio-visual versus auditory alone presentation (Experiment 2), the typical amplitude-reduction effect over the auditory-evoked N100 response was observed in the audiovisual modality. Interestingly, a specific benefit of high- versus low-visual saliency constraints occurred over the early N100 response and at the late N400 time window, confirming the result of Experiment 1. Taken together, our results indicate that the saliency of visual speech can exert an influence over both auditory processing and word recognition at relatively late stages, and thus suggest strong interactivity between audio-visual integration and other (arguably higher) stages of information processing during natural speech comprehension. Copyright © 2013 Elsevier B.V. All rights reserved.
Visual short-term memory: activity supporting encoding and maintenance in retinotopic visual cortex.

Science.gov (United States)

Sneve, Markus H; Alnæs, Dag; Endestad, Tor; Greenlee, Mark W; Magnussen, Svein

2012-10-15

Recent studies have demonstrated that retinotopic cortex maintains information about visual stimuli during retention intervals. However, the process by which transient stimulus-evoked sensory responses are transformed into enduring memory representations is unknown. Here, using fMRI and short-term visual memory tasks optimized for univariate and multivariate analysis approaches, we report differential involvement of human retinotopic areas during memory encoding of the low-level visual feature orientation. All visual areas show weaker responses when memory encoding processes are interrupted, possibly due to effects in orientation-sensitive primary visual cortex (V1) propagating across extrastriate areas. Furthermore, intermediate areas in both dorsal (V3a/b) and ventral (LO1/2) streams are significantly more active during memory encoding compared with non-memory (active and passive) processing of the same stimulus material. These effects in intermediate visual cortex are also observed during memory encoding of a different stimulus feature (spatial frequency), suggesting that these areas are involved in encoding processes on a higher level of representation. Using pattern-classification techniques to probe the representational content in visual cortex during delay periods, we further demonstrate that simply initiating memory encoding is not sufficient to produce long-lasting memory traces. Rather, active maintenance appears to underlie the observed memory-specific patterns of information in retinotopic cortex. Copyright © 2012 Elsevier Inc. All rights reserved.
Realtime Audio with Garbage Collection

OpenAIRE

Matheussen, Kjetil Svalastog

2010-01-01

Two non-moving concurrent garbage collectors tailored for realtime audio processing are described. Both collectors work on copies of the heap to avoid cache misses and audio-disruptive synchronizations. Both collectors are targeted at multiprocessor personal computers. The first garbage collector works in uncooperative environments, and can replace Hans Boehm's conservative garbage collector for C and C++. The collector does not access the virtual memory system. Neither doe...
Experimental determination of the empirical formula and energy content of unknown organics in waste streams

Energy Technology Data Exchange (ETDEWEB)

Shizas, I. [Univ. of Toronto, Dept. of Civil Engineering, Toronto, Ontario (Canada); Kosmatos, A. [Ontario Power Generation, Toronto, Ontario (Canada); Bagley, D.M. [Univ. of Toronto, Dept. of Civil Engineering, Toronto, Ontario (Canada)

2002-06-15

Two experimental methods are described in this paper: one for determining the empirical formula, and one for determining the energy content of unknown organics in waste streams. The empirical formula method requires volatile solids (VS), chemical oxygen demand (COD), total organic carbon (TOC), and total Kjeldahl nitrogen (TKN) to be measured for the waste; the formula can then be calculated from these values. To determine the energy content of the organic waste, bomb calorimetry was used with benzoic acid as a combustion aid. The results for standard compounds (glucose, propionic acid, L-arginine, and benzoic acid) were relatively good. The energy content measurement for wastewater and sludges had good reproducibility (i.e. 1.0 to 3.2% relative standard deviation for triplicate samples). Trouble encountered in the measurement of the empirical formulae of the waste samples was possibly due to difficulties with the TOC test; further analysis of this is required. (author)
Experimental determination of the empirical formula and energy content of unknown organics in waste streams

International Nuclear Information System (INIS)

Shizas, I.; Kosmatos, A.; Bagley, D.M.

2002-01-01

Two experimental methods are described in this paper: one for determining the empirical formula, and one for determining the energy content of unknown organics in waste streams. The empirical formula method requires volatile solids (VS), chemical oxygen demand (COD), total organic carbon (TOC), and total Kjeldahl nitrogen (TKN) to be measured for the waste; the formula can then be calculated from these values. To determine the energy content of the organic waste, bomb calorimetry was used with benzoic acid as a combustion aid. The results for standard compounds (glucose, propionic acid, L-arginine, and benzoic acid) were relatively good. The energy content measurement for wastewater and sludges had good reproducibility (i.e. 1.0 to 3.2% relative standard deviation for triplicate samples). Trouble encountered in the measurement of the empirical formulae of the waste samples was possibly due to difficulties with the TOC test; further analysis of this is required. (author)
Defining the cortical visual systems: "what", "where", and "how"

Science.gov (United States)

Creem, S. H.; Proffitt, D. R.; Kaiser, M. K. (Principal Investigator)

2001-01-01

The visual system historically has been defined as consisting of at least two broad subsystems subserving object and spatial vision. These visual processing streams have been organized both structurally as two distinct pathways in the brain, and functionally for the types of tasks that they mediate. The classic definition by Ungerleider and Mishkin labeled a ventral "what" stream to process object information and a dorsal "where" stream to process spatial information. More recently, Goodale and Milner redefined the two visual systems with a focus on the different ways in which visual information is transformed for different goals. They relabeled the dorsal stream as a "how" system for transforming visual information using an egocentric frame of reference in preparation for direct action. This paper reviews recent research from psychophysics, neurophysiology, neuropsychology and neuroimaging to define the roles of the ventral and dorsal visual processing streams. We discuss a possible solution that allows for both "where" and "how" systems that are functionally and structurally organized within the posterior parietal lobe.
CERN automatic audio-conference service

International Nuclear Information System (INIS)

Sierra Moral, Rodrigo

2010-01-01

Scientists from all over the world need to collaborate with CERN on a daily basis. They must be able to communicate effectively on their joint projects at any time; as a result telephone conferences have become indispensable and widely used. Managed by 6 operators, CERN already has more than 20000 hours and 5700 audio-conferences per year. However, the traditional telephone based audio-conference system needed to be modernized in three ways. Firstly, to provide the participants with more autonomy in the organization of their conferences; secondly, to eliminate the constraints of manual intervention by operators; and thirdly, to integrate the audio-conferences into a collaborative working framework. The large number, and hence cost, of the conferences prohibited externalization and so the CERN telecommunications team drew up a specification to implement a new system. It was decided to use a new commercial collaborative audio-conference solution based on the SIP protocol. The system was tested as the first European pilot and several improvements (such as billing, security, redundancy...) were implemented based on CERN's recommendations. The new automatic conference system has been operational since the second half of 2006. It is very popular for the users and has doubled the number of conferences in the past two years.
CERN automatic audio-conference service

Energy Technology Data Exchange (ETDEWEB)

Sierra Moral, Rodrigo, E-mail: Rodrigo.Sierra@cern.c [CERN, IT Department 1211 Geneva-23 (Switzerland)

2010-04-01

Scientists from all over the world need to collaborate with CERN on a daily basis. They must be able to communicate effectively on their joint projects at any time; as a result telephone conferences have become indispensable and widely used. Managed by 6 operators, CERN already has more than 20000 hours and 5700 audio-conferences per year. However, the traditional telephone based audio-conference system needed to be modernized in three ways. Firstly, to provide the participants with more autonomy in the organization of their conferences; secondly, to eliminate the constraints of manual intervention by operators; and thirdly, to integrate the audio-conferences into a collaborative working framework. The large number, and hence cost, of the conferences prohibited externalization and so the CERN telecommunications team drew up a specification to implement a new system. It was decided to use a new commercial collaborative audio-conference solution based on the SIP protocol. The system was tested as the first European pilot and several improvements (such as billing, security, redundancy...) were implemented based on CERN's recommendations. The new automatic conference system has been operational since the second half of 2006. It is very popular for the users and has doubled the number of conferences in the past two years.
CERN automatic audio-conference service

Science.gov (United States)

Sierra Moral, Rodrigo

2010-04-01

Scientists from all over the world need to collaborate with CERN on a daily basis. They must be able to communicate effectively on their joint projects at any time; as a result telephone conferences have become indispensable and widely used. Managed by 6 operators, CERN already has more than 20000 hours and 5700 audio-conferences per year. However, the traditional telephone based audio-conference system needed to be modernized in three ways. Firstly, to provide the participants with more autonomy in the organization of their conferences; secondly, to eliminate the constraints of manual intervention by operators; and thirdly, to integrate the audio-conferences into a collaborative working framework. The large number, and hence cost, of the conferences prohibited externalization and so the CERN telecommunications team drew up a specification to implement a new system. It was decided to use a new commercial collaborative audio-conference solution based on the SIP protocol. The system was tested as the first European pilot and several improvements (such as billing, security, redundancy...) were implemented based on CERN's recommendations. The new automatic conference system has been operational since the second half of 2006. It is very popular for the users and has doubled the number of conferences in the past two years.
Near-field Localization of Audio

DEFF Research Database (Denmark)

Jensen, Jesper Rindom; Christensen, Mads Græsbøll

2014-01-01

Localization of audio sources using microphone arrays has been an important research problem for more than two decades. Many traditional methods for solving the problem are based on a two-stage procedure: first, information about the audio source, such as time differences-of-arrival (TDOAs......) and gain ratios-of-arrival (GROAs) between microphones is estimated, and, second, this knowledge is used to localize the audio source. These methods often have a low computational complexity, but this comes at the cost of a limited estimation accuracy. Therefore, we propose a new localization approach......, where the desired signal is modeled using TDOAs and GROAs, which are determined by the source location. This facilitates the derivation of one-stage, maximum likelihood methods under a white Gaussian noise assumption that is applicable in both near- and far-field scenarios. Simulations show...
Flow visualization

CERN Document Server

Merzkirch, Wolfgang

1974-01-01

Flow Visualization describes the most widely used methods for visualizing flows. Flow visualization evaluates certain properties of a flow field directly accessible to visual perception. Organized into five chapters, this book first presents the methods that create a visible flow pattern that could be investigated by visual inspection, such as simple dye and density-sensitive visualization methods. It then deals with the application of electron beams and streaming birefringence. Optical methods for compressible flows, hydraulic analogy, and high-speed photography are discussed in other cha
Musical Audio Synthesis Using Autoencoding Neural Nets

OpenAIRE

Sarroff, Andy; Casey, Michael A.

2014-01-01

With an optimal network topology and tuning of hyperpa-\\ud rameters, artificial neural networks (ANNs) may be trained\\ud to learn a mapping from low level audio features to one\\ud or more higher-level representations. Such artificial neu-\\ud ral networks are commonly used in classification and re-\\ud gression settings to perform arbitrary tasks. In this work\\ud we suggest repurposing autoencoding neural networks as\\ud musical audio synthesizers. We offer an interactive musi-\\ud cal audio synt...
PROGRAMA DE CAPACITACIÓN TECNOLÓGICA PARA PERSONAS CON DISCAPACIDAD VISUAL //\tTECHNOLOGY TRAINING PROGRAM FOR PEOPLE WITH VISUAL DISABILITY

Directory of Open Access Journals (Sweden)

Ninfa Barón Méndez

2012-12-01

Full Text Available Disabled people need to rely on the use of technology to perform daily activities and work effectively and the couple of people without disabilities. In this sense, the objective of this experiment was to develop an educational proposal for the training of people with visual impairments in the use of o ce automation tools. The development of the program came from the request of the Palavecino Integration Team and based on documentary research monograph. Program design considered the conditions of people with visual impairments, the conditions must be the classroom, take their special educational needs and is based on the use of advanced technological tools to enable them to access the curriculum. The PCTecnoVisual was divided into 5 modules: Basic use of computers, Word, Excel, Power Point, Internet browsers and search engines, and has an estimated duration of 81 hours. Facilitators received 21 hours of training and Jaws 9.0 was used as the reader and Audio Testi 3.0 to translate the manuals to audio format.// RESUMEN: Las personas con discapacidad requieren apoyarse en el uso de la tecnología para realizar actividades cotidianas y laborales de manera efectiva y a la par de personas sin discapacidad. En este sentido, el objetivo de esta experiencia fue elaborar una propuesta educativa para la capacitación de las personas con discapacidad visual en el uso de las herramientas o ofimáticas. La elaboración del programa surgió de la solicitud del personal del Equipo de Integración Palavecino y se apoyó en la investigación monográfica documental. El diseño del programa consideró las condiciones de las personas con discapacidad visual, las condiciones que debe tener el aula, atiende sus necesidades especiales de educación y se fundamenta en el uso de las herramientas tecnológica de vanguardia para que puedan acce- der al currículo. El PCTecnoVisual se estructuró en 5 módulos: Uso básico del computador, Word, Excel, Power Point

Some links on this page may take you to non-federal websites. Their policies may differ from this site.