WorldWideScience

Sample records for video nature scenes

  1. Categorization of natural dynamic audiovisual scenes.

    Directory of Open Access Journals (Sweden)

    Olli Rummukainen

    Full Text Available This work analyzed the perceptual attributes of natural dynamic audiovisual scenes. We presented thirty participants with 19 natural scenes in a similarity categorization task, followed by a semi-structured interview. The scenes were reproduced with an immersive audiovisual display. Natural scene perception has been studied mainly with unimodal settings, which have identified motion as one of the most salient attributes related to visual scenes, and sound intensity along with pitch trajectories related to auditory scenes. However, controlled laboratory experiments with natural multimodal stimuli are still scarce. Our results show that humans pay attention to similar perceptual attributes in natural scenes, and a two-dimensional perceptual map of the stimulus scenes and perceptual attributes was obtained in this work. The exploratory results show the amount of movement, perceived noisiness, and eventfulness of the scene to be the most important perceptual attributes in naturalistically reproduced real-world urban environments. We found the scene gist properties openness and expansion to remain as important factors in scenes with no salient auditory or visual events. We propose that the study of scene perception should move forward to understand better the processes behind multimodal scene processing in real-world environments. We publish our stimulus scenes as spherical video recordings and sound field recordings in a publicly available database.

  2. Presentation of 3D Scenes Through Video Example.

    Science.gov (United States)

    Baldacci, Andrea; Ganovelli, Fabio; Corsini, Massimiliano; Scopigno, Roberto

    2017-09-01

    Using synthetic videos to present a 3D scene is a common requirement for architects, designers, engineers or Cultural Heritage professionals however it is usually time consuming and, in order to obtain high quality results, the support of a film maker/computer animation expert is necessary. We introduce an alternative approach that takes the 3D scene of interest and an example video as input, and automatically produces a video of the input scene that resembles the given video example. In other words, our algorithm allows the user to "replicate" an existing video, on a different 3D scene. We build on the intuition that a video sequence of a static environment is strongly characterized by its optical flow, or, in other words, that two videos are similar if their optical flows are similar. We therefore recast the problem as producing a video of the input scene whose optical flow is similar to the optical flow of the input video. Our intuition is supported by a user-study specifically designed to verify this statement. We have successfully tested our approach on several scenes and input videos, some of which are reported in the accompanying material of this paper.

  3. Blind prediction of natural video quality.

    Science.gov (United States)

    Saad, Michele A; Bovik, Alan C; Charrier, Christophe

    2014-03-01

    We propose a blind (no reference or NR) video quality evaluation model that is nondistortion specific. The approach relies on a spatio-temporal model of video scenes in the discrete cosine transform domain, and on a model that characterizes the type of motion occurring in the scenes, to predict video quality. We use the models to define video statistics and perceptual features that are the basis of a video quality assessment (VQA) algorithm that does not require the presence of a pristine video to compare against in order to predict a perceptual quality score. The contributions of this paper are threefold. 1) We propose a spatio-temporal natural scene statistics (NSS) model for videos. 2) We propose a motion model that quantifies motion coherency in video scenes. 3) We show that the proposed NSS and motion coherency models are appropriate for quality assessment of videos, and we utilize them to design a blind VQA algorithm that correlates highly with human judgments of quality. The proposed algorithm, called video BLIINDS, is tested on the LIVE VQA database and on the EPFL-PoliMi video database and shown to perform close to the level of top performing reduced and full reference VQA algorithms.

  4. Automatic video surveillance of outdoor scenes using track before detect

    DEFF Research Database (Denmark)

    Hansen, Morten; Sørensen, Helge Bjarup Dissing; Birkemark, Christian M.

    2005-01-01

    This paper concerns automatic video surveillance of outdoor scenes using a single camera. The first step in automatic interpretation of the video stream is activity detection based on background subtraction. Usually, this process will generate a large number of false alarms in outdoor scenes due...

  5. Video Scene Parsing with Predictive Feature Learning

    OpenAIRE

    Jin, Xiaojie; Li, Xin; Xiao, Huaxin; Shen, Xiaohui; Lin, Zhe; Yang, Jimei; Chen, Yunpeng; Dong, Jian; Liu, Luoqi; Jie, Zequn; Feng, Jiashi; Yan, Shuicheng

    2016-01-01

    In this work, we address the challenging video scene parsing problem by developing effective representation learning methods given limited parsing annotations. In particular, we contribute two novel methods that constitute a unified parsing framework. (1) \\textbf{Predictive feature learning}} from nearly unlimited unlabeled video data. Different from existing methods learning features from single frame parsing, we learn spatiotemporal discriminative features by enforcing a parsing network to ...

  6. Realistic generation of natural phenomena based on video synthesis

    Science.gov (United States)

    Wang, Changbo; Quan, Hongyan; Li, Chenhui; Xiao, Zhao; Chen, Xiao; Li, Peng; Shen, Liuwei

    2009-10-01

    Research on the generation of natural phenomena has many applications in special effects of movie, battlefield simulation and virtual reality, etc. Based on video synthesis technique, a new approach is proposed for the synthesis of natural phenomena, including flowing water and fire flame. From the fire and flow video, the seamless video of arbitrary length is generated. Then, the interaction between wind and fire flame is achieved through the skeleton of flame. Later, the flow is also synthesized by extending the video textures using an edge resample method. Finally, we can integrate the synthesized natural phenomena into a virtual scene.

  7. From Theatre Improvisation To Video Scenes

    DEFF Research Database (Denmark)

    Larsen, Henry; Hvidt, Niels Christian; Friis, Preben

    2018-01-01

    At Sygehus Lillebaelt, a Danish hospital, there has been a focus for several years on patient communi- cation. This paper reflects on a course focusing on engaging with the patient’s existential themes in particular the negotiations around the creation of video scenes. In the initial workshops, w...

  8. Automating the construction of scene classifiers for content-based video retrieval

    NARCIS (Netherlands)

    Khan, L.; Israël, Menno; Petrushin, V.A.; van den Broek, Egon; van der Putten, Peter

    2004-01-01

    This paper introduces a real time automatic scene classifier within content-based video retrieval. In our envisioned approach end users like documentalists, not image processing experts, build classifiers interactively, by simply indicating positive examples of a scene. Classification consists of a

  9. The effects of scene characteristics, resolution, and compression on the ability to recognize objects in video

    Science.gov (United States)

    Dumke, Joel; Ford, Carolyn G.; Stange, Irena W.

    2011-03-01

    Public safety practitioners increasingly use video for object recognition tasks. These end users need guidance regarding how to identify the level of video quality necessary for their application. The quality of video used in public safety applications must be evaluated in terms of its usability for specific tasks performed by the end user. The Public Safety Communication Research (PSCR) project performed a subjective test as one of the first in a series to explore visual intelligibility in video-a user's ability to recognize an object in a video stream given various conditions. The test sought to measure the effects on visual intelligibility of three scene parameters (target size, scene motion, scene lighting), several compression rates, and two resolutions (VGA (640x480) and CIF (352x288)). Seven similarly sized objects were used as targets in nine sets of near-identical source scenes, where each set was created using a different combination of the parameters under study. Viewers were asked to identify the objects via multiple choice questions. Objective measurements were performed on each of the scenes, and the ability of the measurement to predict visual intelligibility was studied.

  10. Audio scene segmentation for video with generic content

    Science.gov (United States)

    Niu, Feng; Goela, Naveen; Divakaran, Ajay; Abdel-Mottaleb, Mohamed

    2008-01-01

    In this paper, we present a content-adaptive audio texture based method to segment video into audio scenes. The audio scene is modeled as a semantically consistent chunk of audio data. Our algorithm is based on "semantic audio texture analysis." At first, we train GMM models for basic audio classes such as speech, music, etc. Then we define the semantic audio texture based on those classes. We study and present two types of scene changes, those corresponding to an overall audio texture change and those corresponding to a special "transition marker" used by the content creator, such as a short stretch of music in a sitcom or silence in dramatic content. Unlike prior work using genre specific heuristics, such as some methods presented for detecting commercials, we adaptively find out if such special transition markers are being used and if so, which of the base classes are being used as markers without any prior knowledge about the content. Our experimental results show that our proposed audio scene segmentation works well across a wide variety of broadcast content genres.

  11. Colour agnosia impairs the recognition of natural but not of non-natural scenes.

    Science.gov (United States)

    Nijboer, Tanja C W; Van Der Smagt, Maarten J; Van Zandvoort, Martine J E; De Haan, Edward H F

    2007-03-01

    Scene recognition can be enhanced by appropriate colour information, yet the level of visual processing at which colour exerts its effects is still unclear. It has been suggested that colour supports low-level sensory processing, while others have claimed that colour information aids semantic categorization and recognition of objects and scenes. We investigated the effect of colour on scene recognition in a case of colour agnosia, M.A.H. In a scene identification task, participants had to name images of natural or non-natural scenes in six different formats. Irrespective of scene format, M.A.H. was much slower on the natural than on the non-natural scenes. As expected, neither M.A.H. nor control participants showed any difference in performance for the non-natural scenes. However, for the natural scenes, appropriate colour facilitated scene recognition in control participants (i.e., shorter reaction times), whereas M.A.H.'s performance did not differ across formats. Our data thus support the hypothesis that the effect of colour occurs at the level of learned associations.

  12. Maxwellian Eye Fixation during Natural Scene Perception

    Directory of Open Access Journals (Sweden)

    Jean Duchesne

    2012-01-01

    Full Text Available When we explore a visual scene, our eyes make saccades to jump rapidly from one area to another and fixate regions of interest to extract useful information. While the role of fixation eye movements in vision has been widely studied, their random nature has been a hitherto neglected issue. Here we conducted two experiments to examine the Maxwellian nature of eye movements during fixation. In Experiment 1, eight participants were asked to perform free viewing of natural scenes displayed on a computer screen while their eye movements were recorded. For each participant, the probability density function (PDF of eye movement amplitude during fixation obeyed the law established by Maxwell for describing molecule velocity in gas. Only the mean amplitude of eye movements varied with expertise, which was lower in experts than novice participants. In Experiment 2, two participants underwent fixed time, free viewing of natural scenes and of their scrambled version while their eye movements were recorded. Again, the PDF of eye movement amplitude during fixation obeyed Maxwell’s law for each participant and for each scene condition (normal or scrambled. The results suggest that eye fixation during natural scene perception describes a random motion regardless of top-down or of bottom-up processes.

  13. Maxwellian Eye Fixation during Natural Scene Perception

    Science.gov (United States)

    Duchesne, Jean; Bouvier, Vincent; Guillemé, Julien; Coubard, Olivier A.

    2012-01-01

    When we explore a visual scene, our eyes make saccades to jump rapidly from one area to another and fixate regions of interest to extract useful information. While the role of fixation eye movements in vision has been widely studied, their random nature has been a hitherto neglected issue. Here we conducted two experiments to examine the Maxwellian nature of eye movements during fixation. In Experiment 1, eight participants were asked to perform free viewing of natural scenes displayed on a computer screen while their eye movements were recorded. For each participant, the probability density function (PDF) of eye movement amplitude during fixation obeyed the law established by Maxwell for describing molecule velocity in gas. Only the mean amplitude of eye movements varied with expertise, which was lower in experts than novice participants. In Experiment 2, two participants underwent fixed time, free viewing of natural scenes and of their scrambled version while their eye movements were recorded. Again, the PDF of eye movement amplitude during fixation obeyed Maxwell's law for each participant and for each scene condition (normal or scrambled). The results suggest that eye fixation during natural scene perception describes a random motion regardless of top-down or of bottom-up processes. PMID:23226987

  14. The time course of natural scene perception with reduced attention.

    Science.gov (United States)

    Groen, Iris I A; Ghebreab, Sennay; Lamme, Victor A F; Scholte, H Steven

    2016-02-01

    Attention is thought to impose an informational bottleneck on vision by selecting particular information from visual scenes for enhanced processing. Behavioral evidence suggests, however, that some scene information is extracted even when attention is directed elsewhere. Here, we investigated the neural correlates of this ability by examining how attention affects electrophysiological markers of scene perception. In two electro-encephalography (EEG) experiments, human subjects categorized real-world scenes as manmade or natural (full attention condition) or performed tasks on unrelated stimuli in the center or periphery of the scenes (reduced attention conditions). Scene processing was examined in two ways: traditional trial averaging was used to assess the presence of a categorical manmade/natural distinction in event-related potentials, whereas single-trial analyses assessed whether EEG activity was modulated by scene statistics that are diagnostic of naturalness of individual scenes. The results indicated that evoked activity up to 250 ms was unaffected by reduced attention, showing intact categorical differences between manmade and natural scenes and strong modulations of single-trial activity by scene statistics in all conditions. Thus initial processing of both categorical and individual scene information remained intact with reduced attention. Importantly, however, attention did have profound effects on later evoked activity; full attention on the scene resulted in prolonged manmade/natural differences, increased neural sensitivity to scene statistics, and enhanced scene memory. These results show that initial processing of real-world scene information is intact with diminished attention but that the depth of processing of this information does depend on attention. Copyright © 2016 the American Physiological Society.

  15. Video Texture Synthesis Based on Flow-Like Stylization Painting

    Directory of Open Access Journals (Sweden)

    Qian Wenhua

    2014-01-01

    Full Text Available The paper presents an NP-video rendering system based on natural phenomena. It provides a simple nonphotorealistic video synthesis system in which user can obtain a flow-like stylization painting and infinite video scene. Firstly, based on anisotropic Kuwahara filtering in conjunction with line integral convolution, the phenomena video scene can be rendered to flow-like stylization painting. Secondly, the methods of frame division, patches synthesis, will be used to synthesize infinite playing video. According to selection examples from different natural video texture, our system can generate stylized of flow-like and infinite video scenes. The visual discontinuities between neighbor frames are decreased, and we also preserve feature and details of frames. This rendering system is easy and simple to implement.

  16. Video Inpainting of Complex Scenes

    OpenAIRE

    Newson, Alasdair; Almansa, Andrés; Fradet, Matthieu; Gousseau, Yann; Pérez, Patrick

    2015-01-01

    We propose an automatic video inpainting algorithm which relies on the optimisation of a global, patch-based functional. Our algorithm is able to deal with a variety of challenging situations which naturally arise in video inpainting, such as the correct reconstruction of dynamic textures, multiple moving objects and moving background. Furthermore, we achieve this in an order of magnitude less execution time with respect to the state-of-the-art. We are also able to achieve good quality result...

  17. The influence of action video game playing on eye movement behaviour during visual search in abstract, in-game and natural scenes.

    Science.gov (United States)

    Azizi, Elham; Abel, Larry A; Stainer, Matthew J

    2017-02-01

    Action game playing has been associated with several improvements in visual attention tasks. However, it is not clear how such changes might influence the way we overtly select information from our visual world (i.e. eye movements). We examined whether action-video-game training changed eye movement behaviour in a series of visual search tasks including conjunctive search (relatively abstracted from natural behaviour), game-related search, and more naturalistic scene search. Forty nongamers were trained in either an action first-person shooter game or a card game (control) for 10 hours. As a further control, we recorded eye movements of 20 experienced action gamers on the same tasks. The results did not show any change in duration of fixations or saccade amplitude either from before to after the training or between all nongamers (pretraining) and experienced action gamers. However, we observed a change in search strategy, reflected by a reduction in the vertical distribution of fixations for the game-related search task in the action-game-trained group. This might suggest learning the likely distribution of targets. In other words, game training only skilled participants to search game images for targets important to the game, with no indication of transfer to the more natural scene search. Taken together, these results suggest no modification in overt allocation of attention. Either the skills that can be trained with action gaming are not powerful enough to influence information selection through eye movements, or action-game-learned skills are not used when deciding where to move the eyes.

  18. ViCoMo : visual context modeling for scene understanding in video surveillance

    NARCIS (Netherlands)

    Creusen, I.M.; Javanbakhti, S.; Loomans, M.J.H.; Hazelhoff, L.; Roubtsova, N.S.; Zinger, S.; With, de P.H.N.

    2013-01-01

    The use of contextual information can significantly aid scene understanding of surveillance video. Just detecting people and tracking them does not provide sufficient information to detect situations that require operator attention. We propose a proof-of-concept system that uses several sources of

  19. A Method of Sharing Tacit Knowledge by a Bulletin Board Link to Video Scene and an Evaluation in the Field of Nursing Skill

    Science.gov (United States)

    Shimada, Satoshi; Azuma, Shouzou; Teranaka, Sayaka; Kojima, Akira; Majima, Yukie; Maekawa, Yasuko

    We developed the system that knowledge could be discovered and shared cooperatively in the organization based on the SECI model of knowledge management. This system realized three processes by the following method. (1)A video that expressed skill is segmented into a number of scenes according to its contents. Tacit knowledge is shared in each scene. (2)Tacit knowledge is extracted by bulletin board linked to each scene. (3)Knowledge is acquired by repeatedly viewing the video scene with the comment that shows the technical content to be practiced. We conducted experiments that the system was used by nurses working for general hospitals. Experimental results show that the nursing practical knack is able to be collected by utilizing bulletin board linked to video scene. Results of this study confirmed the possibility of expressing the tacit knowledge of nurses' empirical nursing skills sensitively with a clue of video images.

  20. A hierarchical probabilistic model for rapid object categorization in natural scenes.

    Directory of Open Access Journals (Sweden)

    Xiaofu He

    Full Text Available Humans can categorize objects in complex natural scenes within 100-150 ms. This amazing ability of rapid categorization has motivated many computational models. Most of these models require extensive training to obtain a decision boundary in a very high dimensional (e.g., ∼6,000 in a leading model feature space and often categorize objects in natural scenes by categorizing the context that co-occurs with objects when objects do not occupy large portions of the scenes. It is thus unclear how humans achieve rapid scene categorization.To address this issue, we developed a hierarchical probabilistic model for rapid object categorization in natural scenes. In this model, a natural object category is represented by a coarse hierarchical probability distribution (PD, which includes PDs of object geometry and spatial configuration of object parts. Object parts are encoded by PDs of a set of natural object structures, each of which is a concatenation of local object features. Rapid categorization is performed as statistical inference. Since the model uses a very small number (∼100 of structures for even complex object categories such as animals and cars, it requires little training and is robust in the presence of large variations within object categories and in their occurrences in natural scenes. Remarkably, we found that the model categorized animals in natural scenes and cars in street scenes with a near human-level performance. We also found that the model located animals and cars in natural scenes, thus overcoming a flaw in many other models which is to categorize objects in natural context by categorizing contextual features. These results suggest that coarse PDs of object categories based on natural object structures and statistical operations on these PDs may underlie the human ability to rapidly categorize scenes.

  1. Detection of chromatic and luminance distortions in natural scenes.

    Science.gov (United States)

    Jennings, Ben J; Wang, Karen; Menzies, Samantha; Kingdom, Frederick A A

    2015-09-01

    A number of studies have measured visual thresholds for detecting spatial distortions applied to images of natural scenes. In one study, Bex [J. Vis.10(2), 1 (2010)10.1167/10.2.231534-7362] measured sensitivity to sinusoidal spatial modulations of image scale. Here, we measure sensitivity to sinusoidal scale distortions applied to the chromatic, luminance, or both layers of natural scene images. We first established that sensitivity does not depend on whether the undistorted comparison image was of the same or of a different scene. Next, we found that, when the luminance but not chromatic layer was distorted, performance was the same regardless of whether the chromatic layer was present, absent, or phase-scrambled; in other words, the chromatic layer, in whatever form, did not affect sensitivity to the luminance layer distortion. However, when the chromatic layer was distorted, sensitivity was higher when the luminance layer was intact compared to when absent or phase-scrambled. These detection threshold results complement the appearance of periodic distortions of the image scale: when the luminance layer is distorted visibly, the scene appears distorted, but when the chromatic layer is distorted visibly, there is little apparent scene distortion. We conclude that (a) observers have a built-in sense of how a normal image of a natural scene should appear, and (b) the detection of distortion in, as well as the apparent distortion of, natural scene images is mediated predominantly by the luminance layer and not chromatic layer.

  2. Three-directional motion-compensation mask-based novel look-up table on graphics processing units for video-rate generation of digital holographic videos of three-dimensional scenes.

    Science.gov (United States)

    Kwon, Min-Woo; Kim, Seung-Cheol; Kim, Eun-Soo

    2016-01-20

    A three-directional motion-compensation mask-based novel look-up table method is proposed and implemented on graphics processing units (GPUs) for video-rate generation of digital holographic videos of three-dimensional (3D) scenes. Since the proposed method is designed to be well matched with the software and memory structures of GPUs, the number of compute-unified-device-architecture kernel function calls can be significantly reduced. This results in a great increase of the computational speed of the proposed method, allowing video-rate generation of the computer-generated hologram (CGH) patterns of 3D scenes. Experimental results reveal that the proposed method can generate 39.8 frames of Fresnel CGH patterns with 1920×1080 pixels per second for the test 3D video scenario with 12,088 object points on dual GPU boards of NVIDIA GTX TITANs, and they confirm the feasibility of the proposed method in the practical application fields of electroholographic 3D displays.

  3. Viewing nature scenes positively affects recovery of autonomic function following acute-mental stress.

    Science.gov (United States)

    Brown, Daniel K; Barton, Jo L; Gladwell, Valerie F

    2013-06-04

    A randomized crossover study explored whether viewing different scenes prior to a stressor altered autonomic function during the recovery from the stressor. The two scenes were (a) nature (composed of trees, grass, fields) or (b) built (composed of man-made, urban scenes lacking natural characteristics) environments. Autonomic function was assessed using noninvasive techniques of heart rate variability; in particular, time domain analyses evaluated parasympathetic activity, using root-mean-square of successive differences (RMSSD). During stress, secondary cardiovascular markers (heart rate, systolic and diastolic blood pressure) showed significant increases from baseline which did not differ between the two viewing conditions. Parasympathetic activity, however, was significantly higher in recovery following the stressor in the viewing scenes of nature condition compared to viewing scenes depicting built environments (RMSSD; 50.0 ± 31.3 vs 34.8 ± 14.8 ms). Thus, viewing nature scenes prior to a stressor alters autonomic activity in the recovery period. The secondary aim was to examine autonomic function during viewing of the two scenes. Standard deviation of R-R intervals (SDRR), as change from baseline, during the first 5 min of viewing nature scenes was greater than during built scenes. Overall, this suggests that nature can elicit improvements in the recovery process following a stressor.

  4. Fixations on objects in natural scenes: dissociating importance from salience

    Directory of Open Access Journals (Sweden)

    Bernard Marius e’t Hart

    2013-07-01

    Full Text Available The relation of selective attention to understanding of natural scenes has been subject to intense behavioral research and computational modeling, and gaze is often used as a proxy for such attention. The probability of an image region to be fixated typically correlates with its contrast. However, this relation does not imply a causal role of contrast. Rather, contrast may relate to an object’s importance for a scene, which in turn drives attention. Here we operationalize importance by the probability that an observer names the object as characteristic for a scene. We modify luminance contrast of either a frequently named (common/important or a rarely named (rare/unimportant object, track the observers’ eye movements during scene viewing and ask them to provide keywords describing the scene immediately after.When no object is modified relative to the background, important objects draw more fixations than unimportant ones. Increases of contrast make an object more likely to be fixated, irrespective of whether it was important for the original scene, while decreases in contrast have little effect on fixations. Any contrast modification makes originally unimportant objects more important for the scene. Finally, important objects are fixated more centrally than unimportant objects, irrespective of contrast.Our data suggest a dissociation between object importance (relevance for the scene and salience (relevance for attention. If an object obeys natural scene statistics, important objects are also salient. However, when natural scene statistics are violated, importance and salience are differentially affected. Object salience is modulated by the expectation about object properties (e.g., formed by context or gist, and importance by the violation of such expectations. In addition, the dependence of fixated locations within an object on the object’s importance suggests an analogy to the effects of word frequency on landing positions in reading.

  5. Usability of aerial video footage for 3-D scene reconstruction and structural damage assessment

    Science.gov (United States)

    Cusicanqui, Johnny; Kerle, Norman; Nex, Francesco

    2018-06-01

    Remote sensing has evolved into the most efficient approach to assess post-disaster structural damage, in extensively affected areas through the use of spaceborne data. For smaller, and in particular, complex urban disaster scenes, multi-perspective aerial imagery obtained with unmanned aerial vehicles and derived dense color 3-D models are increasingly being used. These type of data allow the direct and automated recognition of damage-related features, supporting an effective post-disaster structural damage assessment. However, the rapid collection and sharing of multi-perspective aerial imagery is still limited due to tight or lacking regulations and legal frameworks. A potential alternative is aerial video footage, which is typically acquired and shared by civil protection institutions or news media and which tends to be the first type of airborne data available. Nevertheless, inherent artifacts and the lack of suitable processing means have long limited its potential use in structural damage assessment and other post-disaster activities. In this research the usability of modern aerial video data was evaluated based on a comparative quality and application analysis of video data and multi-perspective imagery (photos), and their derivative 3-D point clouds created using current photogrammetric techniques. Additionally, the effects of external factors, such as topography and the presence of smoke and moving objects, were determined by analyzing two different earthquake-affected sites: Tainan (Taiwan) and Pescara del Tronto (Italy). Results demonstrated similar usabilities for video and photos. This is shown by the short 2 cm of difference between the accuracies of video- and photo-based 3-D point clouds. Despite the low video resolution, the usability of these data was compensated for by a small ground sampling distance. Instead of video characteristics, low quality and application resulted from non-data-related factors, such as changes in the scene, lack of

  6. Using Video Vignettes of Historical Episodes for Promoting Pre-Service Teachers' Ideas about the Nature of Science

    Science.gov (United States)

    Cakmakci, Gultekin

    2017-01-01

    This study used video vignettes of historical episodes from documentary films as a context and instructional tool to promote pre-service science teachers' (PSTs) conceptions of the nature of science (NOS). The participants received explicit-reflective NOS instruction, and were introduced to techniques to be able to use scenes from documentary…

  7. Motivational Objects in Natural Scenes (MONS: A Database of >800 Objects

    Directory of Open Access Journals (Sweden)

    Judith Schomaker

    2017-09-01

    Full Text Available In daily life, we are surrounded by objects with pre-existing motivational associations. However, these are rarely controlled for in experiments with natural stimuli. Research on natural stimuli would therefore benefit from stimuli with well-defined motivational properties; in turn, such stimuli also open new paths in research on motivation. Here we introduce a database of Motivational Objects in Natural Scenes (MONS. The database consists of 107 scenes. Each scene contains 2 to 7 objects placed at approximately equal distance from the scene center. Each scene was photographed creating 3 versions, with one object (“critical object” being replaced to vary the overall motivational value of the scene (appetitive, aversive, and neutral, while maintaining high visual similarity between the three versions. Ratings on motivation, valence, arousal and recognizability were obtained using internet-based questionnaires. Since the main objective was to provide stimuli of well-defined motivational value, three motivation scales were used: (1 Desire to own the object; (2 Approach/Avoid; (3 Desire to interact with the object. Three sets of ratings were obtained in independent sets of observers: for all 805 objects presented on a neutral background, for 321 critical objects presented in their scene context, and for the entire scenes. On the basis of the motivational ratings, objects were subdivided into aversive, neutral, and appetitive categories. The MONS database will provide a standardized basis for future studies on motivational value under realistic conditions.

  8. Motivational Objects in Natural Scenes (MONS): A Database of >800 Objects.

    Science.gov (United States)

    Schomaker, Judith; Rau, Elias M; Einhäuser, Wolfgang; Wittmann, Bianca C

    2017-01-01

    In daily life, we are surrounded by objects with pre-existing motivational associations. However, these are rarely controlled for in experiments with natural stimuli. Research on natural stimuli would therefore benefit from stimuli with well-defined motivational properties; in turn, such stimuli also open new paths in research on motivation. Here we introduce a database of Motivational Objects in Natural Scenes (MONS). The database consists of 107 scenes. Each scene contains 2 to 7 objects placed at approximately equal distance from the scene center. Each scene was photographed creating 3 versions, with one object ("critical object") being replaced to vary the overall motivational value of the scene (appetitive, aversive, and neutral), while maintaining high visual similarity between the three versions. Ratings on motivation, valence, arousal and recognizability were obtained using internet-based questionnaires. Since the main objective was to provide stimuli of well-defined motivational value, three motivation scales were used: (1) Desire to own the object; (2) Approach/Avoid; (3) Desire to interact with the object. Three sets of ratings were obtained in independent sets of observers: for all 805 objects presented on a neutral background, for 321 critical objects presented in their scene context, and for the entire scenes. On the basis of the motivational ratings, objects were subdivided into aversive, neutral, and appetitive categories. The MONS database will provide a standardized basis for future studies on motivational value under realistic conditions.

  9. Number of perceptually distinct surface colors in natural scenes.

    Science.gov (United States)

    Marín-Franch, Iván; Foster, David H

    2010-09-30

    The ability to perceptually identify distinct surfaces in natural scenes by virtue of their color depends not only on the relative frequency of surface colors but also on the probabilistic nature of observer judgments. Previous methods of estimating the number of discriminable surface colors, whether based on theoretical color gamuts or recorded from real scenes, have taken a deterministic approach. Thus, a three-dimensional representation of the gamut of colors is divided into elementary cells or points which are spaced at one discrimination-threshold unit intervals and which are then counted. In this study, information-theoretic methods were used to take into account both differing surface-color frequencies and observer response uncertainty. Spectral radiances were calculated from 50 hyperspectral images of natural scenes and were represented in a perceptually almost uniform color space. The average number of perceptually distinct surface colors was estimated as 7.3 × 10(3), much smaller than that based on counting methods. This number is also much smaller than the number of distinct points in a scene that are, in principle, available for reliable identification under illuminant changes, suggesting that color constancy, or the lack of it, does not generally determine the limit on the use of color for surface identification.

  10. Overt attention in natural scenes: objects dominate features.

    Science.gov (United States)

    Stoll, Josef; Thrun, Michael; Nuthmann, Antje; Einhäuser, Wolfgang

    2015-02-01

    Whether overt attention in natural scenes is guided by object content or by low-level stimulus features has become a matter of intense debate. Experimental evidence seemed to indicate that once object locations in a scene are known, salience models provide little extra explanatory power. This approach has recently been criticized for using inadequate models of early salience; and indeed, state-of-the-art salience models outperform trivial object-based models that assume a uniform distribution of fixations on objects. Here we propose to use object-based models that take a preferred viewing location (PVL) close to the centre of objects into account. In experiment 1, we demonstrate that, when including this comparably subtle modification, object-based models again are at par with state-of-the-art salience models in predicting fixations in natural scenes. One possible interpretation of these results is that objects rather than early salience dominate attentional guidance. In this view, early-salience models predict fixations through the correlation of their features with object locations. To test this hypothesis directly, in two additional experiments we reduced low-level salience in image areas of high object content. For these modified stimuli, the object-based model predicted fixations significantly better than early salience. This finding held in an object-naming task (experiment 2) and a free-viewing task (experiment 3). These results provide further evidence for object-based fixation selection--and by inference object-based attentional guidance--in natural scenes. Copyright © 2014 The Authors. Published by Elsevier Ltd.. All rights reserved.

  11. PC Scene Generation

    Science.gov (United States)

    Buford, James A., Jr.; Cosby, David; Bunfield, Dennis H.; Mayhall, Anthony J.; Trimble, Darian E.

    2007-04-01

    AMRDEC has successfully tested hardware and software for Real-Time Scene Generation for IR and SAL Sensors on COTS PC based hardware and video cards. AMRDEC personnel worked with nVidia and Concurrent Computer Corporation to develop a Scene Generation system capable of frame rates of at least 120Hz while frame locked to an external source (such as a missile seeker) with no dropped frames. Latency measurements and image validation were performed using COTS and in-house developed hardware and software. Software for the Scene Generation system was developed using OpenSceneGraph.

  12. HDR video synthesis for vision systems in dynamic scenes

    Science.gov (United States)

    Shopovska, Ivana; Jovanov, Ljubomir; Goossens, Bart; Philips, Wilfried

    2016-09-01

    High dynamic range (HDR) image generation from a number of differently exposed low dynamic range (LDR) images has been extensively explored in the past few decades, and as a result of these efforts a large number of HDR synthesis methods have been proposed. Since HDR images are synthesized by combining well-exposed regions of the input images, one of the main challenges is dealing with camera or object motion. In this paper we propose a method for the synthesis of HDR video from a single camera using multiple, differently exposed video frames, with circularly alternating exposure times. One of the potential applications of the system is in driver assistance systems and autonomous vehicles, involving significant camera and object movement, non- uniform and temporally varying illumination, and the requirement of real-time performance. To achieve these goals simultaneously, we propose a HDR synthesis approach based on weighted averaging of aligned radiance maps. The computational complexity of high-quality optical flow methods for motion compensation is still pro- hibitively high for real-time applications. Instead, we rely on more efficient global projective transformations to solve camera movement, while moving objects are detected by thresholding the differences between the trans- formed and brightness adapted images in the set. To attain temporal consistency of the camera motion in the consecutive HDR frames, the parameters of the perspective transformation are stabilized over time by means of computationally efficient temporal filtering. We evaluated our results on several reference HDR videos, on synthetic scenes, and using 14-bit raw images taken with a standard camera.

  13. The perception of naturalness correlates with low-level visual features of environmental scenes.

    Directory of Open Access Journals (Sweden)

    Marc G Berman

    Full Text Available Previous research has shown that interacting with natural environments vs. more urban or built environments can have salubrious psychological effects, such as improvements in attention and memory. Even viewing pictures of nature vs. pictures of built environments can produce similar effects. A major question is: What is it about natural environments that produces these benefits? Problematically, there are many differing qualities between natural and urban environments, making it difficult to narrow down the dimensions of nature that may lead to these benefits. In this study, we set out to uncover visual features that related to individuals' perceptions of naturalness in images. We quantified naturalness in two ways: first, implicitly using a multidimensional scaling analysis and second, explicitly with direct naturalness ratings. Features that seemed most related to perceptions of naturalness were related to the density of contrast changes in the scene, the density of straight lines in the scene, the average color saturation in the scene and the average hue diversity in the scene. We then trained a machine-learning algorithm to predict whether a scene was perceived as being natural or not based on these low-level visual features and we could do so with 81% accuracy. As such we were able to reliably predict subjective perceptions of naturalness with objective low-level visual features. Our results can be used in future studies to determine if these features, which are related to naturalness, may also lead to the benefits attained from interacting with nature.

  14. Age-related changes in visual exploratory behavior in a natural scene setting.

    Science.gov (United States)

    Hamel, Johanna; De Beukelaer, Sophie; Kraft, Antje; Ohl, Sven; Audebert, Heinrich J; Brandt, Stephan A

    2013-01-01

    Diverse cognitive functions decline with increasing age, including the ability to process central and peripheral visual information in a laboratory testing situation (useful visual field of view). To investigate whether and how this influences activities of daily life, we studied age-related changes in visual exploratory behavior in a natural scene setting: a driving simulator paradigm of variable complexity was tested in subjects of varying ages with simultaneous eye- and head-movement recordings via a head-mounted camera. Detection and reaction times were also measured by visual fixation and manual reaction. We considered video computer game experience as a possible influence on performance. Data of 73 participants of varying ages were analyzed, driving two different courses. We analyzed the influence of route difficulty level, age, and eccentricity of test stimuli on oculomotor and driving behavior parameters. No significant age effects were found regarding saccadic parameters. In the older subjects head-movements increasingly contributed to gaze amplitude. More demanding courses and more peripheral stimuli locations induced longer reaction times in all age groups. Deterioration of the functionally useful visual field of view with increasing age was not suggested in our study group. However, video game-experienced subjects revealed larger saccade amplitudes and a broader distribution of fixations on the screen. They reacted faster to peripheral objects suggesting the notion of a general detection task rather than perceiving driving as a central task. As the video game-experienced population consisted of younger subjects, our study indicates that effects due to video game experience can easily be misinterpreted as age effects if not accounted for. We therefore view it as essential to consider video game experience in all testing methods using virtual media.

  15. Age-related changes in visual exploratory behavior in a natural scene setting

    Directory of Open Access Journals (Sweden)

    Johanna eHamel

    2013-06-01

    Full Text Available Diverse cognitive functions decline with increasing age, including the ability to process central and peripheral visual information in a laboratory testing situation (useful visual field of view. To investigate whether and how this influences activities of daily life, we studied age-related changes in visual exploratory behavior in a natural scene setting: a driving simulator paradigm of variable complexity was tested in subjects of varying ages with simultaneous eye- and head-movement recordings via a head-mounted camera. Detection and reaction times were also measured by visual fixation and manual reaction. We considered video computer game experience as a possible influence on performance. Data of 73 participants of varying ages were analyzed, driving two different courses. We analyzed the influence of route difficulty level, age and eccentricity of test stimuli on oculomotor and driving behavior parameters. No significant age effects were found regarding saccadic parameters. In the older subjects head-movements increasingly contributed to gaze amplitude. More demanding courses and more peripheral stimuli locations, induced longer reaction times in all age groups. Deterioration of the functionally useful visual field of view with increasing age was not suggested in our study group. However, video game-experienced subjects revealed larger saccade amplitudes and a broader distribution of fixations on the screen. They reacted faster to peripheral objects suggesting the notion of a general detection task rather than perceiving driving as a central task. As the video game experienced population consisted of younger subjects, our study indicates that effects due to video game experience can easily be misinterpreted as age effects if not accounted for. We therefore view it as essential to consider video game experience in all testing methods using virtual media.

  16. Scene text recognition in mobile applications by character descriptor and structure configuration.

    Science.gov (United States)

    Yi, Chucai; Tian, Yingli

    2014-07-01

    Text characters and strings in natural scene can provide valuable information for many applications. Extracting text directly from natural scene images or videos is a challenging task because of diverse text patterns and variant background interferences. This paper proposes a method of scene text recognition from detected text regions. In text detection, our previously proposed algorithms are applied to obtain text regions from scene image. First, we design a discriminative character descriptor by combining several state-of-the-art feature detectors and descriptors. Second, we model character structure at each character class by designing stroke configuration maps. Our algorithm design is compatible with the application of scene text extraction in smart mobile devices. An Android-based demo system is developed to show the effectiveness of our proposed method on scene text information extraction from nearby objects. The demo system also provides us some insight into algorithm design and performance improvement of scene text extraction. The evaluation results on benchmark data sets demonstrate that our proposed scheme of text recognition is comparable with the best existing methods.

  17. The influence of color on emotional perception of natural scenes.

    Science.gov (United States)

    Codispoti, Maurizio; De Cesarei, Andrea; Ferrari, Vera

    2012-01-01

    Is color a critical factor when processing the emotional content of natural scenes? Under challenging perceptual conditions, such as when pictures are briefly presented, color might facilitate scene segmentation and/or function as a semantic cue via association with scene-relevant concepts (e.g., red and blood/injury). To clarify the influence of color on affective picture perception, we compared the late positive potentials (LPP) to color versus grayscale pictures, presented for very brief (24 ms) and longer (6 s) exposure durations. Results indicated that removing color information had no effect on the affective modulation of the LPP, regardless of exposure duration. These findings imply that the recognition of the emotional content of scenes, even when presented very briefly, does not critically rely on color information. Copyright © 2011 Society for Psychophysiological Research.

  18. Naturalness and image quality : chroma and hue variation in color images of natural scenes

    NARCIS (Netherlands)

    Ridder, de H.; Blommaert, F.J.J.; Fedorovskaya, E.A.; Rogowitz, B.E.; Allebach, J.P.

    1995-01-01

    The relation between perceptual image quality and naturalness was investigated by varying the colorfulness and hue of color images of natural scenes. These variations were created by digitizing the images, subsequently determining their color point distributions in the CIELUV color space and finally

  19. Naturalness and image quality: Chroma and hue variation in color images of natural scenes

    NARCIS (Netherlands)

    Ridder, de H.; Blommaert, F.J.J.; Fedorovskaya, E.A.; Eschbach, R.; Braun, K.

    1997-01-01

    The relation between perceptual image quality and natural ness was investigated by varying the colorfulness and hue of color images of natural scenes. These variations were created by digitizing the images, subsequently determining their color point distributions in the CIELUV color space and

  20. Motion video analysis using planar parallax

    Science.gov (United States)

    Sawhney, Harpreet S.

    1994-04-01

    Motion and structure analysis in video sequences can lead to efficient descriptions of objects and their motions. Interesting events in videos can be detected using such an analysis--for instance independent object motion when the camera itself is moving, figure-ground segregation based on the saliency of a structure compared to its surroundings. In this paper we present a method for 3D motion and structure analysis that uses a planar surface in the environment as a reference coordinate system to describe a video sequence. The motion in the video sequence is described as the motion of the reference plane, and the parallax motion of all the non-planar components of the scene. It is shown how this method simplifies the otherwise hard general 3D motion analysis problem. In addition, a natural coordinate system in the environment is used to describe the scene which can simplify motion based segmentation. This work is a part of an ongoing effort in our group towards video annotation and analysis for indexing and retrieval. Results from a demonstration system being developed are presented.

  1. Guidance of Attention to Objects and Locations by Long-Term Memory of Natural Scenes

    Science.gov (United States)

    Becker, Mark W.; Rasmussen, Ian P.

    2008-01-01

    Four flicker change-detection experiments demonstrate that scene-specific long-term memory guides attention to both behaviorally relevant locations and objects within a familiar scene. Participants performed an initial block of change-detection trials, detecting the addition of an object to a natural scene. After a 30-min delay, participants…

  2. Cortical networks dynamically emerge with the interplay of slow and fast oscillations for memory of a natural scene.

    Science.gov (United States)

    Mizuhara, Hiroaki; Sato, Naoyuki; Yamaguchi, Yoko

    2015-05-01

    Neural oscillations are crucial for revealing dynamic cortical networks and for serving as a possible mechanism of inter-cortical communication, especially in association with mnemonic function. The interplay of the slow and fast oscillations might dynamically coordinate the mnemonic cortical circuits to rehearse stored items during working memory retention. We recorded simultaneous EEG-fMRI during a working memory task involving a natural scene to verify whether the cortical networks emerge with the neural oscillations for memory of the natural scene. The slow EEG power was enhanced in association with the better accuracy of working memory retention, and accompanied cortical activities in the mnemonic circuits for the natural scene. Fast oscillation showed a phase-amplitude coupling to the slow oscillation, and its power was tightly coupled with the cortical activities for representing the visual images of natural scenes. The mnemonic cortical circuit with the slow neural oscillations would rehearse the distributed natural scene representations with the fast oscillation for working memory retention. The coincidence of the natural scene representations could be obtained by the slow oscillation phase to create a coherent whole of the natural scene in the working memory. Copyright © 2015 Elsevier Inc. All rights reserved.

  3. Object tracking mask-based NLUT on GPUs for real-time generation of holographic videos of three-dimensional scenes.

    Science.gov (United States)

    Kwon, M-W; Kim, S-C; Yoon, S-E; Ho, Y-S; Kim, E-S

    2015-02-09

    A new object tracking mask-based novel-look-up-table (OTM-NLUT) method is proposed and implemented on graphics-processing-units (GPUs) for real-time generation of holographic videos of three-dimensional (3-D) scenes. Since the proposed method is designed to be matched with software and memory structures of the GPU, the number of compute-unified-device-architecture (CUDA) kernel function calls and the computer-generated hologram (CGH) buffer size of the proposed method have been significantly reduced. It therefore results in a great increase of the computational speed of the proposed method and enables real-time generation of CGH patterns of 3-D scenes. Experimental results show that the proposed method can generate 31.1 frames of Fresnel CGH patterns with 1,920 × 1,080 pixels per second, on average, for three test 3-D video scenarios with 12,666 object points on three GPU boards of NVIDIA GTX TITAN, and confirm the feasibility of the proposed method in the practical application of electro-holographic 3-D displays.

  4. Scintillation mitigation for long-range surveillance video

    CSIR Research Space (South Africa)

    Delport, JP

    2010-09-01

    Full Text Available Atmospheric turbulence is a naturally occurring phenomenon that can severely degrade the quality of long-range surveillance video footage. Major effects include image blurring, image warping and temporal wavering of objects in the scene. Mitigating...

  5. Parallel programming of saccades during natural scene viewing: evidence from eye movement positions.

    Science.gov (United States)

    Wu, Esther X W; Gilani, Syed Omer; van Boxtel, Jeroen J A; Amihai, Ido; Chua, Fook Kee; Yen, Shih-Cheng

    2013-10-24

    Previous studies have shown that saccade plans during natural scene viewing can be programmed in parallel. This evidence comes mainly from temporal indicators, i.e., fixation durations and latencies. In the current study, we asked whether eye movement positions recorded during scene viewing also reflect parallel programming of saccades. As participants viewed scenes in preparation for a memory task, their inspection of the scene was suddenly disrupted by a transition to another scene. We examined whether saccades after the transition were invariably directed immediately toward the center or were contingent on saccade onset times relative to the transition. The results, which showed a dissociation in eye movement behavior between two groups of saccades after the scene transition, supported the parallel programming account. Saccades with relatively long onset times (>100 ms) after the transition were directed immediately toward the center of the scene, probably to restart scene exploration. Saccades with short onset times (programming of saccades during scene viewing. Additionally, results from the analyses of intersaccadic intervals were also consistent with the parallel programming hypothesis.

  6. AR goggles make crime scene investigation a desk job

    OpenAIRE

    Aron, Jacob; NORTHFIELD, Dean

    2012-01-01

    CRIME scene investigators could one day help solve murders without leaving the office. A pair of augmented reality glasses could allow local police to virtually tag objects in a crime scene, and build a clean record of the scene in 3D video before evidence is removed for processing.\\ud The system, being developed by Oytun Akman and colleagues at the Delft University of Technology in the Netherlands, consists of a head-mounted display receiving 3D video from a pair of attached cameras controll...

  7. Face, Body, and Center of Gravity Mediate Person Detection in Natural Scenes

    Science.gov (United States)

    Bindemann, Markus; Scheepers, Christoph; Ferguson, Heather J.; Burton, A. Mike

    2010-01-01

    Person detection is an important prerequisite of social interaction, but is not well understood. Following suggestions that people in the visual field can capture a viewer's attention, this study examines the role of the face and the body for person detection in natural scenes. We observed that viewers tend first to look at the center of a scene,…

  8. S3-2: Colorfulness Perception Adapting to Natural Scenes

    Directory of Open Access Journals (Sweden)

    Yoko Mizokami

    2012-10-01

    Full Text Available Our visual system has the ability to adapt to the color characteristics of environment and maintain stable color appearance. Many researches on chromatic adaptation and color constancy suggested that the different levels of visual processes involve the adaptation mechanism. In the case of colorfulness perception, it has been shown that the perception changes with adaptation to chromatic contrast modulation and to surrounding chromatic variance. However, it is still not clear how the perception changes in natural scenes and what levels of visual mechanisms contribute to the perception. Here, I will mainly present our recent work on colorfulness-adaptation in natural images. In the experiment, we examined whether the colorfulness perception of an image was influenced by the adaptation to natural images with different degrees of saturation. Natural and unnatural (shuffled or phase-scrambled images are used for adapting and test images, and all combinations of adapting and test images were tested (e.g., the combination of natural adapting images and a shuffled test image. The results show that colorfulness perception was influenced by adaptation to the saturation of images. A test image appeared less colorful after adaptation to saturated images, and vice versa. The effect of colorfulness adaptation was the strongest for the combination of natural adapting and natural test images. The fact that the naturalness of the spatial structure in an image affects the strength of the adaptation effect implies that the recognition of natural scene would play an important role in the adaptation mechanism.

  9. Neural Correlates of Divided Attention in Natural Scenes.

    Science.gov (United States)

    Fagioli, Sabrina; Macaluso, Emiliano

    2016-09-01

    Individuals are able to split attention between separate locations, but divided spatial attention incurs the additional requirement of monitoring multiple streams of information. Here, we investigated divided attention using photos of natural scenes, where the rapid categorization of familiar objects and prior knowledge about the likely positions of objects in the real world might affect the interplay between these spatial and nonspatial factors. Sixteen participants underwent fMRI during an object detection task. They were presented with scenes containing either a person or a car, located on the left or right side of the photo. Participants monitored either one or both object categories, in one or both visual hemifields. First, we investigated the interplay between spatial and nonspatial attention by comparing conditions of divided attention between categories and/or locations. We then assessed the contribution of top-down processes versus stimulus-driven signals by separately testing the effects of divided attention in target and nontarget trials. The results revealed activation of a bilateral frontoparietal network when dividing attention between the two object categories versus attending to a single category but no main effect of dividing attention between spatial locations. Within this network, the left dorsal premotor cortex and the left intraparietal sulcus were found to combine task- and stimulus-related signals. These regions showed maximal activation when participants monitored two categories at spatially separate locations and the scene included a nontarget object. We conclude that the dorsal frontoparietal cortex integrates top-down and bottom-up signals in the presence of distractors during divided attention in real-world scenes.

  10. User and Device Adaptation in Summarizing Sports Videos

    Science.gov (United States)

    Nitta, Naoko; Babaguchi, Noboru

    Video summarization is defined as creating a video summary which includes only important scenes in the original video streams. In order to realize automatic video summarization, the significance of each scene needs to be determined. When targeted especially on broadcast sports videos, a play scene, which corresponds to a play, can be considered as a scene unit. The significance of every play scene can generally be determined based on the importance of the play in the game. Furthermore, the following two issues should be considered: 1) what is important depends on each user's preferences, and 2) the summaries should be tailored for media devices that each user has. Considering the above issues, this paper proposes a unified framework for user and device adaptation in summarizing broadcast sports videos. The proposed framework summarizes sports videos by selecting play scenes based on not only the importance of each play itself but also the users' preferences by using the metadata, which describes the semantic content of videos with keywords, and user profiles, which describe users' preference degrees for the keywords. The selected scenes are then presented in a proper way using various types of media such as video, image, or text according to device profiles which describe the device type. We experimentally verified the effectiveness of user adaptation by examining how the generated summaries are changed by different preference degrees and by comparing our results with/without using user profiles. The validity of device adaptation is also evaluated by conducting questionnaires using PCs and mobile phones as the media devices.

  11. Scene analysis in the natural environment

    DEFF Research Database (Denmark)

    Lewicki, Michael S; Olshausen, Bruno A; Surlykke, Annemarie

    2014-01-01

    The problem of scene analysis has been studied in a number of different fields over the past decades. These studies have led to important insights into problems of scene analysis, but not all of these insights are widely appreciated, and there remain critical shortcomings in current approaches th...... ill-posed problems, (2) the ability to integrate and store information across time and modality, (3) efficient recovery and representation of 3D scene structure, and (4) the use of optimal motor actions for acquiring information to progress toward behavioral goals....

  12. Combination of Morphological Operations with Structure based Partitioning and grouping for Text String detection from Natural Scenes

    OpenAIRE

    Vyankatesh V. Rampurkar; Gyankamal J. Chhajed

    2014-01-01

    Text information in natural scene images serves as important clues for many image-based applications such as scene perceptive, content-based image retrieval, assistive direction-finding and automatic geocoding. Now days different approaches like countours based, Image binarization and enhancement based, Gradient based and colour reduction based techniques can be used for the text detection from natural scenes. In this paper the combination of morphological operations with structure based part...

  13. Iconic memory for the gist of natural scenes.

    Science.gov (United States)

    Clarke, Jason; Mack, Arien

    2014-11-01

    Does iconic memory contain the gist of multiple scenes? Three experiments were conducted. In the first, four scenes from different basic-level categories were briefly presented in one of two conditions: a cue or a no-cue condition. The cue condition was designed to provide an index of the contents of iconic memory of the display. Subjects were more sensitive to scene gist in the cue condition than in the no-cue condition. In the second, the scenes came from the same basic-level category. We found no difference in sensitivity between the two conditions. In the third, six scenes from different basic level categories were presented in the visual periphery. Subjects were more sensitive to scene gist in the cue condition. These results suggest that scene gist is contained in iconic memory even in the visual periphery; however, iconic representations are not sufficiently detailed to distinguish between scenes coming from the same category. Copyright © 2014 Elsevier Inc. All rights reserved.

  14. Fixation and saliency during search of natural scenes: the case of visual agnosia.

    Science.gov (United States)

    Foulsham, Tom; Barton, Jason J S; Kingstone, Alan; Dewhurst, Richard; Underwood, Geoffrey

    2009-07-01

    Models of eye movement control in natural scenes often distinguish between stimulus-driven processes (which guide the eyes to visually salient regions) and those based on task and object knowledge (which depend on expectations or identification of objects and scene gist). In the present investigation, the eye movements of a patient with visual agnosia were recorded while she searched for objects within photographs of natural scenes and compared to those made by students and age-matched controls. Agnosia is assumed to disrupt the top-down knowledge available in this task, and so may increase the reliance on bottom-up cues. The patient's deficit in object recognition was seen in poor search performance and inefficient scanning. The low-level saliency of target objects had an effect on responses in visual agnosia, and the most salient region in the scene was more likely to be fixated by the patient than by controls. An analysis of model-predicted saliency at fixation locations indicated a closer match between fixations and low-level saliency in agnosia than in controls. These findings are discussed in relation to saliency-map models and the balance between high and low-level factors in eye guidance.

  15. Common and Innovative Visuals: A sparsity modeling framework for video.

    Science.gov (United States)

    Abdolhosseini Moghadam, Abdolreza; Kumar, Mrityunjay; Radha, Hayder

    2014-05-02

    Efficient video representation models are critical for many video analysis and processing tasks. In this paper, we present a framework based on the concept of finding the sparsest solution to model video frames. To model the spatio-temporal information, frames from one scene are decomposed into two components: (i) a common frame, which describes the visual information common to all the frames in the scene/segment, and (ii) a set of innovative frames, which depicts the dynamic behaviour of the scene. The proposed approach exploits and builds on recent results in the field of compressed sensing to jointly estimate the common frame and the innovative frames for each video segment. We refer to the proposed modeling framework by CIV (Common and Innovative Visuals). We show how the proposed model can be utilized to find scene change boundaries and extend CIV to videos from multiple scenes. Furthermore, the proposed model is robust to noise and can be used for various video processing applications without relying on motion estimation and detection or image segmentation. Results for object tracking, video editing (object removal, inpainting) and scene change detection are presented to demonstrate the efficiency and the performance of the proposed model.

  16. A Comparative Study of Registration Methods for RGB-D Video of Static Scenes

    Directory of Open Access Journals (Sweden)

    Vicente Morell-Gimenez

    2014-05-01

    Full Text Available The use of RGB-D sensors for mapping and recognition tasks in robotics or, in general, for virtual reconstruction has increased in recent years. The key aspect of these kinds of sensors is that they provide both depth and color information using the same device. In this paper, we present a comparative analysis of the most important methods used in the literature for the registration of subsequent RGB-D video frames in static scenarios. The analysis begins by explaining the characteristics of the registration problem, dividing it into two representative applications: scene modeling and object reconstruction. Then, a detailed experimentation is carried out to determine the behavior of the different methods depending on the application. For both applications, we used standard datasets and a new one built for object reconstruction.

  17. 3D Traffic Scene Understanding From Movable Platforms.

    Science.gov (United States)

    Geiger, Andreas; Lauer, Martin; Wojek, Christian; Stiller, Christoph; Urtasun, Raquel

    2014-05-01

    In this paper, we present a novel probabilistic generative model for multi-object traffic scene understanding from movable platforms which reasons jointly about the 3D scene layout as well as the location and orientation of objects in the scene. In particular, the scene topology, geometry, and traffic activities are inferred from short video sequences. Inspired by the impressive driving capabilities of humans, our model does not rely on GPS, lidar, or map knowledge. Instead, it takes advantage of a diverse set of visual cues in the form of vehicle tracklets, vanishing points, semantic scene labels, scene flow, and occupancy grids. For each of these cues, we propose likelihood functions that are integrated into a probabilistic generative model. We learn all model parameters from training data using contrastive divergence. Experiments conducted on videos of 113 representative intersections show that our approach successfully infers the correct layout in a variety of very challenging scenarios. To evaluate the importance of each feature cue, experiments using different feature combinations are conducted. Furthermore, we show how by employing context derived from the proposed method we are able to improve over the state-of-the-art in terms of object detection and object orientation estimation in challenging and cluttered urban environments.

  18. Is the preference of natural versus man-made scenes driven by bottom-up processing of the visual features of nature?

    Directory of Open Access Journals (Sweden)

    Omid eKardan

    2015-04-01

    Full Text Available Previous research has shown that viewing images of nature scenes can have a beneficial effect on memory, attention and mood. In this study we aimed to determine whether the preference of natural versus man-made scenes is driven by bottom-up processing of the low-level visual features of nature. We used participants’ ratings of perceived naturalness as well as aesthetic preference for 307 images with varied natural and urban content. We then quantified ten low-level image features for each image (a combination of spatial and color properties. These features were used to predict aesthetic preference in the images, as well as to decompose perceived naturalness to its predictable (modelled by the low-level visual features and non-modelled aspects. Interactions of these separate aspects of naturalness with the time it took to make a preference judgment showed that naturalness based on low-level features related more to preference when the judgment was faster (bottom-up. On the other hand perceived naturalness that was not modelled by low-level features was related more to preference when the judgment was slower. A quadratic discriminant classification analysis showed how relevant each aspect of naturalness (modelled and non-modelled was to predicting preference ratings, as well as the image features on their own. Finally, we compared the effect of color-related and structure-related modelled naturalness, and the remaining unmodelled naturalness in predicting aesthetic preference. In summary bottom-up (color and spatial properties of natural images captured by our features and the non-modelled naturalness are important to aesthetic judgments of natural and man-made scenes, with each predicting unique variance.

  19. The Influence of Familiarity on Affective Responses to Natural Scenes

    Science.gov (United States)

    Sanabria Z., Jorge C.; Cho, Youngil; Yamanaka, Toshimasa

    This kansei study explored how familiarity with image-word combinations influences affective states. Stimuli were obtained from Japanese print advertisements (ads), and consisted of images (e.g., natural-scene backgrounds) and their corresponding headlines (advertising copy). Initially, a group of subjects evaluated their level of familiarity with images and headlines independently, and stimuli were filtered based on the results. In the main experiment, a different group of subjects rated their pleasure and arousal to, and familiarity with, image-headline combinations. The Self-Assessment Manikin (SAM) scale was used to evaluate pleasure and arousal, and a bipolar scale was used to evaluate familiarity. The results showed a high correlation between familiarity and pleasure, but low correlation between familiarity and arousal. The characteristics of the stimuli, and their effect on the variables of pleasure, arousal and familiarity, were explored through ANOVA. It is suggested that, in the case of natural-scene ads, familiarity with image-headline combinations may increase the pleasure response to the ads, and that certain components in the images (e.g., water) may increase arousal levels.

  20. Attention in natural scenes: Affective-motivational factors guide gaze independently of visual salience.

    Science.gov (United States)

    Schomaker, Judith; Walper, Daniel; Wittmann, Bianca C; Einhäuser, Wolfgang

    2017-04-01

    In addition to low-level stimulus characteristics and current goals, our previous experience with stimuli can also guide attentional deployment. It remains unclear, however, if such effects act independently or whether they interact in guiding attention. In the current study, we presented natural scenes including every-day objects that differed in affective-motivational impact. In the first free-viewing experiment, we presented visually-matched triads of scenes in which one critical object was replaced that varied mainly in terms of motivational value, but also in terms of valence and arousal, as confirmed by ratings by a large set of observers. Treating motivation as a categorical factor, we found that it affected gaze. A linear-effect model showed that arousal, valence, and motivation predicted fixations above and beyond visual characteristics, like object size, eccentricity, or visual salience. In a second experiment, we experimentally investigated whether the effects of emotion and motivation could be modulated by visual salience. In a medium-salience condition, we presented the same unmodified scenes as in the first experiment. In a high-salience condition, we retained the saturation of the critical object in the scene, and decreased the saturation of the background, and in a low-salience condition, we desaturated the critical object while retaining the original saturation of the background. We found that highly salient objects guided gaze, but still found additional additive effects of arousal, valence and motivation, confirming that higher-level factors can also guide attention, as measured by fixations towards objects in natural scenes. Copyright © 2017 Elsevier Ltd. All rights reserved.

  1. SAMPEG: a scene-adaptive parallel MPEG-2 software encoder

    NARCIS (Netherlands)

    Farin, D.S.; Mache, N.; With, de P.H.N.; Girod, B.; Bouman, C.A.; Steinbach, E.G.

    2001-01-01

    This paper presents a fully software-based MPEG-2 encoder architecture, which uses scene-change detection to optimize the Group-of-Picture (GOP) structure for the actual video sequence. This feature enables easy, lossless edit cuts at scene-change positions and it also improves overall picture

  2. Stages as models of scene geometry.

    Science.gov (United States)

    Nedović, Vladimir; Smeulders, Arnold W M; Redert, André; Geusebroek, Jan-Mark

    2010-09-01

    Reconstruction of 3D scene geometry is an important element for scene understanding, autonomous vehicle and robot navigation, image retrieval, and 3D television. We propose accounting for the inherent structure of the visual world when trying to solve the scene reconstruction problem. Consequently, we identify geometric scene categorization as the first step toward robust and efficient depth estimation from single images. We introduce 15 typical 3D scene geometries called stages, each with a unique depth profile, which roughly correspond to a large majority of broadcast video frames. Stage information serves as a first approximation of global depth, narrowing down the search space in depth estimation and object localization. We propose different sets of low-level features for depth estimation, and perform stage classification on two diverse data sets of television broadcasts. Classification results demonstrate that stages can often be efficiently learned from low-dimensional image representations.

  3. Estimating 3D tilt from local image cues in natural scenes

    OpenAIRE

    Burge, Johannes; McCann, Brian C.; Geisler, Wilson S.

    2016-01-01

    Estimating three-dimensional (3D) surface orientation (slant and tilt) is an important first step toward estimating 3D shape. Here, we examine how three local image cues from the same location (disparity gradient, luminance gradient, and dominant texture orientation) should be combined to estimate 3D tilt in natural scenes. We collected a database of natural stereoscopic images with precisely co-registered range images that provide the ground-truth distance at each pixel location. We then ana...

  4. Toward brain correlates of natural behavior: fMRI during violent video games.

    Science.gov (United States)

    Mathiak, Klaus; Weber, René

    2006-12-01

    Modern video games represent highly advanced virtual reality simulations and often contain virtual violence. In a significant amount of young males, playing video games is a quotidian activity, making it an almost natural behavior. Recordings of brain activation with functional magnetic resonance imaging (fMRI) during gameplay may reflect neuronal correlates of real-life behavior. We recorded 13 experienced gamers (18-26 years; average 14 hrs/week playing) while playing a violent first-person shooter game (a violent computer game played in self-perspective) by means of distortion and dephasing reduced fMRI (3 T; single-shot triple-echo echo-planar imaging [EPI]). Content analysis of the video and sound with 100 ms time resolution achieved relevant behavioral variables. These variables explained significant signal variance across large distributed networks. Occurrence of violent scenes revealed significant neuronal correlates in an event-related design. Activation of dorsal and deactivation of rostral anterior cingulate and amygdala characterized the mid-frontal pattern related to virtual violence. Statistics and effect sizes can be considered large at these areas. Optimized imaging strategies allowed for single-subject and for single-trial analysis with good image quality at basal brain structures. We propose that virtual environments can be used to study neuronal processes involved in semi-naturalistic behavior as determined by content analysis. Importantly, the activation pattern reflects brain-environment interactions rather than stimulus responses as observed in classical experimental designs. We relate our findings to the general discussion on social effects of playing first-person shooter games. (c) 2006 Wiley-Liss, Inc.

  5. Perceptual salience affects the contents of working memory during free-recollection of objects from natural scenes

    Directory of Open Access Journals (Sweden)

    Tiziana ePedale

    2015-02-01

    Full Text Available One of the most important issues in the study of cognition is to understand which are the factors determining internal representation of the external world. Previous literature has started to highlight the impact of low-level sensory features (indexed by saliency-maps in driving attention selection, hence increasing the probability for objects presented in complex and natural scenes to be successfully encoded into working memory(WM and then correctly remembered. Here we asked whether the probability of retrieving high-saliency objects modulates the overall contents of WM, by decreasing the probability of retrieving other, lower-saliency objects. We presented pictures of natural scenes for 4 secs. After a retention period of 8 secs, we asked participants to verbally report as many objects/details as possible of the previous scenes. We then computed how many times the objects located at either the peak of maximal or minimal saliency in the scene (as indexed by a saliency-map; Itti et al., 1998 were recollected by participants. Results showed that maximal-saliency objects were recollected more often and earlier in the stream of successfully reported items than minimal-saliency objects. This indicates that bottom-up sensory salience increases the recollection probability and facilitates the access to memory representation at retrieval, respectively. Moreover, recollection of the maximal- (but not the minimal- saliency objects predicted the overall amount of successfully recollected objects: The higher the probability of having successfully reported the most-salient object in the scene, the lower the amount of recollected objects. These findings highlight that bottom-up sensory saliency modulates the current contents of WM during recollection of objects from natural scenes, most likely by reducing available resources to encode and then retrieve other (lower saliency objects.

  6. Temporal dynamics of motor cortex excitability during perception of natural emotional scenes

    NARCIS (Netherlands)

    Borgomaneri, Sara; Gazzola, Valeria; Avenanti, Alessio

    2014-01-01

    Although it is widely assumed that emotions prime the body for action, the effects of visual perception of natural emotional scenes on the temporal dynamics of the human motor system have scarcely been investigated. Here, we used single-pulse transcranial magnetic stimulation (TMS) to assess motor

  7. A scheme for racquet sports video analysis with the combination of audio-visual information

    Science.gov (United States)

    Xing, Liyuan; Ye, Qixiang; Zhang, Weigang; Huang, Qingming; Yu, Hua

    2005-07-01

    As a very important category in sports video, racquet sports video, e.g. table tennis, tennis and badminton, has been paid little attention in the past years. Considering the characteristics of this kind of sports video, we propose a new scheme for structure indexing and highlight generating based on the combination of audio and visual information. Firstly, a supervised classification method is employed to detect important audio symbols including impact (ball hit), audience cheers, commentator speech, etc. Meanwhile an unsupervised algorithm is proposed to group video shots into various clusters. Then, by taking advantage of temporal relationship between audio and visual signals, we can specify the scene clusters with semantic labels including rally scenes and break scenes. Thirdly, a refinement procedure is developed to reduce false rally scenes by further audio analysis. Finally, an exciting model is proposed to rank the detected rally scenes from which many exciting video clips such as game (match) points can be correctly retrieved. Experiments on two types of representative racquet sports video, table tennis video and tennis video, demonstrate encouraging results.

  8. Individual differences in the spontaneous recruitment of brain regions supporting mental state understanding when viewing natural social scenes.

    Science.gov (United States)

    Wagner, Dylan D; Kelley, William M; Heatherton, Todd F

    2011-12-01

    People are able to rapidly infer complex personality traits and mental states even from the most minimal person information. Research has shown that when observers view a natural scene containing people, they spend a disproportionate amount of their time looking at the social features (e.g., faces, bodies). Does this preference for social features merely reflect the biological salience of these features or are observers spontaneously attempting to make sense of complex social dynamics? Using functional neuroimaging, we investigated neural responses to social and nonsocial visual scenes in a large sample of participants (n = 48) who varied on an individual difference measure assessing empathy and mentalizing (i.e., empathizing). Compared with other scene categories, viewing natural social scenes activated regions associated with social cognition (e.g., dorsomedial prefrontal cortex and temporal poles). Moreover, activity in these regions during social scene viewing was strongly correlated with individual differences in empathizing. These findings offer neural evidence that observers spontaneously engage in social cognition when viewing complex social material but that the degree to which people do so is mediated by individual differences in trait empathizing.

  9. Underwater Scene Composition

    Science.gov (United States)

    Kim, Nanyoung

    2009-01-01

    In this article, the author describes an underwater scene composition for elementary-education majors. This project deals with watercolor with crayon or oil-pastel resist (medium); the beauty of nature represented by fish in the underwater scene (theme); texture and pattern (design elements); drawing simple forms (drawing skill); and composition…

  10. A Conceptual Characterization of Online Videos Explaining Natural Selection

    Science.gov (United States)

    Bohlin, Gustav; Göransson, Andreas; Höst, Gunnar E.; Tibell, Lena A. E.

    2017-01-01

    Educational videos on the Internet comprise a vast and highly diverse source of information. Online search engines facilitate access to numerous videos claiming to explain natural selection, but little is known about the degree to which the video content match key evolutionary content identified as important in evolution education research. In…

  11. Text Detection in Natural Scene Images by Stroke Gabor Words.

    Science.gov (United States)

    Yi, Chucai; Tian, Yingli

    2011-01-01

    In this paper, we propose a novel algorithm, based on stroke components and descriptive Gabor filters, to detect text regions in natural scene images. Text characters and strings are constructed by stroke components as basic units. Gabor filters are used to describe and analyze the stroke components in text characters or strings. We define a suitability measurement to analyze the confidence of Gabor filters in describing stroke component and the suitability of Gabor filters on an image window. From the training set, we compute a set of Gabor filters that can describe principle stroke components of text by their parameters. Then a K -means algorithm is applied to cluster the descriptive Gabor filters. The clustering centers are defined as Stroke Gabor Words (SGWs) to provide a universal description of stroke components. By suitability evaluation on positive and negative training samples respectively, each SGW generates a pair of characteristic distributions of suitability measurements. On a testing natural scene image, heuristic layout analysis is applied first to extract candidate image windows. Then we compute the principle SGWs for each image window to describe its principle stroke components. Characteristic distributions generated by principle SGWs are used to classify text or nontext windows. Experimental results on benchmark datasets demonstrate that our algorithm can handle complex backgrounds and variant text patterns (font, color, scale, etc.).

  12. Dynamic Textures Modeling via Joint Video Dictionary Learning.

    Science.gov (United States)

    Wei, Xian; Li, Yuanxiang; Shen, Hao; Chen, Fang; Kleinsteuber, Martin; Wang, Zhongfeng

    2017-04-06

    Video representation is an important and challenging task in the computer vision community. In this paper, we consider the problem of modeling and classifying video sequences of dynamic scenes which could be modeled in a dynamic textures (DT) framework. At first, we assume that image frames of a moving scene can be modeled as a Markov random process. We propose a sparse coding framework, named joint video dictionary learning (JVDL), to model a video adaptively. By treating the sparse coefficients of image frames over a learned dictionary as the underlying "states", we learn an efficient and robust linear transition matrix between two adjacent frames of sparse events in time series. Hence, a dynamic scene sequence is represented by an appropriate transition matrix associated with a dictionary. In order to ensure the stability of JVDL, we impose several constraints on such transition matrix and dictionary. The developed framework is able to capture the dynamics of a moving scene by exploring both sparse properties and the temporal correlations of consecutive video frames. Moreover, such learned JVDL parameters can be used for various DT applications, such as DT synthesis and recognition. Experimental results demonstrate the strong competitiveness of the proposed JVDL approach in comparison with state-of-the-art video representation methods. Especially, it performs significantly better in dealing with DT synthesis and recognition on heavily corrupted data.

  13. Accumulating and remembering the details of neutral and emotional natural scenes.

    Science.gov (United States)

    Melcher, David

    2010-01-01

    In contrast to our rich sensory experience with complex scenes in everyday life, the capacity of visual working memory is thought to be quite limited. Here our memory has been examined for the details of naturalistic scenes as a function of display duration, emotional valence of the scene, and delay before test. Individual differences in working memory and long-term memory for pictorial scenes were examined in experiment 1. The accumulation of memory for emotional scenes and the retention of these details in long-term memory were investigated in experiment 2. Although there were large individual differences in performance, memory for scene details generally exceeded the traditional working memory limit within a few seconds. Information about positive scenes was learned most quickly, while negative scenes showed the worst memory for details. The overall pattern of results was consistent with the idea that both short-term and long-term representations are mixed together in a medium-term 'online' memory for scenes.

  14. The scene and the unseen: manipulating photographs for experiments on change blindness and scene memory: image manipulation for change blindness.

    Science.gov (United States)

    Ball, Felix; Elzemann, Anne; Busch, Niko A

    2014-09-01

    The change blindness paradigm, in which participants often fail to notice substantial changes in a scene, is a popular tool for studying scene perception, visual memory, and the link between awareness and attention. Some of the most striking and popular examples of change blindness have been demonstrated with digital photographs of natural scenes; in most studies, however, much simpler displays, such as abstract stimuli or "free-floating" objects, are typically used. Although simple displays have undeniable advantages, natural scenes remain a very useful and attractive stimulus for change blindness research. To assist researchers interested in using natural-scene stimuli in change blindness experiments, we provide here a step-by-step tutorial on how to produce changes in natural-scene images with a freely available image-processing tool (GIMP). We explain how changes in a scene can be made by deleting objects or relocating them within the scene or by changing the color of an object, in just a few simple steps. We also explain how the physical properties of such changes can be analyzed using GIMP and MATLAB (a high-level scientific programming tool). Finally, we present an experiment confirming that scenes manipulated according to our guidelines are effective in inducing change blindness and demonstrating the relationship between change blindness and the physical properties of the change and inter-individual differences in performance measures. We expect that this tutorial will be useful for researchers interested in studying the mechanisms of change blindness, attention, or visual memory using natural scenes.

  15. Contextual analysis of videos

    CERN Document Server

    Thida, Myo; Monekosso, Dorothy

    2013-01-01

    Video context analysis is an active and vibrant research area, which provides means for extracting, analyzing and understanding behavior of a single target and multiple targets. Over the last few decades, computer vision researchers have been working to improve the accuracy and robustness of algorithms to analyse the context of a video automatically. In general, the research work in this area can be categorized into three major topics: 1) counting number of people in the scene 2) tracking individuals in a crowd and 3) understanding behavior of a single target or multiple targets in the scene.

  16. Deep hierarchical attention network for video description

    Science.gov (United States)

    Li, Shuohao; Tang, Min; Zhang, Jun

    2018-03-01

    Pairing video to natural language description remains a challenge in computer vision and machine translation. Inspired by image description, which uses an encoder-decoder model for reducing visual scene into a single sentence, we propose a deep hierarchical attention network for video description. The proposed model uses convolutional neural network (CNN) and bidirectional LSTM network as encoders while a hierarchical attention network is used as the decoder. Compared to encoder-decoder models used in video description, the bidirectional LSTM network can capture the temporal structure among video frames. Moreover, the hierarchical attention network has an advantage over single-layer attention network on global context modeling. To make a fair comparison with other methods, we evaluate the proposed architecture with different types of CNN structures and decoders. Experimental results on the standard datasets show that our model has a more superior performance than the state-of-the-art techniques.

  17. Luminance cues constrain chromatic blur discrimination in natural scene stimuli.

    Science.gov (United States)

    Sharman, Rebecca J; McGraw, Paul V; Peirce, Jonathan W

    2013-03-22

    Introducing blur into the color components of a natural scene has very little effect on its percept, whereas blur introduced into the luminance component is very noticeable. Here we quantify the dominance of luminance information in blur detection and examine a number of potential causes. We show that the interaction between chromatic and luminance information is not explained by reduced acuity or spatial resolution limitations for chromatic cues, the effective contrast of the luminance cue, or chromatic and achromatic statistical regularities in the images. Regardless of the quality of chromatic information, the visual system gives primacy to luminance signals when determining edge location. In natural viewing, luminance information appears to be specialized for detecting object boundaries while chromatic information may be used to determine surface properties.

  18. A distributed code for colour in natural scenes derived from centre-surround filtered cone signals

    Directory of Open Access Journals (Sweden)

    Christian Johannes Kellner

    2013-09-01

    Full Text Available In the retina of trichromatic primates, chromatic information is encoded in an opponent fashion and transmitted to the lateral geniculate nucleus (LGN and visual cortex via parallel pathways. Chromatic selectivities of neurons in the LGN form two separate clusters, corresponding to two classes of cone opponency. In the visual cortex, however, the chromatic selectivities are more distributed, which is in accordance with a population code for colour. Previous studies of cone signals in natural scenes typically found opponent codes with chromatic selectivities corresponding to two directions in colour space. Here we investigated how the nonlinear spatiochromatic filtering in the retina influences the encoding of colour signals. Cone signals were derived from hyperspectral images of natural scenes and pre-processed by centre-surround filtering and rectification, resulting in parallel ON and OFF channels. Independent Component Analysis on these signals yielded a highly sparse code with basis functions that showed spatio-chromatic selectivities. In contrast to previous analyses of linear transformations of cone signals, chromatic selectivities were not restricted to two main chromatic axes, but were more continuously distributed in colour space, similar to the population code of colour in the early visual cortex. Our results indicate that spatiochromatic processing in the retina leads to a more distributed and more efficient code for natural scenes.

  19. Large Capacity of Conscious Access for Incidental Memories in Natural Scenes.

    Science.gov (United States)

    Kaunitz, Lisandro N; Rowe, Elise G; Tsuchiya, Naotsugu

    2016-09-01

    When searching a crowd, people can detect a target face only by direct fixation and attention. Once the target is found, it is consciously experienced and remembered, but what is the perceptual fate of the fixated nontarget faces? Whereas introspection suggests that one may remember nontargets, previous studies have proposed that almost no memory should be retained. Using a gaze-contingent paradigm, we asked subjects to visually search for a target face within a crowded natural scene and then tested their memory for nontarget faces, as well as their confidence in those memories. Subjects remembered up to seven fixated, nontarget faces with more than 70% accuracy. Memory accuracy was correlated with trial-by-trial confidence ratings, which implies that the memory was consciously maintained and accessed. When the search scene was inverted, no more than three nontarget faces were remembered. These findings imply that incidental memory for faces, such as those recalled by eyewitnesses, is more reliable than is usually assumed. © The Author(s) 2016.

  20. Qualitative spatial logic descriptors from 3D indoor scenes to generate explanations in natural language.

    Science.gov (United States)

    Falomir, Zoe; Kluth, Thomas

    2018-05-01

    The challenge of describing 3D real scenes is tackled in this paper using qualitative spatial descriptors. A key point to study is which qualitative descriptors to use and how these qualitative descriptors must be organized to produce a suitable cognitive explanation. In order to find answers, a survey test was carried out with human participants which openly described a scene containing some pieces of furniture. The data obtained in this survey are analysed, and taking this into account, the QSn3D computational approach was developed which uses a XBox 360 Kinect to obtain 3D data from a real indoor scene. Object features are computed on these 3D data to identify objects in indoor scenes. The object orientation is computed, and qualitative spatial relations between the objects are extracted. These qualitative spatial relations are the input to a grammar which applies saliency rules obtained from the survey study and generates cognitive natural language descriptions of scenes. Moreover, these qualitative descriptors can be expressed as first-order logical facts in Prolog for further reasoning. Finally, a validation study is carried out to test whether the descriptions provided by QSn3D approach are human readable. The obtained results show that their acceptability is higher than 82%.

  1. Advanced video coding systems

    CERN Document Server

    Gao, Wen

    2015-01-01

    This comprehensive and accessible text/reference presents an overview of the state of the art in video coding technology. Specifically, the book introduces the tools of the AVS2 standard, describing how AVS2 can help to achieve a significant improvement in coding efficiency for future video networks and applications by incorporating smarter coding tools such as scene video coding. Topics and features: introduces the basic concepts in video coding, and presents a short history of video coding technology and standards; reviews the coding framework, main coding tools, and syntax structure of AV

  2. Color constancy in a scene with bright colors that do not have a fully natural surface appearance.

    Science.gov (United States)

    Fukuda, Kazuho; Uchikawa, Keiji

    2014-04-01

    Theoretical and experimental approaches have proposed that color constancy involves a correction related to some average of stimulation over the scene, and some of the studies showed that the average gives greater weight to surrounding bright colors. However, in a natural scene, high-luminance elements do not necessarily carry information about the scene illuminant when the luminance is too high for it to appear as a natural object color. The question is how a surrounding color's appearance mode influences its contribution to the degree of color constancy. Here the stimuli were simple geometric patterns, and the luminance of surrounding colors was tested over the range beyond the luminosity threshold. Observers performed perceptual achromatic setting on the test patch in order to measure the degree of color constancy and evaluated the surrounding bright colors' appearance mode. Broadly, our results support the assumption that the visual system counts only the colors in the object-color appearance for color constancy. However, detailed analysis indicated that surrounding colors without a fully natural object-color appearance had some sort of influence on color constancy. Consideration of this contribution of unnatural object color might be important for precise modeling of human color constancy.

  3. Electronic evaluation for video commercials by impression index.

    Science.gov (United States)

    Kong, Wanzeng; Zhao, Xinxin; Hu, Sanqing; Vecchiato, Giovanni; Babiloni, Fabio

    2013-12-01

    How to evaluate the effect of commercials is significantly important in neuromarketing. In this paper, we proposed an electronic way to evaluate the influence of video commercials on consumers by impression index. The impression index combines both the memorization and attention index during consumers observing video commercials by tracking the EEG activity. It extracts features from scalp EEG to evaluate the effectiveness of video commercials in terms of time-frequency-space domain. And, the general global field power was used as an impression index for evaluation of video commercial scenes as time series. Results of experiment demonstrate that the proposed approach is able to track variations of the cerebral activity related to cognitive task such as observing video commercials, and help to judge whether the scene in video commercials is impressive or not by EEG signals.

  4. Explaining the Timing of Natural Scene Understanding with a Computational Model of Perceptual Categorization

    Science.gov (United States)

    Sofer, Imri; Crouzet, Sébastien M.; Serre, Thomas

    2015-01-01

    Observers can rapidly perform a variety of visual tasks such as categorizing a scene as open, as outdoor, or as a beach. Although we know that different tasks are typically associated with systematic differences in behavioral responses, to date, little is known about the underlying mechanisms. Here, we implemented a single integrated paradigm that links perceptual processes with categorization processes. Using a large image database of natural scenes, we trained machine-learning classifiers to derive quantitative measures of task-specific perceptual discriminability based on the distance between individual images and different categorization boundaries. We showed that the resulting discriminability measure accurately predicts variations in behavioral responses across categorization tasks and stimulus sets. We further used the model to design an experiment, which challenged previous interpretations of the so-called “superordinate advantage.” Overall, our study suggests that observed differences in behavioral responses across rapid categorization tasks reflect natural variations in perceptual discriminability. PMID:26335683

  5. Visual Alphabets: Video classification by end users

    NARCIS (Netherlands)

    Israël, Menno; van den Broek, Egon; van der Putten, Peter; den Uyl, Marten J.; Petrushin, Valery A.; Khan, Latifur

    2007-01-01

    The work presented here introduces a real-time automatic scene classifier within content-based video retrieval. In our envisioned approach end users like documentalists, not image processing experts, build classifiers interactively, by simply indicating positive examples of a scene. Classification

  6. Video context-dependent recall.

    Science.gov (United States)

    Smith, Steven M; Manzano, Isabel

    2010-02-01

    In two experiments, we used an effective new method for experimentally manipulating local and global contexts to examine context-dependent recall. The method included video-recorded scenes of real environments, with target words superimposed over the scenes. In Experiment 1, we used a within-subjects manipulation of video contexts and compared the effects of reinstatement of a global context (15 words per context) with effects of less overloaded context cues (1 and 3 words per context) on recall. The size of the reinstatement effects in Experiment 1 show how potently video contexts can cue recall. A strong effect of cue overload was also found; reinstatement effects were smaller, but still quite robust, in the 15 words per context condition. The powerful reinstatement effect was replicated for local contexts in Experiment 2, which included a no-contexts-reinstated group, a control condition used to determine whether reinstatement of half of the cues caused biased output interference for uncued targets. The video context method is a potent way to investigate context-dependent memory.

  7. Perceptual geometry of space and form: visual perception of natural scenes and their virtual representation

    Science.gov (United States)

    Assadi, Amir H.

    2001-11-01

    Perceptual geometry is an emerging field of interdisciplinary research whose objectives focus on study of geometry from the perspective of visual perception, and in turn, apply such geometric findings to the ecological study of vision. Perceptual geometry attempts to answer fundamental questions in perception of form and representation of space through synthesis of cognitive and biological theories of visual perception with geometric theories of the physical world. Perception of form and space are among fundamental problems in vision science. In recent cognitive and computational models of human perception, natural scenes are used systematically as preferred visual stimuli. Among key problems in perception of form and space, we have examined perception of geometry of natural surfaces and curves, e.g. as in the observer's environment. Besides a systematic mathematical foundation for a remarkably general framework, the advantages of the Gestalt theory of natural surfaces include a concrete computational approach to simulate or recreate images whose geometric invariants and quantities might be perceived and estimated by an observer. The latter is at the very foundation of understanding the nature of perception of space and form, and the (computer graphics) problem of rendering scenes to visually invoke virtual presence.

  8. Error Detection, Factorization and Correction for Multi-View Scene Reconstruction from Aerial Imagery

    Energy Technology Data Exchange (ETDEWEB)

    Hess-Flores, Mauricio [Univ. of California, Davis, CA (United States)

    2011-11-10

    Scene reconstruction from video sequences has become a prominent computer vision research area in recent years, due to its large number of applications in fields such as security, robotics and virtual reality. Despite recent progress in this field, there are still a number of issues that manifest as incomplete, incorrect or computationally-expensive reconstructions. The engine behind achieving reconstruction is the matching of features between images, where common conditions such as occlusions, lighting changes and texture-less regions can all affect matching accuracy. Subsequent processes that rely on matching accuracy, such as camera parameter estimation, structure computation and non-linear parameter optimization, are also vulnerable to additional sources of error, such as degeneracies and mathematical instability. Detection and correction of errors, along with robustness in parameter solvers, are a must in order to achieve a very accurate final scene reconstruction. However, error detection is in general difficult due to the lack of ground-truth information about the given scene, such as the absolute position of scene points or GPS/IMU coordinates for the camera(s) viewing the scene. In this dissertation, methods are presented for the detection, factorization and correction of error sources present in all stages of a scene reconstruction pipeline from video, in the absence of ground-truth knowledge. Two main applications are discussed. The first set of algorithms derive total structural error measurements after an initial scene structure computation and factorize errors into those related to the underlying feature matching process and those related to camera parameter estimation. A brute-force local correction of inaccurate feature matches is presented, as well as an improved conditioning scheme for non-linear parameter optimization which applies weights on input parameters in proportion to estimated camera parameter errors. Another application is in

  9. Veterans Crisis Line: Videos About Reaching out for Help

    Medline Plus

    Full Text Available ... v/K5u3sb-Dbkc Watch additional videos about getting help. Behind the Scenes see more videos from Veterans Health Administration Be There: Help Save a Life see more videos from Veterans ...

  10. NATURE VIDEO WATCHING: CONSEQUENCES ON ANGER AND ANXIETY

    Directory of Open Access Journals (Sweden)

    Nicoleta Răban-Motounu

    2017-12-01

    Full Text Available Extensive research has been conducted on the effects of natural environment on people’s well-being, starting with the short term restoring effects on the brain, and continuing with the long-term effects on the emotional self-regulating processes. In the present research we have focused on the latter, trying to connect two of the problems in our world: the violent behavior, and the preservation of natural environment. Thus, the objective was to study the effects of watching a video from nature wild life on anger (the feeling and its expression, and state-anxiety. The statistical analysis indicated that, while there were no significant differences regarding anxiety (worry, internal tension or general mechanisms in dealing with fury, watching the video significantly decreased the feeling of anger, and the tendency to express it either verbally or physically. As a main conclusion we highlight the link between the accessibility of natural environment, and the violent expressions of anger.

  11. Joint Rendering and Segmentation of Free-Viewpoint Video

    Directory of Open Access Journals (Sweden)

    Ishii Masato

    2010-01-01

    Full Text Available Abstract This paper presents a method that jointly performs synthesis and object segmentation of free-viewpoint video using multiview video as the input. This method is designed to achieve robust segmentation from online video input without per-frame user interaction and precomputations. This method shares a calculation process between the synthesis and segmentation steps; the matching costs calculated through the synthesis step are adaptively fused with other cues depending on the reliability in the segmentation step. Since the segmentation is performed for arbitrary viewpoints directly, the extracted object can be superimposed onto another 3D scene with geometric consistency. We can observe that the object and new background move naturally along with the viewpoint change as if they existed together in the same space. In the experiments, our method can process online video input captured by a 25-camera array and show the result image at 4.55 fps.

  12. Veterans Crisis Line: Videos About Reaching out for Help

    Medline Plus

    Full Text Available ... out for help. Bittersweet More Videos from Veterans Health Administration Embedded YouTube video: https://www.youtube.com/ ... Behind the Scenes see more videos from Veterans Health Administration Be There: Help Save a Life see ...

  13. Veterans Crisis Line: Videos About Reaching out for Help

    Medline Plus

    Full Text Available ... for help. Bittersweet More Videos from Veterans Health Administration Embedded YouTube video: https://www.youtube.com/v/ ... the Scenes see more videos from Veterans Health Administration Be There: Help Save a Life see more ...

  14. Immersive video

    Science.gov (United States)

    Moezzi, Saied; Katkere, Arun L.; Jain, Ramesh C.

    1996-03-01

    Interactive video and television viewers should have the power to control their viewing position. To make this a reality, we introduce the concept of Immersive Video, which employs computer vision and computer graphics technologies to provide remote users a sense of complete immersion when viewing an event. Immersive Video uses multiple videos of an event, captured from different perspectives, to generate a full 3D digital video of that event. That is accomplished by assimilating important information from each video stream into a comprehensive, dynamic, 3D model of the environment. Using this 3D digital video, interactive viewers can then move around the remote environment and observe the events taking place from any desired perspective. Our Immersive Video System currently provides interactive viewing and `walkthrus' of staged karate demonstrations, basketball games, dance performances, and typical campus scenes. In its full realization, Immersive Video will be a paradigm shift in visual communication which will revolutionize television and video media, and become an integral part of future telepresence and virtual reality systems.

  15. Detecting anomalies in crowded scenes via locality-constrained affine subspace coding

    Science.gov (United States)

    Fan, Yaxiang; Wen, Gongjian; Qiu, Shaohua; Li, Deren

    2017-07-01

    Video anomaly event detection is the process of finding an abnormal event deviation compared with the majority of normal or usual events. The main challenges are the high structure redundancy and the dynamic changes in the scenes that are in surveillance videos. To address these problems, we present a framework for anomaly detection and localization in videos that is based on locality-constrained affine subspace coding (LASC) and a model updating procedure. In our algorithm, LASC attempts to reconstruct the test sample by its top-k nearest subspaces, which are obtained by segmenting the normal samples space using a clustering method. A sample with a large reconstruction cost is detected as abnormal by setting a threshold. To adapt to the scene changes over time, a model updating strategy is proposed. We experiment on two public datasets: the UCSD dataset and the Avenue dataset. The results demonstrate that our method achieves competitive performance at a 700 fps on a single desktop PC.

  16. Video segmentation for post-production

    Science.gov (United States)

    Wills, Ciaran

    2001-12-01

    Specialist post-production is an industry that has much to gain from the application of content-based video analysis techniques. However the types of material handled in specialist post-production, such as television commercials, pop music videos and special effects are quite different in nature from the typical broadcast material which many video analysis techniques are designed to work with; shots are short and highly dynamic, and the transitions are often novel or ambiguous. We address the problem of scene change detection and develop a new algorithm which tackles some of the common aspects of post-production material that cause difficulties for past algorithms, such as illumination changes and jump cuts. Operating in the compressed domain on Motion JPEG compressed video, our algorithm detects cuts and fades by analyzing each JPEG macroblock in the context of its temporal and spatial neighbors. Analyzing the DCT coefficients directly we can extract the mean color of a block and an approximate detail level. We can also perform an approximated cross-correlation between two blocks. The algorithm is part of a set of tools being developed to work with an automated asset management system designed specifically for use in post-production facilities.

  17. [Perception of objects and scenes in age-related macular degeneration].

    Science.gov (United States)

    Tran, T H C; Boucart, M

    2012-01-01

    Vision related quality of life questionnaires suggest that patients with AMD exhibit difficulties in finding objects and in mobility. In the natural environment, objects seldom appear in isolation. They appear in a spatial context which may obscure them in part or place obstacles in the patient's path. Furthermore, the luminance of a natural scene varies as a function of the hour of the day and the light source, which can alter perception. This study aims to evaluate recognition of objects and natural scenes by patients with AMD, by using photographs of such scenes. Studies demonstrate that AMD patients are able to categorize scenes as nature scenes or urban scenes and to discriminate indoor from outdoor scenes with a high degree of precision. They detect objects better in isolation, in color, or against a white background than in their natural contexts. These patients encounter more difficulties than normally sighted individuals in detecting objects in a low-contrast, black-and-white scene. These results may have implications for rehabilitation, for layout of texts and magazines for the reading-impaired and for the rearrangement of the spatial environment of older AMD patients in order to facilitate mobility, finding objects and reducing the risk of falls. Copyright © 2011 Elsevier Masson SAS. All rights reserved.

  18. The Legal Nature of Video Games – Adapting Copyright Law to Multimedia

    Directory of Open Access Journals (Sweden)

    Julian Simon Stein

    2015-06-01

    Full Text Available In Copyright Law, video games are still a contentious matter. The multimedia nature of games brings up the question on how to define their legal nature. While there are several original underlying works in video games such as computer programs, artistic works, musical works, dramatic works etc., video games enjoy protection as films or audiovisual works respectively in many jurisdictions, making video games an arrangement of a multiplicity of works. However, some have argued to define video games as a single 'multimedia work' rather than a product of many works of copyright.This article analyses the different types of original and derivative works contained in video games before evaluating the necessity and feasibility of a multimedia category of work, arguing in favour of the current system.

  19. Evidence for similar patterns of neural activity elicited by picture- and word-based representations of natural scenes.

    Science.gov (United States)

    Kumar, Manoj; Federmeier, Kara D; Fei-Fei, Li; Beck, Diane M

    2017-07-15

    A long-standing core question in cognitive science is whether different modalities and representation types (pictures, words, sounds, etc.) access a common store of semantic information. Although different input types have been shown to activate a shared network of brain regions, this does not necessitate that there is a common representation, as the neurons in these regions could still differentially process the different modalities. However, multi-voxel pattern analysis can be used to assess whether, e.g., pictures and words evoke a similar pattern of activity, such that the patterns that separate categories in one modality transfer to the other. Prior work using this method has found support for a common code, but has two limitations: they have either only examined disparate categories (e.g. animals vs. tools) that are known to activate different brain regions, raising the possibility that the pattern separation and inferred similarity reflects only large scale differences between the categories or they have been limited to individual object representations. By using natural scene categories, we not only extend the current literature on cross-modal representations beyond objects, but also, because natural scene categories activate a common set of brain regions, we identify a more fine-grained (i.e. higher spatial resolution) common representation. Specifically, we studied picture- and word-based representations of natural scene stimuli from four different categories: beaches, cities, highways, and mountains. Participants passively viewed blocks of either phrases (e.g. "sandy beach") describing scenes or photographs from those same scene categories. To determine whether the phrases and pictures evoke a common code, we asked whether a classifier trained on one stimulus type (e.g. phrase stimuli) would transfer (i.e. cross-decode) to the other stimulus type (e.g. picture stimuli). The analysis revealed cross-decoding in the occipitotemporal, posterior parietal and

  20. Moving through a multiplex holographic scene

    Science.gov (United States)

    Mrongovius, Martina

    2013-02-01

    This paper explores how movement can be used as a compositional element in installations of multiplex holograms. My holographic images are created from montages of hand-held video and photo-sequences. These spatially dynamic compositions are visually complex but anchored to landmarks and hints of the capturing process - such as the appearance of the photographer's shadow - to establish a sense of connection to the holographic scene. Moving around in front of the hologram, the viewer animates the holographic scene. A perception of motion then results from the viewer's bodily awareness of physical motion and the visual reading of dynamics within the scene or movement of perspective through a virtual suggestion of space. By linking and transforming the physical motion of the viewer with the visual animation, the viewer's bodily awareness - including proprioception, balance and orientation - play into the holographic composition. How multiplex holography can be a tool for exploring coupled, cross-referenced and transformed perceptions of movement is demonstrated with a number of holographic image installations. Through this process I expanded my creative composition practice to consider how dynamic and spatial scenes can be conveyed through the fragmented view of a multiplex hologram. This body of work was developed through an installation art practice and was the basis of my recently completed doctoral thesis: 'The Emergent Holographic Scene — compositions of movement and affect using multiplex holographic images'.

  1. Affective Ranking of Movie Scenes Using Physiological Signals and Content Analysis

    OpenAIRE

    Soleymani, Mohammad; Chanel, Guillaume; Kierkels, Joep Johannes Maria; Pun, Thierry

    2008-01-01

    In this paper, we propose an approach for affective ranking of movie scenes based on the emotions that are actually felt by spectators. Such a ranking can be used for characterizing the affective, or emotional, content of video clips. The ranking can for instance help determine which video clip from a database elicits, for a given user, the most joy. This in turn will permit video indexing and retrieval based on affective criteria corresponding to a personalized user affective profile.A datas...

  2. A Conceptual Characterization of Online Videos Explaining Natural Selection

    Science.gov (United States)

    Bohlin, Gustav; Göransson, Andreas; Höst, Gunnar E.; Tibell, Lena A. E.

    2017-11-01

    Educational videos on the Internet comprise a vast and highly diverse source of information. Online search engines facilitate access to numerous videos claiming to explain natural selection, but little is known about the degree to which the video content match key evolutionary content identified as important in evolution education research. In this study, we therefore analyzed the content of 60 videos accessed through the Internet, using a criteria catalog with 38 operationalized variables derived from research literature. The variables were sorted into four categories: (a) key concepts (e.g. limited resources and inherited variation), (b) threshold concepts (abstract concepts with a transforming and integrative function), (c) misconceptions (e.g. that evolution is driven by need), and (d) organismal context (e.g. animal or plant). The results indicate that some concepts are frequently communicated, and certain taxa are commonly used to illustrate concepts, while others are seldom included. In addition, evolutionary phenomena at small temporal and spatial scales, such as subcellular processes, are rarely covered. Rather, the focus is on population-level events over time scales spanning years or longer. This is consistent with an observed lack of explanations regarding how randomly occurring mutations provide the basis for variation (and thus natural selection). The findings imply, among other things, that some components of natural selection warrant far more attention in biology teaching and science education research.

  3. A gaze-contingent display to study contrast sensitivity under natural viewing conditions

    Science.gov (United States)

    Dorr, Michael; Bex, Peter J.

    2011-03-01

    Contrast sensitivity has been extensively studied over the last decades and there are well-established models of early vision that were derived by presenting the visual system with synthetic stimuli such as sine-wave gratings near threshold contrasts. Natural scenes, however, contain a much wider distribution of orientations, spatial frequencies, and both luminance and contrast values. Furthermore, humans typically move their eyes two to three times per second under natural viewing conditions, but most laboratory experiments require subjects to maintain central fixation. We here describe a gaze-contingent display capable of performing real-time contrast modulations of video in retinal coordinates, thus allowing us to study contrast sensitivity when dynamically viewing dynamic scenes. Our system is based on a Laplacian pyramid for each frame that efficiently represents individual frequency bands. Each output pixel is then computed as a locally weighted sum of pyramid levels to introduce local contrast changes as a function of gaze. Our GPU implementation achieves real-time performance with more than 100 fps on high-resolution video (1920 by 1080 pixels) and a synthesis latency of only 1.5ms. Psychophysical data show that contrast sensitivity is greatly decreased in natural videos and under dynamic viewing conditions. Synthetic stimuli therefore only poorly characterize natural vision.

  4. The time course of natural scene perception with reduced attention

    NARCIS (Netherlands)

    Groen, I.I.A.; Ghebreab, S.; Lamme, V.A.F.; Scholte, H.S.

    Attention is thought to impose an informational bottleneck on vision by selecting particular information from visual scenes for enhanced processing. Behavioral evidence suggests, however, that some scene information is extracted even when attention is directed elsewhere. Here, we investigated the

  5. Detection of Visual Events in Underwater Video Using a Neuromorphic Saliency-based Attention System

    Science.gov (United States)

    Edgington, D. R.; Walther, D.; Cline, D. E.; Sherlock, R.; Salamy, K. A.; Wilson, A.; Koch, C.

    2003-12-01

    The Monterey Bay Aquarium Research Institute (MBARI) uses high-resolution video equipment on remotely operated vehicles (ROV) to obtain quantitative data on the distribution and abundance of oceanic animals. High-quality video data supplants the traditional approach of assessing the kinds and numbers of animals in the oceanic water column through towing collection nets behind ships. Tow nets are limited in spatial resolution, and often destroy abundant gelatinous animals resulting in species undersampling. Video camera-based quantitative video transects (QVT) are taken through the ocean midwater, from 50m to 4000m, and provide high-resolution data at the scale of the individual animals and their natural aggregation patterns. However, the current manual method of analyzing QVT video by trained scientists is labor intensive and poses a serious limitation to the amount of information that can be analyzed from ROV dives. Presented here is an automated system for detecting marine animals (events) visible in the videos. Automated detection is difficult due to the low contrast of many translucent animals and due to debris ("marine snow") cluttering the scene. Video frames are processed with an artificial intelligence attention selection algorithm that has proven a robust means of target detection in a variety of natural terrestrial scenes. The candidate locations identified by the attention selection module are tracked across video frames using linear Kalman filters. Typically, the occurrence of visible animals in the video footage is sparse in space and time. A notion of "boring" video frames is developed by detecting whether or not there is an interesting candidate object for an animal present in a particular sequence of underwater video -- video frames that do not contain any "interesting" events. If objects can be tracked successfully over several frames, they are stored as potentially "interesting" events. Based on low-level properties, interesting events are

  6. Toward 3D-IPTV: design and implementation of a stereoscopic and multiple-perspective video streaming system

    Science.gov (United States)

    Petrovic, Goran; Farin, Dirk; de With, Peter H. N.

    2008-02-01

    3D-Video systems allow a user to perceive depth in the viewed scene and to display the scene from arbitrary viewpoints interactively and on-demand. This paper presents a prototype implementation of a 3D-video streaming system using an IP network. The architecture of our streaming system is layered, where each information layer conveys a single coded video signal or coded scene-description data. We demonstrate the benefits of a layered architecture with two examples: (a) stereoscopic video streaming, (b) monoscopic video streaming with remote multiple-perspective rendering. Our implementation experiments confirm that prototyping 3D-video streaming systems is possible with today's software and hardware. Furthermore, our current operational prototype demonstrates that highly heterogeneous clients can coexist in the system, ranging from auto-stereoscopic 3D displays to resource-constrained mobile devices.

  7. Changing scenes: memory for naturalistic events following change blindness.

    Science.gov (United States)

    Mäntylä, Timo; Sundström, Anna

    2004-11-01

    Research on scene perception indicates that viewers often fail to detect large changes to scene regions when these changes occur during a visual disruption such as a saccade or a movie cut. In two experiments, we examined whether this relative inability to detect changes would produce systematic biases in event memory. In Experiment 1, participants decided whether two successively presented images were the same or different, followed by a memory task, in which they recalled the content of the viewed scene. In Experiment 2, participants viewed a short video, in which an actor carried out a series of daily activities, and central scenes' attributes were changed during a movie cut. A high degree of change blindness was observed in both experiments, and these effects were related to scene complexity (Experiment 1) and level of retrieval support (Experiment 2). Most important, participants reported the changed, rather than the initial, event attributes following a failure in change detection. These findings suggest that attentional limitations during encoding contribute to biases in episodic memory.

  8. Intelligent keyframe extraction for video printing

    Science.gov (United States)

    Zhang, Tong

    2004-10-01

    Nowadays most digital cameras have the functionality of taking short video clips, with the length of video ranging from several seconds to a couple of minutes. The purpose of this research is to develop an algorithm which extracts an optimal set of keyframes from each short video clip so that the user could obtain proper video frames to print out. In current video printing systems, keyframes are normally obtained by evenly sampling the video clip over time. Such an approach, however, may not reflect highlights or regions of interest in the video. Keyframes derived in this way may also be improper for video printing in terms of either content or image quality. In this paper, we present an intelligent keyframe extraction approach to derive an improved keyframe set by performing semantic analysis of the video content. For a video clip, a number of video and audio features are analyzed to first generate a candidate keyframe set. These features include accumulative color histogram and color layout differences, camera motion estimation, moving object tracking, face detection and audio event detection. Then, the candidate keyframes are clustered and evaluated to obtain a final keyframe set. The objective is to automatically generate a limited number of keyframes to show different views of the scene; to show different people and their actions in the scene; and to tell the story in the video shot. Moreover, frame extraction for video printing, which is a rather subjective problem, is considered in this work for the first time, and a semi-automatic approach is proposed.

  9. A video event trigger for high frame rate, high resolution video technology

    Science.gov (United States)

    Williams, Glenn L.

    1991-12-01

    When video replaces film the digitized video data accumulates very rapidly, leading to a difficult and costly data storage problem. One solution exists for cases when the video images represent continuously repetitive 'static scenes' containing negligible activity, occasionally interrupted by short events of interest. Minutes or hours of redundant video frames can be ignored, and not stored, until activity begins. A new, highly parallel digital state machine generates a digital trigger signal at the onset of a video event. High capacity random access memory storage coupled with newly available fuzzy logic devices permits the monitoring of a video image stream for long term or short term changes caused by spatial translation, dilation, appearance, disappearance, or color change in a video object. Pretrigger and post-trigger storage techniques are then adaptable for archiving the digital stream from only the significant video images.

  10. A view not to be missed: Salient scene content interferes with cognitive restoration

    Science.gov (United States)

    Van der Jagt, Alexander P. N.; Craig, Tony; Brewer, Mark J.; Pearson, David G.

    2017-01-01

    Attention Restoration Theory (ART) states that built scenes place greater load on attentional resources than natural scenes. This is explained in terms of "hard" and "soft" fascination of built and natural scenes. Given a lack of direct empirical evidence for this assumption we propose that perceptual saliency of scene content can function as an empirically derived indicator of fascination. Saliency levels were established by measuring speed of scene category detection using a Go/No-Go detection paradigm. Experiment 1 shows that built scenes are more salient than natural scenes. Experiment 2 replicates these findings using greyscale images, ruling out a colour-based response strategy, and additionally shows that built objects in natural scenes affect saliency to a greater extent than the reverse. Experiment 3 demonstrates that the saliency of scene content is directly linked to cognitive restoration using an established restoration paradigm. Overall, these findings demonstrate an important link between the saliency of scene content and related cognitive restoration. PMID:28723975

  11. A view not to be missed: Salient scene content interferes with cognitive restoration.

    Directory of Open Access Journals (Sweden)

    Alexander P N Van der Jagt

    Full Text Available Attention Restoration Theory (ART states that built scenes place greater load on attentional resources than natural scenes. This is explained in terms of "hard" and "soft" fascination of built and natural scenes. Given a lack of direct empirical evidence for this assumption we propose that perceptual saliency of scene content can function as an empirically derived indicator of fascination. Saliency levels were established by measuring speed of scene category detection using a Go/No-Go detection paradigm. Experiment 1 shows that built scenes are more salient than natural scenes. Experiment 2 replicates these findings using greyscale images, ruling out a colour-based response strategy, and additionally shows that built objects in natural scenes affect saliency to a greater extent than the reverse. Experiment 3 demonstrates that the saliency of scene content is directly linked to cognitive restoration using an established restoration paradigm. Overall, these findings demonstrate an important link between the saliency of scene content and related cognitive restoration.

  12. Enhancement system of nighttime infrared video image and visible video image

    Science.gov (United States)

    Wang, Yue; Piao, Yan

    2016-11-01

    Visibility of Nighttime video image has a great significance for military and medicine areas, but nighttime video image has so poor quality that we can't recognize the target and background. Thus we enhance the nighttime video image by fuse infrared video image and visible video image. According to the characteristics of infrared and visible images, we proposed improved sift algorithm andαβ weighted algorithm to fuse heterologous nighttime images. We would deduced a transfer matrix from improved sift algorithm. The transfer matrix would rapid register heterologous nighttime images. And theαβ weighted algorithm can be applied in any scene. In the video image fusion system, we used the transfer matrix to register every frame and then used αβ weighted method to fuse every frame, which reached the time requirement soft video. The fused video image not only retains the clear target information of infrared video image, but also retains the detail and color information of visible video image and the fused video image can fluency play.

  13. Binocular contrast-gain control for natural scenes: Image structure and phase alignment.

    Science.gov (United States)

    Huang, Pi-Chun; Dai, Yu-Ming

    2018-05-01

    In the context of natural scenes, we applied the pattern-masking paradigm to investigate how image structure and phase alignment affect contrast-gain control in binocular vision. We measured the discrimination thresholds of bandpass-filtered natural-scene images (targets) under various types of pedestals. Our first experiment had four pedestal types: bandpass-filtered pedestals, unfiltered pedestals, notch-filtered pedestals (which enabled removal of the spatial frequency), and misaligned pedestals (which involved rotation of unfiltered pedestals). Our second experiment featured six types of pedestals: bandpass-filtered, unfiltered, and notch-filtered pedestals, and the corresponding phase-scrambled pedestals. The thresholds were compared for monocular, binocular, and dichoptic viewing configurations. The bandpass-filtered pedestal and unfiltered pedestals showed classic dipper shapes; the dipper shapes of the notch-filtered, misaligned, and phase-scrambled pedestals were weak. We adopted a two-stage binocular contrast-gain control model to describe our results. We deduced that the phase-alignment information influenced the contrast-gain control mechanism before the binocular summation stage and that the phase-alignment information and structural misalignment information caused relatively strong divisive inhibition in the monocular and interocular suppression stages. When the pedestals were phase-scrambled, the elimination of the interocular suppression processing was the most convincing explanation of the results. Thus, our results indicated that both phase-alignment information and similar image structures cause strong interocular suppression. Copyright © 2018 Elsevier Ltd. All rights reserved.

  14. Modular integrated video system (MIVS) review station

    International Nuclear Information System (INIS)

    Garcia, M.L.

    1988-01-01

    An unattended video surveillance unit, the Modular Integrated Video System (MIVS), has been developed by Sandia National Laboratories for International Safeguards use. An important support element of this system is a semi-automatic Review Station. Four component modules, including an 8 mm video tape recorder, a 4-inch video monitor, a power supply and control electronics utilizing a liquid crystal display (LCD) are mounted in a suitcase for probability. The unit communicates through the interactive, menu-driven LCD and may be operated on facility power through the world. During surveillance, the MIVS records video information at specified time intervals, while also inserting consecutive scene numbers and tamper event information. Using either of two available modes of operation, the Review Station reads the inserted information and counts the number of missed scenes and/or tamper events encountered on the tapes, and reports this to the user on the LCD. At the end of a review session, the system will summarize the results of the review, stop the recorder, and advise the user of the completion of the review. In addition, the Review Station will check for any video loss on the tape

  15. SIRSALE: integrated video database management tools

    Science.gov (United States)

    Brunie, Lionel; Favory, Loic; Gelas, J. P.; Lefevre, Laurent; Mostefaoui, Ahmed; Nait-Abdesselam, F.

    2002-07-01

    Video databases became an active field of research during the last decade. The main objective in such systems is to provide users with capabilities to friendly search, access and playback distributed stored video data in the same way as they do for traditional distributed databases. Hence, such systems need to deal with hard issues : (a) video documents generate huge volumes of data and are time sensitive (streams must be delivered at a specific bitrate), (b) contents of video data are very hard to be automatically extracted and need to be humanly annotated. To cope with these issues, many approaches have been proposed in the literature including data models, query languages, video indexing etc. In this paper, we present SIRSALE : a set of video databases management tools that allow users to manipulate video documents and streams stored in large distributed repositories. All the proposed tools are based on generic models that can be customized for specific applications using ad-hoc adaptation modules. More precisely, SIRSALE allows users to : (a) browse video documents by structures (sequences, scenes, shots) and (b) query the video database content by using a graphical tool, adapted to the nature of the target video documents. This paper also presents an annotating interface which allows archivists to describe the content of video documents. All these tools are coupled to a video player integrating remote VCR functionalities and are based on active network technology. So, we present how dedicated active services allow an optimized video transport for video streams (with Tamanoir active nodes). We then describe experiments of using SIRSALE on an archive of news video and soccer matches. The system has been demonstrated to professionals with a positive feedback. Finally, we discuss open issues and present some perspectives.

  16. Global scene layout modulates contextual learning in change detection

    Directory of Open Access Journals (Sweden)

    Markus eConci

    2014-02-01

    Full Text Available Change in the visual scene often goes unnoticed – a phenomenon referred to as ‘change blindness’. This study examined whether the hierarchical structure, i.e., the global-local layout of a scene can influence performance in a one-shot change detection paradigm. To this end, natural scenes of a laid breakfast table were presented, and observers were asked to locate the onset of a new local object. Importantly, the global structure of the scene was manipulated by varying the relations among objects in the scene layouts. The very same items were either presented as global-congruent (typical layouts or as global-incongruent (random arrangements. Change blindness was less severe for congruent than for incongruent displays, and this congruency benefit increased with the duration of the experiment. These findings show that global layouts are learned, supporting detection of local changes with enhanced efficiency. However, performance was not affected by scene congruency in a subsequent control experiment that required observers to localize a static discontinuity (i.e., an object that was missing from the repeated layouts. Our results thus show that learning of the global layout is particularly linked to the local objects. Taken together, our results reveal an effect of global precedence in natural scenes. We suggest that relational properties within the hierarchy of a natural scene are governed, in particular, by global image analysis, reducing change blindness for local objects through scene learning.

  17. Global scene layout modulates contextual learning in change detection.

    Science.gov (United States)

    Conci, Markus; Müller, Hermann J

    2014-01-01

    Change in the visual scene often goes unnoticed - a phenomenon referred to as "change blindness." This study examined whether the hierarchical structure, i.e., the global-local layout of a scene can influence performance in a one-shot change detection paradigm. To this end, natural scenes of a laid breakfast table were presented, and observers were asked to locate the onset of a new local object. Importantly, the global structure of the scene was manipulated by varying the relations among objects in the scene layouts. The very same items were either presented as global-congruent (typical) layouts or as global-incongruent (random) arrangements. Change blindness was less severe for congruent than for incongruent displays, and this congruency benefit increased with the duration of the experiment. These findings show that global layouts are learned, supporting detection of local changes with enhanced efficiency. However, performance was not affected by scene congruency in a subsequent control experiment that required observers to localize a static discontinuity (i.e., an object that was missing from the repeated layouts). Our results thus show that learning of the global layout is particularly linked to the local objects. Taken together, our results reveal an effect of "global precedence" in natural scenes. We suggest that relational properties within the hierarchy of a natural scene are governed, in particular, by global image analysis, reducing change blindness for local objects through scene learning.

  18. Roadside video data analysis deep learning

    CERN Document Server

    Verma, Brijesh; Stockwell, David

    2017-01-01

    This book highlights the methods and applications for roadside video data analysis, with a particular focus on the use of deep learning to solve roadside video data segmentation and classification problems. It describes system architectures and methodologies that are specifically built upon learning concepts for roadside video data processing, and offers a detailed analysis of the segmentation, feature extraction and classification processes. Lastly, it demonstrates the applications of roadside video data analysis including scene labelling, roadside vegetation classification and vegetation biomass estimation in fire risk assessment.

  19. Text Line Detection from Rectangle Traffic Panels of Natural Scene

    Science.gov (United States)

    Wang, Shiyuan; Huang, Linlin; Hu, Jian

    2018-01-01

    Traffic sign detection and recognition is very important for Intelligent Transportation. Among traffic signs, traffic panel contains rich information. However, due to low resolution and blur in the rectangular traffic panel, it is difficult to extract the character and symbols. In this paper, we propose a coarse-to-fine method to detect the Chinese character on traffic panels from natural scenes. Given a traffic panel Color Quantization is applied to extract candidate regions of Chinese characters. Second, a multi-stage filter based on learning is applied to discard the non-character regions. Third, we aggregate the characters for text lines by Distance Metric Learning method. Experimental results on real traffic images from Baidu Street View demonstrate the effectiveness of the proposed method.

  20. Differential processing of natural scenes in typical and atypical Alzheimer disease measured with a saccade choice task

    Directory of Open Access Journals (Sweden)

    Muriel eBoucart

    2014-07-01

    Full Text Available Though atrophy of the medial temporal lobe, including structures (hippocampus and parahippocampal cortex that support scene perception and the binding of an object to its context, appears early in Alzheimer disease (AD few studies have investigated scene perception in people with AD. We assessed the ability to find a target object within a natural scene in people with typical AD and in people with atypical AD (posterior cortical atrophy. Pairs of colored photographs were displayed left and right of fixation for one second. Participants were asked to categorize the target (an animal either in moving their eyes toward the photograph containing the target (saccadic choice task or in pressing a key corresponding to the location of the target (manual choice task in separate blocks of trials. For both tasks performance was compared in two conditions: with isolated objects and with objects in scenes. Patients with atypical AD were more impaired to detect a target within a scene than people with typical AD who exhibited a pattern of performance more similar to that of age-matched controls in terms of accuracy, saccade latencies and benefit from contextual information. People with atypical AD benefited less from contextual information in both the saccade and the manual choice tasks suggesting a higher sensitivity to crowding and deficits in figure/ground segregation in people with lesions in posterior areas of the brain.

  1. International video project on natural analogues

    International Nuclear Information System (INIS)

    Guentensperger, Marcel

    1993-01-01

    A natural analogue can be defined as a natural process which has occurred in the past and is studied in order to test predictions about the future evolution of similar processes. In recent years, natural analogues have been used increasingly to test the mathematical models required for repository performance assessment. Analogues are, however, also of considerable use in public relations as they allow many of the principles involved in demonstrating repository safety to be illustrated in a clear manner using natural systems with which man is familiar. The international Natural Analogue Working Group (NAWG), organised under the auspices of the CEC, has recognised that such PR applications are of considerable importance and should be supported from a technical level. At the NAWG meeting in Pitlochry, Scotland (June 1990), it was recommended that the possibilities for making a video film on this topic be investigated and Nagra was requested to take the lead role in setting up such a project

  2. Adherent Raindrop Modeling, Detectionand Removal in Video.

    Science.gov (United States)

    You, Shaodi; Tan, Robby T; Kawakami, Rei; Mukaigawa, Yasuhiro; Ikeuchi, Katsushi

    2016-09-01

    Raindrops adhered to a windscreen or window glass can significantly degrade the visibility of a scene. Modeling, detecting and removing raindrops will, therefore, benefit many computer vision applications, particularly outdoor surveillance systems and intelligent vehicle systems. In this paper, a method that automatically detects and removes adherent raindrops is introduced. The core idea is to exploit the local spatio-temporal derivatives of raindrops. To accomplish the idea, we first model adherent raindrops using law of physics, and detect raindrops based on these models in combination with motion and intensity temporal derivatives of the input video. Having detected the raindrops, we remove them and restore the images based on an analysis that some areas of raindrops completely occludes the scene, and some other areas occlude only partially. For partially occluding areas, we restore them by retrieving as much as possible information of the scene, namely, by solving a blending function on the detected partially occluding areas using the temporal intensity derivative. For completely occluding areas, we recover them by using a video completion technique. Experimental results using various real videos show the effectiveness of our method.

  3. Acoustic scanning of natural scenes by echolocation in the big brown bat, Eptesicus fuscus

    DEFF Research Database (Denmark)

    Surlykke, Annemarie; Ghose, Kaushik; Moss, Cynthia F

    2009-01-01

    Echolocation allows bats to orient and localize prey in complete darkness. The sonar beam of the big brown bat, Eptesicus fuscus, is directional but broad enough to provide audible echo information from within a 60-90 deg. cone. This suggests that the big brown bat could interrogate a natural scene...

  4. Albedo estimation for scene segmentation

    Energy Technology Data Exchange (ETDEWEB)

    Lee, C H; Rosenfeld, A

    1983-03-01

    Standard methods of image segmentation do not take into account the three-dimensional nature of the underlying scene. For example, histogram-based segmentation tacitly assumes that the image intensity is piecewise constant, and this is not true when the scene contains curved surfaces. This paper introduces a method of taking 3d information into account in the segmentation process. The image intensities are adjusted to compensate for the effects of estimated surface orientation; the adjusted intensities can be regarded as reflectivity estimates. When histogram-based segmentation is applied to these new values, the image is segmented into parts corresponding to surfaces of constant reflectivity in the scene. 7 references.

  5. Extended image differencing for change detection in UAV video mosaics

    Science.gov (United States)

    Saur, Günter; Krüger, Wolfgang; Schumann, Arne

    2014-03-01

    Change detection is one of the most important tasks when using unmanned aerial vehicles (UAV) for video reconnaissance and surveillance. We address changes of short time scale, i.e. the observations are taken in time distances from several minutes up to a few hours. Each observation is a short video sequence acquired by the UAV in near-nadir view and the relevant changes are, e.g., recently parked or moved vehicles. In this paper we extend our previous approach of image differencing for single video frames to video mosaics. A precise image-to-image registration combined with a robust matching approach is needed to stitch the video frames to a mosaic. Additionally, this matching algorithm is applied to mosaic pairs in order to align them to a common geometry. The resulting registered video mosaic pairs are the input of the change detection procedure based on extended image differencing. A change mask is generated by an adaptive threshold applied to a linear combination of difference images of intensity and gradient magnitude. The change detection algorithm has to distinguish between relevant and non-relevant changes. Examples for non-relevant changes are stereo disparity at 3D structures of the scene, changed size of shadows, and compression or transmission artifacts. The special effects of video mosaicking such as geometric distortions and artifacts at moving objects have to be considered, too. In our experiments we analyze the influence of these effects on the change detection results by considering several scenes. The results show that for video mosaics this task is more difficult than for single video frames. Therefore, we extended the image registration by estimating an elastic transformation using a thin plate spline approach. The results for mosaics are comparable to that of single video frames and are useful for interactive image exploitation due to a larger scene coverage.

  6. A view not to be missed: Salient scene content interferes with cognitive restoration

    NARCIS (Netherlands)

    van der Jagt, A.P.N.; Craig, Tony; Brewer, Mark J.; Pearson, David G.

    2017-01-01

    Attention Restoration Theory (ART) states that built scenes place greater load on attentional resources than natural scenes. This is explained in terms of "hard" and "soft" fascination of built and natural scenes. Given a lack of direct empirical evidence for this assumption we propose that

  7. Photo-acoustic and video-acoustic methods for sensing distant sound sources

    Science.gov (United States)

    Slater, Dan; Kozacik, Stephen; Kelmelis, Eric

    2017-05-01

    Long range telescopic video imagery of distant terrestrial scenes, aircraft, rockets and other aerospace vehicles can be a powerful observational tool. But what about the associated acoustic activity? A new technology, Remote Acoustic Sensing (RAS), may provide a method to remotely listen to the acoustic activity near these distant objects. Local acoustic activity sometimes weakly modulates the ambient illumination in a way that can be remotely sensed. RAS is a new type of microphone that separates an acoustic transducer into two spatially separated components: 1) a naturally formed in situ acousto-optic modulator (AOM) located within the distant scene and 2) a remote sensing readout device that recovers the distant audio. These two elements are passively coupled over long distances at the speed of light by naturally occurring ambient light energy or other electromagnetic fields. Stereophonic, multichannel and acoustic beam forming are all possible using RAS techniques and when combined with high-definition video imagery it can help to provide a more cinema like immersive viewing experience. A practical implementation of a remote acousto-optic readout device can be a challenging engineering problem. The acoustic influence on the optical signal is generally weak and often with a strong bias term. The optical signal is further degraded by atmospheric seeing turbulence. In this paper, we consider two fundamentally different optical readout approaches: 1) a low pixel count photodiode based RAS photoreceiver and 2) audio extraction directly from a video stream. Most of our RAS experiments to date have used the first method for reasons of performance and simplicity. But there are potential advantages to extracting audio directly from a video stream. These advantages include the straight forward ability to work with multiple AOMs (useful for acoustic beam forming), simpler optical configurations, and a potential ability to use certain preexisting video recordings. However

  8. "Comuniquemonos, Ya]": strengthening interpersonal communication and health through video.

    Science.gov (United States)

    1992-01-01

    The Nutrition Communication Project has overseen production of a training video interpersonal communication for health workers involved in growth monitoring and promotion (GMP) programs in Latin America entitled Comuniquemonos, Ya] Producers used the following questions as their guidelines: Who is the audience?, Why is the training needed?, and What are the objectives and advantages of using video? Communication specialists, anthropologists, educators, and nutritionists worked together to write the script. Then video camera specialists taped the video in Bolivia and Guatemala. A facilitator's guide complete with an outline of an entire workshop comes with the video. The guide encourages trainees to participate in various situations. Trainees are able to compare their interpersonal skills with those of the health workers on the video. Further they can determine cause and effect. The video has 2 scenes to demonstrate poor and good communication skills using the same health worker in both situations. Other scenes highlight 6 communication skills: developing a warm environment, asking questions, sharing results, listening, observing, and doing demonstration. All types of health workers ranging from physicians to community health workers as well as health workers from various countries (Guatemala, Honduras, Bolivia, and Ecuador) approve of the video. Some trainers have used the video without using the guide and comment that it began a debate on communication 's role in GMP efforts.

  9. The singular nature of auditory and visual scene analysis in autism.

    Science.gov (United States)

    Lin, I-Fan; Shirama, Aya; Kato, Nobumasa; Kashino, Makio

    2017-02-19

    Individuals with autism spectrum disorder often have difficulty acquiring relevant auditory and visual information in daily environments, despite not being diagnosed as hearing impaired or having low vision. Resent psychophysical and neurophysiological studies have shown that autistic individuals have highly specific individual differences at various levels of information processing, including feature extraction, automatic grouping and top-down modulation in auditory and visual scene analysis. Comparison of the characteristics of scene analysis between auditory and visual modalities reveals some essential commonalities, which could provide clues about the underlying neural mechanisms. Further progress in this line of research may suggest effective methods for diagnosing and supporting autistic individuals.This article is part of the themed issue 'Auditory and visual scene analysis'. © 2017 The Author(s).

  10. Sensory substitution: the spatial updating of auditory scenes ‘mimics’ the spatial updating of visual scenes

    Directory of Open Access Journals (Sweden)

    Achille ePasqualotto

    2016-04-01

    Full Text Available Visual-to-auditory sensory substitution is used to convey visual information through audition, and it was initially created to compensate for blindness; it consists of software converting the visual images captured by a video-camera into the equivalent auditory images, or ‘soundscapes’. Here, it was used by blindfolded sighted participants to learn the spatial position of simple shapes depicted in images arranged on the floor. Very few studies have used sensory substitution to investigate spatial representation, while it has been widely used to investigate object recognition. Additionally, with sensory substitution we could study the performance of participants actively exploring the environment through audition, rather than passively localising sound sources. Blindfolded participants egocentrically learnt the position of six images by using sensory substitution and then a judgement of relative direction task (JRD was used to determine how this scene was represented. This task consists of imagining being in a given location, oriented in a given direction, and pointing towards the required image. Before performing the JRD task, participants explored a map that provided allocentric information about the scene. Although spatial exploration was egocentric, surprisingly we found that performance in the JRD task was better for allocentric perspectives. This suggests that the egocentric representation of the scene was updated. This result is in line with previous studies using visual and somatosensory scenes, thus supporting the notion that different sensory modalities produce equivalent spatial representation(s. Moreover, our results have practical implications to improve training methods with sensory substitution devices.

  11. Automatic generation of pictorial transcripts of video programs

    Science.gov (United States)

    Shahraray, Behzad; Gibbon, David C.

    1995-03-01

    An automatic authoring system for the generation of pictorial transcripts of video programs which are accompanied by closed caption information is presented. A number of key frames, each of which represents the visual information in a segment of the video (i.e., a scene), are selected automatically by performing a content-based sampling of the video program. The textual information is recovered from the closed caption signal and is initially segmented based on its implied temporal relationship with the video segments. The text segmentation boundaries are then adjusted, based on lexical analysis and/or caption control information, to account for synchronization errors due to possible delays in the detection of scene boundaries or the transmission of the caption information. The closed caption text is further refined through linguistic processing for conversion to lower- case with correct capitalization. The key frames and the related text generate a compact multimedia presentation of the contents of the video program which lends itself to efficient storage and transmission. This compact representation can be viewed on a computer screen, or used to generate the input to a commercial text processing package to generate a printed version of the program.

  12. Scene Categorization in Alzheimer's Disease: A Saccadic Choice Task

    Directory of Open Access Journals (Sweden)

    Quentin Lenoble

    2015-01-01

    Full Text Available Aims: We investigated the performance in scene categorization of patients with Alzheimer's disease (AD using a saccadic choice task. Method: 24 patients with mild AD, 28 age-matched controls and 26 young people participated in the study. The participants were presented pairs of coloured photographs and were asked to make a saccadic eye movement to the picture corresponding to the target scene (natural vs. urban, indoor vs. outdoor. Results: The patients' performance did not differ from chance for natural scenes. Differences between young and older controls and patients with AD were found in accuracy but not saccadic latency. Conclusions: The results are interpreted in terms of cerebral reorganization in the prefrontal and temporo-occipital cortex of patients with AD, but also in terms of impaired processing of visual global properties of scenes.

  13. Portrayal of smoking in Nigerian online videos: a medium for tobacco advertising and promotion?

    Directory of Open Access Journals (Sweden)

    Adegoke Oloruntoba Adelufosi

    2014-09-01

    Full Text Available The Nigerian home video industry, popularly known as Nollywood is a booming industry, with increasing numbers of easily accessible online videos. The aim of this study was to analyse the contents of popular Nigerian online videos to determine the prevalence of smoking imageries and their public health implications. Using specific search terms, popular English language and indigenous Yoruba language, Nigerian home videos uploaded on YouTube in 2013 were identified and sorted based on their view counts. Data on smoking related scenes such as smoking incidents, context of tobacco use, depiction of cigarette brand, gender of smokers and film rating were collected. Of the 60 online videos whose contents were assessed in this study, 26 (43.3% had scenes with cigarrete smoking imageries. The mean (SD smoking incident was 2.7 (1.6, giving an average of one smoking incident for every 26 to 27 min of film. More than half (53.8% of the films with tobacco use had high smoking imageries. An average of 2 characters per film smoked, mostly in association with acts of criminality or prostitution (57.7% and alcohol use (57.7%. There were scenes of the main protagonists smoking in 73.1% of the films with scenes of female protagonists smoking (78.9% more than the male protagonists (21.1%. Smoking imageries are common in popular Nigerian online movies. Given the wide reach of online videos, their potential to be viewed by people from different cultures and to negatively influence youngsters, it is important that smoking portrayals in online movies are controlled.

  14. Photogrammetric Applications of Immersive Video Cameras

    Science.gov (United States)

    Kwiatek, K.; Tokarczyk, R.

    2014-05-01

    The paper investigates immersive videography and its application in close-range photogrammetry. Immersive video involves the capture of a live-action scene that presents a 360° field of view. It is recorded simultaneously by multiple cameras or microlenses, where the principal point of each camera is offset from the rotating axis of the device. This issue causes problems when stitching together individual frames of video separated from particular cameras, however there are ways to overcome it and applying immersive cameras in photogrammetry provides a new potential. The paper presents two applications of immersive video in photogrammetry. At first, the creation of a low-cost mobile mapping system based on Ladybug®3 and GPS device is discussed. The amount of panoramas is much too high for photogrammetric purposes as the base line between spherical panoramas is around 1 metre. More than 92 000 panoramas were recorded in one Polish region of Czarny Dunajec and the measurements from panoramas enable the user to measure the area of outdoors (adverting structures) and billboards. A new law is being created in order to limit the number of illegal advertising structures in the Polish landscape and immersive video recorded in a short period of time is a candidate for economical and flexible measurements off-site. The second approach is a generation of 3d video-based reconstructions of heritage sites based on immersive video (structure from immersive video). A mobile camera mounted on a tripod dolly was used to record the interior scene and immersive video, separated into thousands of still panoramas, was converted from video into 3d objects using Agisoft Photoscan Professional. The findings from these experiments demonstrated that immersive photogrammetry seems to be a flexible and prompt method of 3d modelling and provides promising features for mobile mapping systems.

  15. Innovative Solution to Video Enhancement

    Science.gov (United States)

    2001-01-01

    Through a licensing agreement, Intergraph Government Solutions adapted a technology originally developed at NASA's Marshall Space Flight Center for enhanced video imaging by developing its Video Analyst(TM) System. Marshall's scientists developed the Video Image Stabilization and Registration (VISAR) technology to help FBI agents analyze video footage of the deadly 1996 Olympic Summer Games bombing in Atlanta, Georgia. VISAR technology enhanced nighttime videotapes made with hand-held camcorders, revealing important details about the explosion. Intergraph's Video Analyst System is a simple, effective, and affordable tool for video enhancement and analysis. The benefits associated with the Video Analyst System include support of full-resolution digital video, frame-by-frame analysis, and the ability to store analog video in digital format. Up to 12 hours of digital video can be stored and maintained for reliable footage analysis. The system also includes state-of-the-art features such as stabilization, image enhancement, and convolution to help improve the visibility of subjects in the video without altering underlying footage. Adaptable to many uses, Intergraph#s Video Analyst System meets the stringent demands of the law enforcement industry in the areas of surveillance, crime scene footage, sting operations, and dash-mounted video cameras.

  16. Selection and evaluation of video tape recorders for surveillance applications

    International Nuclear Information System (INIS)

    Martinez, R.L.

    1988-01-01

    Unattended surveillance places unique requirements on video recorders. One such requireemnt, extended operational reliability, often cannot be determined from the manufacturers' data. Subsequent to market surveys and preliminary testing, the Sony 8mm EVO-210 recorder was selected for use in the Modular Integrated Video System (MIVS), while concurrently undergoing extensive reliability testing. A microprocessor based controller was developed to life test and evaluate the performance of the video cassette recorders. The controller has the capability to insert a unique binary count in the vertical interval of the recorder video signal for each scene. This feature allows for automatic verification of the recorded data using a MIVS Review Station. Initially, twenty recorders were subjected to the accelerated lift test, which involves recording one scene (eight video frames) every 15 seconds. The recorders were operated in the exact manner in which they are utilized in the MIVS. This paper describes the results of the preliminary testing, accelerated life test and the extensive testing on 130 Sony EVO-210 recorders

  17. Study on Detection and Localization Algorithm of Traffic Signs from Natural Scenes

    Directory of Open Access Journals (Sweden)

    Xian-Zhong Han

    2014-08-01

    Full Text Available Automatic detection and location of traffic signs is an important part of intelligent transportation, especially for unmanned vehicle technology research. For the morphological feature of China road traffic signs, we propose a traffic sign detection method based on color segmentation and shape analysis. Firstly, in order to solve the problems of traffic signs color cast, distortion, and cross-color in natural scenes, the images are processed by white balance, Retinex color enhancement, and affine transformation. Then, the type of traffic signs is discriminated and detected, according to the color and shape characteristics of traffic signs. The experimental results show that this method can effectively detect and recognize traffic signs.

  18. Correlated Topic Vector for Scene Classification.

    Science.gov (United States)

    Wei, Pengxu; Qin, Fei; Wan, Fang; Zhu, Yi; Jiao, Jianbin; Ye, Qixiang

    2017-07-01

    Scene images usually involve semantic correlations, particularly when considering large-scale image data sets. This paper proposes a novel generative image representation, correlated topic vector, to model such semantic correlations. Oriented from the correlated topic model, correlated topic vector intends to naturally utilize the correlations among topics, which are seldom considered in the conventional feature encoding, e.g., Fisher vector, but do exist in scene images. It is expected that the involvement of correlations can increase the discriminative capability of the learned generative model and consequently improve the recognition accuracy. Incorporated with the Fisher kernel method, correlated topic vector inherits the advantages of Fisher vector. The contributions to the topics of visual words have been further employed by incorporating the Fisher kernel framework to indicate the differences among scenes. Combined with the deep convolutional neural network (CNN) features and Gibbs sampling solution, correlated topic vector shows great potential when processing large-scale and complex scene image data sets. Experiments on two scene image data sets demonstrate that correlated topic vector improves significantly the deep CNN features, and outperforms existing Fisher kernel-based features.

  19. End User Perceptual Distorted Scenes Enhancement Algorithm Using Partition-Based Local Color Values for QoE-Guaranteed IPTV

    Science.gov (United States)

    Kim, Jinsul

    In this letter, we propose distorted scenes enhancement algorithm in order to provide end user perceptual QoE-guaranteed IPTV service. The block edge detection with weight factor and partition-based local color values method can be applied for the degraded video frames which are affected by network transmission errors such as out of order, jitter, and packet loss to improve QoE efficiently. Based on the result of quality metric after using the distorted scenes enhancement algorithm, the distorted scenes have been restored better than others.

  20. Deep video deblurring

    KAUST Repository

    Su, Shuochen

    2016-11-25

    Motion blur from camera shake is a major problem in videos captured by hand-held devices. Unlike single-image deblurring, video-based approaches can take advantage of the abundant information that exists across neighboring frames. As a result the best performing methods rely on aligning nearby frames. However, aligning images is a computationally expensive and fragile procedure, and methods that aggregate information must therefore be able to identify which regions have been accurately aligned and which have not, a task which requires high level scene understanding. In this work, we introduce a deep learning solution to video deblurring, where a CNN is trained end-to-end to learn how to accumulate information across frames. To train this network, we collected a dataset of real videos recorded with a high framerate camera, which we use to generate synthetic motion blur for supervision. We show that the features learned from this dataset extend to deblurring motion blur that arises due to camera shake in a wide range of videos, and compare the quality of results to a number of other baselines.

  1. Hierarchical video summarization based on context clustering

    Science.gov (United States)

    Tseng, Belle L.; Smith, John R.

    2003-11-01

    A personalized video summary is dynamically generated in our video personalization and summarization system based on user preference and usage environment. The three-tier personalization system adopts the server-middleware-client architecture in order to maintain, select, adapt, and deliver rich media content to the user. The server stores the content sources along with their corresponding MPEG-7 metadata descriptions. In this paper, the metadata includes visual semantic annotations and automatic speech transcriptions. Our personalization and summarization engine in the middleware selects the optimal set of desired video segments by matching shot annotations and sentence transcripts with user preferences. Besides finding the desired contents, the objective is to present a coherent summary. There are diverse methods for creating summaries, and we focus on the challenges of generating a hierarchical video summary based on context information. In our summarization algorithm, three inputs are used to generate the hierarchical video summary output. These inputs are (1) MPEG-7 metadata descriptions of the contents in the server, (2) user preference and usage environment declarations from the user client, and (3) context information including MPEG-7 controlled term list and classification scheme. In a video sequence, descriptions and relevance scores are assigned to each shot. Based on these shot descriptions, context clustering is performed to collect consecutively similar shots to correspond to hierarchical scene representations. The context clustering is based on the available context information, and may be derived from domain knowledge or rules engines. Finally, the selection of structured video segments to generate the hierarchical summary efficiently balances between scene representation and shot selection.

  2. Temporal Segmentation of MPEG Video Streams

    Directory of Open Access Journals (Sweden)

    Janko Calic

    2002-06-01

    Full Text Available Many algorithms for temporal video partitioning rely on the analysis of uncompressed video features. Since the information relevant to the partitioning process can be extracted directly from the MPEG compressed stream, higher efficiency can be achieved utilizing information from the MPEG compressed domain. This paper introduces a real-time algorithm for scene change detection that analyses the statistics of the macroblock features extracted directly from the MPEG stream. A method for extraction of the continuous frame difference that transforms the 3D video stream into a 1D curve is presented. This transform is then further employed to extract temporal units within the analysed video sequence. Results of computer simulations are reported.

  3. Photogrammetric Applications of Immersive Video Cameras

    OpenAIRE

    Kwiatek, K.; Tokarczyk, R.

    2014-01-01

    The paper investigates immersive videography and its application in close-range photogrammetry. Immersive video involves the capture of a live-action scene that presents a 360° field of view. It is recorded simultaneously by multiple cameras or microlenses, where the principal point of each camera is offset from the rotating axis of the device. This issue causes problems when stitching together individual frames of video separated from particular cameras, however there are ways to ov...

  4. Rotation-invariant features for multi-oriented text detection in natural images.

    Directory of Open Access Journals (Sweden)

    Cong Yao

    Full Text Available Texts in natural scenes carry rich semantic information, which can be used to assist a wide range of applications, such as object recognition, image/video retrieval, mapping/navigation, and human computer interaction. However, most existing systems are designed to detect and recognize horizontal (or near-horizontal texts. Due to the increasing popularity of mobile-computing devices and applications, detecting texts of varying orientations from natural images under less controlled conditions has become an important but challenging task. In this paper, we propose a new algorithm to detect texts of varying orientations. Our algorithm is based on a two-level classification scheme and two sets of features specially designed for capturing the intrinsic characteristics of texts. To better evaluate the proposed method and compare it with the competing algorithms, we generate a comprehensive dataset with various types of texts in diverse real-world scenes. We also propose a new evaluation protocol, which is more suitable for benchmarking algorithms for detecting texts in varying orientations. Experiments on benchmark datasets demonstrate that our system compares favorably with the state-of-the-art algorithms when handling horizontal texts and achieves significantly enhanced performance on variant texts in complex natural scenes.

  5. Learned Compact Local Feature Descriptor for Tls-Based Geodetic Monitoring of Natural Outdoor Scenes

    Science.gov (United States)

    Gojcic, Z.; Zhou, C.; Wieser, A.

    2018-05-01

    The advantages of terrestrial laser scanning (TLS) for geodetic monitoring of man-made and natural objects are not yet fully exploited. Herein we address one of the open challenges by proposing feature-based methods for identification of corresponding points in point clouds of two or more epochs. We propose a learned compact feature descriptor tailored for point clouds of natural outdoor scenes obtained using TLS. We evaluate our method both on a benchmark data set and on a specially acquired outdoor dataset resembling a simplified monitoring scenario where we successfully estimate 3D displacement vectors of a rock that has been displaced between the scans. We show that the proposed descriptor has the capacity to generalize to unseen data and achieves state-of-the-art performance while being time efficient at the matching step due the low dimension.

  6. Teen videos on YouTube: Features and digital vulnerabilities

    OpenAIRE

    Montes-Vozmediano, Manuel; García-Jiménez, Antonio; Menor-Sendra, Juan

    2018-01-01

    As a mechanism for social participation and integration and for the purpose of building their identity, teens make and share videos on platforms such as YouTube of which they are also content consumers. The vulnerability conditions that occur and the risks to which adolescents are exposed, both as creators and consumers of videos, are the focus of this study. The methodology used is content analysis, applied to 400 videos. This research has worked with manifest variables (such as the scene) a...

  7. Higher-order scene statistics of breast images

    Science.gov (United States)

    Abbey, Craig K.; Sohl-Dickstein, Jascha N.; Olshausen, Bruno A.; Eckstein, Miguel P.; Boone, John M.

    2009-02-01

    Researchers studying human and computer vision have found description and construction of these systems greatly aided by analysis of the statistical properties of naturally occurring scenes. More specifically, it has been found that receptive fields with directional selectivity and bandwidth properties similar to mammalian visual systems are more closely matched to the statistics of natural scenes. It is argued that this allows for sparse representation of the independent components of natural images [Olshausen and Field, Nature, 1996]. These theories have important implications for medical image perception. For example, will a system that is designed to represent the independent components of natural scenes, where objects occlude one another and illumination is typically reflected, be appropriate for X-ray imaging, where features superimpose on one another and illumination is transmissive? In this research we begin to examine these issues by evaluating higher-order statistical properties of breast images from X-ray projection mammography (PM) and dedicated breast computed tomography (bCT). We evaluate kurtosis in responses of octave bandwidth Gabor filters applied to PM and to coronal slices of bCT scans. We find that kurtosis in PM rises and quickly saturates for filter center frequencies with an average value above 0.95. By contrast, kurtosis in bCT peaks near 0.20 cyc/mm with kurtosis of approximately 2. Our findings suggest that the human visual system may be tuned to represent breast tissue more effectively in bCT over a specific range of spatial frequencies.

  8. Design considerations for view interpolation in a 3D video coding framework

    NARCIS (Netherlands)

    Morvan, Y.; Farin, D.S.; With, de P.H.N.; Lagendijk, R.L.; Weber, Jos H.; Berg, van den A.F.M.

    2006-01-01

    A 3D video stream typically consists of a set of views capturing simultaneously the same scene. For an efficient transmission of the 3D video, a compression technique is required. In this paper, we describe a coding architecture and appropriate algorithms that enable the compression and

  9. Object Attention Patches for Text Detection and Recognition in Scene Images using SIFT

    NARCIS (Netherlands)

    Sriman, Bowornrat; Schomaker, Lambertus; De Marsico, Maria; Figueiredo, Mário; Fred, Ana

    2015-01-01

    Natural urban scene images contain many problems for character recognition such as luminance noise, varying font styles or cluttered backgrounds. Detecting and recognizing text in a natural scene is a difficult problem. Several techniques have been proposed to overcome these problems. These are,

  10. The development of hand-centred visual representations in the primate brain: a computer modelling study using natural visual scenes.

    Directory of Open Access Journals (Sweden)

    Juan Manuel Galeazzi

    2015-12-01

    Full Text Available Neurons that respond to visual targets in a hand-centred frame of reference have been found within various areas of the primate brain. We investigate how hand-centred visual representations may develop in a neural network model of the primate visual system called VisNet, when the model is trained on images of the hand seen against natural visual scenes. The simulations show how such neurons may develop through a biologically plausible process of unsupervised competitive learning and self-organisation. In an advance on our previous work, the visual scenes consisted of multiple targets presented simultaneously with respect to the hand. Three experiments are presented. First, VisNet was trained with computerized images consisting of a realistic image of a hand and and a variety of natural objects, presented in different textured backgrounds during training. The network was then tested with just one textured object near the hand in order to verify if the output cells were capable of building hand-centered representations with a single localised receptive field. We explain the underlying principles of the statistical decoupling that allows the output cells of the network to develop single localised receptive fields even when the network is trained with multiple objects. In a second simulation we examined how some of the cells with hand-centred receptive fields decreased their shape selectivity and started responding to a localised region of hand-centred space as the number of objects presented in overlapping locations during training increases. Lastly, we explored the same learning principles training the network with natural visual scenes collected by volunteers. These results provide an important step in showing how single, localised, hand-centered receptive fields could emerge under more ecologically realistic visual training conditions.

  11. Statistics of high-level scene context.

    Science.gov (United States)

    Greene, Michelle R

    2013-01-01

    CONTEXT IS CRITICAL FOR RECOGNIZING ENVIRONMENTS AND FOR SEARCHING FOR OBJECTS WITHIN THEM: contextual associations have been shown to modulate reaction time and object recognition accuracy, as well as influence the distribution of eye movements and patterns of brain activations. However, we have not yet systematically quantified the relationships between objects and their scene environments. Here I seek to fill this gap by providing descriptive statistics of object-scene relationships. A total of 48, 167 objects were hand-labeled in 3499 scenes using the LabelMe tool (Russell et al., 2008). From these data, I computed a variety of descriptive statistics at three different levels of analysis: the ensemble statistics that describe the density and spatial distribution of unnamed "things" in the scene; the bag of words level where scenes are described by the list of objects contained within them; and the structural level where the spatial distribution and relationships between the objects are measured. The utility of each level of description for scene categorization was assessed through the use of linear classifiers, and the plausibility of each level for modeling human scene categorization is discussed. Of the three levels, ensemble statistics were found to be the most informative (per feature), and also best explained human patterns of categorization errors. Although a bag of words classifier had similar performance to human observers, it had a markedly different pattern of errors. However, certain objects are more useful than others, and ceiling classification performance could be achieved using only the 64 most informative objects. As object location tends not to vary as a function of category, structural information provided little additional information. Additionally, these data provide valuable information on natural scene redundancy that can be exploited for machine vision, and can help the visual cognition community to design experiments guided by statistics

  12. The roles of scene gist and spatial dependency among objects in the semantic guidance of attention in real-world scenes.

    Science.gov (United States)

    Wu, Chia-Chien; Wang, Hsueh-Cheng; Pomplun, Marc

    2014-12-01

    A previous study (Vision Research 51 (2011) 1192-1205) found evidence for semantic guidance of visual attention during the inspection of real-world scenes, i.e., an influence of semantic relationships among scene objects on overt shifts of attention. In particular, the results revealed an observer bias toward gaze transitions between semantically similar objects. However, this effect is not necessarily indicative of semantic processing of individual objects but may be mediated by knowledge of the scene gist, which does not require object recognition, or by known spatial dependency among objects. To examine the mechanisms underlying semantic guidance, in the present study, participants were asked to view a series of displays with the scene gist excluded and spatial dependency varied. Our results show that spatial dependency among objects seems to be sufficient to induce semantic guidance. Scene gist, on the other hand, does not seem to affect how observers use semantic information to guide attention while viewing natural scenes. Extracting semantic information mainly based on spatial dependency may be an efficient strategy of the visual system that only adds little cognitive load to the viewing task. Copyright © 2014 Elsevier Ltd. All rights reserved.

  13. Detecting text in natural scenes with multi-level MSER and SWT

    Science.gov (United States)

    Lu, Tongwei; Liu, Renjun

    2018-04-01

    The detection of the characters in the natural scene is susceptible to factors such as complex background, variable viewing angle and diverse forms of language, which leads to poor detection results. Aiming at these problems, a new text detection method was proposed, which consisted of two main stages, candidate region extraction and text region detection. At first stage, the method used multiple scale transformations of original image and multiple thresholds of maximally stable extremal regions (MSER) to detect the text regions which could detect character regions comprehensively. At second stage, obtained SWT maps by using the stroke width transform (SWT) algorithm to compute the candidate regions, then using cascaded classifiers to propose non-text regions. The proposed method was evaluated on the standard benchmark datasets of ICDAR2011 and the datasets that we made our own data sets. The experiment results showed that the proposed method have greatly improved that compared to other text detection methods.

  14. Interaction between scene-based and array-based contextual cueing.

    Science.gov (United States)

    Rosenbaum, Gail M; Jiang, Yuhong V

    2013-07-01

    Contextual cueing refers to the cueing of spatial attention by repeated spatial context. Previous studies have demonstrated distinctive properties of contextual cueing by background scenes and by an array of search items. Whereas scene-based contextual cueing reflects explicit learning of the scene-target association, array-based contextual cueing is supported primarily by implicit learning. In this study, we investigated the interaction between scene-based and array-based contextual cueing. Participants searched for a target that was predicted by both the background scene and the locations of distractor items. We tested three possible patterns of interaction: (1) The scene and the array could be learned independently, in which case cueing should be expressed even when only one cue was preserved; (2) the scene and array could be learned jointly, in which case cueing should occur only when both cues were preserved; (3) overshadowing might occur, in which case learning of the stronger cue should preclude learning of the weaker cue. In several experiments, we manipulated the nature of the contextual cues present during training and testing. We also tested explicit awareness of scenes, scene-target associations, and arrays. The results supported the overshadowing account: Specifically, scene-based contextual cueing precluded array-based contextual cueing when both were predictive of the location of a search target. We suggest that explicit, endogenous cues dominate over implicit cues in guiding spatial attention.

  15. Gamifying Video Object Segmentation.

    Science.gov (United States)

    Spampinato, Concetto; Palazzo, Simone; Giordano, Daniela

    2017-10-01

    Video object segmentation can be considered as one of the most challenging computer vision problems. Indeed, so far, no existing solution is able to effectively deal with the peculiarities of real-world videos, especially in cases of articulated motion and object occlusions; limitations that appear more evident when we compare the performance of automated methods with the human one. However, manually segmenting objects in videos is largely impractical as it requires a lot of time and concentration. To address this problem, in this paper we propose an interactive video object segmentation method, which exploits, on one hand, the capability of humans to identify correctly objects in visual scenes, and on the other hand, the collective human brainpower to solve challenging and large-scale tasks. In particular, our method relies on a game with a purpose to collect human inputs on object locations, followed by an accurate segmentation phase achieved by optimizing an energy function encoding spatial and temporal constraints between object regions as well as human-provided location priors. Performance analysis carried out on complex video benchmarks, and exploiting data provided by over 60 users, demonstrated that our method shows a better trade-off between annotation times and segmentation accuracy than interactive video annotation and automated video object segmentation approaches.

  16. Video segmentation and camera motion characterization using compressed data

    Science.gov (United States)

    Milanese, Ruggero; Deguillaume, Frederic; Jacot-Descombes, Alain

    1997-10-01

    We address the problem of automatically extracting visual indexes from videos, in order to provide sophisticated access methods to the contents of a video server. We focus on tow tasks, namely the decomposition of a video clip into uniform segments, and the characterization of each shot by camera motion parameters. For the first task we use a Bayesian classification approach to detecting scene cuts by analyzing motion vectors. For the second task a least- squares fitting procedure determines the pan/tilt/zoom camera parameters. In order to guarantee the highest processing speed, all techniques process and analyze directly MPEG-1 motion vectors, without need for video decompression. Experimental results are reported for a database of news video clips.

  17. Validity and reliability of naturalistic driving scene categorization Judgments from crowdsourcing.

    Science.gov (United States)

    Cabrall, Christopher D D; Lu, Zhenji; Kyriakidis, Miltos; Manca, Laura; Dijksterhuis, Chris; Happee, Riender; de Winter, Joost

    2018-05-01

    A common challenge with processing naturalistic driving data is that humans may need to categorize great volumes of recorded visual information. By means of the online platform CrowdFlower, we investigated the potential of crowdsourcing to categorize driving scene features (i.e., presence of other road users, straight road segments, etc.) at greater scale than a single person or a small team of researchers would be capable of. In total, 200 workers from 46 different countries participated in 1.5days. Validity and reliability were examined, both with and without embedding researcher generated control questions via the CrowdFlower mechanism known as Gold Test Questions (GTQs). By employing GTQs, we found significantly more valid (accurate) and reliable (consistent) identification of driving scene items from external workers. Specifically, at a small scale CrowdFlower Job of 48 three-second video segments, an accuracy (i.e., relative to the ratings of a confederate researcher) of 91% on items was found with GTQs compared to 78% without. A difference in bias was found, where without GTQs, external workers returned more false positives than with GTQs. At a larger scale CrowdFlower Job making exclusive use of GTQs, 12,862 three-second video segments were released for annotation. Infeasible (and self-defeating) to check the accuracy of each at this scale, a random subset of 1012 categorizations was validated and returned similar levels of accuracy (95%). In the small scale Job, where full video segments were repeated in triplicate, the percentage of unanimous agreement on the items was found significantly more consistent when using GTQs (90%) than without them (65%). Additionally, in the larger scale Job (where a single second of a video segment was overlapped by ratings of three sequentially neighboring segments), a mean unanimity of 94% was obtained with validated-as-correct ratings and 91% with non-validated ratings. Because the video segments overlapped in full for

  18. Picture models for 2-scene comics creating system

    Directory of Open Access Journals (Sweden)

    Miki UENO

    2015-03-01

    Full Text Available Recently, computer understanding pictures and stories becomes one of the most important research topics in computer science. However, there are few researches about human like understanding by computers because pictures have not certain format and contain more lyric aspect than that of natural laguage. For picture understanding, a comic is the suitable target because it is consisted by clear and simple plot of stories and separated scenes.In this paper, we propose 2 different types of picture models for 2-scene comics creating system. We also show the method of the application of 2-scene comics creating system by means of proposed picture model.

  19. Hybrid Reality Lab Capabilities - Video 2

    Science.gov (United States)

    Delgado, Francisco J.; Noyes, Matthew

    2016-01-01

    Our Hybrid Reality and Advanced Operations Lab is developing incredibly realistic and immersive systems that could be used to provide training, support engineering analysis, and augment data collection for various human performance metrics at NASA. To get a better understanding of what Hybrid Reality is, let's go through the two most commonly known types of immersive realities: Virtual Reality, and Augmented Reality. Virtual Reality creates immersive scenes that are completely made up of digital information. This technology has been used to train astronauts at NASA, used during teleoperation of remote assets (arms, rovers, robots, etc.) and other activities. One challenge with Virtual Reality is that if you are using it for real time-applications (like landing an airplane) then the information used to create the virtual scenes can be old (i.e. visualized long after physical objects moved in the scene) and not accurate enough to land the airplane safely. This is where Augmented Reality comes in. Augmented Reality takes real-time environment information (from a camera, or see through window, and places digitally created information into the scene so that it matches with the video/glass information). Augmented Reality enhances real environment information collected with a live sensor or viewport (e.g. camera, window, etc.) with the information-rich visualization provided by Virtual Reality. Hybrid Reality takes Augmented Reality even further, by creating a higher level of immersion where interactivity can take place. Hybrid Reality takes Virtual Reality objects and a trackable, physical representation of those objects, places them in the same coordinate system, and allows people to interact with both objects' representations (virtual and physical) simultaneously. After a short period of adjustment, the individuals begin to interact with all the objects in the scene as if they were real-life objects. The ability to physically touch and interact with digitally created

  20. Theory and practice of perceptual video processing in broadcast encoders for cable, IPTV, satellite, and internet distribution

    Science.gov (United States)

    McCarthy, S.

    2014-02-01

    This paper describes the theory and application of a perceptually-inspired video processing technology that was recently incorporated into professional video encoders now being used by major cable, IPTV, satellite, and internet video service providers. We will present data that show that this perceptual video processing (PVP) technology can improve video compression efficiency by up to 50% for MPEG-2, H.264, and High Efficiency Video Coding (HEVC). The PVP technology described in this paper works by forming predicted eye-tracking attractor maps that indicate how likely it might be that a free viewing person would look at particular area of an image or video. We will introduce in this paper the novel model and supporting theory used to calculate the eye-tracking attractor maps. We will show how the underlying perceptual model was inspired by electrophysiological studies of the vertebrate retina, and will explain how the model incorporates statistical expectations about natural scenes as well as a novel method for predicting error in signal estimation tasks. Finally, we will describe how the eye-tracking attractor maps are created in real time and used to modify video prior to encoding so that it is more compressible but not noticeably different than the original unmodified video.

  1. Automated UAV-based mapping for airborne reconnaissance and video exploitation

    Science.gov (United States)

    Se, Stephen; Firoozfam, Pezhman; Goldstein, Norman; Wu, Linda; Dutkiewicz, Melanie; Pace, Paul; Naud, J. L. Pierre

    2009-05-01

    Airborne surveillance and reconnaissance are essential for successful military missions. Such capabilities are critical for force protection, situational awareness, mission planning, damage assessment and others. UAVs gather huge amount of video data but it is extremely labour-intensive for operators to analyse hours and hours of received data. At MDA, we have developed a suite of tools towards automated video exploitation including calibration, visualization, change detection and 3D reconstruction. The on-going work is to improve the robustness of these tools and automate the process as much as possible. Our calibration tool extracts and matches tie-points in the video frames incrementally to recover the camera calibration and poses, which are then refined by bundle adjustment. Our visualization tool stabilizes the video, expands its field-of-view and creates a geo-referenced mosaic from the video frames. It is important to identify anomalies in a scene, which may include detecting any improvised explosive devices (IED). However, it is tedious and difficult to compare video clips to look for differences manually. Our change detection tool allows the user to load two video clips taken from two passes at different times and flags any changes between them. 3D models are useful for situational awareness, as it is easier to understand the scene by visualizing it in 3D. Our 3D reconstruction tool creates calibrated photo-realistic 3D models from video clips taken from different viewpoints, using both semi-automated and automated approaches. The resulting 3D models also allow distance measurements and line-of- sight analysis.

  2. Improved content aware scene retargeting for retinitis pigmentosa patients

    Directory of Open Access Journals (Sweden)

    Al-Atabany Walid I

    2010-09-01

    Full Text Available Abstract Background In this paper we present a novel scene retargeting technique to reduce the visual scene while maintaining the size of the key features. The algorithm is scalable to implementation onto portable devices, and thus, has potential for augmented reality systems to provide visual support for those with tunnel vision. We therefore test the efficacy of our algorithm on shrinking the visual scene into the remaining field of view for those patients. Methods Simple spatial compression of visual scenes makes objects appear further away. We have therefore developed an algorithm which removes low importance information, maintaining the size of the significant features. Previous approaches in this field have included seam carving, which removes low importance seams from the scene, and shrinkability which dynamically shrinks the scene according to a generated importance map. The former method causes significant artifacts and the latter is inefficient. In this work we have developed a new algorithm, combining the best aspects of both these two previous methods. In particular, our approach is to generate a shrinkability importance map using as seam based approach. We then use it to dynamically shrink the scene in similar fashion to the shrinkability method. Importantly, we have implemented it so that it can be used in real time without prior knowledge of future frames. Results We have evaluated and compared our algorithm to the seam carving and image shrinkability approaches from a content preservation perspective and a compression quality perspective. Also our technique has been evaluated and tested on a trial included 20 participants with simulated tunnel vision. Results show the robustness of our method at reducing scenes up to 50% with minimal distortion. We also demonstrate efficacy in its use for those with simulated tunnel vision of 22 degrees of field of view or less. Conclusions Our approach allows us to perform content aware video

  3. Flexible Human Behavior Analysis Framework for Video Surveillance Applications

    Directory of Open Access Journals (Sweden)

    Weilun Lao

    2010-01-01

    Full Text Available We study a flexible framework for semantic analysis of human motion from surveillance video. Successful trajectory estimation and human-body modeling facilitate the semantic analysis of human activities in video sequences. Although human motion is widely investigated, we have extended such research in three aspects. By adding a second camera, not only more reliable behavior analysis is possible, but it also enables to map the ongoing scene events onto a 3D setting to facilitate further semantic analysis. The second contribution is the introduction of a 3D reconstruction scheme for scene understanding. Thirdly, we perform a fast scheme to detect different body parts and generate a fitting skeleton model, without using the explicit assumption of upright body posture. The extension of multiple-view fusion improves the event-based semantic analysis by 15%–30%. Our proposed framework proves its effectiveness as it achieves a near real-time performance (13–15 frames/second and 6–8 frames/second for monocular and two-view video sequences.

  4. Influences of High-Level Features, Gaze, and Scene Transitions on the Reliability of BOLD Responses to Natural Movie Stimuli

    Science.gov (United States)

    Lu, Kun-Han; Hung, Shao-Chin; Wen, Haiguang; Marussich, Lauren; Liu, Zhongming

    2016-01-01

    Complex, sustained, dynamic, and naturalistic visual stimulation can evoke distributed brain activities that are highly reproducible within and across individuals. However, the precise origins of such reproducible responses remain incompletely understood. Here, we employed concurrent functional magnetic resonance imaging (fMRI) and eye tracking to investigate the experimental and behavioral factors that influence fMRI activity and its intra- and inter-subject reproducibility during repeated movie stimuli. We found that widely distributed and highly reproducible fMRI responses were attributed primarily to the high-level natural content in the movie. In the absence of such natural content, low-level visual features alone in a spatiotemporally scrambled control stimulus evoked significantly reduced degree and extent of reproducible responses, which were mostly confined to the primary visual cortex (V1). We also found that the varying gaze behavior affected the cortical response at the peripheral part of V1 and in the oculomotor network, with minor effects on the response reproducibility over the extrastriate visual areas. Lastly, scene transitions in the movie stimulus due to film editing partly caused the reproducible fMRI responses at widespread cortical areas, especially along the ventral visual pathway. Therefore, the naturalistic nature of a movie stimulus is necessary for driving highly reliable visual activations. In a movie-stimulation paradigm, scene transitions and individuals’ gaze behavior should be taken as potential confounding factors in order to properly interpret cortical activity that supports natural vision. PMID:27564573

  5. Adaptive colour contrast coding in the salamander retina efficiently matches natural scene statistics.

    Directory of Open Access Journals (Sweden)

    Genadiy Vasserman

    Full Text Available The visual system continually adjusts its sensitivity to the statistical properties of the environment through an adaptation process that starts in the retina. Colour perception and processing is commonly thought to occur mainly in high visual areas, and indeed most evidence for chromatic colour contrast adaptation comes from cortical studies. We show that colour contrast adaptation starts in the retina where ganglion cells adjust their responses to the spectral properties of the environment. We demonstrate that the ganglion cells match their responses to red-blue stimulus combinations according to the relative contrast of each of the input channels by rotating their functional response properties in colour space. Using measurements of the chromatic statistics of natural environments, we show that the retina balances inputs from the two (red and blue stimulated colour channels, as would be expected from theoretical optimal behaviour. Our results suggest that colour is encoded in the retina based on the efficient processing of spectral information that matches spectral combinations in natural scenes on the colour processing level.

  6. Tackling action-based video abstraction of animated movies for video browsing

    Science.gov (United States)

    Ionescu, Bogdan; Ott, Laurent; Lambert, Patrick; Coquin, Didier; Pacureanu, Alexandra; Buzuloiu, Vasile

    2010-07-01

    We address the issue of producing automatic video abstracts in the context of the video indexing of animated movies. For a quick browse of a movie's visual content, we propose a storyboard-like summary, which follows the movie's events by retaining one key frame for each specific scene. To capture the shot's visual activity, we use histograms of cumulative interframe distances, and the key frames are selected according to the distribution of the histogram's modes. For a preview of the movie's exciting action parts, we propose a trailer-like video highlight, whose aim is to show only the most interesting parts of the movie. Our method is based on a relatively standard approach, i.e., highlighting action through the analysis of the movie's rhythm and visual activity information. To suit every type of movie content, including predominantly static movies or movies without exciting parts, the concept of action depends on the movie's average rhythm. The efficiency of our approach is confirmed through several end-user studies.

  7. Short report: the effect of expertise in hiking on recognition memory for mountain scenes.

    Science.gov (United States)

    Kawamura, Satoru; Suzuki, Sae; Morikawa, Kazunori

    2007-10-01

    The nature of an expert memory advantage that does not depend on stimulus structure or chunking was examined, using more ecologically valid stimuli in the context of a more natural activity than previously studied domains. Do expert hikers and novice hikers see and remember mountain scenes differently? In the present experiment, 18 novice hikers and 17 expert hikers were presented with 60 photographs of scenes from hiking trails. These scenes differed in the degree of functional aspects that implied some action possibilities or dangers. The recognition test revealed that the memory performance of experts was significantly superior to that of novices for scenes with highly functional aspects. The memory performance for the scenes with few functional aspects did not differ between novices and experts. These results suggest that experts pay more attention to, and thus remember better, scenes with functional meanings than do novices.

  8. Authentication Approaches for Standoff Video Surveillance

    International Nuclear Information System (INIS)

    Baldwin, G.; Sweatt, W.; Thomas, M.

    2015-01-01

    Video surveillance for international nuclear safeguards applications requires authentication, which confirms to an inspector reviewing the surveillance images that both the source and the integrity of those images can be trusted. To date, all such authentication approaches originate at the camera. Camera authentication would not suffice for a ''standoff video'' application, where the surveillance camera views an image piped to it from a distant objective lens. Standoff video might be desired in situations where it does not make sense to expose sensitive and costly camera electronics to contamination, radiation, water immersion, or other adverse environments typical of hot cells, reprocessing facilities, and within spent fuel pools, for example. In this paper, we offer optical architectures that introduce a standoff distance of several metres between the scene and camera. Several schemes enable one to authenticate not only that the extended optical path is secure, but also that the scene is being viewed live. They employ optical components with remotely-operated spectral, temporal, directional, and intensity properties that are under the control of the inspector. If permitted by the facility operator, illuminators, reflectors and polarizers placed in the scene offer further possibilities. Any tampering that would insert an alternative image source for the camera, although undetectable with conventional cryptographic authentication of digital camera data, is easily exposed using the approaches we describe. Sandia National Laboratories is a multi-programme laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000. Support to Sandia National Laboratories provided by the NNSA Next Generation Safeguards Initiative is gratefully acknowledged. SAND2014-3196 A. (author)

  9. A Method for Counting Moving People in Video Surveillance Videos

    Directory of Open Access Journals (Sweden)

    Mario Vento

    2010-01-01

    Full Text Available People counting is an important problem in video surveillance applications. This problem has been faced either by trying to detect people in the scene and then counting them or by establishing a mapping between some scene feature and the number of people (avoiding the complex detection problem. This paper presents a novel method, following this second approach, that is based on the use of SURF features and of an ϵ-SVR regressor provide an estimate of this count. The algorithm takes specifically into account problems due to partial occlusions and to perspective. In the experimental evaluation, the proposed method has been compared with the algorithm by Albiol et al., winner of the PETS 2009 contest on people counting, using the same PETS 2009 database. The provided results confirm that the proposed method yields an improved accuracy, while retaining the robustness of Albiol's algorithm.

  10. A Method for Counting Moving People in Video Surveillance Videos

    Directory of Open Access Journals (Sweden)

    Conte Donatello

    2010-01-01

    Full Text Available People counting is an important problem in video surveillance applications. This problem has been faced either by trying to detect people in the scene and then counting them or by establishing a mapping between some scene feature and the number of people (avoiding the complex detection problem. This paper presents a novel method, following this second approach, that is based on the use of SURF features and of an -SVR regressor provide an estimate of this count. The algorithm takes specifically into account problems due to partial occlusions and to perspective. In the experimental evaluation, the proposed method has been compared with the algorithm by Albiol et al., winner of the PETS 2009 contest on people counting, using the same PETS 2009 database. The provided results confirm that the proposed method yields an improved accuracy, while retaining the robustness of Albiol's algorithm.

  11. A Method for Counting Moving People in Video Surveillance Videos

    Science.gov (United States)

    Conte, Donatello; Foggia, Pasquale; Percannella, Gennaro; Tufano, Francesco; Vento, Mario

    2010-12-01

    People counting is an important problem in video surveillance applications. This problem has been faced either by trying to detect people in the scene and then counting them or by establishing a mapping between some scene feature and the number of people (avoiding the complex detection problem). This paper presents a novel method, following this second approach, that is based on the use of SURF features and of an [InlineEquation not available: see fulltext.]-SVR regressor provide an estimate of this count. The algorithm takes specifically into account problems due to partial occlusions and to perspective. In the experimental evaluation, the proposed method has been compared with the algorithm by Albiol et al., winner of the PETS 2009 contest on people counting, using the same PETS 2009 database. The provided results confirm that the proposed method yields an improved accuracy, while retaining the robustness of Albiol's algorithm.

  12. High-quality and small-capacity e-learning video featuring lecturer-superimposing PC screen images

    Science.gov (United States)

    Nomura, Yoshihiko; Murakami, Michinobu; Sakamoto, Ryota; Sugiura, Tokuhiro; Matsui, Hirokazu; Kato, Norihiko

    2006-10-01

    Information processing and communication technology are progressing quickly, and are prevailing throughout various technological fields. Therefore, the development of such technology should respond to the needs for improvement of quality in the e-learning education system. The authors propose a new video-image compression processing system that ingeniously employs the features of the lecturing scene. While dynamic lecturing scene is shot by a digital video camera, screen images are electronically stored by a PC screen image capturing software in relatively long period at a practical class. Then, a lecturer and a lecture stick are extracted from the digital video images by pattern recognition techniques, and the extracted images are superimposed on the appropriate PC screen images by off-line processing. Thus, we have succeeded to create a high-quality and small-capacity (HQ/SC) video-on-demand educational content featuring the advantages: the high quality of image sharpness, the small electronic file capacity, and the realistic lecturer motion.

  13. Semantic-based surveillance video retrieval.

    Science.gov (United States)

    Hu, Weiming; Xie, Dan; Fu, Zhouyu; Zeng, Wenrong; Maybank, Steve

    2007-04-01

    Visual surveillance produces large amounts of video data. Effective indexing and retrieval from surveillance video databases are very important. Although there are many ways to represent the content of video clips in current video retrieval algorithms, there still exists a semantic gap between users and retrieval systems. Visual surveillance systems supply a platform for investigating semantic-based video retrieval. In this paper, a semantic-based video retrieval framework for visual surveillance is proposed. A cluster-based tracking algorithm is developed to acquire motion trajectories. The trajectories are then clustered hierarchically using the spatial and temporal information, to learn activity models. A hierarchical structure of semantic indexing and retrieval of object activities, where each individual activity automatically inherits all the semantic descriptions of the activity model to which it belongs, is proposed for accessing video clips and individual objects at the semantic level. The proposed retrieval framework supports various queries including queries by keywords, multiple object queries, and queries by sketch. For multiple object queries, succession and simultaneity restrictions, together with depth and breadth first orders, are considered. For sketch-based queries, a method for matching trajectories drawn by users to spatial trajectories is proposed. The effectiveness and efficiency of our framework are tested in a crowded traffic scene.

  14. Impulsive noise removal from color video with morphological filtering

    Science.gov (United States)

    Ruchay, Alexey; Kober, Vitaly

    2017-09-01

    This paper deals with impulse noise removal from color video. The proposed noise removal algorithm employs a switching filtering for denoising of color video; that is, detection of corrupted pixels by means of a novel morphological filtering followed by removal of the detected pixels on the base of estimation of uncorrupted pixels in the previous scenes. With the help of computer simulation we show that the proposed algorithm is able to well remove impulse noise in color video. The performance of the proposed algorithm is compared in terms of image restoration metrics with that of common successful algorithms.

  15. Effect of Viewing Smoking Scenes in Motion Pictures on Subsequent Smoking Desire in Audiences in South Korea.

    Science.gov (United States)

    Sohn, Minsung; Jung, Minsoo

    2017-07-17

    In the modern era of heightened awareness of public health, smoking scenes in movies remain relatively free from public monitoring. The effect of smoking scenes in movies on the promotion of viewers' smoking desire remains unknown. The study aimed to explore whether exposure of adolescent smokers to images of smoking in fılms could stimulate smoking behavior. Data were derived from a national Web-based sample survey of 748 Korean high-school students. Participants aged 16-18 years were randomly assigned to watch three short video clips with or without smoking scenes. After adjusting covariates using propensity score matching, paired sample t test and logistic regression analyses compared the difference in smoking desire before and after exposure of participants to smoking scenes. For male adolescents, cigarette craving was significantly higher in those who watched movies with smoking scenes than in the control group who did not view smoking scenes (t 307.96 =2.066, Pfilms and assigning a smoking-related screening grade to films is warranted. ©Minsung Sohn, Minsoo Jung. Originally published in JMIR Public Health and Surveillance (http://publichealth.jmir.org), 17.07.2017.

  16. Video repairing under variable illumination using cyclic motions.

    Science.gov (United States)

    Jia, Jiaya; Tai, Yu-Wing; Wu, Tai-Pang; Tang, Chi-Keung

    2006-05-01

    This paper presents a complete system capable of synthesizing a large number of pixels that are missing due to occlusion or damage in an uncalibrated input video. These missing pixels may correspond to the static background or cyclic motions of the captured scene. Our system employs user-assisted video layer segmentation, while the main processing in video repair is fully automatic. The input video is first decomposed into the color and illumination videos. The necessary temporal consistency is maintained by tensor voting in the spatio-temporal domain. Missing colors and illumination of the background are synthesized by applying image repairing. Finally, the occluded motions are inferred by spatio-temporal alignment of collected samples at multiple scales. We experimented on our system with some difficult examples with variable illumination, where the capturing camera can be stationary or in motion.

  17. On the contribution of binocular disparity to the long-term memory for natural scenes.

    Directory of Open Access Journals (Sweden)

    Matteo Valsecchi

    Full Text Available Binocular disparity is a fundamental dimension defining the input we receive from the visual world, along with luminance and chromaticity. In a memory task involving images of natural scenes we investigate whether binocular disparity enhances long-term visual memory. We found that forest images studied in the presence of disparity for relatively long times (7s were remembered better as compared to 2D presentation. This enhancement was not evident for other categories of pictures, such as images containing cars and houses, which are mostly identified by the presence of distinctive artifacts rather than by their spatial layout. Evidence from a further experiment indicates that observers do not retain a trace of stereo presentation in long-term memory.

  18. Integration of an open interface PC scene generator using COTS DVI converter hardware

    Science.gov (United States)

    Nordland, Todd; Lyles, Patrick; Schultz, Bret

    2006-05-01

    Commercial-Off-The-Shelf (COTS) personal computer (PC) hardware is increasingly capable of computing high dynamic range (HDR) scenes for military sensor testing at high frame rates. New electro-optical and infrared (EO/IR) scene projectors feature electrical interfaces that can accept the DVI output of these PC systems. However, military Hardware-in-the-loop (HWIL) facilities such as those at the US Army Aviation and Missile Research Development and Engineering Center (AMRDEC) utilize a sizeable inventory of existing projection systems that were designed to use the Silicon Graphics Incorporated (SGI) digital video port (DVP, also known as DVP2 or DD02) interface. To mate the new DVI-based scene generation systems to these legacy projection systems, CG2 Inc., a Quantum3D Company (CG2), has developed a DVI-to-DVP converter called Delta DVP. This device takes progressive scan DVI input, converts it to digital parallel data, and combines and routes color components to derive a 16-bit wide luminance channel replicated on a DVP output interface. The HWIL Functional Area of AMRDEC has developed a suite of modular software to perform deterministic real-time, wave band-specific rendering of sensor scenes, leveraging the features of commodity graphics hardware and open source software. Together, these technologies enable sensor simulation and test facilities to integrate scene generation and projection components with diverse pedigrees.

  19. Multi-Model Estimation Based Moving Object Detection for Aerial Video

    Directory of Open Access Journals (Sweden)

    Yanning Zhang

    2015-04-01

    Full Text Available With the wide development of UAV (Unmanned Aerial Vehicle technology, moving target detection for aerial video has become a popular research topic in the computer field. Most of the existing methods are under the registration-detection framework and can only deal with simple background scenes. They tend to go wrong in the complex multi background scenarios, such as viaducts, buildings and trees. In this paper, we break through the single background constraint and perceive the complex scene accurately by automatic estimation of multiple background models. First, we segment the scene into several color blocks and estimate the dense optical flow. Then, we calculate an affine transformation model for each block with large area and merge the consistent models. Finally, we calculate subordinate degree to multi-background models pixel to pixel for all small area blocks. Moving objects are segmented by means of energy optimization method solved via Graph Cuts. The extensive experimental results on public aerial videos show that, due to multi background models estimation, analyzing each pixel’s subordinate relationship to multi models by energy minimization, our method can effectively remove buildings, trees and other false alarms and detect moving objects correctly.

  20. Content-Aware Video Adaptation under Low-Bitrate Constraint

    Directory of Open Access Journals (Sweden)

    Hsiao Ming-Ho

    2007-01-01

    Full Text Available With the development of wireless network and the improvement of mobile device capability, video streaming is more and more widespread in such an environment. Under the condition of limited resource and inherent constraints, appropriate video adaptations have become one of the most important and challenging issues in wireless multimedia applications. In this paper, we propose a novel content-aware video adaptation in order to effectively utilize resource and improve visual perceptual quality. First, the attention model is derived from analyzing the characteristics of brightness, location, motion vector, and energy features in compressed domain to reduce computation complexity. Then, through the integration of attention model, capability of client device and correlational statistic model, attractive regions of video scenes are derived. The information object- (IOB- weighted rate distortion model is used for adjusting the bit allocation. Finally, the video adaptation scheme dynamically adjusts video bitstream in frame level and object level. Experimental results validate that the proposed scheme achieves better visual quality effectively and efficiently.

  1. Scene incongruity and attention.

    Science.gov (United States)

    Mack, Arien; Clarke, Jason; Erol, Muge; Bert, John

    2017-02-01

    Does scene incongruity, (a mismatch between scene gist and a semantically incongruent object), capture attention and lead to conscious perception? We explored this question using 4 different procedures: Inattention (Experiment 1), Scene description (Experiment 2), Change detection (Experiment 3), and Iconic Memory (Experiment 4). We found no differences between scene incongruity and scene congruity in Experiments 1, 2, and 4, although in Experiment 3 change detection was faster for scenes containing an incongruent object. We offer an explanation for why the change detection results differ from the results of the other three experiments. In all four experiments, participants invariably failed to report the incongruity and routinely mis-described it by normalizing the incongruent object. None of the results supports the claim that semantic incongruity within a scene invariably captures attention and provide strong evidence of the dominant role of scene gist in determining what is perceived. Copyright © 2016 Elsevier Inc. All rights reserved.

  2. Feature diagnosticity and task context shape activity in human scene-selective cortex.

    Science.gov (United States)

    Lowe, Matthew X; Gallivan, Jason P; Ferber, Susanne; Cant, Jonathan S

    2016-01-15

    Scenes are constructed from multiple visual features, yet previous research investigating scene processing has often focused on the contributions of single features in isolation. In the real world, features rarely exist independently of one another and likely converge to inform scene identity in unique ways. Here, we utilize fMRI and pattern classification techniques to examine the interactions between task context (i.e., attend to diagnostic global scene features; texture or layout) and high-level scene attributes (content and spatial boundary) to test the novel hypothesis that scene-selective cortex represents multiple visual features, the importance of which varies according to their diagnostic relevance across scene categories and task demands. Our results show for the first time that scene representations are driven by interactions between multiple visual features and high-level scene attributes. Specifically, univariate analysis of scene-selective cortex revealed that task context and feature diagnosticity shape activity differentially across scene categories. Examination using multivariate decoding methods revealed results consistent with univariate findings, but also evidence for an interaction between high-level scene attributes and diagnostic visual features within scene categories. Critically, these findings suggest visual feature representations are not distributed uniformly across scene categories but are shaped by task context and feature diagnosticity. Thus, we propose that scene-selective cortex constructs a flexible representation of the environment by integrating multiple diagnostically relevant visual features, the nature of which varies according to the particular scene being perceived and the goals of the observer. Copyright © 2015 Elsevier Inc. All rights reserved.

  3. Privacy enabling technology for video surveillance

    Science.gov (United States)

    Dufaux, Frédéric; Ouaret, Mourad; Abdeljaoued, Yousri; Navarro, Alfonso; Vergnenègre, Fabrice; Ebrahimi, Touradj

    2006-05-01

    In this paper, we address the problem privacy in video surveillance. We propose an efficient solution based on transformdomain scrambling of regions of interest in a video sequence. More specifically, the sign of selected transform coefficients is flipped during encoding. We address more specifically the case of Motion JPEG 2000. Simulation results show that the technique can be successfully applied to conceal information in regions of interest in the scene while providing with a good level of security. Furthermore, the scrambling is flexible and allows adjusting the amount of distortion introduced. This is achieved with a small impact on coding performance and negligible computational complexity increase. In the proposed video surveillance system, heterogeneous clients can remotely access the system through the Internet or 2G/3G mobile phone network. Thanks to the inherently scalable Motion JPEG 2000 codestream, the server is able to adapt the resolution and bandwidth of the delivered video depending on the usage environment of the client.

  4. Snapshot spectral and polarimetric imaging; target identification with multispectral video

    Science.gov (United States)

    Bartlett, Brent D.; Rodriguez, Mikel D.

    2013-05-01

    As the number of pixels continue to grow in consumer and scientific imaging devices, it has become feasible to collect the incident light field. In this paper, an imaging device developed around light field imaging is used to collect multispectral and polarimetric imagery in a snapshot fashion. The sensor is described and a video data set is shown highlighting the advantage of snapshot spectral imaging. Several novel computer vision approaches are applied to the video cubes to perform scene characterization and target identification. It is shown how the addition of spectral and polarimetric data to the video stream allows for multi-target identification and tracking not possible with traditional RGB video collection.

  5. Two Distinct Scene-Processing Networks Connecting Vision and Memory.

    Science.gov (United States)

    Baldassano, Christopher; Esteva, Andre; Fei-Fei, Li; Beck, Diane M

    2016-01-01

    A number of regions in the human brain are known to be involved in processing natural scenes, but the field has lacked a unifying framework for understanding how these different regions are organized and interact. We provide evidence from functional connectivity and meta-analyses for a new organizational principle, in which scene processing relies upon two distinct networks that split the classically defined parahippocampal place area (PPA). The first network of strongly connected regions consists of the occipital place area/transverse occipital sulcus and posterior PPA, which contain retinotopic maps and are not strongly coupled to the hippocampus at rest. The second network consists of the caudal inferior parietal lobule, retrosplenial complex, and anterior PPA, which connect to the hippocampus (especially anterior hippocampus), and are implicated in both visual and nonvisual tasks, including episodic memory and navigation. We propose that these two distinct networks capture the primary functional division among scene-processing regions, between those that process visual features from the current view of a scene and those that connect information from a current scene view with a much broader temporal and spatial context. This new framework for understanding the neural substrates of scene-processing bridges results from many lines of research, and makes specific functional predictions.

  6. Video Surveillance of Epilepsy Patients using Color Image Processing

    DEFF Research Database (Denmark)

    Bager, Gitte; Vilic, Kenan; Alving, Jørgen

    2007-01-01

    This report introduces a method for tracking of patients under video surveillance based on a marker system. The patients are not restricted in their movements, which requires a tracking system that can overcome non-ideal scenes e.g. occlusions, very fast movements, lightning issues and other movi...

  7. Video surveillance of epilepsy patients using color image processing

    DEFF Research Database (Denmark)

    Bager, Gitte; Vilic, Kenan; Vilic, Adnan

    2014-01-01

    This paper introduces a method for tracking patients under video surveillance based on a color marker system. The patients are not restricted in their movements, which requires a tracking system that can overcome non-ideal scenes e.g. occlusions, very fast movements, lighting issues and other mov...

  8. The effects of multiview depth video compression on multiview rendering

    NARCIS (Netherlands)

    Merkle, P.; Morvan, Y.; Smolic, A.; Farin, D.S.; Mueller, K.; With, de P.H.N.; Wiegang, T.

    2009-01-01

    This article investigates the interaction between different techniques for depth compression and view synthesis rendering with multiview video plus scene depth data. Two different approaches for depth coding are compared, namely H.264/MVC, using temporal and inter-view reference images for efficient

  9. Local spectral anisotropy is a valid cue for figure-ground organization in natural scenes.

    Science.gov (United States)

    Ramenahalli, Sudarshan; Mihalas, Stefan; Niebur, Ernst

    2014-10-01

    An important step in the process of understanding visual scenes is its organization in different perceptual objects which requires figure-ground segregation. The determination of which side of an occlusion boundary is figure (closer to the observer) and which is ground (further away from the observer) is made through a combination of global cues, like convexity, and local cues, like T-junctions. We here focus on a novel set of local cues in the intensity patterns along occlusion boundaries which we show to differ between figure and ground. Image patches are extracted from natural scenes from two standard image sets along the boundaries of objects and spectral analysis is performed separately on figure and ground. On the figure side, oriented spectral power orthogonal to the occlusion boundary significantly exceeds that parallel to the boundary. This "spectral anisotropy" is present only for higher spatial frequencies, and absent on the ground side. The difference in spectral anisotropy between the two sides of an occlusion border predicts which is the figure and which the background with an accuracy exceeding 60% per patch. Spectral anisotropy of close-by locations along the boundary co-varies but is largely independent over larger distances which allows to combine results from different image regions. Given the low cost of this strictly local computation, we propose that spectral anisotropy along occlusion boundaries is a valuable cue for figure-ground segregation. A data base of images and extracted patches labeled for figure and ground is made freely available. Copyright © 2014 Elsevier Ltd. All rights reserved.

  10. History of Reading Struggles Linked to Enhanced Learning in Low Spatial Frequency Scenes

    Science.gov (United States)

    Schneps, Matthew H.; Brockmole, James R.; Sonnert, Gerhard; Pomplun, Marc

    2012-01-01

    People with dyslexia, who face lifelong struggles with reading, exhibit numerous associated low-level sensory deficits including deficits in focal attention. Countering this, studies have shown that struggling readers outperform typical readers in some visual tasks that integrate distributed information across an expanse. Though such abilities would be expected to facilitate scene memory, prior investigations using the contextual cueing paradigm failed to find corresponding advantages in dyslexia. We suggest that these studies were confounded by task-dependent effects exaggerating known focal attention deficits in dyslexia, and that, if natural scenes were used as the context, advantages would emerge. Here, we investigate this hypothesis by comparing college students with histories of severe lifelong reading difficulties (SR) and typical readers (TR) in contexts that vary attention load. We find no differences in contextual-cueing when spatial contexts are letter-like objects, or when contexts are natural scenes. However, the SR group significantly outperforms the TR group when contexts are low-pass filtered natural scenes [F(3, 39) = 3.15, p<.05]. These findings suggest that perception or memory for low spatial frequency components in scenes is enhanced in dyslexia. These findings are important because they suggest strengths for spatial learning in a population otherwise impaired, carrying implications for the education and support of students who face challenges in school. PMID:22558210

  11. History of reading struggles linked to enhanced learning in low spatial frequency scenes.

    Directory of Open Access Journals (Sweden)

    Matthew H Schneps

    Full Text Available People with dyslexia, who face lifelong struggles with reading, exhibit numerous associated low-level sensory deficits including deficits in focal attention. Countering this, studies have shown that struggling readers outperform typical readers in some visual tasks that integrate distributed information across an expanse. Though such abilities would be expected to facilitate scene memory, prior investigations using the contextual cueing paradigm failed to find corresponding advantages in dyslexia. We suggest that these studies were confounded by task-dependent effects exaggerating known focal attention deficits in dyslexia, and that, if natural scenes were used as the context, advantages would emerge. Here, we investigate this hypothesis by comparing college students with histories of severe lifelong reading difficulties (SR and typical readers (TR in contexts that vary attention load. We find no differences in contextual-cueing when spatial contexts are letter-like objects, or when contexts are natural scenes. However, the SR group significantly outperforms the TR group when contexts are low-pass filtered natural scenes [F(3, 39 = 3.15, p<.05]. These findings suggest that perception or memory for low spatial frequency components in scenes is enhanced in dyslexia. These findings are important because they suggest strengths for spatial learning in a population otherwise impaired, carrying implications for the education and support of students who face challenges in school.

  12. A research proposition for using high definition video in emergency medical services

    OpenAIRE

    Weerakkody, Vishanth; Molnar, Andreea; Irani, Zahir; El-Haddadeh, Ramzi

    2013-01-01

    In emergency situations, communication between the ambulance crew and an emergency department in the hospital can be crucial in determining the best decision for a patient's health. Currently, when an ambulance crew reports at an emergency, paramedics use voice communication from scene of emergency to the hospital. In critical life threatening situations, use of high quality visual images and live video streaming can allow paramedics on the scene of an emergency to take better informed decisi...

  13. Repetition and brain potentials when recognizing natural scenes: task and emotion differences

    Science.gov (United States)

    Bradley, Margaret M.; Codispoti, Maurizio; Karlsson, Marie; Lang, Peter J.

    2013-01-01

    Repetition has long been known to facilitate memory performance, but its effects on event-related potentials (ERPs), measured as an index of recognition memory, are less well characterized. In Experiment 1, effects of both massed and distributed repetition on old–new ERPs were assessed during an immediate recognition test that followed incidental encoding of natural scenes that also varied in emotionality. Distributed repetition at encoding enhanced both memory performance and the amplitude of an old–new ERP difference over centro-parietal sensors. To assess whether these repetition effects reflect encoding or retrieval differences, the recognition task was replaced with passive viewing of old and new pictures in Experiment 2. In the absence of an explicit recognition task, ERPs were completely unaffected by repetition at encoding, and only emotional pictures prompted a modestly enhanced old–new difference. Taken together, the data suggest that repetition facilitates retrieval processes and that, in the absence of an explicit recognition task, differences in old–new ERPs are only apparent for affective cues. PMID:22842817

  14. Perceived Quality of Full HD Video - Subjective Quality Assessment

    Directory of Open Access Journals (Sweden)

    Juraj Bienik

    2016-01-01

    Full Text Available In recent years, an interest in multimedia services has become a global trend and this trend is still rising. The video quality is a very significant part from the bundle of multimedia services, which leads to a requirement for quality assessment in the video domain. Video quality of a streamed video across IP networks is generally influenced by two factors “transmission link imperfection and efficiency of compression standards. This paper deals with subjective video quality assessment and the impact of the compression standards H.264, H.265 and VP9 on perceived video quality of these compression standards. The evaluation is done for four full HD sequences, the difference of scenes is in the content“ distinction is based on Spatial (SI and Temporal (TI Index of test sequences. Finally, experimental results follow up to 30% bitrate reducing of H.265 and VP9 compared with the reference H.264.

  15. Multicamera High Dynamic Range High-Speed Video of Rocket Engine Tests and Launches

    Data.gov (United States)

    National Aeronautics and Space Administration — High-speed video recording of rocket engine tests has several challenges. The scenes that are imaged have both bright and dark regions associated with plume emission...

  16. Key Issues in Modeling of Complex 3D Structures from Video Sequences

    Directory of Open Access Journals (Sweden)

    Shengyong Chen

    2012-01-01

    Full Text Available Construction of three-dimensional structures from video sequences has wide applications for intelligent video analysis. This paper summarizes the key issues of the theory and surveys the recent advances in the state of the art. Reconstruction of a scene object from video sequences often takes the basic principle of structure from motion with an uncalibrated camera. This paper lists the typical strategies and summarizes the typical solutions or algorithms for modeling of complex three-dimensional structures. Open difficult problems are also suggested for further study.

  17. Anticipatory Scene Representation in Preschool Children's Recall and Recognition Memory

    Science.gov (United States)

    Kreindel, Erica; Intraub, Helene

    2017-01-01

    Behavioral and neuroscience research on boundary extension (false memory beyond the edges of a view of a scene) has provided new insights into the constructive nature of scene representation, and motivates questions about development. Early research with children (as young as 6-7 years) was consistent with boundary extension, but relied on an…

  18. Parietal cortex integrates contextual and saliency signals during the encoding of natural scenes in working memory.

    Science.gov (United States)

    Santangelo, Valerio; Di Francesco, Simona Arianna; Mastroberardino, Serena; Macaluso, Emiliano

    2015-12-01

    The Brief presentation of a complex scene entails that only a few objects can be selected, processed indepth, and stored in memory. Both low-level sensory salience and high-level context-related factors (e.g., the conceptual match/mismatch between objects and scene context) contribute to this selection process, but how the interplay between these factors affects memory encoding is largely unexplored. Here, during fMRI we presented participants with pictures of everyday scenes. After a short retention interval, participants judged the position of a target object extracted from the initial scene. The target object could be either congruent or incongruent with the context of the scene, and could be located in a region of the image with maximal or minimal salience. Behaviourally, we found a reduced impact of saliency on visuospatial working memory performance when the target was out-of-context. Encoding-related fMRI results showed that context-congruent targets activated dorsoparietal regions, while context-incongruent targets de-activated the ventroparietal cortex. Saliency modulated activity both in dorsal and ventral regions, with larger context-related effects for salient targets. These findings demonstrate the joint contribution of knowledge-based and saliency-driven attention for memory encoding, highlighting a dissociation between dorsal and ventral parietal regions. © 2015 Wiley Periodicals, Inc.

  19. An evaluation of parent-produced video self-modeling to improve independence in an adolescent with intellectual developmental disorder and an autism spectrum disorder: a controlled case study.

    Science.gov (United States)

    Allen, Keith D; Vatland, Christopher; Bowen, Scott L; Burke, Raymond V

    2015-07-01

    We evaluated a parent-created video self-modeling (VSM) intervention to improve independence in an adolescent diagnosed with Intellectual Developmental Disorder (IDD) and Autism Spectrum Disorder (ASD). In a multiple baseline design across routines, a parent and her 17-year-old daughter created self-modeling videos of three targeted routines needed for independence in the community. The parent used a tablet device with a mobile app called "VideoTote" to produce videos of the daughter performing the targeted routines. The mobile app includes a 30-s tutorial about making modeling videos. The parent and daughter produced and watched a VSM scene prior to performing each of the three routines in an analogue community setting. The adolescent showed marked, immediate, and sustained improvements in performing each routine following the production and implementation of the VSM. Performance was found to generalize to the natural community setting. Results suggest that parents can use available technology to promote community independence for transition age individuals. © The Author(s) 2015.

  20. Video segmentation using keywords

    Science.gov (United States)

    Ton-That, Vinh; Vong, Chi-Tai; Nguyen-Dao, Xuan-Truong; Tran, Minh-Triet

    2018-04-01

    At DAVIS-2016 Challenge, many state-of-art video segmentation methods achieve potential results, but they still much depend on annotated frames to distinguish between background and foreground. It takes a lot of time and efforts to create these frames exactly. In this paper, we introduce a method to segment objects from video based on keywords given by user. First, we use a real-time object detection system - YOLOv2 to identify regions containing objects that have labels match with the given keywords in the first frame. Then, for each region identified from the previous step, we use Pyramid Scene Parsing Network to assign each pixel as foreground or background. These frames can be used as input frames for Object Flow algorithm to perform segmentation on entire video. We conduct experiments on a subset of DAVIS-2016 dataset in half the size of its original size, which shows that our method can handle many popular classes in PASCAL VOC 2012 dataset with acceptable accuracy, about 75.03%. We suggest widely testing by combining other methods to improve this result in the future.

  1. Multimodal Semantics Extraction from User-Generated Videos

    Directory of Open Access Journals (Sweden)

    Francesco Cricri

    2012-01-01

    Full Text Available User-generated video content has grown tremendously fast to the point of outpacing professional content creation. In this work we develop methods that analyze contextual information of multiple user-generated videos in order to obtain semantic information about public happenings (e.g., sport and live music events being recorded in these videos. One of the key contributions of this work is a joint utilization of different data modalities, including such captured by auxiliary sensors during the video recording performed by each user. In particular, we analyze GPS data, magnetometer data, accelerometer data, video- and audio-content data. We use these data modalities to infer information about the event being recorded, in terms of layout (e.g., stadium, genre, indoor versus outdoor scene, and the main area of interest of the event. Furthermore we propose a method that automatically identifies the optimal set of cameras to be used in a multicamera video production. Finally, we detect the camera users which fall within the field of view of other cameras recording at the same public happening. We show that the proposed multimodal analysis methods perform well on various recordings obtained in real sport events and live music performances.

  2. Slow Motion and Zoom in HD Digital Videos Using Fractals

    Directory of Open Access Journals (Sweden)

    Maurizio Murroni

    2009-01-01

    Full Text Available Slow motion replay and spatial zooming are special effects used in digital video rendering. At present, most techniques to perform digital spatial zoom and slow motion are based on interpolation for both enlarging the size of the original pictures and generating additional intermediate frames. Mainly, interpolation is done either by linear or cubic spline functions or by motion estimation/compensation which both can be applied pixel by pixel, or by partitioning frames into blocks. Purpose of this paper is to present an alternative technique combining fractals theory and wavelet decomposition to achieve spatial zoom and slow motion replay of HD digital color video sequences. Fast scene change detection, active scene detection, wavelet subband analysis, and color fractal coding based on Earth Mover's Distance (EMD measure are used to reduce computational load and to improve visual quality. Experiments show that the proposed scheme achieves better results in terms of overall visual quality compared to the state-of-the-art techniques.

  3. Memory-guided attention during active viewing of edited dynamic scenes.

    Science.gov (United States)

    Valuch, Christian; König, Peter; Ansorge, Ulrich

    2017-01-01

    Films, TV shows, and other edited dynamic scenes contain many cuts, which are abrupt transitions from one video shot to the next. Cuts occur within or between scenes, and often join together visually and semantically related shots. Here, we tested to which degree memory for the visual features of the precut shot facilitates shifting attention to the postcut shot. We manipulated visual similarity across cuts, and measured how this affected covert attention (Experiment 1) and overt attention (Experiments 2 and 3). In Experiments 1 and 2, participants actively viewed a target movie that randomly switched locations with a second, distractor movie at the time of the cuts. In Experiments 1 and 2, participants were able to deploy attention more rapidly and accurately to the target movie's continuation when visual similarity was high than when it was low. Experiment 3 tested whether this could be explained by stimulus-driven (bottom-up) priming by feature similarity, using one clip at screen center that was followed by two alternative continuations to the left and right. Here, even the highest similarity across cuts did not capture attention. We conclude that following cuts of high visual similarity, memory-guided attention facilitates the deployment of attention, but this effect is (top-down) dependent on the viewer's active matching of scene content across cuts.

  4. Visual narratives : free-hand sketch for visual search and navigation of video.

    OpenAIRE

    James, Stuart

    2016-01-01

    Humans have an innate ability to communicate visually; the earliest forms of communication were cave drawings, and children can communicate visual descriptions of scenes through drawings well before they can write. Drawings and sketches offer an intuitive and efficient means for communicating visual concepts. Today, society faces a deluge of digital visual content driven by a surge in the generation of video on social media and the online availability of video archives. Mobile devices are...

  5. Special effects used in creating 3D animated scenes-part 1

    Science.gov (United States)

    Avramescu, A. M.

    2015-11-01

    In present, with the help of computer, we can create special effects that look so real that we almost don't perceive them as being different. These special effects are somehow hard to differentiate from the real elements like those on the screen. With the increasingly accesible 3D field that has more and more areas of application, the 3D technology goes easily from architecture to product designing. Real like 3D animations are used as means of learning, for multimedia presentations of big global corporations, for special effects and even for virtual actors in movies. Technology, as part of the movie art, is considered a prerequisite but the cinematography is the first art that had to wait for the correct intersection of technological development, innovation and human vision in order to attain full achievement. Increasingly more often, the majority of industries is using 3D sequences (three dimensional). 3D represented graphics, commercials and special effects from movies are all designed in 3D. The key for attaining real visual effects is to successfully combine various distinct elements: characters, objects, images and video scenes; like all these elements represent a whole that works in perfect harmony. This article aims to exhibit a game design from these days. Considering the advanced technology and futuristic vision of designers, nowadays we have different and multifarious game models. Special effects are decisively contributing in the creation of a realistic three-dimensional scene. These effects are essential for transmitting the emotional state of the scene. Creating the special effects is a work of finesse in order to achieve high quality scenes. Special effects can be used to get the attention of the onlooker on an object from a scene. Out of the conducted study, the best-selling game of the year 2010 was Call of Duty: Modern Warfare 2. This way, the article aims for the presented scene to be similar with many locations from this type of games, more

  6. Utilising E-on Vue and Unity 3D scenes to generate synthetic images and videos for visible signature analysis

    Science.gov (United States)

    Madden, Christopher S.; Richards, Noel J.; Culpepper, Joanne B.

    2016-10-01

    This paper investigates the ability to develop synthetic scenes in an image generation tool, E-on Vue, and a gaming engine, Unity 3D, which can be used to generate synthetic imagery of target objects across a variety of conditions in land environments. Developments within these tools and gaming engines have allowed the computer gaming industry to dramatically enhance the realism of the games they develop; however they utilise short cuts to ensure that the games run smoothly in real-time to create an immersive effect. Whilst these short cuts may have an impact upon the realism of the synthetic imagery, they do promise a much more time efficient method of developing imagery of different environmental conditions and to investigate the dynamic aspect of military operations that is currently not evaluated in signature analysis. The results presented investigate how some of the common image metrics used in target acquisition modelling, namely the Δμ1, Δμ2, Δμ3, RSS, and Doyle metrics, perform on the synthetic scenes generated by E-on Vue and Unity 3D compared to real imagery of similar scenes. An exploration of the time required to develop the various aspects of the scene to enhance its realism are included, along with an overview of the difficulties associated with trying to recreate specific locations as a virtual scene. This work is an important start towards utilising virtual worlds for visible signature evaluation, and evaluating how equivalent synthetic imagery is to real photographs.

  7. Identifying sports videos using replay, text, and camera motion features

    Science.gov (United States)

    Kobla, Vikrant; DeMenthon, Daniel; Doermann, David S.

    1999-12-01

    Automated classification of digital video is emerging as an important piece of the puzzle in the design of content management systems for digital libraries. The ability to classify videos into various classes such as sports, news, movies, or documentaries, increases the efficiency of indexing, browsing, and retrieval of video in large databases. In this paper, we discuss the extraction of features that enable identification of sports videos directly from the compressed domain of MPEG video. These features include detecting the presence of action replays, determining the amount of scene text in vide, and calculating various statistics on camera and/or object motion. The features are derived from the macroblock, motion,and bit-rate information that is readily accessible from MPEG video with very minimal decoding, leading to substantial gains in processing speeds. Full-decoding of selective frames is required only for text analysis. A decision tree classifier built using these features is able to identify sports clips with an accuracy of about 93 percent.

  8. Attaching Hollywood to a Surveillant Assemblage: Normalizing Discourses of Video Surveillance

    Directory of Open Access Journals (Sweden)

    Randy K Lippert

    2015-10-01

    Full Text Available This article examines video surveillance images in Hollywood film. It moves beyond previous accounts of video surveillance in relation to film by theoretically situating the use of these surveillance images in a broader “surveillant assemblage”. To this end, scenes from a sample of thirty-five (35 films of several genres are examined to discern dominant discourses and how they lend themselves to normalization of video surveillance. Four discourses are discovered and elaborated by providing examples from Hollywood films. While the films provide video surveillance with a positive associative association it is not without nuance and limitations. Thus, it is found that some forms of resistance to video surveillance are shown while its deterrent effect is not. It is ultimately argued that Hollywood film is becoming attached to a video surveillant assemblage discursively through these normalizing discourses as well as structurally to the extent actual video surveillance technology to produce the images is used.

  9. Adaptation of facial synthesis to parameter analysis in MPEG-4 visual communication

    Science.gov (United States)

    Yu, Lu; Zhang, Jingyu; Liu, Yunhai

    2000-12-01

    In MPEG-4, Facial Definition Parameters (FDPs) and Facial Animation Parameters (FAPs) are defined to animate 1 a facial object. Most of the previous facial animation reconstruction systems were focused on synthesizing animation from manually or automatically generated FAPs but not the FAPs extracted from natural video scene. In this paper, an analysis-synthesis MPEG-4 visual communication system is established, in which facial animation is reconstructed from FAPs extracted from natural video scene.

  10. Exploring inter-frame correlation analysis and wavelet-domain modeling for real-time caption detection in streaming video

    Science.gov (United States)

    Li, Jia; Tian, Yonghong; Gao, Wen

    2008-01-01

    In recent years, the amount of streaming video has grown rapidly on the Web. Often, retrieving these streaming videos offers the challenge of indexing and analyzing the media in real time because the streams must be treated as effectively infinite in length, thus precluding offline processing. Generally speaking, captions are important semantic clues for video indexing and retrieval. However, existing caption detection methods often have difficulties to make real-time detection for streaming video, and few of them concern on the differentiation of captions from scene texts and scrolling texts. In general, these texts have different roles in streaming video retrieval. To overcome these difficulties, this paper proposes a novel approach which explores the inter-frame correlation analysis and wavelet-domain modeling for real-time caption detection in streaming video. In our approach, the inter-frame correlation information is used to distinguish caption texts from scene texts and scrolling texts. Moreover, wavelet-domain Generalized Gaussian Models (GGMs) are utilized to automatically remove non-text regions from each frame and only keep caption regions for further processing. Experiment results show that our approach is able to offer real-time caption detection with high recall and low false alarm rate, and also can effectively discern caption texts from the other texts even in low resolutions.

  11. Hierarchical vs non-hierarchical audio indexation and classification for video genres

    Science.gov (United States)

    Dammak, Nouha; BenAyed, Yassine

    2018-04-01

    In this paper, Support Vector Machines (SVMs) are used for segmenting and indexing video genres based on only audio features extracted at block level, which has a prominent asset by capturing local temporal information. The main contribution of our study is to show the wide effect on the classification accuracies while using an hierarchical categorization structure based on Mel Frequency Cepstral Coefficients (MFCC) audio descriptor. In fact, the classification consists in three common video genres: sports videos, music clips and news scenes. The sub-classification may divide each genre into several multi-speaker and multi-dialect sub-genres. The validation of this approach was carried out on over 360 minutes of video span yielding a classification accuracy of over 99%.

  12. Scene Integration Without Awareness: No Conclusive Evidence for Processing Scene Congruency During Continuous Flash Suppression.

    Science.gov (United States)

    Moors, Pieter; Boelens, David; van Overwalle, Jaana; Wagemans, Johan

    2016-07-01

    A recent study showed that scenes with an object-background relationship that is semantically incongruent break interocular suppression faster than scenes with a semantically congruent relationship. These results implied that semantic relations between the objects and the background of a scene could be extracted in the absence of visual awareness of the stimulus. In the current study, we assessed the replicability of this finding and tried to rule out an alternative explanation dependent on low-level differences between the stimuli. Furthermore, we used a Bayesian analysis to quantify the evidence in favor of the presence or absence of a scene-congruency effect. Across three experiments, we found no convincing evidence for a scene-congruency effect or a modulation of scene congruency by scene inversion. These findings question the generalizability of previous observations and cast doubt on whether genuine semantic processing of object-background relationships in scenes can manifest during interocular suppression. © The Author(s) 2016.

  13. Markerless client-server augmented reality system with natural features

    Science.gov (United States)

    Ning, Shuangning; Sang, Xinzhu; Chen, Duo

    2017-10-01

    A markerless client-server augmented reality system is presented. In this research, the more extensive and mature virtual reality head-mounted display is adopted to assist the implementation of augmented reality. The viewer is provided an image in front of their eyes with the head-mounted display. The front-facing camera is used to capture video signals into the workstation. The generated virtual scene is merged with the outside world information received from the camera. The integrated video is sent to the helmet display system. The distinguishing feature and novelty is to realize the augmented reality with natural features instead of marker, which address the limitations of the marker, such as only black and white, the inapplicability of different environment conditions, and particularly cannot work when the marker is partially blocked. Further, 3D stereoscopic perception of virtual animation model is achieved. The high-speed and stable socket native communication method is adopted for transmission of the key video stream data, which can reduce the calculation burden of the system.

  14. An Indoor Scene Recognition-Based 3D Registration Mechanism for Real-Time AR-GIS Visualization in Mobile Applications

    Directory of Open Access Journals (Sweden)

    Wei Ma

    2018-03-01

    Full Text Available Mobile Augmented Reality (MAR systems are becoming ideal platforms for visualization, permitting users to better comprehend and interact with spatial information. Subsequently, this technological development, in turn, has prompted efforts to enhance mechanisms for registering virtual objects in real world contexts. Most existing AR 3D Registration techniques lack the scene recognition capabilities needed to describe accurately the positioning of virtual objects in scenes representing reality. Moreover, the application of such registration methods in indoor AR-GIS systems is further impeded by the limited capacity of these systems to detect the geometry and semantic information in indoor environments. In this paper, we propose a novel method for fusing virtual objects and indoor scenes, based on indoor scene recognition technology. To accomplish scene fusion in AR-GIS, we first detect key points in reference images. Then, we perform interior layout extraction using a Fully Connected Networks (FCN algorithm to acquire layout coordinate points for the tracking targets. We detect and recognize the target scene in a video frame image to track targets and estimate the camera pose. In this method, virtual 3D objects are fused precisely to a real scene, according to the camera pose and the previously extracted layout coordinate points. Our results demonstrate that this approach enables accurate fusion of virtual objects with representations of real world indoor environments. Based on this fusion technique, users can better grasp virtual three-dimensional representations on an AR-GIS platform.

  15. Scene construction in schizophrenia.

    Science.gov (United States)

    Raffard, Stéphane; D'Argembeau, Arnaud; Bayard, Sophie; Boulenger, Jean-Philippe; Van der Linden, Martial

    2010-09-01

    Recent research has revealed that schizophrenia patients are impaired in remembering the past and imagining the future. In this study, we examined patients' ability to engage in scene construction (i.e., the process of mentally generating and maintaining a complex and coherent scene), which is a key part of retrieving past experiences and episodic future thinking. 24 participants with schizophrenia and 25 healthy controls were asked to imagine new fictitious experiences and described their mental representations of the scenes in as much detail as possible. Descriptions were scored according to various dimensions (e.g., sensory details, spatial reference), and participants also provided ratings of their subjective experience when imagining the scenes (e.g., their sense of presence, the perceived similarity of imagined events to past experiences). Imagined scenes contained less phenomenological details (d = 1.11) and were more fragmented (d = 2.81) in schizophrenia patients compared to controls. Furthermore, positive symptoms were positively correlated to the sense of presence (r = .43) and the perceived similarity of imagined events to past episodes (r = .47), whereas negative symptoms were negatively related to the overall richness of the imagined scenes (r = -.43). The results suggest that schizophrenic patients' impairments in remembering the past and imagining the future are, at least in part, due to deficits in the process of scene construction. The relationships between the characteristics of imagined scenes and positive and negative symptoms could be related to reality monitoring deficits and difficulties in strategic retrieval processes, respectively. Copyright 2010 APA, all rights reserved.

  16. Neural activation and memory for natural scenes: Explicit and spontaneous retrieval.

    Science.gov (United States)

    Weymar, Mathias; Bradley, Margaret M; Sege, Christopher T; Lang, Peter J

    2018-05-06

    Stimulus repetition elicits either enhancement or suppression in neural activity, and a recent fMRI meta-analysis of repetition effects for visual stimuli (Kim, 2017) reported cross-stimulus repetition enhancement in medial and lateral parietal cortex, as well as regions of prefrontal, temporal, and posterior cingulate cortex. Repetition enhancement was assessed here for repeated and novel scenes presented in the context of either an explicit episodic recognition task or an implicit judgment task, in order to study the role of spontaneous retrieval of episodic memories. Regardless of whether episodic memory was explicitly probed or not, repetition enhancement was found in medial posterior parietal (precuneus/cuneus), lateral parietal cortex (angular gyrus), as well as in medial prefrontal cortex (frontopolar), which did not differ by task. Enhancement effects in the posterior cingulate cortex were significantly larger during explicit compared to implicit task, primarily due to a lack of functional activity for new scenes. Taken together, the data are consistent with an interpretation that medial and (ventral) lateral parietal cortex are associated with spontaneous episodic retrieval, whereas posterior cingulate cortical regions may reflect task or decision processes. © 2018 Society for Psychophysiological Research.

  17. Walk This Way: Improving Pedestrian Agent-Based Models through Scene Activity Analysis

    Directory of Open Access Journals (Sweden)

    Andrew Crooks

    2015-09-01

    Full Text Available Pedestrian movement is woven into the fabric of urban regions. With more people living in cities than ever before, there is an increased need to understand and model how pedestrians utilize and move through space for a variety of applications, ranging from urban planning and architecture to security. Pedestrian modeling has been traditionally faced with the challenge of collecting data to calibrate and validate such models of pedestrian movement. With the increased availability of mobility datasets from video surveillance and enhanced geolocation capabilities in consumer mobile devices we are now presented with the opportunity to change the way we build pedestrian models. Within this paper we explore the potential that such information offers for the improvement of agent-based pedestrian models. We introduce a Scene- and Activity-Aware Agent-Based Model (SA2-ABM, a method for harvesting scene activity information in the form of spatiotemporal trajectories, and incorporate this information into our models. In order to assess and evaluate the improvement offered by such information, we carry out a range of experiments using real-world datasets. We demonstrate that the use of real scene information allows us to better inform our model and enhance its predictive capabilities.

  18. Action adaptation during natural unfolding social scenes influences action recognition and inferences made about actor beliefs.

    Science.gov (United States)

    Keefe, Bruce D; Wincenciak, Joanna; Jellema, Tjeerd; Ward, James W; Barraclough, Nick E

    2016-07-01

    When observing another individual's actions, we can both recognize their actions and infer their beliefs concerning the physical and social environment. The extent to which visual adaptation influences action recognition and conceptually later stages of processing involved in deriving the belief state of the actor remains unknown. To explore this we used virtual reality (life-size photorealistic actors presented in stereoscopic three dimensions) to see how visual adaptation influences the perception of individuals in naturally unfolding social scenes at increasingly higher levels of action understanding. We presented scenes in which one actor picked up boxes (of varying number and weight), after which a second actor picked up a single box. Adaptation to the first actor's behavior systematically changed perception of the second actor. Aftereffects increased with the duration of the first actor's behavior, declined exponentially over time, and were independent of view direction. Inferences about the second actor's expectation of box weight were also distorted by adaptation to the first actor. Distortions in action recognition and actor expectations did not, however, extend across different actions, indicating that adaptation is not acting at an action-independent abstract level but rather at an action-dependent level. We conclude that although adaptation influences more complex inferences about belief states of individuals, this is likely to be a result of adaptation at an earlier action recognition stage rather than adaptation operating at a higher, more abstract level in mentalizing or simulation systems.

  19. The singular nature of auditory and visual scene analysis in autism

    OpenAIRE

    Lin, I.-Fan; Shirama, Aya; Kato, Nobumasa; Kashino, Makio

    2017-01-01

    Individuals with autism spectrum disorder often have difficulty acquiring relevant auditory and visual information in daily environments, despite not being diagnosed as hearing impaired or having low vision. Resent psychophysical and neurophysiological studies have shown that autistic individuals have highly specific individual differences at various levels of information processing, including feature extraction, automatic grouping and top-down modulation in auditory and visual scene analysis...

  20. A video for teaching english tenses

    Directory of Open Access Journals (Sweden)

    Frida Unsiah

    2017-04-01

    Students of English Language Education Program in Faculty of Cultural Studies Universitas Brawijaya ideally master Grammar before taking the degree of Sarjana Pendidikan. However, the fact shows that they are still weak in Grammar especially tenses. Therefore, the researchers initiate to develop a video as a media to teach tenses. Objectively, by using video, students get better understanding on tenses so that they can communicate using English accurately and contextually. To develop the video, the researchers used ADDIE model (Analysis, Design, Development, Implementation, Evaluation. First, the researchers analyzed the students’ learning need to determine the product that would be developed, in this case was a movie about English tenses. Then, the researchers developed a video as the product. The product then was validated by media expert who validated attractiveness, typography, audio, image, and usefulness and content expert and validated by a content expert who validated the language aspects and tenses of English used by the actors in the video dealing with the grammar content, pronunciation, and fluency performed by the actors. The result of validation shows that the video developed was considered good. Theoretically, it is appropriate to be used English Grammar classes. However, the media expert suggests that it still needs some improvement for the next development especially dealing with the synchronization between lips movement and sound on the scenes while the content expert suggests that the Grammar content of the video should focus on one tense only to provide more detailed concept of the tense.

  1. Perceptual learning during action video game playing.

    Science.gov (United States)

    Green, C Shawn; Li, Renjie; Bavelier, Daphne

    2010-04-01

    Action video games have been shown to enhance behavioral performance on a wide variety of perceptual tasks, from those that require effective allocation of attentional resources across the visual scene, to those that demand the successful identification of fleetingly presented stimuli. Importantly, these effects have not only been shown in expert action video game players, but a causative link has been established between action video game play and enhanced processing through training studies. Although an account based solely on attention fails to capture the variety of enhancements observed after action game playing, a number of models of perceptual learning are consistent with the observed results, with behavioral modeling favoring the hypothesis that avid video game players are better able to form templates for, or extract the relevant statistics of, the task at hand. This may suggest that the neural site of learning is in areas where information is integrated and actions are selected; yet changes in low-level sensory areas cannot be ruled out. Copyright © 2009 Cognitive Science Society, Inc.

  2. Detection of Upscale-Crop and Partial Manipulation in Surveillance Video Based on Sensor Pattern Noise

    Science.gov (United States)

    Hyun, Dai-Kyung; Ryu, Seung-Jin; Lee, Hae-Yeoun; Lee, Heung-Kyu

    2013-01-01

    In many court cases, surveillance videos are used as significant court evidence. As these surveillance videos can easily be forged, it may cause serious social issues, such as convicting an innocent person. Nevertheless, there is little research being done on forgery of surveillance videos. This paper proposes a forensic technique to detect forgeries of surveillance video based on sensor pattern noise (SPN). We exploit the scaling invariance of the minimum average correlation energy Mellin radial harmonic (MACE-MRH) correlation filter to reliably unveil traces of upscaling in videos. By excluding the high-frequency components of the investigated video and adaptively choosing the size of the local search window, the proposed method effectively localizes partially manipulated regions. Empirical evidence from a large database of test videos, including RGB (Red, Green, Blue)/infrared video, dynamic-/static-scene video and compressed video, indicates the superior performance of the proposed method. PMID:24051524

  3. Degraded visual environment image/video quality metrics

    Science.gov (United States)

    Baumgartner, Dustin D.; Brown, Jeremy B.; Jacobs, Eddie L.; Schachter, Bruce J.

    2014-06-01

    A number of image quality metrics (IQMs) and video quality metrics (VQMs) have been proposed in the literature for evaluating techniques and systems for mitigating degraded visual environments. Some require both pristine and corrupted imagery. Others require patterned target boards in the scene. None of these metrics relates well to the task of landing a helicopter in conditions such as a brownout dust cloud. We have developed and used a variety of IQMs and VQMs related to the pilot's ability to detect hazards in the scene and to maintain situational awareness. Some of these metrics can be made agnostic to sensor type. Not only are the metrics suitable for evaluating algorithm and sensor variation, they are also suitable for choosing the most cost effective solution to improve operating conditions in degraded visual environments.

  4. Alcohol and substance use portrayals in Nigerian video tapes: an analysis of 479 films and implications for public drug education.

    Science.gov (United States)

    Aina, Olatunji F; Olorunshola, Derin A

    There is an observed increasing trend of substance use among the adolescents and young adults. One of the important aetiologies is "modeling" especially from popular artists portraying their use to the viewing public over the electronic media. Indigenous films on video tapes acted in English or "Yoruba" (a popular Nigerian language) were randomly selected from various retail outlets in Lagos for viewing. The settings were the Ikorodu and Ipaja suburbs of Lagos. The viewing audience in each center was made up of a researcher and two adolescent secondary school students. They were to make notes on each film with scenes of substance use, type, and nature of use. A total of 479 video tapes were studied over a 6 month period, of which 268 (55.9%) contained scenes portraying the use of one or more substances. Two hundred forty-seven (51.6%, N = 479) depicted the use of only one type of substance and the rest, 21 (4.3%, N = 479), portrayed the use of multiple substances. The commonest substance portrayed to be used was alcohol, 197 (41.1%, N = 479), followed by tobacco, 81 (16.9%, N = 479). Cannabis was shown to be used in only 3 (0.6%, N = 479); Cocaine and Heroin in 8 (1.6%, N = 479) of the films. There was no statistically significant difference on substance use portrayal between the home movies acted in English and Yoruba (chi2 = 32.8; df = 7 at p > or = 0.05). A significant number of films on video tapes in Nigeria portrayed substance use which could act as triggers or reinforcement for substance use among the viewing audience, especially adolescents and young adults. The need to censor video tapes on substance use portrayal was advocated.

  5. Scene-Based Contextual Cueing in Pigeons

    Science.gov (United States)

    Wasserman, Edward A.; Teng, Yuejia; Brooks, Daniel I.

    2014-01-01

    Repeated pairings of a particular visual context with a specific location of a target stimulus facilitate target search in humans. We explored an animal model of such contextual cueing. Pigeons had to peck a target which could appear in one of four locations on color photographs of real-world scenes. On half of the trials, each of four scenes was consistently paired with one of four possible target locations; on the other half of the trials, each of four different scenes was randomly paired with the same four possible target locations. In Experiments 1 and 2, pigeons exhibited robust contextual cueing when the context preceded the target by 1 s to 8 s, with reaction times to the target being shorter on predictive-scene trials than on random-scene trials. Pigeons also responded more frequently during the delay on predictive-scene trials than on random-scene trials; indeed, during the delay on predictive-scene trials, pigeons predominately pecked toward the location of the upcoming target, suggesting that attentional guidance contributes to contextual cueing. In Experiment 3, involving left-right and top-bottom scene reversals, pigeons exhibited stronger control by global than by local scene cues. These results attest to the robustness and associative basis of contextual cueing in pigeons. PMID:25546098

  6. Panoramic Search: The Interaction of Memory and Vision in Search through a Familiar Scene

    Science.gov (United States)

    Oliva, Aude; Wolfe, Jeremy M. Arsenio, Helga C.

    2004-01-01

    How do observers search through familiar scenes? A novel panoramic search method is used to study the interaction of memory and vision in natural search behavior. In panoramic search, observers see part of an unchanging scene larger than their current field of view. A target object can be visible, present in the display but hidden from view, or…

  7. Camera Control and Geo-Registration for Video Sensor Networks

    Science.gov (United States)

    Davis, James W.

    With the use of large video networks, there is a need to coordinate and interpret the video imagery for decision support systems with the goal of reducing the cognitive and perceptual overload of human operators. We present computer vision strategies that enable efficient control and management of cameras to effectively monitor wide-coverage areas, and examine the framework within an actual multi-camera outdoor urban video surveillance network. First, we construct a robust and precise camera control model for commercial pan-tilt-zoom (PTZ) video cameras. In addition to providing a complete functional control mapping for PTZ repositioning, the model can be used to generate wide-view spherical panoramic viewspaces for the cameras. Using the individual camera control models, we next individually map the spherical panoramic viewspace of each camera to a large aerial orthophotograph of the scene. The result provides a unified geo-referenced map representation to permit automatic (and manual) video control and exploitation of cameras in a coordinated manner. The combined framework provides new capabilities for video sensor networks that are of significance and benefit to the broad surveillance/security community.

  8. An Efficient Fractal Video Sequences Codec with Multiviews

    Directory of Open Access Journals (Sweden)

    Shiping Zhu

    2013-01-01

    Full Text Available Multiview video consists of multiple views of the same scene. They require enormous amount of data to achieve high image quality, which makes it indispensable to compress multiview video. Therefore, data compression is a major issue for multiviews. In this paper, we explore an efficient fractal video codec to compress multiviews. The proposed scheme first compresses a view-dependent geometry of the base view using fractal video encoder with homogeneous region condition. With the extended fractional pel motion estimation algorithm and fast disparity estimation algorithm, it then generates prediction images of other views. The prediction image uses the image-based rendering techniques based on the decoded video. And the residual signals are obtained by the prediction image and the original image. Finally, it encodes residual signals by the fractal video encoder. The idea is also to exploit the statistical dependencies from both temporal and interview reference pictures for motion compensated prediction. Experimental results show that the proposed algorithm is consistently better than JMVC8.5, with 62.25% bit rate decrease and 0.37 dB PSNR increase based on the Bjontegaard metric, and the total encoding time (TET of the proposed algorithm is reduced by 92%.

  9. Slow motion in films and video clips: Music influences perceived duration and emotion, autonomic physiological activation and pupillary responses.

    Science.gov (United States)

    Wöllner, Clemens; Hammerschmidt, David; Albrecht, Henning

    2018-01-01

    Slow motion scenes are ubiquitous in screen-based audiovisual media and are typically accompanied by emotional music. The strong effects of slow motion on observers are hypothetically related to heightened emotional states in which time seems to pass more slowly. These states are simulated in films and video clips, and seem to resemble such experiences in daily life. The current study investigated time perception and emotional response to media clips containing decelerated human motion, with or without music using psychometric and psychophysiological testing methods. Participants were presented with slow-motion scenes taken from commercial films, ballet and sports footage, as well as the same scenes converted to real-time. Results reveal that slow-motion scenes, compared to adapted real-time scenes, led to systematic underestimations of duration, lower perceived arousal but higher valence, lower respiration rates and smaller pupillary diameters. The presence of music compared to visual-only presentations strongly affected results in terms of higher accuracy in duration estimates, higher perceived arousal and valence, higher physiological activation and larger pupillary diameters, indicating higher arousal. Video genre affected responses in addition. These findings suggest that perceiving slow motion is not related to states of high arousal, but rather affects cognitive dimensions of perceived time and valence. Music influences these experiences profoundly, thus strengthening the impact of stretched time in audiovisual media.

  10. Multimodal computational attention for scene understanding and robotics

    CERN Document Server

    Schauerte, Boris

    2016-01-01

    This book presents state-of-the-art computational attention models that have been successfully tested in diverse application areas and can build the foundation for artificial systems to efficiently explore, analyze, and understand natural scenes. It gives a comprehensive overview of the most recent computational attention models for processing visual and acoustic input. It covers the biological background of visual and auditory attention, as well as bottom-up and top-down attentional mechanisms and discusses various applications. In the first part new approaches for bottom-up visual and acoustic saliency models are presented and applied to the task of audio-visual scene exploration of a robot. In the second part the influence of top-down cues for attention modeling is investigated. .

  11. The Processing Speed of Scene Categorization at Multiple Levels of Description: The Superordinate Advantage Revisited.

    Science.gov (United States)

    Banno, Hayaki; Saiki, Jun

    2015-03-01

    Recent studies have sought to determine which levels of categories are processed first in visual scene categorization and have shown that the natural and man-made superordinate-level categories are understood faster than are basic-level categories. The current study examined the robustness of the superordinate-level advantage in a visual scene categorization task. A go/no-go categorization task was evaluated with response time distribution analysis using an ex-Gaussian template. A visual scene was categorized as either superordinate or basic level, and two basic-level categories forming a superordinate category were judged as either similar or dissimilar to each other. First, outdoor/ indoor groups and natural/man-made were used as superordinate categories to investigate whether the advantage could be generalized beyond the natural/man-made boundary. Second, a set of images forming a superordinate category was manipulated. We predicted that decreasing image set similarity within the superordinate-level category would work against the speed advantage. We found that basic-level categorization was faster than outdoor/indoor categorization when the outdoor category comprised dissimilar basic-level categories. Our results indicate that the superordinate-level advantage in visual scene categorization is labile across different categories and category structures. © 2015 SAGE Publications.

  12. Broadcast court-net sports video analysis using fast 3-D camera modeling

    NARCIS (Netherlands)

    Han, Jungong; Farin, D.S.; With, de P.H.N.

    2008-01-01

    This paper addresses the automatic analysis of court-net sports video content. We extract information about the players, the playing-field in a bottom-up way until we reach scene-level semantic concepts. Each part of our framework is general, so that the system is applicable to several kinds of

  13. Only "efficient" emotional stimuli affect the content of working memory during free-recollection from natural scenes.

    Science.gov (United States)

    Buttafuoco, Arianna; Pedale, Tiziana; Buchanan, Tony W; Santangelo, Valerio

    2018-02-01

    Emotional events are thought to have privileged access to attention and memory, consuming resources needed to encode competing emotionally neutral stimuli. However, it is not clear whether this detrimental effect is automatic or depends on the successful maintenance of the specific emotional object within working memory. Here, participants viewed everyday scenes including an emotional object among other neutral objects followed by a free-recollection task. Results showed that emotional objects-irrespective of their perceptual saliency-were recollected more often than neutral objects. The probability of being recollected increased as a function of the arousal of the emotional objects, specifically for negative objects. Successful recollection of emotional objects (positive or negative) from a scene reduced the overall number of recollected neutral objects from the same scene. This indicates that only emotional stimuli that are efficient in grabbing (and then consuming) available attentional resources play a crucial role during the encoding of competing information, with a subsequent bias in the recollection of neutral representations.

  14. Eye Movement Control in Scene Viewing and Reading: Evidence from the Stimulus Onset Delay Paradigm

    Science.gov (United States)

    Luke, Steven G.; Nuthmann, Antje; Henderson, John M.

    2013-01-01

    The present study used the stimulus onset delay paradigm to investigate eye movement control in reading and in scene viewing in a within-participants design. Short onset delays (0, 25, 50, 200, and 350 ms) were chosen to simulate the type of natural processing difficulty encountered in reading and scene viewing. Fixation duration increased…

  15. Short-term change detection for UAV video

    Science.gov (United States)

    Saur, Günter; Krüger, Wolfgang

    2012-11-01

    In the last years, there has been an increased use of unmanned aerial vehicles (UAV) for video reconnaissance and surveillance. An important application in this context is change detection in UAV video data. Here we address short-term change detection, in which the time between observations ranges from several minutes to a few hours. We distinguish this task from video motion detection (shorter time scale) and from long-term change detection, based on time series of still images taken between several days, weeks, or even years. Examples for relevant changes we are looking for are recently parked or moved vehicles. As a pre-requisite, a precise image-to-image registration is needed. Images are selected on the basis of the geo-coordinates of the sensor's footprint and with respect to a certain minimal overlap. The automatic imagebased fine-registration adjusts the image pair to a common geometry by using a robust matching approach to handle outliers. The change detection algorithm has to distinguish between relevant and non-relevant changes. Examples for non-relevant changes are stereo disparity at 3D structures of the scene, changed length of shadows, and compression or transmission artifacts. To detect changes in image pairs we analyzed image differencing, local image correlation, and a transformation-based approach (multivariate alteration detection). As input we used color and gradient magnitude images. To cope with local misalignment of image structures we extended the approaches by a local neighborhood search. The algorithms are applied to several examples covering both urban and rural scenes. The local neighborhood search in combination with intensity and gradient magnitude differencing clearly improved the results. Extended image differencing performed better than both the correlation based approach and the multivariate alternation detection. The algorithms are adapted to be used in semi-automatic workflows for the ABUL video exploitation system of Fraunhofer

  16. The modular integrated video system (MIVS)

    International Nuclear Information System (INIS)

    Schneider, S.L.; Sonnier, C.S.

    1987-01-01

    The Modular Integrated Video System (MIVS) is being developed for the International Atomic Energy Agency (IAEA) for use in facilities where mains power is available and the separation of the Camera and Recording Control Unit is desirable. The system is being developed under the US Program for Technical Assistance to the IAEA Safeguards (POTAS). The MIVS is designed to be a user-friendly system, allowing operation with minimal effort and training. The system software, through the use of a Liquid Crystal Display (LCD) and four soft keys, leads the inspector through the setup procedures to accomplish the intended surveillance or maintenance task. Review of surveillance data is accomplished with the use of a Portable Review Station. This Review Station will aid the inspector in the review process and determine the number of missed video scenes during a surveillance period

  17. Video System for Viewing From a Remote or Windowless Cockpit

    Science.gov (United States)

    Banerjee, Amamath

    2009-01-01

    A system of electronic hardware and software synthesizes, in nearly real time, an image of a portion of a scene surveyed by as many as eight video cameras aimed, in different directions, at portions of the scene. This is a prototype of systems that would enable a pilot to view the scene outside a remote or windowless cockpit. The outputs of the cameras are digitized. Direct memory addressing is used to store the data of a few captured images in sequence, and the sequence is repeated in cycles. Cylindrical warping is used in merging adjacent images at their borders to construct a mosaic image of the scene. The mosaic-image data are written to a memory block from which they can be rendered on a head-mounted display (HMD) device. A subsystem in the HMD device tracks the direction of gaze of the wearer, providing data that are used to select, for display, the portion of the mosaic image corresponding to the direction of gaze. The basic functionality of the system has been demonstrated by mounting the cameras on the roof of a van and steering the van by use of the images presented on the HMD device.

  18. Advances in top-down and bottom-up approaches to video-based camera tracking

    OpenAIRE

    Marimón Sanjuán, David

    2007-01-01

    Video-based camera tracking consists in trailing the three dimensional pose followed by a mobile camera using video as sole input. In order to estimate the pose of a camera with respect to a real scene, one or more three dimensional references are needed. Examples of such references are landmarks with known geometric shape, or objects for which a model is generated beforehand. By comparing what is seen by a camera with what is geometrically known from reality, it is possible to recover the po...

  19. Advances in top-down and bottom-up approaches to video-based camera tracking

    OpenAIRE

    Marimón Sanjuán, David; Ebrahimi, Touradj

    2008-01-01

    Video-based camera tracking consists in trailing the three dimensional pose followed by a mobile camera using video as sole input. In order to estimate the pose of a camera with respect to a real scene, one or more three dimensional references are needed. Examples of such references are landmarks with known geometric shape, or objects for which a model is generated beforehand. By comparing what is seen by a camera with what is geometrically known from reality, it is possible to recover the po...

  20. Logarithmic r-θ mapping for hybrid optical neural network filter for multiple objects recognition within cluttered scenes

    Science.gov (United States)

    Kypraios, Ioannis; Young, Rupert C. D.; Chatwin, Chris R.; Birch, Phil M.

    2009-04-01

    θThe window unit in the design of the complex logarithmic r-θ mapping for hybrid optical neural network filter can allow multiple objects of the same class to be detected within the input image. Additionally, the architecture of the neural network unit of the complex logarithmic r-θ mapping for hybrid optical neural network filter becomes attractive for accommodating the recognition of multiple objects of different classes within the input image by modifying the output layer of the unit. We test the overall filter for multiple objects of the same and of different classes' recognition within cluttered input images and video sequences of cluttered scenes. Logarithmic r-θ mapping for hybrid optical neural network filter is shown to exhibit with a single pass over the input data simultaneously in-plane rotation, out-of-plane rotation, scale, log r-θ map translation and shift invariance, and good clutter tolerance by recognizing correctly the different objects within the cluttered scenes. We record in our results additional extracted information from the cluttered scenes about the objects' relative position, scale and in-plane rotation.

  1. Automatically assessing properties of dynamic cameras for camera selection and rapid deployment of video-content-analysis tasks in large-scale ad-hoc networks

    NARCIS (Netherlands)

    Hollander R.J.M. den; Bouma, H.; Rest, J.H.C. van; Hove, J.M. ten; Haar, F.B. ter; Burghouts, G.J.

    2017-01-01

    Video analytics is essential for managing large quantities of raw data that are produced by video surveillance systems (VSS) for the prevention, repression and investigation of crime and terrorism. Analytics is highly sensitive to changes in the scene, and for changes in the optical chain so a VSS

  2. Developmental Changes in Attention to Faces and Bodies in Static and Dynamic Scenes

    Directory of Open Access Journals (Sweden)

    Brenda M Stoesz

    2014-03-01

    Full Text Available Typically developing individuals show a strong visual preference for faces and face-like stimuli; however, this may come at the expense of attending to bodies or to other aspects of a scene. The primary goal of the present study was to provide additional insight into the development of attentional mechanisms that underlie perception of real people in naturalistic scenes. We examined the looking behaviours of typical children, adolescents, and young adults as they viewed static and dynamic scenes depicting one or more people. Overall, participants showed a bias to attend to faces more than on other parts of the scenes. Adding motion cues led to a reduction in the number, but an increase in the average duration of face fixations in single-character scenes. When multiple characters appeared in a scene, motion-related effects were attenuated and participants shifted their gaze from faces to bodies, or made off-screen glances. Children showed the largest effects related to the introduction of motion cues or additional characters, suggesting that they find dynamic faces difficult to process, and are especially prone to look away from faces when viewing complex social scenes – a strategy that could reduce the cognitive and the affective load imposed by having to divide one’s attention between multiple faces. Our findings provide new insights into the typical development of social attention during natural scene viewing, and lay the foundation for future work examining gaze behaviours in typical and atypical development.

  3. Development of an emergency medical video multiplexing transport system. Aiming at the nation wide prehospital care on ambulance.

    Science.gov (United States)

    Nagatuma, Hideaki

    2003-04-01

    The Emergency Medical Video Multiplexing Transport System (EMTS) is designed to support prehospital cares by delivering high quality live video streams of patients in an ambulance to emergency doctors in a remote hospital via satellite communications. The important feature is that EMTS divides a patient's live video scene into four pieces and transports the four video streams on four separate network channels. By multiplexing four video streams, EMTS is able to transport high quality videos through low data transmission rate networks such as satellite communications and cellular phone networks. In order to transport live video streams constantly, EMTS adopts Real-time Transport Protocol/Real-time Control Protocol as a network protocol and video stream data are compressed by Moving Picture Experts Group 4 format. As EMTS combines four video streams with checking video frame numbers, it uses a refresh packet that initializes server's frame numbers to synchronize the four video streams.

  4. An effective method of collecting practical knowledge by presentation of videos and related words

    Directory of Open Access Journals (Sweden)

    Satoshi Shimada

    2017-12-01

    Full Text Available The concentration of practical knowledge and experiential knowledge in the form of collective intelligence (the wisdom of the crowd is of interest in the area of skill transfer. Previous studies have confirmed that collective intelligence can be formed through the utilization of video annotation systems where knowledge that is recalled while watching videos of work tasks can be assigned in the form of a comment. The knowledge that can be collected is limited, however, to the content that can be depicted in videos, meaning that it is necessary to prepare many videos when collecting knowledge. This paper proposes a method for expanding the scope of recall from the same video through the automatic generation and simultaneous display of related words and video scenes. Further, the validity of the proposed method is empirically illustrated through the example of a field experiment related to mountaineering skills.

  5. Words Matter: Scene Text for Image Classification and Retrieval

    NARCIS (Netherlands)

    Karaoglu, S.; Tao, R.; Gevers, T.; Smeulders, A.W.M.

    Text in natural images typically adds meaning to an object or scene. In particular, text specifies which business places serve drinks (e.g., cafe, teahouse) or food (e.g., restaurant, pizzeria), and what kind of service is provided (e.g., massage, repair). The mere presence of text, its words, and

  6. Water surface modeling from a single viewpoint video.

    Science.gov (United States)

    Li, Chuan; Pickup, David; Saunders, Thomas; Cosker, Darren; Marshall, David; Hall, Peter; Willis, Philip

    2013-07-01

    We introduce a video-based approach for producing water surface models. Recent advances in this field output high-quality results but require dedicated capturing devices and only work in limited conditions. In contrast, our method achieves a good tradeoff between the visual quality and the production cost: It automatically produces a visually plausible animation using a single viewpoint video as the input. Our approach is based on two discoveries: first, shape from shading (SFS) is adequate to capture the appearance and dynamic behavior of the example water; second, shallow water model can be used to estimate a velocity field that produces complex surface dynamics. We will provide qualitative evaluation of our method and demonstrate its good performance across a wide range of scenes.

  7. Security training with interactive laser-video-disk technology

    International Nuclear Information System (INIS)

    Wilson, D.

    1988-01-01

    DOE, through its contractor EG and G Energy Measurements, Inc., has developed a state-of-the-art interactive-video system for use at the Department of Energy's Central Training Academy. Called the Security Training and Evaluation Shooting System (STRESS), the computer-driven decision shooting system employs the latest is laservideo-disk technology. STRESS is designed to provide realistic and stressful training for security inspectors employed by the DOE and its contractors. The system uses wide-screen video projection, sophisticated scenario-branching technology, and customized video scenarios especially designed for the DOE. Firing a weapon that has been modified to shoot ''laser bullets,'' and wearing a special vest that detects ''hits'': the security inspector encounters adversaries on the wide screen who can shoot or be shot by the inspector in scenarios that demand fast decisions. Based on those decisions, the computer provides instantaneous branching to different scenes, giving the inspector confrontational training with the realism and variability of real life

  8. Multimodal Feature Learning for Video Captioning

    Directory of Open Access Journals (Sweden)

    Sujin Lee

    2018-01-01

    Full Text Available Video captioning refers to the task of generating a natural language sentence that explains the content of the input video clips. This study proposes a deep neural network model for effective video captioning. Apart from visual features, the proposed model learns additionally semantic features that describe the video content effectively. In our model, visual features of the input video are extracted using convolutional neural networks such as C3D and ResNet, while semantic features are obtained using recurrent neural networks such as LSTM. In addition, our model includes an attention-based caption generation network to generate the correct natural language captions based on the multimodal video feature sequences. Various experiments, conducted with the two large benchmark datasets, Microsoft Video Description (MSVD and Microsoft Research Video-to-Text (MSR-VTT, demonstrate the performance of the proposed model.

  9. Progress in the development of a video-based wind farm simulation technique

    OpenAIRE

    Robotham, AJ

    1992-01-01

    The progress in the development of a video-based wind farm simulation technique is reviewed. While improvements have been achieved in the quality of the composite picture created by combining computer generated animation sequences of wind turbines with background scenes of the wind farm site, extending the technique to include camera movements has proved troublesome.

  10. Detection and localization of copy-paste forgeries in digital videos.

    Science.gov (United States)

    Singh, Raahat Devender; Aggarwal, Naveen

    2017-12-01

    Amidst the continual march of technology, we find ourselves relying on digital videos to proffer visual evidence in several highly sensitive areas such as journalism, politics, civil and criminal litigation, and military and intelligence operations. However, despite being an indispensable source of information with high evidentiary value, digital videos are also extremely vulnerable to conscious manipulations. Therefore, in a situation where dependence on video evidence is unavoidable, it becomes crucial to authenticate the contents of this evidence before accepting them as an accurate depiction of reality. Digital videos can suffer from several kinds of manipulations, but perhaps, one of the most consequential forgeries is copy-paste forgery, which involves insertion/removal of objects into/from video frames. Copy-paste forgeries alter the information presented by the video scene, which has a direct effect on our basic understanding of what that scene represents, and so, from a forensic standpoint, the challenge of detecting such forgeries is especially significant. In this paper, we propose a sensor pattern noise based copy-paste detection scheme, which is an improved and forensically stronger version of an existing noise-residue based technique. We also study a demosaicing artifact based image forensic scheme to estimate the extent of its viability in the domain of video forensics. Furthermore, we suggest a simplistic clustering technique for the detection of copy-paste forgeries, and determine if it possess the capabilities desired of a viable and efficacious video forensic scheme. Finally, we validate these schemes on a set of realistically tampered MJPEG, MPEG-2, MPEG-4, and H.264/AVC encoded videos in a diverse experimental set-up by varying the strength of post-production re-compressions and transcodings, bitrates, and sizes of the tampered regions. Such an experimental set-up is representative of a neutral testing platform and simulates a real

  11. The capture and recreation of 3D auditory scenes

    Science.gov (United States)

    Li, Zhiyun

    The main goal of this research is to develop the theory and implement practical tools (in both software and hardware) for the capture and recreation of 3D auditory scenes. Our research is expected to have applications in virtual reality, telepresence, film, music, video games, auditory user interfaces, and sound-based surveillance. The first part of our research is concerned with sound capture via a spherical microphone array. The advantage of this array is that it can be steered into any 3D directions digitally with the same beampattern. We develop design methodologies to achieve flexible microphone layouts, optimal beampattern approximation and robustness constraint. We also design novel hemispherical and circular microphone array layouts for more spatially constrained auditory scenes. Using the captured audio, we then propose a unified and simple approach for recreating them by exploring the reciprocity principle that is satisfied between the two processes. Our approach makes the system easy to build, and practical. Using this approach, we can capture the 3D sound field by a spherical microphone array and recreate it using a spherical loudspeaker array, and ensure that the recreated sound field matches the recorded field up to a high order of spherical harmonics. For some regular or semi-regular microphone layouts, we design an efficient parallel implementation of the multi-directional spherical beamformer by using the rotational symmetries of the beampattern and of the spherical microphone array. This can be implemented in either software or hardware and easily adapted for other regular or semi-regular layouts of microphones. In addition, we extend this approach for headphone-based system. Design examples and simulation results are presented to verify our algorithms. Prototypes are built and tested in real-world auditory scenes.

  12. Associative Processing Is Inherent in Scene Perception

    Science.gov (United States)

    Aminoff, Elissa M.; Tarr, Michael J.

    2015-01-01

    How are complex visual entities such as scenes represented in the human brain? More concretely, along what visual and semantic dimensions are scenes encoded in memory? One hypothesis is that global spatial properties provide a basis for categorizing the neural response patterns arising from scenes. In contrast, non-spatial properties, such as single objects, also account for variance in neural responses. The list of critical scene dimensions has continued to grow—sometimes in a contradictory manner—coming to encompass properties such as geometric layout, big/small, crowded/sparse, and three-dimensionality. We demonstrate that these dimensions may be better understood within the more general framework of associative properties. That is, across both the perceptual and semantic domains, features of scene representations are related to one another through learned associations. Critically, the components of such associations are consistent with the dimensions that are typically invoked to account for scene understanding and its neural bases. Using fMRI, we show that non-scene stimuli displaying novel associations across identities or locations recruit putatively scene-selective regions of the human brain (the parahippocampal/lingual region, the retrosplenial complex, and the transverse occipital sulcus/occipital place area). Moreover, we find that the voxel-wise neural patterns arising from these associations are significantly correlated with the neural patterns arising from everyday scenes providing critical evidence whether the same encoding principals underlie both types of processing. These neuroimaging results provide evidence for the hypothesis that the neural representation of scenes is better understood within the broader theoretical framework of associative processing. In addition, the results demonstrate a division of labor that arises across scene-selective regions when processing associations and scenes providing better understanding of the functional

  13. Beyond scene gist: Objects guide search more than scene background.

    Science.gov (United States)

    Koehler, Kathryn; Eckstein, Miguel P

    2017-06-01

    Although the facilitation of visual search by contextual information is well established, there is little understanding of the independent contributions of different types of contextual cues in scenes. Here we manipulated 3 types of contextual information: object co-occurrence, multiple object configurations, and background category. We isolated the benefits of each contextual cue to target detectability, its impact on decision bias, confidence, and the guidance of eye movements. We find that object-based information guides eye movements and facilitates perceptual judgments more than scene background. The degree of guidance and facilitation of each contextual cue can be related to its inherent informativeness about the target spatial location as measured by human explicit judgments about likely target locations. Our results improve the understanding of the contributions of distinct contextual scene components to search and suggest that the brain's utilization of cues to guide eye movements is linked to the cue's informativeness about the target's location. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  14. Self-Occlusions and Disocclusions in Causal Video Object Segmentation

    KAUST Repository

    Yang, Yanchao

    2016-02-19

    We propose a method to detect disocclusion in video sequences of three-dimensional scenes and to partition the disoccluded regions into objects, defined by coherent deformation corresponding to surfaces in the scene. Our method infers deformation fields that are piecewise smooth by construction without the need for an explicit regularizer and the associated choice of weight. It then partitions the disoccluded region and groups its components with objects by leveraging on the complementarity of motion and appearance cues: Where appearance changes within an object, motion can usually be reliably inferred and used for grouping. Where appearance is close to constant, it can be used for grouping directly. We integrate both cues in an energy minimization framework, incorporate prior assumptions explicitly into the energy, and propose a numerical scheme. © 2015 IEEE.

  15. Interactive Procedural Modelling of Coherent Waterfall Scenes

    OpenAIRE

    Emilien , Arnaud; Poulin , Pierre; Cani , Marie-Paule; Vimont , Ulysse

    2015-01-01

    International audience; Combining procedural generation and user control is a fundamental challenge for the interactive design of natural scenery. This is particularly true for modelling complex waterfall scenes where, in addition to taking charge of geometric details, an ideal tool should also provide a user with the freedom to shape the running streams and falls, while automatically maintaining physical plausibility in terms of flow network, embedding into the terrain, and visual aspects of...

  16. Habitat diversity in the Northeastern Gulf of Mexico: Selected video clips from the Gulfstream Natural Gas Pipeline digital archive

    Science.gov (United States)

    Raabe, Ellen A.; D'Anjou, Robert; Pope, Domonique K.; Robbins, Lisa L.

    2011-01-01

    This project combines underwater video with maps and descriptions to illustrate diverse seafloor habitats from Tampa Bay, Florida, to Mobile Bay, Alabama. A swath of seafloor was surveyed with underwater video to 100 meters (m) water depth in 1999 and 2000 as part of the Gulfstream Natural Gas System Survey. The U.S. Geological Survey (USGS) in St. Petersburg, Florida, in cooperation with Eckerd College and the Florida Department of Environmental Protection (FDEP), produced an archive of analog-to-digital underwater movies. Representative clips of seafloor habitats were selected from hundreds of hours of underwater footage. The locations of video clips were mapped to show the distribution of habitat and habitat transitions. The numerous benthic habitats in the northeastern Gulf of Mexico play a vital role in the region's economy, providing essential resources for tourism, natural gas, recreational water sports (fishing, boating, scuba diving), materials, fresh food, energy, a source of sand for beach renourishment, and more. These submerged natural resources are important to the economy but are often invisible to the general public. This product provides a glimpse of the seafloor with sample underwater video, maps, and habitat descriptions. It was developed to depict the range and location of seafloor habitats in the region but is limited by depth and by the survey track. It should not be viewed as comprehensive, but rather as a point of departure for inquiries and appreciation of marine resources and seafloor habitats. Further information is provided in the Resources section.

  17. Stages As Models of Scene Geometry

    NARCIS (Netherlands)

    Nedović, V.; Smeulders, A.W.M.; Redert, A.; Geusebroek, J.M.

    2010-01-01

    Reconstruction of 3D scene geometry is an important element for scene understanding, autonomous vehicle and robot navigation, image retrieval, and 3D television. We propose accounting for the inherent structure of the visual world when trying to solve the scene reconstruction problem. Consequently,

  18. Video games.

    Science.gov (United States)

    Funk, Jeanne B

    2005-06-01

    The video game industry insists that it is doing everything possible to provide information about the content of games so that parents can make informed choices; however, surveys indicate that ratings may not reflect consumer views of the nature of the content. This article describes some of the currently popular video games, as well as developments that are on the horizon, and discusses the status of research on the positive and negative impacts of playing video games. Recommendations are made to help parents ensure that children play games that are consistent with their values.

  19. Integration of Trace Images in Three-dimensional Crime Scene Reconstruction

    Directory of Open Access Journals (Sweden)

    Quentin Milliet

    2016-01-01

    Full Text Available Forensic image analysis has greatly developed with the proliferation of photography and video recording devices. Trace images of serious incidents are increasingly captured by first responders, witnesses, bystanders, or surveillance systems. Image perception is exposed with a special emphasis on the influence of the field of view on observation. In response to the pitfalls of the mental eye, a way to systematize the integration of images as traces in three-dimensional crime scene reconstruction is proposed. The systematic approach is based on the application of photogrammetric principles to slightly modify the usual photographic documentation as well as on the early collection and review of available trace images. The integration of images as traces provides valuable contributions to contextualize what happened at a crime scene based on the information that can be obtained from images. In a wider perspective, the systematic analysis of images fosters the use and interpretation of forensic evidence to complement witness statements in the criminal justice system. This article outlines the benefits of integrating trace images into a coherent reconstruction framework in order to improve interpretation of their content. A solution is proposed to integrate perception differences between the field of view of cameras and the human eye.

  20. When Does Repeated Search in Scenes Involve Memory? Looking at versus Looking for Objects in Scenes

    Science.gov (United States)

    Vo, Melissa L. -H.; Wolfe, Jeremy M.

    2012-01-01

    One might assume that familiarity with a scene or previous encounters with objects embedded in a scene would benefit subsequent search for those items. However, in a series of experiments we show that this is not the case: When participants were asked to subsequently search for multiple objects in the same scene, search performance remained…

  1. A hardware architecture for real-time shadow removal in high-contrast video

    Science.gov (United States)

    Verdugo, Pablo; Pezoa, Jorge E.; Figueroa, Miguel

    2017-09-01

    Broadcasting an outdoor sports event at daytime is a challenging task due to the high contrast that exists between areas in the shadow and light conditions within the same scene. Commercial cameras typically do not handle the high dynamic range of such scenes in a proper manner, resulting in broadcast streams with very little shadow detail. We propose a hardware architecture for real-time shadow removal in high-resolution video, which reduces the shadow effect and simultaneously improves shadow details. The algorithm operates only on the shadow portions of each video frame, thus improving the results and producing more realistic images than algorithms that operate on the entire frame, such as simplified Retinex and histogram shifting. The architecture receives an input in the RGB color space, transforms it into the YIQ space, and uses color information from both spaces to produce a mask of the shadow areas present in the image. The mask is then filtered using a connected components algorithm to eliminate false positives and negatives. The hardware uses pixel information at the edges of the mask to estimate the illumination ratio between light and shadow in the image, which is then used to correct the shadow area. Our prototype implementation simultaneously processes up to 7 video streams of 1920×1080 pixels at 60 frames per second on a Xilinx Kintex-7 XC7K325T FPGA.

  2. The Role of Binocular Disparity in Rapid Scene and Pattern Recognition

    Directory of Open Access Journals (Sweden)

    Matteo Valsecchi

    2013-04-01

    Full Text Available We investigated the contribution of binocular disparity to the rapid recognition of scenes and simpler spatial patterns using a paradigm combining backward masked stimulus presentation and short-term match-to-sample recognition. First, we showed that binocular disparity did not contribute significantly to the recognition of briefly presented natural and artificial scenes, even when the availability of monocular cues was reduced. Subsequently, using dense random dot stereograms as stimuli, we showed that observers were in principle able to extract spatial patterns defined only by disparity under brief, masked presentations. Comparing our results with the predictions from a cue-summation model, we showed that combining disparity with luminance did not per se disrupt the processing of disparity. Our results suggest that the rapid recognition of scenes is mediated mostly by a monocular comparison of the images, although we can rely on stereo in fast pattern recognition.

  3. AUTOMATIC FAST VIDEO OBJECT DETECTION AND TRACKING ON VIDEO SURVEILLANCE SYSTEM

    Directory of Open Access Journals (Sweden)

    V. Arunachalam

    2012-08-01

    Full Text Available This paper describes the advance techniques for object detection and tracking in video. Most visual surveillance systems start with motion detection. Motion detection methods attempt to locate connected regions of pixels that represent the moving objects within the scene; different approaches include frame-to-frame difference, background subtraction and motion analysis. The motion detection can be achieved by Principle Component Analysis (PCA and then separate an objects from background using background subtraction. The detected object can be segmented. Segmentation consists of two schemes: one for spatial segmentation and the other for temporal segmentation. Tracking approach can be done in each frame of detected Object. Pixel label problem can be alleviated by the MAP (Maximum a Posteriori technique.

  4. Forensic 3D Scene Reconstruction

    International Nuclear Information System (INIS)

    LITTLE, CHARLES Q.; PETERS, RALPH R.; RIGDON, J. BRIAN; SMALL, DANIEL E.

    1999-01-01

    Traditionally law enforcement agencies have relied on basic measurement and imaging tools, such as tape measures and cameras, in recording a crime scene. A disadvantage of these methods is that they are slow and cumbersome. The development of a portable system that can rapidly record a crime scene with current camera imaging, 3D geometric surface maps, and contribute quantitative measurements such as accurate relative positioning of crime scene objects, would be an asset to law enforcement agents in collecting and recording significant forensic data. The purpose of this project is to develop a feasible prototype of a fast, accurate, 3D measurement and imaging system that would support law enforcement agents to quickly document and accurately record a crime scene

  5. Real-Time FPGA-Based Object Tracker with Automatic Pan-Tilt Features for Smart Video Surveillance Systems

    Directory of Open Access Journals (Sweden)

    Sanjay Singh

    2017-05-01

    Full Text Available The design of smart video surveillance systems is an active research field among the computer vision community because of their ability to perform automatic scene analysis by selecting and tracking the objects of interest. In this paper, we present the design and implementation of an FPGA-based standalone working prototype system for real-time tracking of an object of interest in live video streams for such systems. In addition to real-time tracking of the object of interest, the implemented system is also capable of providing purposive automatic camera movement (pan-tilt in the direction determined by movement of the tracked object. The complete system, including camera interface, DDR2 external memory interface controller, designed object tracking VLSI architecture, camera movement controller and display interface, has been implemented on the Xilinx ML510 (Virtex-5 FX130T FPGA Board. Our proposed, designed and implemented system robustly tracks the target object present in the scene in real time for standard PAL (720 × 576 resolution color video and automatically controls camera movement in the direction determined by the movement of the tracked object.

  6. Interaction between High-Level and Low-Level Image Analysis for Semantic Video Object Extraction

    Directory of Open Access Journals (Sweden)

    Andrea Cavallaro

    2004-06-01

    Full Text Available The task of extracting a semantic video object is split into two subproblems, namely, object segmentation and region segmentation. Object segmentation relies on a priori assumptions, whereas region segmentation is data-driven and can be solved in an automatic manner. These two subproblems are not mutually independent, and they can benefit from interactions with each other. In this paper, a framework for such interaction is formulated. This representation scheme based on region segmentation and semantic segmentation is compatible with the view that image analysis and scene understanding problems can be decomposed into low-level and high-level tasks. Low-level tasks pertain to region-oriented processing, whereas the high-level tasks are closely related to object-level processing. This approach emulates the human visual system: what one “sees” in a scene depends on the scene itself (region segmentation as well as on the cognitive task (semantic segmentation at hand. The higher-level segmentation results in a partition corresponding to semantic video objects. Semantic video objects do not usually have invariant physical properties and the definition depends on the application. Hence, the definition incorporates complex domain-specific knowledge and is not easy to generalize. For the specific implementation used in this paper, motion is used as a clue to semantic information. In this framework, an automatic algorithm is presented for computing the semantic partition based on color change detection. The change detection strategy is designed to be immune to the sensor noise and local illumination variations. The lower-level segmentation identifies the partition corresponding to perceptually uniform regions. These regions are derived by clustering in an N-dimensional feature space, composed of static as well as dynamic image attributes. We propose an interaction mechanism between the semantic and the region partitions which allows to

  7. Hierarchical Context Modeling for Video Event Recognition.

    Science.gov (United States)

    Wang, Xiaoyang; Ji, Qiang

    2016-10-11

    Current video event recognition research remains largely target-centered. For real-world surveillance videos, targetcentered event recognition faces great challenges due to large intra-class target variation, limited image resolution, and poor detection and tracking results. To mitigate these challenges, we introduced a context-augmented video event recognition approach. Specifically, we explicitly capture different types of contexts from three levels including image level, semantic level, and prior level. At the image level, we introduce two types of contextual features including the appearance context features and interaction context features to capture the appearance of context objects and their interactions with the target objects. At the semantic level, we propose a deep model based on deep Boltzmann machine to learn event object representations and their interactions. At the prior level, we utilize two types of prior-level contexts including scene priming and dynamic cueing. Finally, we introduce a hierarchical context model that systematically integrates the contextual information at different levels. Through the hierarchical context model, contexts at different levels jointly contribute to the event recognition. We evaluate the hierarchical context model for event recognition on benchmark surveillance video datasets. Results show that incorporating contexts in each level can improve event recognition performance, and jointly integrating three levels of contexts through our hierarchical model achieves the best performance.

  8. A Survey of the Predictors of Amount of Aggression in the Adolescent Users of Violent Video Games in Qom City, 2012, Iran

    OpenAIRE

    Sarallah Shojaei; Tahereh Dehdari; Keramat Noori Jelyani; Behnaz Dowran

    2013-01-01

    Background and Objectives: Adolescents are the main audiences of video games. Attractive technologies of these games make virtual faces seem real characters to their audiences. There is a high tendency to show violent and deadly scenes. The present study was done with the purpose of determining the predictors of the amount of aggression in the adolescent users of violent video games in Qom city.Methods: In this descriptive cross-sectional study, 100 adolescent users of violent video game refe...

  9. Video content analysis on body-worn cameras for retrospective investigation

    Science.gov (United States)

    Bouma, Henri; Baan, Jan; ter Haar, Frank B.; Eendebak, Pieter T.; den Hollander, Richard J. M.; Burghouts, Gertjan J.; Wijn, Remco; van den Broek, Sebastiaan P.; van Rest, Jeroen H. C.

    2015-10-01

    In the security domain, cameras are important to assess critical situations. Apart from fixed surveillance cameras we observe an increasing number of sensors on mobile platforms, such as drones, vehicles and persons. Mobile cameras allow rapid and local deployment, enabling many novel applications and effects, such as the reduction of violence between police and citizens. However, the increased use of bodycams also creates potential challenges. For example: how can end-users extract information from the abundance of video, how can the information be presented, and how can an officer retrieve information efficiently? Nevertheless, such video gives the opportunity to stimulate the professionals' memory, and support complete and accurate reporting. In this paper, we show how video content analysis (VCA) can address these challenges and seize these opportunities. To this end, we focus on methods for creating a complete summary of the video, which allows quick retrieval of relevant fragments. The content analysis for summarization consists of several components, such as stabilization, scene selection, motion estimation, localization, pedestrian tracking and action recognition in the video from a bodycam. The different components and visual representations of summaries are presented for retrospective investigation.

  10. Scenes of the self, and trance

    Directory of Open Access Journals (Sweden)

    Jan M. Broekman

    2014-02-01

    Full Text Available Trance shows the Self as a process involved in all sorts and forms of life. A Western perspective on a self and its reifying tendencies is only one (or one series of those variations. The process character of the self does not allow any coherent theory but shows, in particular when confronted with trance, its variability in all regards. What is more: the Self is always first on the scene of itself―a situation in which it becomes a sign for itself. That particular semiotic feature is again not a unified one but leads, as the Self in view of itself does, to series of scenes with changing colors, circumstances and environments. Our first scene “Beyond Monotheism” shows semiotic importance in that a self as determining component of a trance-phenomenon must abolish its own referent and seems not able to answer the question, what makes trance a trance. The Pizzica is an example here. Other social features of trance appear in the second scene, US post traumatic psychological treatments included. Our third scene underlines structures of an unfolding self: beginning with ‘split-ego’ conclusions, a self’s engenderment appears dependent on linguistic events and on spoken words in the first place. A fourth scene explores that theme and explains modern forms of an ego ―in particular those inherent to ‘citizenship’ or a ‘corporation’. The legal consequences are concentrated in the fifth scene, which considers a legal subject by revealing its ‘standing’. Our sixth and final scene pertains to the relation between trance and commerce. All scenes tie together and show parallels between Pizzica, rights-based behavior, RAVE music versus disco, commerce and trance; they demonstrate the meaning of trance as a multifaceted social phenomenon.

  11. Video-documentation: 'The Pannonic ozon project'

    International Nuclear Information System (INIS)

    Loibl, W.; Cabela, E.; Mayer, H. F.; Schmidt, M.

    1998-07-01

    Goal of the project was the production of a video film as documentation of the Pannonian Ozone Project- POP. The main part of the video describes the POP-model consisting of the modules meteorology, emissions and chemistry, developed during the POP-project. The model considers the European emission patterns of ozone precursors and the actual wind fields. It calculates ozone build up and depletion within air parcels due to emission and weather situation along trajectory routes. Actual ozone concentrations are calculated during model runs simulating the photochemical processes within air parcels moving along 4 day trajectories before reaching the Vienna region. The model computations were validated during extensive ground and aircraft-based measurements of ozone precursors and ozone concentration within the POP study area. Scenario computations were used to determine how much ozone can be reduced in north-eastern Austria by emissions control measures. The video lasts 12:20 minutes and consists of computer animations and life video scenes, presenting the ozone problem in general, the POP model and the model results. The video was produced in co-operation by the Austrian Research Center Seibersdorf - Department of Environmental Planning (ARCS) and Joanneum Research - Institute of Informationsystems (JR). ARCS was responsible for idea, concept, storyboard and text while JR was responsible for computer animation and general video production. The speaker text was written with scientific advice by the POP - project partners: Institute of Meteorology and Physics, University of Agricultural Sciences- Vienna, Environment Agency Austria - Air Quality Department, Austrian Research Center Seibersdorf- Environmental Planning Department/System Research Division. The film was produced as German and English version. (author)

  12. Automatically assessing properties of dynamic cameras for camera selection and rapid deployment of video content analysis tasks in large-scale ad-hoc networks

    Science.gov (United States)

    den Hollander, Richard J. M.; Bouma, Henri; van Rest, Jeroen H. C.; ten Hove, Johan-Martijn; ter Haar, Frank B.; Burghouts, Gertjan J.

    2017-10-01

    Video analytics is essential for managing large quantities of raw data that are produced by video surveillance systems (VSS) for the prevention, repression and investigation of crime and terrorism. Analytics is highly sensitive to changes in the scene, and for changes in the optical chain so a VSS with analytics needs careful configuration and prompt maintenance to avoid false alarms. However, there is a trend from static VSS consisting of fixed CCTV cameras towards more dynamic VSS deployments over public/private multi-organization networks, consisting of a wider variety of visual sensors, including pan-tilt-zoom (PTZ) cameras, body-worn cameras and cameras on moving platforms. This trend will lead to more dynamic scenes and more frequent changes in the optical chain, creating structural problems for analytics. If these problems are not adequately addressed, analytics will not be able to continue to meet end users' developing needs. In this paper, we present a three-part solution for managing the performance of complex analytics deployments. The first part is a register containing meta data describing relevant properties of the optical chain, such as intrinsic and extrinsic calibration, and parameters of the scene such as lighting conditions or measures for scene complexity (e.g. number of people). A second part frequently assesses these parameters in the deployed VSS, stores changes in the register, and signals relevant changes in the setup to the VSS administrator. A third part uses the information in the register to dynamically configure analytics tasks based on VSS operator input. In order to support the feasibility of this solution, we give an overview of related state-of-the-art technologies for autocalibration (self-calibration), scene recognition and lighting estimation in relation to person detection. The presented solution allows for rapid and robust deployment of Video Content Analysis (VCA) tasks in large scale ad-hoc networks.

  13. The nature-disorder paradox: A perceptual study on how nature is disorderly yet aesthetically preferred.

    Science.gov (United States)

    Kotabe, Hiroki P; Kardan, Omid; Berman, Marc G

    2017-08-01

    Natural environments have powerful aesthetic appeal linked to their capacity for psychological restoration. In contrast, disorderly environments are aesthetically aversive, and have various detrimental psychological effects. But in our research, we have repeatedly found that natural environments are perceptually disorderly. What could explain this paradox? We present 3 competing hypotheses: the aesthetic preference for naturalness is more powerful than the aesthetic aversion to disorder (the nature-trumps-disorder hypothesis ); disorder is trivial to aesthetic preference in natural contexts (the harmless-disorder hypothesis ); and disorder is aesthetically preferred in natural contexts (the beneficial-disorder hypothesis ). Utilizing novel methods of perceptual study and diverse stimuli, we rule in the nature-trumps-disorder hypothesis and rule out the harmless-disorder and beneficial-disorder hypotheses. In examining perceptual mechanisms, we find evidence that high-level scene semantics are both necessary and sufficient for the nature-trumps-disorder effect. Necessity is evidenced by the effect disappearing in experiments utilizing only low-level visual stimuli (i.e., where scene semantics have been removed) and experiments utilizing a rapid-scene-presentation procedure that obscures scene semantics. Sufficiency is evidenced by the effect reappearing in experiments utilizing noun stimuli which remove low-level visual features. Furthermore, we present evidence that the interaction of scene semantics with low-level visual features amplifies the nature-trumps-disorder effect-the effect is weaker both when statistically adjusting for quantified low-level visual features and when using noun stimuli which remove low-level visual features. These results have implications for psychological theories bearing on the joint influence of low- and high-level perceptual inputs on affect and cognition, as well as for aesthetic design. (PsycINFO Database Record (c) 2017 APA, all

  14. Temporal properties of natural scenes

    NARCIS (Netherlands)

    Hateren, J.H. van; Schaaf, A. van der; Rogowitz, BE; Allebach, JP

    1996-01-01

    A major problem a visual system faces is how to fit the large intensity variation of natural image streams into the limited dynamic range of its neurons. One of the means to accomplish this is through the use of fast light adaptation of the photoreceptors. In order to investigate this, we measured

  15. 3D terrestrial lidar data classification of complex natural scenes using a multi-scale dimensionality criterion: Applications in geomorphology

    Science.gov (United States)

    Brodu, N.; Lague, D.

    2012-03-01

    3D point clouds of natural environments relevant to problems in geomorphology (rivers, coastal environments, cliffs, …) often require classification of the data into elementary relevant classes. A typical example is the separation of riparian vegetation from ground in fluvial environments, the distinction between fresh surfaces and rockfall in cliff environments, or more generally the classification of surfaces according to their morphology (e.g. the presence of bedforms or by grain size). Natural surfaces are heterogeneous and their distinctive properties are seldom defined at a unique scale, prompting the use of multi-scale criteria to achieve a high degree of classification success. We have thus defined a multi-scale measure of the point cloud dimensionality around each point. The dimensionality characterizes the local 3D organization of the point cloud within spheres centered on the measured points and varies from being 1D (points set along a line), 2D (points forming a plane) to the full 3D volume. By varying the diameter of the sphere, we can thus monitor how the local cloud geometry behaves across scales. We present the technique and illustrate its efficiency in separating riparian vegetation from ground and classifying a mountain stream as vegetation, rock, gravel or water surface. In these two cases, separating the vegetation from ground or other classes achieve accuracy larger than 98%. Comparison with a single scale approach shows the superiority of the multi-scale analysis in enhancing class separability and spatial resolution of the classification. Scenes between 10 and one hundred million points can be classified on a common laptop in a reasonable time. The technique is robust to missing data, shadow zones and changes in point density within the scene. The classification is fast and accurate and can account for some degree of intra-class morphological variability such as different vegetation types. A probabilistic confidence in the classification

  16. Satellite markers: a simple method for ground truth car pose on stereo video

    Science.gov (United States)

    Gil, Gustavo; Savino, Giovanni; Piantini, Simone; Pierini, Marco

    2018-04-01

    Artificial prediction of future location of other cars in the context of advanced safety systems is a must. The remote estimation of car pose and particularly its heading angle is key to predict its future location. Stereo vision systems allow to get the 3D information of a scene. Ground truth in this specific context is associated with referential information about the depth, shape and orientation of the objects present in the traffic scene. Creating 3D ground truth is a measurement and data fusion task associated with the combination of different kinds of sensors. The novelty of this paper is the method to generate ground truth car pose only from video data. When the method is applied to stereo video, it also provides the extrinsic camera parameters for each camera at frame level which are key to quantify the performance of a stereo vision system when it is moving because the system is subjected to undesired vibrations and/or leaning. We developed a video post-processing technique which employs a common camera calibration tool for the 3D ground truth generation. In our case study, we focus in accurate car heading angle estimation of a moving car under realistic imagery. As outcomes, our satellite marker method provides accurate car pose at frame level, and the instantaneous spatial orientation for each camera at frame level.

  17. Lateralized discrimination of emotional scenes in peripheral vision.

    Science.gov (United States)

    Calvo, Manuel G; Rodríguez-Chinea, Sandra; Fernández-Martín, Andrés

    2015-03-01

    This study investigates whether there is lateralized processing of emotional scenes in the visual periphery, in the absence of eye fixations; and whether this varies with emotional valence (pleasant vs. unpleasant), specific emotional scene content (babies, erotica, human attack, mutilation, etc.), and sex of the viewer. Pairs of emotional (positive or negative) and neutral photographs were presented for 150 ms peripherally (≥6.5° away from fixation). Observers judged on which side the emotional picture was located. Low-level image properties, scene visual saliency, and eye movements were controlled. Results showed that (a) correct identification of the emotional scene exceeded the chance level; (b) performance was more accurate and faster when the emotional scene appeared in the left than in the right visual field; (c) lateralization was equivalent for females and males for pleasant scenes, but was greater for females and unpleasant scenes; and (d) lateralization occurred similarly for different emotional scene categories. These findings reveal discrimination between emotional and neutral scenes, and right brain hemisphere dominance for emotional processing, which is modulated by sex of the viewer and scene valence, and suggest that coarse affective significance can be extracted in peripheral vision.

  18. Self Occlusion and Disocclusion in Causal Video Object Segmentation

    Science.gov (United States)

    2015-12-18

    22, 37, 13, 17], since an explicit 3D reconstruction of the scene produces as a side effect a partition of the video into regions. However, it...83.4 79.3 82.8 84.4 34.7 Soldier 84.0 81.1 83.8 66.6 66.5 Monkey 85.1 86.0 84.8 79.0 61.9 Bird of Paradise 96.1 93.0 94.0 92.2 86.8 BMXPerson 92.8 88.9

  19. Falling out of time: enhanced memory for scenes presented at behaviorally irrelevant points in time in posttraumatic stress disorder (PTSD).

    Science.gov (United States)

    Levy-Gigi, Einat; Kéri, Szabolcs

    2012-01-01

    Spontaneous encoding of the visual environment depends on the behavioral relevance of the task performed simultaneously. If participants identify target letters or auditory tones while viewing a series of briefly presented natural and urban scenes, they demonstrate effective scene recognition only when a target, but not a behaviorally irrelevant distractor, appears together with the scene. Here, we show that individuals with posttraumatic stress disorder (PTSD), who witnessed the red sludge disaster in Hungary, show the opposite pattern of performance: enhanced recognition of scenes presented together with distractors and deficient recognition of scenes presented with targets. The recognition of trauma-related and neutral scenes was not different in individuals with PTSD. We found a positive correlation between memory for scenes presented with auditory distractors and re-experiencing symptoms (memory intrusions and flashbacks). These results suggest that abnormal encoding of visual scenes at behaviorally irrelevant events might be associated with intrusive experiences by disrupting the flow of time.

  20. Falling out of time: enhanced memory for scenes presented at behaviorally irrelevant points in time in posttraumatic stress disorder (PTSD.

    Directory of Open Access Journals (Sweden)

    Einat Levy-Gigi

    Full Text Available Spontaneous encoding of the visual environment depends on the behavioral relevance of the task performed simultaneously. If participants identify target letters or auditory tones while viewing a series of briefly presented natural and urban scenes, they demonstrate effective scene recognition only when a target, but not a behaviorally irrelevant distractor, appears together with the scene. Here, we show that individuals with posttraumatic stress disorder (PTSD, who witnessed the red sludge disaster in Hungary, show the opposite pattern of performance: enhanced recognition of scenes presented together with distractors and deficient recognition of scenes presented with targets. The recognition of trauma-related and neutral scenes was not different in individuals with PTSD. We found a positive correlation between memory for scenes presented with auditory distractors and re-experiencing symptoms (memory intrusions and flashbacks. These results suggest that abnormal encoding of visual scenes at behaviorally irrelevant events might be associated with intrusive experiences by disrupting the flow of time.

  1. Pedestrian detection in video surveillance using fully convolutional YOLO neural network

    Science.gov (United States)

    Molchanov, V. V.; Vishnyakov, B. V.; Vizilter, Y. V.; Vishnyakova, O. V.; Knyaz, V. A.

    2017-06-01

    More than 80% of video surveillance systems are used for monitoring people. Old human detection algorithms, based on background and foreground modelling, could not even deal with a group of people, to say nothing of a crowd. Recent robust and highly effective pedestrian detection algorithms are a new milestone of video surveillance systems. Based on modern approaches in deep learning, these algorithms produce very discriminative features that can be used for getting robust inference in real visual scenes. They deal with such tasks as distinguishing different persons in a group, overcome problem with sufficient enclosures of human bodies by the foreground, detect various poses of people. In our work we use a new approach which enables to combine detection and classification tasks into one challenge using convolution neural networks. As a start point we choose YOLO CNN, whose authors propose a very efficient way of combining mentioned above tasks by learning a single neural network. This approach showed competitive results with state-of-the-art models such as FAST R-CNN, significantly overcoming them in speed, which allows us to apply it in real time video surveillance and other video monitoring systems. Despite all advantages it suffers from some known drawbacks, related to the fully-connected layers that obstruct applying the CNN to images with different resolution. Also it limits the ability to distinguish small close human figures in groups which is crucial for our tasks since we work with rather low quality images which often include dense small groups of people. In this work we gradually change network architecture to overcome mentioned above problems, train it on a complex pedestrian dataset and finally get the CNN detecting small pedestrians in real scenes.

  2. Designer's approach for scene selection in tests of preference and restoration along a continuum of natural to manmade environments

    Science.gov (United States)

    Hunter, MaryCarol R.; Askarinejad, Ali

    2015-01-01

    It is well-established that the experience of nature produces an array of positive benefits to mental well-being. Much less is known about the specific attributes of green space which produce these effects. In the absence of translational research that links theory with application, it is challenging to design urban green space for its greatest restorative potential. This translational research provides a method for identifying which specific physical attributes of an environmental setting are most likely to influence preference and restoration responses. Attribute identification was based on a triangulation process invoking environmental psychology and aesthetics theories, principles of design founded in mathematics and aesthetics, and empirical research on the role of specific physical attributes of the environment in preference or restoration responses. From this integration emerged a list of physical attributes defining aspects of spatial structure and environmental content found to be most relevant to the perceptions involved with preference and restoration. The physical attribute list offers a starting point for deciphering which scene stimuli dominate or collaborate in preference and restoration responses. To support this, functional definitions and metrics—efficient methods for attribute quantification are presented. Use of these research products and the process for defining place-based metrics can provide (a) greater control in the selection and interpretation of the scenes/images used in tests of preference and restoration and (b) an expanded evidence base for well-being designers of the built environment. PMID:26347691

  3. Improving Remote Sensing Scene Classification by Integrating Global-Context and Local-Object Features

    Directory of Open Access Journals (Sweden)

    Dan Zeng

    2018-05-01

    Full Text Available Recently, many researchers have been dedicated to using convolutional neural networks (CNNs to extract global-context features (GCFs for remote-sensing scene classification. Commonly, accurate classification of scenes requires knowledge about both the global context and local objects. However, unlike the natural images in which the objects cover most of the image, objects in remote-sensing images are generally small and decentralized. Thus, it is hard for vanilla CNNs to focus on both global context and small local objects. To address this issue, this paper proposes a novel end-to-end CNN by integrating the GCFs and local-object-level features (LOFs. The proposed network includes two branches, the local object branch (LOB and global semantic branch (GSB, which are used to generate the LOFs and GCFs, respectively. Then, the concatenation of features extracted from the two branches allows our method to be more discriminative in scene classification. Three challenging benchmark remote-sensing datasets were extensively experimented on; the proposed approach outperformed the existing scene classification methods and achieved state-of-the-art results for all three datasets.

  4. Anticipatory scene representation in preschool children's recall and recognition memory.

    Science.gov (United States)

    Kreindel, Erica; Intraub, Helene

    2017-09-01

    Behavioral and neuroscience research on boundary extension (false memory beyond the edges of a view of a scene) has provided new insights into the constructive nature of scene representation, and motivates questions about development. Early research with children (as young as 6-7 years) was consistent with boundary extension, but relied on an analysis of spatial errors in drawings which are open to alternative explanations (e.g. drawing ability). Experiment 1 replicated and extended prior drawing results with 4-5-year-olds and adults. In Experiment 2, a new, forced-choice immediate recognition memory test was implemented with the same children. On each trial, a card (photograph of a simple scene) was immediately replaced by a test card (identical view and either a closer or more wide-angle view) and participants indicated which one matched the original view. Error patterns supported boundary extension; identical photographs were more frequently rejected when the closer view was the original view, than vice versa. This asymmetry was not attributable to a selection bias (guessing tasks; Experiments 3-5). In Experiment 4, working memory load was increased by presenting more expansive views of more complex scenes. Again, children exhibited boundary extension, but now adults did not, unless stimulus duration was reduced to 5 s (limiting time to implement strategies; Experiment 5). We propose that like adults, children interpret photographs as views of places in the world; they extrapolate the anticipated continuation of the scene beyond the view and misattribute it to having been seen. Developmental differences in source attribution decision processes provide an explanation for the age-related differences observed. © 2016 John Wiley & Sons Ltd.

  5. Smoking scenes in popular Japanese serial television dramas: descriptive analysis during the same 3-month period in two consecutive years.

    Science.gov (United States)

    Kanda, Hideyuki; Okamura, Tomonori; Turin, Tanvir Chowdhury; Hayakawa, Takehito; Kadowaki, Takashi; Ueshima, Hirotsugu

    2006-06-01

    Japanese serial television dramas are becoming very popular overseas, particularly in other Asian countries. Exposure to smoking scenes in movies and television dramas has been known to trigger initiation of habitual smoking in young people. Smoking scenes in Japanese dramas may affect the smoking behavior of many young Asians. We examined smoking scenes and smoking-related items in serial television dramas targeting young audiences in Japan during the same season in two consecutive years. Fourteen television dramas targeting the young audience broadcast between July and September in 2001 and 2002 were analyzed. A total of 136 h 42 min of television programs were divided into unit scenes of 3 min (a total of 2734 unit scenes). All the unit scenes were reviewed for smoking scenes and smoking-related items. Of the 2734 3-min unit scenes, 205 (7.5%) were actual smoking scenes and 387 (14.2%) depicted smoking environments with the presence of smoking-related items, such as ash trays. In 185 unit scenes (90.2% of total smoking scenes), actors were shown smoking. Actresses were less frequently shown smoking (9.8% of total smoking scenes). Smoking characters in dramas were in the 20-49 age group in 193 unit scenes (94.1% of total smoking scenes). In 96 unit scenes (46.8% of total smoking scenes), at least one non-smoker was present in the smoking scenes. The smoking locations were mainly indoors, including offices, restaurants and homes (122 unit scenes, 59.6%). The most common smoking-related items shown were ash trays (in 45.5% of smoking-item-related scenes) and cigarettes (in 30.2% of smoking-item-related scenes). Only 3 unit scenes (0.1 % of all scenes) promoted smoking prohibition. This was a descriptive study to examine the nature of smoking scenes observed in Japanese television dramas from a public health perspective.

  6. Evaluating color descriptors for object and scene recognition.

    Science.gov (United States)

    van de Sande, Koen E A; Gevers, Theo; Snoek, Cees G M

    2010-09-01

    Image category recognition is important to access visual information on the level of objects and scene types. So far, intensity-based descriptors have been widely used for feature extraction at salient points. To increase illumination invariance and discriminative power, color descriptors have been proposed. Because many different descriptors exist, a structured overview is required of color invariant descriptors in the context of image category recognition. Therefore, this paper studies the invariance properties and the distinctiveness of color descriptors (software to compute the color descriptors from this paper is available from http://www.colordescriptors.com) in a structured way. The analytical invariance properties of color descriptors are explored, using a taxonomy based on invariance properties with respect to photometric transformations, and tested experimentally using a data set with known illumination conditions. In addition, the distinctiveness of color descriptors is assessed experimentally using two benchmarks, one from the image domain and one from the video domain. From the theoretical and experimental results, it can be derived that invariance to light intensity changes and light color changes affects category recognition. The results further reveal that, for light intensity shifts, the usefulness of invariance is category-specific. Overall, when choosing a single descriptor and no prior knowledge about the data set and object and scene categories is available, the OpponentSIFT is recommended. Furthermore, a combined set of color descriptors outperforms intensity-based SIFT and improves category recognition by 8 percent on the PASCAL VOC 2007 and by 7 percent on the Mediamill Challenge.

  7. Semantic Reasoning for Scene Interpretation

    DEFF Research Database (Denmark)

    Jensen, Lars Baunegaard With; Baseski, Emre; Pugeault, Nicolas

    2008-01-01

    In this paper, we propose a hierarchical architecture for representing scenes, covering 2D and 3D aspects of visual scenes as well as the semantic relations between the different aspects. We argue that labeled graphs are a suitable representational framework for this representation and demonstrat...

  8. Providing views of the driving scene to drivers' conversation partners mitigates cell-phone-related distraction.

    Science.gov (United States)

    Gaspar, John G; Street, Whitney N; Windsor, Matthew B; Carbonari, Ronald; Kaczmarski, Henry; Kramer, Arthur F; Mathewson, Kyle E

    2014-12-01

    Cell-phone use impairs driving safety and performance. This impairment may stem from the remote partner's lack of awareness about the driving situation. In this study, pairs of participants completed a driving simulator task while conversing naturally in the car and while talking on a hands-free cell phone. In a third condition, the driver drove while the remote conversation partner could see video of both the road ahead and the driver's face. We tested the extent to which this additional visual information diminished the negative effects of cell-phone distraction and increased situational awareness. Collision rates for unexpected merging events were high when participants drove in a cell-phone condition but were reduced when they were in a videophone condition, reaching a level equal to that observed when they drove with an in-car passenger or drove alone. Drivers and their partners made shorter utterances and made longer, more frequent traffic references when they spoke in the videophone rather than the cell-phone condition. Providing a view of the driving scene allows remote partners to help drivers by modulating their conversation and referring to traffic more often. © The Author(s) 2014.

  9. Video surveillance using distance maps

    Science.gov (United States)

    Schouten, Theo E.; Kuppens, Harco C.; van den Broek, Egon L.

    2006-02-01

    Human vigilance is limited; hence, automatic motion and distance detection is one of the central issues in video surveillance. Hereby, many aspects are of importance, this paper specially addresses: efficiency, achieving real-time performance, accuracy, and robustness against various noise factors. To obtain fully controlled test environments, an artificial development center for robot navigation is introduced in which several parameters can be set (e.g., number of objects, trajectories and type and amount of noise). In the videos, for each following frame, movement of stationary objects is detected and pixels of moving objects are located from which moving objects are identified in a robust way. An Exact Euclidean Distance Map (E2DM) is utilized to determine accurately the distances between moving and stationary objects. Together with the determined distances between moving objects and the detected movement of stationary objects, this provides the input for detecting unwanted situations in the scene. Further, each intelligent object (e.g., a robot), is provided with its E2DM, allowing the object to plan its course of action. Timing results are specified for each program block of the processing chain for 20 different setups. So, the current paper presents extensive, experimentally controlled research on real-time, accurate, and robust motion detection for video surveillance, using E2DMs, which makes it a unique approach.

  10. Pooling Objects for Recognizing Scenes without Examples

    NARCIS (Netherlands)

    Kordumova, S.; Mensink, T.; Snoek, C.G.M.

    2016-01-01

    In this paper we aim to recognize scenes in images without using any scene images as training data. Different from attribute based approaches, we do not carefully select the training classes to match the unseen scene classes. Instead, we propose a pooling over ten thousand of off-the-shelf object

  11. A System based on Adaptive Background Subtraction Approach for Moving Object Detection and Tracking in Videos

    Directory of Open Access Journals (Sweden)

    Bahadır KARASULU

    2013-04-01

    Full Text Available Video surveillance systems are based on video and image processing research areas in the scope of computer science. Video processing covers various methods which are used to browse the changes in existing scene for specific video. Nowadays, video processing is one of the important areas of computer science. Two-dimensional videos are used to apply various segmentation and object detection and tracking processes which exists in multimedia content-based indexing, information retrieval, visual and distributed cross-camera surveillance systems, people tracking, traffic tracking and similar applications. Background subtraction (BS approach is a frequently used method for moving object detection and tracking. In the literature, there exist similar methods for this issue. In this research study, it is proposed to provide a more efficient method which is an addition to existing methods. According to model which is produced by using adaptive background subtraction (ABS, an object detection and tracking system’s software is implemented in computer environment. The performance of developed system is tested via experimental works with related video datasets. The experimental results and discussion are given in the study

  12. Multi-modal highlight generation for sports videos using an information-theoretic excitability measure

    Science.gov (United States)

    Hasan, Taufiq; Bořil, Hynek; Sangwan, Abhijeet; L Hansen, John H.

    2013-12-01

    The ability to detect and organize `hot spots' representing areas of excitement within video streams is a challenging research problem when techniques rely exclusively on video content. A generic method for sports video highlight selection is presented in this study which leverages both video/image structure as well as audio/speech properties. Processing begins where the video is partitioned into small segments and several multi-modal features are extracted from each segment. Excitability is computed based on the likelihood of the segmental features residing in certain regions of their joint probability density function space which are considered both exciting and rare. The proposed measure is used to rank order the partitioned segments to compress the overall video sequence and produce a contiguous set of highlights. Experiments are performed on baseball videos based on signal processing advancements for excitement assessment in the commentators' speech, audio energy, slow motion replay, scene cut density, and motion activity as features. Detailed analysis on correlation between user excitability and various speech production parameters is conducted and an effective scheme is designed to estimate the excitement level of commentator's speech from the sports videos. Subjective evaluation of excitability and ranking of video segments demonstrate a higher correlation with the proposed measure compared to well-established techniques indicating the effectiveness of the overall approach.

  13. Initial progress in the recording of crime scene simulations using 3D laser structured light imagery techniques for law enforcement and forensic applications

    Science.gov (United States)

    Altschuler, Bruce R.; Monson, Keith L.

    1998-03-01

    Representation of crime scenes as virtual reality 3D computer displays promises to become a useful and important tool for law enforcement evaluation and analysis, forensic identification and pathological study and archival presentation during court proceedings. Use of these methods for assessment of evidentiary materials demands complete accuracy of reproduction of the original scene, both in data collection and in its eventual virtual reality representation. The recording of spatially accurate information as soon as possible after first arrival of law enforcement personnel is advantageous for unstable or hazardous crime scenes and reduces the possibility that either inadvertent measurement error or deliberate falsification may occur or be alleged concerning processing of a scene. Detailed measurements and multimedia archiving of critical surface topographical details in a calibrated, uniform, consistent and standardized quantitative 3D coordinate method are needed. These methods would afford professional personnel in initial contact with a crime scene the means for remote, non-contacting, immediate, thorough and unequivocal documentation of the contents of the scene. Measurements of the relative and absolute global positions of object sand victims, and their dispositions within the scene before their relocation and detailed examination, could be made. Resolution must be sufficient to map both small and large objects. Equipment must be able to map regions at varied resolution as collected from different perspectives. Progress is presented in devising methods for collecting and archiving 3D spatial numerical data from crime scenes, sufficient for law enforcement needs, by remote laser structured light and video imagery. Two types of simulation studies were done. One study evaluated the potential of 3D topographic mapping and 3D telepresence using a robotic platform for explosive ordnance disassembly. The second study involved using the laser mapping system on a

  14. Preservice Teachers' Video Simulations and Subsequent Noticing: A Practice-Based Method to Prepare Mathematics Teachers

    Science.gov (United States)

    Amador, Julie M.

    2017-01-01

    The purpose of this study was to implement a Video Simulation Task in a mathematics methods teacher education course to engage preservice teachers in considering both the teaching and learning aspects of mathematics lesson delivery. Participants anticipated student and teacher thinking and created simulations, in which they acted out scenes on a…

  15. Video quality of 3G videophones for telephone cardiopulmonary resuscitation.

    Science.gov (United States)

    Tränkler, Uwe; Hagen, Oddvar; Horsch, Alexander

    2008-01-01

    We simulated a cardiopulmonary resuscitation (CPR) scene with a manikin and used two 3G videophones on the caller's side to transmit video to a laptop PC. Five observers (two doctors with experience in emergency medicine and three paramedics) evaluated the video. They judged whether the manikin was breathing and whether they would give advice for CPR; they also graded the confidence of their decision-making. Breathing was only visible from certain orientations of the videophones, at distances below 150 cm with good illumination and a still background. Since the phones produced a degradation in colours and shadows, detection of breathing mainly depended on moving contours. Low camera positioning produced better results than having the camera high up. Darkness, shaking of the camera and a moving background made detection of breathing almost impossible. The video from the two 3G videophones that were tested was of sufficient quality for telephone CPR provided that camera orientation, distance, illumination and background were carefully chosen. Thus it seems possible to use 3G videophones for emergency calls involving CPR. However, further studies on the required video quality in different scenarios are necessary.

  16. Brain activity and desire for Internet video game play.

    Science.gov (United States)

    Han, Doug Hyun; Bolo, Nicolas; Daniels, Melissa A; Arenella, Lynn; Lyoo, In Kyoon; Renshaw, Perry F

    2011-01-01

    Recent studies have suggested that the brain circuitry mediating cue-induced desire for video games is similar to that elicited by cues related to drugs and alcohol. We hypothesized that desire for Internet video games during cue presentation would activate similar brain regions to those that have been linked with craving for drugs or pathologic gambling. This study involved the acquisition of diagnostic magnetic resonance imaging and functional magnetic resonance imaging data from 19 healthy male adults (age, 18-23 years) following training and a standardized 10-day period of game play with a specified novel Internet video game, "War Rock" (K2 Network, Irvine, CA). Using segments of videotape consisting of 5 contiguous 90-second segments of alternating resting, matched control, and video game-related scenes, desire to play the game was assessed using a 7-point visual analogue scale before and after presentation of the videotape. In responding to Internet video game stimuli, compared with neutral control stimuli, significantly greater activity was identified in left inferior frontal gyrus, left parahippocampal gyrus, right and left parietal lobe, right and left thalamus, and right cerebellum (false discovery rate Internet video game showed significantly greater activity in right medial frontal lobe, right and left frontal precentral gyrus, right parietal postcentral gyrus, right parahippocampal gyrus, and left parietal precuneus gyrus. Controlling for total game time, reported desire for the Internet video game in the subjects who played more Internet video game was positively correlated with activation in right medial frontal lobe and right parahippocampal gyrus. The present findings suggest that cue-induced activation to Internet video game stimuli may be similar to that observed during cue presentation in persons with substance dependence or pathologic gambling. In particular, cues appear to commonly elicit activity in the dorsolateral prefrontal, orbitofrontal

  17. Neural Scene Segmentation by Oscillatory Correlation

    National Research Council Canada - National Science Library

    Wang, DeLiang

    2000-01-01

    The segmentation of a visual scene into a set of coherent patterns (objects) is a fundamental aspect of perception, which underlies a variety of important tasks such as figure/ground segregation, and scene analysis...

  18. A hierarchical inferential method for indoor scene classification

    Directory of Open Access Journals (Sweden)

    Jiang Jingzhe

    2017-12-01

    Full Text Available Indoor scene classification forms a basis for scene interaction for service robots. The task is challenging because the layout and decoration of a scene vary considerably. Previous studies on knowledge-based methods commonly ignore the importance of visual attributes when constructing the knowledge base. These shortcomings restrict the performance of classification. The structure of a semantic hierarchy was proposed to describe similarities of different parts of scenes in a fine-grained way. Besides the commonly used semantic features, visual attributes were also introduced to construct the knowledge base. Inspired by the processes of human cognition and the characteristics of indoor scenes, we proposed an inferential framework based on the Markov logic network. The framework is evaluated on a popular indoor scene dataset, and the experimental results demonstrate its effectiveness.

  19. Eye movements, visual search and scene memory, in an immersive virtual environment.

    Directory of Open Access Journals (Sweden)

    Dmitry Kit

    Full Text Available Visual memory has been demonstrated to play a role in both visual search and attentional prioritization in natural scenes. However, it has been studied predominantly in experimental paradigms using multiple two-dimensional images. Natural experience, however, entails prolonged immersion in a limited number of three-dimensional environments. The goal of the present experiment was to recreate circumstances comparable to natural visual experience in order to evaluate the role of scene memory in guiding eye movements in a natural environment. Subjects performed a continuous visual-search task within an immersive virtual-reality environment over three days. We found that, similar to two-dimensional contexts, viewers rapidly learn the location of objects in the environment over time, and use spatial memory to guide search. Incidental fixations did not provide obvious benefit to subsequent search, suggesting that semantic contextual cues may often be just as efficient, or that many incidentally fixated items are not held in memory in the absence of a specific task. On the third day of the experience in the environment, previous search items changed in color. These items were fixated upon with increased probability relative to control objects, suggesting that memory-guided prioritization (or Surprise may be a robust mechanisms for attracting gaze to novel features of natural environments, in addition to task factors and simple spatial saliency.

  20. Eye movements, visual search and scene memory, in an immersive virtual environment.

    Science.gov (United States)

    Kit, Dmitry; Katz, Leor; Sullivan, Brian; Snyder, Kat; Ballard, Dana; Hayhoe, Mary

    2014-01-01

    Visual memory has been demonstrated to play a role in both visual search and attentional prioritization in natural scenes. However, it has been studied predominantly in experimental paradigms using multiple two-dimensional images. Natural experience, however, entails prolonged immersion in a limited number of three-dimensional environments. The goal of the present experiment was to recreate circumstances comparable to natural visual experience in order to evaluate the role of scene memory in guiding eye movements in a natural environment. Subjects performed a continuous visual-search task within an immersive virtual-reality environment over three days. We found that, similar to two-dimensional contexts, viewers rapidly learn the location of objects in the environment over time, and use spatial memory to guide search. Incidental fixations did not provide obvious benefit to subsequent search, suggesting that semantic contextual cues may often be just as efficient, or that many incidentally fixated items are not held in memory in the absence of a specific task. On the third day of the experience in the environment, previous search items changed in color. These items were fixated upon with increased probability relative to control objects, suggesting that memory-guided prioritization (or Surprise) may be a robust mechanisms for attracting gaze to novel features of natural environments, in addition to task factors and simple spatial saliency.

  1. Camera Motion and Surrounding Scene Appearance as Context for Action Recognition

    KAUST Repository

    Heilbron, Fabian Caba; Thabet, Ali Kassem; Niebles, Juan Carlos; Ghanem, Bernard

    2015-01-01

    This paper describes a framework for recognizing human actions in videos by incorporating a new set of visual cues that represent the context of the action. We develop a weak foreground-background segmentation approach in order to robustly extract not only foreground features that are focused on the actors, but also global camera motion and contextual scene information. Using dense point trajectories, our approach separates and describes the foreground motion from the background, represents the appearance of the extracted static background, and encodes the global camera motion that interestingly is shown to be discriminative for certain action classes. Our experiments on four challenging benchmarks (HMDB51, Hollywood2, Olympic Sports, and UCF50) show that our contextual features enable a significant performance improvement over state-of-the-art algorithms.

  2. Camera Motion and Surrounding Scene Appearance as Context for Action Recognition

    KAUST Repository

    Heilbron, Fabian Caba

    2015-04-17

    This paper describes a framework for recognizing human actions in videos by incorporating a new set of visual cues that represent the context of the action. We develop a weak foreground-background segmentation approach in order to robustly extract not only foreground features that are focused on the actors, but also global camera motion and contextual scene information. Using dense point trajectories, our approach separates and describes the foreground motion from the background, represents the appearance of the extracted static background, and encodes the global camera motion that interestingly is shown to be discriminative for certain action classes. Our experiments on four challenging benchmarks (HMDB51, Hollywood2, Olympic Sports, and UCF50) show that our contextual features enable a significant performance improvement over state-of-the-art algorithms.

  3. Saliency predicts change detection in pictures of natural scenes.

    Science.gov (United States)

    Wright, Michael J

    2005-01-01

    It has been proposed that the visual system encodes the salience of objects in the visual field in an explicit two-dimensional map that guides visual selective attention. Experiments were conducted to determine whether salience measurements applied to regions of pictures of outdoor scenes could predict the detection of changes in those regions. To obtain a quantitative measure of change detection, observers located changes in pairs of colour pictures presented across an interstimulus interval (ISI). Salience measurements were then obtained from different observers for image change regions using three independent methods, and all were positively correlated with change detection. Factor analysis extracted a single saliency factor that accounted for 62% of the variance contained in the four measures. Finally, estimates of the magnitude of the image change in each picture pair were obtained, using nine separate visual filters representing low-level vision features (luminance, colour, spatial frequency, orientation, edge density). None of the feature outputs was significantly associated with change detection or saliency. On the other hand it was shown that high-level (structural) properties of the changed region were related to saliency and to change detection: objects were more salient than shadows and more detectable when changed.

  4. Group tele-immersion:enabling natural interactions between groups at distant sites.

    Energy Technology Data Exchange (ETDEWEB)

    Yang, Christine L. (Sandia National Laboratories, Livermore, CA); Stewart, Corbin (Sandia National Laboratories, Livermore, CA); Nashel, Andrew (University of North Carolina at Chapel Hill, Chapel Hill, NC)

    2005-08-01

    We present techniques and a system for synthesizing views for video teleconferencing between small groups. In place of replicating one-to-one systems for each pair of users, we create a single unified display of the remote group. Instead of performing dense 3D scene computation, we use more cameras and trade-off storage and hardware for computation. While it is expensive to directly capture a scene from all possible viewpoints, we have observed that the participants viewpoints usually remain at a constant height (eye level) during video teleconferencing. Therefore, we can restrict the possible viewpoint to be within a virtual plane without sacrificing much of the realism, and in cloning so we significantly reduce the number of required cameras. Based on this observation, we have developed a technique that uses light-field style rendering to guarantee the quality of the synthesized views, using a linear array of cameras with a life-sized, projected display. Our full-duplex prototype system between Sandia National Laboratories, California and the University of North Carolina at Chapel Hill has been able to synthesize photo-realistic views at interactive rates, and has been used to video conference during regular meetings between the sites.

  5. Evaluating Perceived Naturalness of Facial Expression After Fillers to the Nasolabial Folds and Lower Face With Standardized Video and Photography.

    Science.gov (United States)

    Philipp-Dormston, Wolfgang G; Wong, Cindy; Schuster, Bernd; Larsson, Markus K; Podda, Maurizio

    2018-06-01

    Hyaluronic acid (HA) fillers are commonly used in treating facial wrinkles and folds but have not been studied with standardized methodology to include assessment of standard facial expressions. To assess perceived naturalness of facial expression after treatment with 2 HA fillers manufactured with XpresHAn Technology (also known as Optimal Balance Technology). Treatment was directed to the nasolabial folds (NLFs) and at least 1 additional lower face wrinkle or fold. Maintenance of naturalness, attractiveness, and age at 1 month after optimal treatment were assessed using video recordings and photographs capturing different facial animations. Global aesthetic improvement, subjects' satisfaction, and safety were also evaluated. The treatment was well tolerated. Naturalness of facial expression in motion was determined to be at least maintained in 95% of subjects. Attractiveness was enhanced in 89% of subjects and 79% of subjects were considered to look younger. Most subjects assessed their aesthetic appearance as improved and were satisfied with their treatment. Naturalness and attractiveness can be assessed using video recordings and photographs capturing different facial animations. XpresHAn Technology HA filler treatments create natural-looking results with high subject satisfaction.

  6. Visual search for arbitrary objects in real scenes

    Science.gov (United States)

    Alvarez, George A.; Rosenholtz, Ruth; Kuzmova, Yoana I.; Sherman, Ashley M.

    2011-01-01

    How efficient is visual search in real scenes? In searches for targets among arrays of randomly placed distractors, efficiency is often indexed by the slope of the reaction time (RT) × Set Size function. However, it may be impossible to define set size for real scenes. As an approximation, we hand-labeled 100 indoor scenes and used the number of labeled regions as a surrogate for set size. In Experiment 1, observers searched for named objects (a chair, bowl, etc.). With set size defined as the number of labeled regions, search was very efficient (~5 ms/item). When we controlled for a possible guessing strategy in Experiment 2, slopes increased somewhat (~15 ms/item), but they were much shallower than search for a random object among other distinctive objects outside of a scene setting (Exp. 3: ~40 ms/item). In Experiments 4–6, observers searched repeatedly through the same scene for different objects. Increased familiarity with scenes had modest effects on RTs, while repetition of target items had large effects (>500 ms). We propose that visual search in scenes is efficient because scene-specific forms of attentional guidance can eliminate most regions from the “functional set size” of items that could possibly be the target. PMID:21671156

  7. Visual search for arbitrary objects in real scenes.

    Science.gov (United States)

    Wolfe, Jeremy M; Alvarez, George A; Rosenholtz, Ruth; Kuzmova, Yoana I; Sherman, Ashley M

    2011-08-01

    How efficient is visual search in real scenes? In searches for targets among arrays of randomly placed distractors, efficiency is often indexed by the slope of the reaction time (RT) × Set Size function. However, it may be impossible to define set size for real scenes. As an approximation, we hand-labeled 100 indoor scenes and used the number of labeled regions as a surrogate for set size. In Experiment 1, observers searched for named objects (a chair, bowl, etc.). With set size defined as the number of labeled regions, search was very efficient (~5 ms/item). When we controlled for a possible guessing strategy in Experiment 2, slopes increased somewhat (~15 ms/item), but they were much shallower than search for a random object among other distinctive objects outside of a scene setting (Exp. 3: ~40 ms/item). In Experiments 4-6, observers searched repeatedly through the same scene for different objects. Increased familiarity with scenes had modest effects on RTs, while repetition of target items had large effects (>500 ms). We propose that visual search in scenes is efficient because scene-specific forms of attentional guidance can eliminate most regions from the "functional set size" of items that could possibly be the target.

  8. Architecture and Protocol of a Semantic System Designed for Video Tagging with Sensor Data in Mobile Devices

    Science.gov (United States)

    Macias, Elsa; Lloret, Jaime; Suarez, Alvaro; Garcia, Miguel

    2012-01-01

    Current mobile phones come with several sensors and powerful video cameras. These video cameras can be used to capture good quality scenes, which can be complemented with the information gathered by the sensors also embedded in the phones. For example, the surroundings of a beach recorded by the camera of the mobile phone, jointly with the temperature of the site can let users know via the Internet if the weather is nice enough to swim. In this paper, we present a system that tags the video frames of the video recorded from mobile phones with the data collected by the embedded sensors. The tagged video is uploaded to a video server, which is placed on the Internet and is accessible by any user. The proposed system uses a semantic approach with the stored information in order to make easy and efficient video searches. Our experimental results show that it is possible to tag video frames in real time and send the tagged video to the server with very low packet delay variations. As far as we know there is not any other application developed as the one presented in this paper. PMID:22438753

  9. Architecture and Protocol of a Semantic System Designed for Video Tagging with Sensor Data in Mobile Devices

    Directory of Open Access Journals (Sweden)

    Alvaro Suarez

    2012-02-01

    Full Text Available Current mobile phones come with several sensors and powerful video cameras. These video cameras can be used to capture good quality scenes, which can be complemented with the information gathered by the sensors also embedded in the phones. For example, the surroundings of a beach recorded by the camera of the mobile phone, jointly with the temperature of the site can let users know via the Internet if the weather is nice enough to swim. In this paper, we present a system that tags the video frames of the video recorded from mobile phones with the data collected by the embedded sensors. The tagged video is uploaded to a video server, which is placed on the Internet and is accessible by any user. The proposed system uses a semantic approach with the stored information in order to make easy and efficient video searches. Our experimental results show that it is possible to tag video frames in real time and send the tagged video to the server with very low packet delay variations. As far as we know there is not any other application developed as the one presented in this paper.

  10. Architecture and protocol of a semantic system designed for video tagging with sensor data in mobile devices.

    Science.gov (United States)

    Macias, Elsa; Lloret, Jaime; Suarez, Alvaro; Garcia, Miguel

    2012-01-01

    Current mobile phones come with several sensors and powerful video cameras. These video cameras can be used to capture good quality scenes, which can be complemented with the information gathered by the sensors also embedded in the phones. For example, the surroundings of a beach recorded by the camera of the mobile phone, jointly with the temperature of the site can let users know via the Internet if the weather is nice enough to swim. In this paper, we present a system that tags the video frames of the video recorded from mobile phones with the data collected by the embedded sensors. The tagged video is uploaded to a video server, which is placed on the Internet and is accessible by any user. The proposed system uses a semantic approach with the stored information in order to make easy and efficient video searches. Our experimental results show that it is possible to tag video frames in real time and send the tagged video to the server with very low packet delay variations. As far as we know there is not any other application developed as the one presented in this paper.

  11. Fish assemblages associated with natural and anthropogenically-modified habitats in a marine embayment: comparison of baited videos and opera-house traps.

    Directory of Open Access Journals (Sweden)

    Corey B Wakefield

    Full Text Available Marine embayments and estuaries play an important role in the ecology and life history of many fish species. Cockburn Sound is one of a relative paucity of marine embayments on the west coast of Australia. Its sheltered waters and close proximity to a capital city have resulted in anthropogenic intrusion and extensive seascape modification. This study aimed to compare the sampling efficiencies of baited videos and fish traps in determining the relative abundance and diversity of temperate demersal fish species associated with naturally occurring (seagrass, limestone outcrops and soft sediment and modified (rockwall and dredge channel habitats in Cockburn Sound. Baited videos sampled a greater range of species in higher total and mean abundances than fish traps. This larger amount of data collected by baited videos allowed for greater discrimination of fish assemblages between habitats. The markedly higher diversity and abundances of fish associated with seagrass and limestone outcrops, and the fact that these habitats are very limited within Cockburn Sound, suggests they play an important role in the fish ecology of this embayment. Fish assemblages associated with modified habitats comprised a subset of species in lower abundances when compared to natural habitats with similar physical characteristics. This suggests modified habitats may not have provided the necessary resource requirements (e.g. shelter and/or diet for some species, resulting in alterations to the natural trophic structure and interspecific interactions. Baited videos provided a more efficient and non-extractive method for comparing fish assemblages and habitat associations of smaller bodied species and juveniles in a turbid environment.

  12. Effects of varying presentation time on long-term recognition memory for scenes: Verbatim and gist representations.

    Science.gov (United States)

    Ahmad, Fahad N; Moscovitch, Morris; Hockley, William E

    2017-04-01

    Konkle, Brady, Alvarez and Oliva (Psychological Science, 21, 1551-1556, 2010) showed that participants have an exceptional long-term memory (LTM) for photographs of scenes. We examined to what extent participants' exceptional LTM for scenes is determined by presentation time during encoding. In addition, at retrieval, we varied the nature of the lures in a forced-choice recognition task so that they resembled the target in gist (i.e., global or categorical) information, but were distinct in verbatim information (e.g., an "old" beach scene and a similar "new" beach scene; exemplar condition) or vice versa (e.g., a beach scene and a new scene from a novel category; novel condition). In Experiment 1, half of the list of scenes was presented for 1 s, whereas the other half was presented for 4 s. We found lower performance for shorter study presentation time in the exemplar test condition and similar performance for both study presentation times in the novel test condition. In Experiment 2, participants showed similar performance in an exemplar test for which the lure was of a different category but a category that was used at study. In Experiment 3, when presentation time was lowered to 500 ms, recognition accuracy was reduced in both novel and exemplar test conditions. A less detailed memorial representation of the studied scene containing more gist (i.e., meaning) than verbatim (i.e., surface or perceptual details) information is retrieved from LTM after a short compared to a long study presentation time. We conclude that our findings support fuzzy-trace theory.

  13. Combating bad weather part I rain removal from video

    CERN Document Server

    Mukhopadhyay, Sudipta

    2015-01-01

    Current vision systems are designed to perform in normal weather condition. However, no one can escape from severe weather conditions. Bad weather reduces scene contrast and visibility, which results in degradation in the performance of various computer vision algorithms such as object tracking, segmentation and recognition. Thus, current vision systems must include some mechanisms that enable them to perform up to the mark in bad weather conditions such as rain and fog. Rain causes the spatial and temporal intensity variations in images or video frames. These intensity changes are due to the

  14. Semantic guidance of eye movements in real-world scenes.

    Science.gov (United States)

    Hwang, Alex D; Wang, Hsueh-Cheng; Pomplun, Marc

    2011-05-25

    The perception of objects in our visual world is influenced by not only their low-level visual features such as shape and color, but also their high-level features such as meaning and semantic relations among them. While it has been shown that low-level features in real-world scenes guide eye movements during scene inspection and search, the influence of semantic similarity among scene objects on eye movements in such situations has not been investigated. Here we study guidance of eye movements by semantic similarity among objects during real-world scene inspection and search. By selecting scenes from the LabelMe object-annotated image database and applying latent semantic analysis (LSA) to the object labels, we generated semantic saliency maps of real-world scenes based on the semantic similarity of scene objects to the currently fixated object or the search target. An ROC analysis of these maps as predictors of subjects' gaze transitions between objects during scene inspection revealed a preference for transitions to objects that were semantically similar to the currently inspected one. Furthermore, during the course of a scene search, subjects' eye movements were progressively guided toward objects that were semantically similar to the search target. These findings demonstrate substantial semantic guidance of eye movements in real-world scenes and show its importance for understanding real-world attentional control. Copyright © 2011 Elsevier Ltd. All rights reserved.

  15. Brain activity and desire for internet video game play

    Science.gov (United States)

    Han, Doug Hyun; Bolo, Nicolas; Daniels, Melissa A.; Arenella, Lynn; Lyoo, In Kyoon; Renshaw, Perry F.

    2010-01-01

    Objective Recent studies have suggested that the brain circuitry mediating cue induced desire for video games is similar to that elicited by cues related to drugs and alcohol. We hypothesized that desire for internet video games during cue presentation would activate similar brain regions to those which have been linked with craving for drugs or pathological gambling. Methods This study involved the acquisition of diagnostic MRI and fMRI data from 19 healthy male adults (ages 18–23 years) following training and a standardized 10-day period of game play with a specified novel internet video game, “War Rock” (K-network®). Using segments of videotape consisting of five contiguous 90-second segments of alternating resting, matched control and video game-related scenes, desire to play the game was assessed using a seven point visual analogue scale before and after presentation of the videotape. Results In responding to internet video game stimuli, compared to neutral control stimuli, significantly greater activity was identified in left inferior frontal gyrus, left parahippocampal gyrus, right and left parietal lobe, right and left thalamus, and right cerebellum (FDR video game (MIGP) cohort showed significantly greater activity in right medial frontal lobe, right and left frontal pre-central gyrus, right parietal post-central gyrus, right parahippocampal gyrus, and left parietal precuneus gyrus. Controlling for total game time, reported desire for the internet video game in the MIGP cohort was positively correlated with activation in right medial frontal lobe and right parahippocampal gyrus. Discussion The present findings suggest that cue-induced activation to internet video game stimuli may be similar to that observed during cue presentation in persons with substance dependence or pathological gambling. In particular, cues appear to commonly elicit activity in the dorsolateral prefrontal, orbitofrontal cortex, parahippocampal gyrus, and thalamus. PMID:21220070

  16. Superpixel-Based Feature for Aerial Image Scene Recognition

    Directory of Open Access Journals (Sweden)

    Hongguang Li

    2018-01-01

    Full Text Available Image scene recognition is a core technology for many aerial remote sensing applications. Different landforms are inputted as different scenes in aerial imaging, and all landform information is regarded as valuable for aerial image scene recognition. However, the conventional features of the Bag-of-Words model are designed using local points or other related information and thus are unable to fully describe landform areas. This limitation cannot be ignored when the aim is to ensure accurate aerial scene recognition. A novel superpixel-based feature is proposed in this study to characterize aerial image scenes. Then, based on the proposed feature, a scene recognition method of the Bag-of-Words model for aerial imaging is designed. The proposed superpixel-based feature that utilizes landform information establishes top-task superpixel extraction of landforms to bottom-task expression of feature vectors. This characterization technique comprises the following steps: simple linear iterative clustering based superpixel segmentation, adaptive filter bank construction, Lie group-based feature quantification, and visual saliency model-based feature weighting. Experiments of image scene recognition are carried out using real image data captured by an unmanned aerial vehicle (UAV. The recognition accuracy of the proposed superpixel-based feature is 95.1%, which is higher than those of scene recognition algorithms based on other local features.

  17. Statistics of natural binaural sounds.

    Directory of Open Access Journals (Sweden)

    Wiktor Młynarski

    Full Text Available Binaural sound localization is usually considered a discrimination task, where interaural phase (IPD and level (ILD disparities at narrowly tuned frequency channels are utilized to identify a position of a sound source. In natural conditions however, binaural circuits are exposed to a stimulation by sound waves originating from multiple, often moving and overlapping sources. Therefore statistics of binaural cues depend on acoustic properties and the spatial configuration of the environment. Distribution of cues encountered naturally and their dependence on physical properties of an auditory scene have not been studied before. In the present work we analyzed statistics of naturally encountered binaural sounds. We performed binaural recordings of three auditory scenes with varying spatial configuration and analyzed empirical cue distributions from each scene. We have found that certain properties such as the spread of IPD distributions as well as an overall shape of ILD distributions do not vary strongly between different auditory scenes. Moreover, we found that ILD distributions vary much weaker across frequency channels and IPDs often attain much higher values, than can be predicted from head filtering properties. In order to understand the complexity of the binaural hearing task in the natural environment, sound waveforms were analyzed by performing Independent Component Analysis (ICA. Properties of learned basis functions indicate that in natural conditions soundwaves in each ear are predominantly generated by independent sources. This implies that the real-world sound localization must rely on mechanisms more complex than a mere cue extraction.

  18. Statistics of natural binaural sounds.

    Science.gov (United States)

    Młynarski, Wiktor; Jost, Jürgen

    2014-01-01

    Binaural sound localization is usually considered a discrimination task, where interaural phase (IPD) and level (ILD) disparities at narrowly tuned frequency channels are utilized to identify a position of a sound source. In natural conditions however, binaural circuits are exposed to a stimulation by sound waves originating from multiple, often moving and overlapping sources. Therefore statistics of binaural cues depend on acoustic properties and the spatial configuration of the environment. Distribution of cues encountered naturally and their dependence on physical properties of an auditory scene have not been studied before. In the present work we analyzed statistics of naturally encountered binaural sounds. We performed binaural recordings of three auditory scenes with varying spatial configuration and analyzed empirical cue distributions from each scene. We have found that certain properties such as the spread of IPD distributions as well as an overall shape of ILD distributions do not vary strongly between different auditory scenes. Moreover, we found that ILD distributions vary much weaker across frequency channels and IPDs often attain much higher values, than can be predicted from head filtering properties. In order to understand the complexity of the binaural hearing task in the natural environment, sound waveforms were analyzed by performing Independent Component Analysis (ICA). Properties of learned basis functions indicate that in natural conditions soundwaves in each ear are predominantly generated by independent sources. This implies that the real-world sound localization must rely on mechanisms more complex than a mere cue extraction.

  19. Online Video as a Marketing Tool : A quantitative survey on video marketing habits

    OpenAIRE

    Boman, Kalle; Raijonkari, Kalle

    2017-01-01

    The rapid development of high-speed mobile networks and mobile device technology have led to an immense growth of online video content. As consumers spend more and more time with online video, marketing of goods and services has naturally caught up with the medium. The aim of the research was to examine the online video marketing habits and attitudes of small and medium-sized enterprises in Jyväskylä for RecOn Productions Oy, a local audiovisual production company. The findings of the res...

  20. Spherical rotation orientation indication for HEVC and JEM coding of 360 degree video

    Science.gov (United States)

    Boyce, Jill; Xu, Qian

    2017-09-01

    Omnidirectional (or "360 degree") video, representing a panoramic view of a spherical 360° ×180° scene, can be encoded using conventional video compression standards, once it has been projection mapped to a 2D rectangular format. Equirectangular projection format is currently used for mapping 360 degree video to a rectangular representation for coding using HEVC/JEM. However, video in the top and bottom regions of the image, corresponding to the "north pole" and "south pole" of the spherical representation, is significantly warped. We propose to perform spherical rotation of the input video prior to HEVC/JEM encoding in order to improve the coding efficiency, and to signal parameters in a supplemental enhancement information (SEI) message that describe the inverse rotation process recommended to be applied following HEVC/JEM decoding, prior to display. Experiment results show that up to 17.8% bitrate gain (using the WS-PSNR end-to-end metric) can be achieved for the Chairlift sequence using HM16.15 and 11.9% gain using JEM6.0, and an average gain of 2.9% for HM16.15 and 2.2% for JEM6.0.

  1. Videos for Science Communication and Nature Interpretation: The TIB|AV-Portal as Resource.

    Science.gov (United States)

    Marín Arraiza, Paloma; Plank, Margret; Löwe, Peter

    2016-04-01

    Scientific audiovisual media such as videos of research, interactive displays or computer animations has become an important part of scientific communication and education. Dynamic phenomena can be described better by audiovisual media than by words and pictures. For this reason, scientific videos help us to understand and discuss environmental phenomena more efficiently. Moreover, the creation of scientific videos is easier than ever, thanks to mobile devices and open source editing software. Video-clips, webinars or even the interactive part of a PICO are formats of scientific audiovisual media used in the Geosciences. This type of media translates the location-referenced Science Communication such as environmental interpretation into computed-based Science Communication. A new way of Science Communication is video abstracting. A video abstract is a three- to five-minute video statement that provides background information about a research paper. It also gives authors the opportunity to present their research activities to a wider audience. Since this kind of media have become an important part of scientific communication there is a need for reliable infrastructures which are capable of managing the digital assets researchers generate. Using the reference of the usecase of video abstracts this paper gives an overview over the activities by the German National Library of Science and Technology (TIB) regarding publishing and linking audiovisual media in a scientifically sound way. The German National Library of Science and Technology (TIB) in cooperation with the Hasso Plattner Institute (HPI) developed a web-based portal (av.tib.eu) that optimises access to scientific videos in the fields of science and technology. Videos from the realms of science and technology can easily be uploaded onto the TIB|AV Portal. Within a short period of time the videos are assigned a digital object identifier (DOI). This enables them to be referenced, cited, and linked (e.g. to the

  2. Multi- and hyperspectral scene modeling

    Science.gov (United States)

    Borel, Christoph C.; Tuttle, Ronald F.

    2011-06-01

    This paper shows how to use a public domain raytracer POV-Ray (Persistence Of Vision Raytracer) to render multiand hyper-spectral scenes. The scripting environment allows automatic changing of the reflectance and transmittance parameters. The radiosity rendering mode allows accurate simulation of multiple-reflections between surfaces and also allows semi-transparent surfaces such as plant leaves. We show that POV-Ray computes occlusion accurately using a test scene with two blocks under a uniform sky. A complex scene representing a plant canopy is generated using a few lines of script. With appropriate rendering settings, shadows cast by leaves are rendered in many bands. Comparing single and multiple reflection renderings, the effect of multiple reflections is clearly visible and accounts for 25% of the overall apparent canopy reflectance in the near infrared.

  3. A generic flexible and robust approach for intelligent real-time video-surveillance systems

    Science.gov (United States)

    Desurmont, Xavier; Delaigle, Jean-Francois; Bastide, Arnaud; Macq, Benoit

    2004-05-01

    In this article we present a generic, flexible and robust approach for an intelligent real-time video-surveillance system. A previous version of the system was presented in [1]. The goal of these advanced tools is to provide help to operators by detecting events of interest in visual scenes and highlighting alarms and compute statistics. The proposed system is a multi-camera platform able to handle different standards of video inputs (composite, IP, IEEE1394 ) and which can basically compress (MPEG4), store and display them. This platform also integrates advanced video analysis tools, such as motion detection, segmentation, tracking and interpretation. The design of the architecture is optimised to playback, display, and process video flows in an efficient way for video-surveillance application. The implementation is distributed on a scalable computer cluster based on Linux and IP network. It relies on POSIX threads for multitasking scheduling. Data flows are transmitted between the different modules using multicast technology and under control of a TCP-based command network (e.g. for bandwidth occupation control). We report here some results and we show the potential use of such a flexible system in third generation video surveillance system. We illustrate the interest of the system in a real case study, which is the indoor surveillance.

  4. Ambient visual information confers a context-specific, long-term benefit on memory for haptic scenes.

    Science.gov (United States)

    Pasqualotto, Achille; Finucane, Ciara M; Newell, Fiona N

    2013-09-01

    We investigated the effects of indirect, ambient visual information on haptic spatial memory. Using touch only, participants first learned an array of objects arranged in a scene and were subsequently tested on their recognition of that scene which was always hidden from view. During haptic scene exploration, participants could either see the surrounding room or were blindfolded. We found a benefit in haptic memory performance only when ambient visual information was available in the early stages of the task but not when participants were initially blindfolded. Specifically, when ambient visual information was available a benefit on performance was found in a subsequent block of trials during which the participant was blindfolded (Experiment 1), and persisted over a delay of one week (Experiment 2). However, we found that the benefit for ambient visual information did not transfer to a novel environment (Experiment 3). In Experiment 4 we further investigated the nature of the visual information that improved haptic memory and found that geometric information about a surrounding (virtual) room rather than isolated object landmarks, facilitated haptic scene memory. Our results suggest that vision improves haptic memory for scenes by providing an environment-centred, allocentric reference frame for representing object location through touch. Copyright © 2013 Elsevier B.V. All rights reserved.

  5. Research on 3D virtual campus scene modeling based on 3ds Max and VRML

    Science.gov (United States)

    Kang, Chuanli; Zhou, Yanliu; Liang, Xianyue

    2015-12-01

    With the rapid development of modem technology, the digital information management and the virtual reality simulation technology has become a research hotspot. Virtual campus 3D model can not only express the real world objects of natural, real and vivid, and can expand the campus of the reality of time and space dimension, the combination of school environment and information. This paper mainly uses 3ds Max technology to create three-dimensional model of building and on campus buildings, special land etc. And then, the dynamic interactive function is realized by programming the object model in 3ds Max by VRML .This research focus on virtual campus scene modeling technology and VRML Scene Design, and the scene design process in a variety of real-time processing technology optimization strategy. This paper guarantees texture map image quality and improve the running speed of image texture mapping. According to the features and architecture of Guilin University of Technology, 3ds Max, AutoCAD and VRML were used to model the different objects of the virtual campus. Finally, the result of virtual campus scene is summarized.

  6. Primal scene derivatives in the work of Yukio Mishima: the primal scene fantasy.

    Science.gov (United States)

    Turco, Ronald N

    2002-01-01

    This article discusses the preoccupation with fire, revenge, crucifixion, and other fantasies as they relate to the primal scene. The manifestations of these fantasies are demonstrated in a work of fiction by Yukio Mishima. The Temple of the Golden Pavillion. As is the case in other writings of Mishima there is a fusion of aggressive and libidinal drives and a preoccupation with death. The primal scene is directly connected with pyromania and destructive "acting out" of fantasies. This article is timely with regard to understanding contemporary events of cultural and national destruction.

  7. Emotional and neutral scenes in competition: orienting, efficiency, and identification.

    Science.gov (United States)

    Calvo, Manuel G; Nummenmaa, Lauri; Hyönä, Jukka

    2007-12-01

    To investigate preferential processing of emotional scenes competing for limited attentional resources with neutral scenes, prime pictures were presented briefly (450 ms), peripherally (5.2 degrees away from fixation), and simultaneously (one emotional and one neutral scene) versus singly. Primes were followed by a mask and a probe for recognition. Hit rate was higher for emotional than for neutral scenes in the dual- but not in the single-prime condition, and A' sensitivity decreased for neutral but not for emotional scenes in the dual-prime condition. This preferential processing involved both selective orienting and efficient encoding, as revealed, respectively, by a higher probability of first fixation on--and shorter saccade latencies to--emotional scenes and by shorter fixation time needed to accurately identify emotional scenes, in comparison with neutral scenes.

  8. Employing Inquiry-Based Computer Simulations and Embedded Scientist Videos To Teach Challenging Climate Change and Nature of Science Concepts

    Science.gov (United States)

    Cohen, E.

    2013-12-01

    Design based research was utilized to investigate how students use a greenhouse effect simulation in order to derive best learning practices. During this process, students recognized the authentic scientific process involving computer simulations. The simulation used is embedded within an inquiry-based technology-mediated science curriculum known as Web-based Inquiry Science Environment (WISE). For this research, students from a suburban, diverse, middle school setting use the simulations as part of a two week-long class unit on climate change. A pilot study was conducted during phase one of the research that informed phase two, which encompasses the dissertation. During the pilot study, as students worked through the simulation, evidence of shifts in student motivation, understanding of science content, and ideas about the nature of science became present using a combination of student interviews, focus groups, and students' conversations. Outcomes of the pilot study included improvements to the pedagogical approach. Allowing students to do 'Extreme Testing' (e.g., making the world as hot or cold as possible) and increasing the time for free exploration of the simulation are improvements made as a result of the findings of the pilot study. In the dissertation (phase two of the research design) these findings were implemented in a new curriculum scaled for 85 new students from the same school during the next school year. The modifications included new components implementing simulations as an assessment tool for all students and embedded modeling tools. All students were asked to build pre and post models, however due to technological constraints these were not an effective tool. A non-video group of 44 students was established and another group of 41 video students had a WISE curriculum which included twelve minutes of scientists' conversational videos referencing explicit aspects on the nature of science, specifically the use of models and simulations in science

  9. Suspiciousness perception in dynamic scenes: a comparison of CCTV operators and novices.

    Directory of Open Access Journals (Sweden)

    Christina Jayne Howard

    2013-08-01

    Full Text Available Perception of scenes has typically been investigated by using static or simplified visual displays. How attention is used to perceive and evaluate dynamic, realistic scenes is more poorly understood, in part due to the problem of comparing eye fixations to moving stimuli across observers. When the task and stimulus is common across observers, consistent fixation location can indicate that that region has high goal-based relevance. Here we investigated these issues when an observer has a specific, and naturalistic, task: closed-circuit television (CCTV monitoring. We concurrently recorded eye movements and ratings of perceived suspiciousness as different observers watched the same set of clips from real CCTV footage. Trained CCTV operators showed a greater consistency in fixation location and greater consistency in suspiciousness judgements than untrained observers. Training appears to increase between-operators consistency by learning 'knowing what to look for' in these scenes. We used a novel ‘Dynamic Area of Focus (DAF’ analysis to show that in CCTV monitoring there is a temporal relationship between eye movements and subsequent manual responses, as we have previously found for a sports video watching task. For trained CCTV operators and for untrained observers, manual responses were most highly related to between-observer eye position spread when a temporal lag was introduced between the fixation and response data. Shortly after between-observer eye positions became most similar, observers tended to push the joystick to indicate perceived suspiciousness. Conversely, shortly after between-observer eye positions became dissimilar, observers tended to rate suspiciousness as low. These data provide further support for this DAF method as an important tool for examining goal-directed fixation behaviour when the stimulus is a real moving image.

  10. Fourier power, subjective distance, and object categories all provide plausible models of BOLD responses in scene-selective visual areas

    Science.gov (United States)

    Lescroart, Mark D.; Stansbury, Dustin E.; Gallant, Jack L.

    2015-01-01

    Perception of natural visual scenes activates several functional areas in the human brain, including the Parahippocampal Place Area (PPA), Retrosplenial Complex (RSC), and the Occipital Place Area (OPA). It is currently unclear what specific scene-related features are represented in these areas. Previous studies have suggested that PPA, RSC, and/or OPA might represent at least three qualitatively different classes of features: (1) 2D features related to Fourier power; (2) 3D spatial features such as the distance to objects in a scene; or (3) abstract features such as the categories of objects in a scene. To determine which of these hypotheses best describes the visual representation in scene-selective areas, we applied voxel-wise modeling (VM) to BOLD fMRI responses elicited by a set of 1386 images of natural scenes. VM provides an efficient method for testing competing hypotheses by comparing predictions of brain activity based on encoding models that instantiate each hypothesis. Here we evaluated three different encoding models that instantiate each of the three hypotheses listed above. We used linear regression to fit each encoding model to the fMRI data recorded from each voxel, and we evaluated each fit model by estimating the amount of variance it predicted in a withheld portion of the data set. We found that voxel-wise models based on Fourier power or the subjective distance to objects in each scene predicted much of the variance predicted by a model based on object categories. Furthermore, the response variance explained by these three models is largely shared, and the individual models explain little unique variance in responses. Based on an evaluation of previous studies and the data we present here, we conclude that there is currently no good basis to favor any one of the three alternative hypotheses about visual representation in scene-selective areas. We offer suggestions for further studies that may help resolve this issue. PMID:26594164

  11. Balancing Attended and Global Stimuli in Perceived Video Quality Assessment

    DEFF Research Database (Denmark)

    You, Junyong; Korhonen, Jari; Perkis, Andrew

    2011-01-01

    . This paper proposes a quality model based on the late attention selection theory, assuming that the video quality is perceived via two mechanisms: global and local quality assessment. First we model several visual features influencing the visual attention in quality assessment scenarios to derive......The visual attention mechanism plays a key role in the human perception system and it has a significant impact on our assessment of perceived video quality. In spite of receiving less attention from the viewers, unattended stimuli can still contribute to the understanding of the visual content...... an attention map using appropriate fusion techniques. The global quality assessment as based on the assumption that viewers allocate their attention equally to the entire visual scene, is modeled by four carefully designed quality features. By employing these same quality features, the local quality model...

  12. Automatic detection of artifacts in converted S3D video

    Science.gov (United States)

    Bokov, Alexander; Vatolin, Dmitriy; Zachesov, Anton; Belous, Alexander; Erofeev, Mikhail

    2014-03-01

    In this paper we present algorithms for automatically detecting issues specific to converted S3D content. When a depth-image-based rendering approach produces a stereoscopic image, the quality of the result depends on both the depth maps and the warping algorithms. The most common problem with converted S3D video is edge-sharpness mismatch. This artifact may appear owing to depth-map blurriness at semitransparent edges: after warping, the object boundary becomes sharper in one view and blurrier in the other, yielding binocular rivalry. To detect this problem we estimate the disparity map, extract boundaries with noticeable differences, and analyze edge-sharpness correspondence between views. We pay additional attention to cases involving a complex background and large occlusions. Another problem is detection of scenes that lack depth volume: we present algorithms for detecting at scenes and scenes with at foreground objects. To identify these problems we analyze the features of the RGB image as well as uniform areas in the depth map. Testing of our algorithms involved examining 10 Blu-ray 3D releases with converted S3D content, including Clash of the Titans, The Avengers, and The Chronicles of Narnia: The Voyage of the Dawn Treader. The algorithms we present enable improved automatic quality assessment during the production stage.

  13. 3D scene reconstruction based on multi-view distributed video coding in the Zernike domain for mobile applications

    Science.gov (United States)

    Palma, V.; Carli, M.; Neri, A.

    2011-02-01

    In this paper a Multi-view Distributed Video Coding scheme for mobile applications is presented. Specifically a new fusion technique between temporal and spatial side information in Zernike Moments domain is proposed. Distributed video coding introduces a flexible architecture that enables the design of very low complex video encoders compared to its traditional counterparts. The main goal of our work is to generate at the decoder the side information that optimally blends temporal and interview data. Multi-view distributed coding performance strongly depends on the side information quality built at the decoder. At this aim for improving its quality a spatial view compensation/prediction in Zernike moments domain is applied. Spatial and temporal motion activity have been fused together to obtain the overall side-information. The proposed method has been evaluated by rate-distortion performances for different inter-view and temporal estimation quality conditions.

  14. Scene Integration for Online VR Advertising Clouds

    Directory of Open Access Journals (Sweden)

    Michael Kalochristianakis

    2014-12-01

    Full Text Available This paper presents a scene composition approach that allows the combinational use of standard three dimensional objects, called models, in order to create X3D scenes. The module is an integral part of a broader design aiming to construct large scale online advertising infrastructures that rely on virtual reality technologies. The architecture addresses a number of problems regarding remote rendering for low end devices and last but not least, the provision of scene composition and integration. Since viewers do not keep information regarding individual input models or scenes, composition requires the consideration of mechanisms that add state to viewing technologies. In terms of this work we extended a well-known, open source X3D authoring tool.

  15. Advanced radiometric and interferometric milimeter-wave scene simulations

    Science.gov (United States)

    Hauss, B. I.; Moffa, P. J.; Steele, W. G.; Agravante, H.; Davidheiser, R.; Samec, T.; Young, S. K.

    1993-01-01

    Smart munitions and weapons utilize various imaging sensors (including passive IR, active and passive millimeter-wave, and visible wavebands) to detect/identify targets at short standoff ranges and in varied terrain backgrounds. In order to design and evaluate these sensors under a variety of conditions, a high-fidelity scene simulation capability is necessary. Such a capability for passive millimeter-wave scene simulation exists at TRW. TRW's Advanced Radiometric Millimeter-Wave Scene Simulation (ARMSS) code is a rigorous, benchmarked, end-to-end passive millimeter-wave scene simulation code for interpreting millimeter-wave data, establishing scene signatures and evaluating sensor performance. In passive millimeter-wave imaging, resolution is limited due to wavelength and aperture size. Where high resolution is required, the utility of passive millimeter-wave imaging is confined to short ranges. Recent developments in interferometry have made possible high resolution applications on military platforms. Interferometry or synthetic aperture radiometry allows the creation of a high resolution image with a sparsely filled aperture. Borrowing from research work in radio astronomy, we have developed and tested at TRW scene reconstruction algorithms that allow the recovery of the scene from a relatively small number of spatial frequency components. In this paper, the TRW modeling capability is described and numerical results are presented.

  16. Setting the scene

    International Nuclear Information System (INIS)

    Curran, S.

    1977-01-01

    The reasons for the special meeting on the breeder reactor are outlined with some reference to the special Scottish interest in the topic. Approximately 30% of the electrical energy generated in Scotland is nuclear and the special developments at Dounreay make policy decisions on the future of the commercial breeder reactor urgent. The participants review the major questions arising in arriving at such decisions. In effect an attempt is made to respond to the wish of the Secretary of State for Energy to have informed debate. To set the scene the importance of energy availability as regards to the strength of the national economy is stressed and the reasons for an increasing energy demand put forward. Examination of alternative sources of energy shows that none is definitely capable of filling the foreseen energy gap. This implies an integrated thermal/breeder reactor programme as the way to close the anticipated gap. The problems of disposal of radioactive waste and the safeguards in the handling of plutonium are outlined. Longer-term benefits, including the consumption of plutonium and naturally occurring radioactive materials, are examined. (author)

  17. Video steganography based on bit-plane decomposition of wavelet-transformed video

    Science.gov (United States)

    Noda, Hideki; Furuta, Tomofumi; Niimi, Michiharu; Kawaguchi, Eiji

    2004-06-01

    This paper presents a steganography method using lossy compressed video which provides a natural way to send a large amount of secret data. The proposed method is based on wavelet compression for video data and bit-plane complexity segmentation (BPCS) steganography. BPCS steganography makes use of bit-plane decomposition and the characteristics of the human vision system, where noise-like regions in bit-planes of a dummy image are replaced with secret data without deteriorating image quality. In wavelet-based video compression methods such as 3-D set partitioning in hierarchical trees (SPIHT) algorithm and Motion-JPEG2000, wavelet coefficients in discrete wavelet transformed video are quantized into a bit-plane structure and therefore BPCS steganography can be applied in the wavelet domain. 3-D SPIHT-BPCS steganography and Motion-JPEG2000-BPCS steganography are presented and tested, which are the integration of 3-D SPIHT video coding and BPCS steganography, and that of Motion-JPEG2000 and BPCS, respectively. Experimental results show that 3-D SPIHT-BPCS is superior to Motion-JPEG2000-BPCS with regard to embedding performance. In 3-D SPIHT-BPCS steganography, embedding rates of around 28% of the compressed video size are achieved for twelve bit representation of wavelet coefficients with no noticeable degradation in video quality.

  18. Laser based imaging of time depending microscopic scenes with strong light emission

    Science.gov (United States)

    Hahlweg, Cornelius; Wilhelm, Eugen; Rothe, Hendrik

    2011-10-01

    Investigating volume scatterometry methods based on short range LIDAR devices for non-static objects we achieved interesting results aside the intended micro-LIDAR: the high speed camera recording of the illuminated scene of an exploding wire -intended for Doppler LIDAR tests - delivered a very effective method of observing details of objects with extremely strong light emission. As a side effect a schlieren movie is gathered without any special effort. The fact that microscopic features of short time processes with high emission and material flow might be imaged without endangering valuable equipment makes this technique at least as interesting as the intended one. So we decided to present our results - including latest video and photo material - instead of a more theoretical paper on our progress concerning the primary goal.

  19. Three-dimensional measurement system for crime scene documentation

    Science.gov (United States)

    Adamczyk, Marcin; Hołowko, Elwira; Lech, Krzysztof; Michoński, Jakub; MÄ czkowski, Grzegorz; Bolewicki, Paweł; Januszkiewicz, Kamil; Sitnik, Robert

    2017-10-01

    Three dimensional measurements (such as photogrammetry, Time of Flight, Structure from Motion or Structured Light techniques) are becoming a standard in the crime scene documentation process. The usage of 3D measurement techniques provide an opportunity to prepare more insightful investigation and helps to show every trace in the context of the entire crime scene. In this paper we would like to present a hierarchical, three-dimensional measurement system that is designed for crime scenes documentation process. Our system reflects the actual standards in crime scene documentation process - it is designed to perform measurement in two stages. First stage of documentation, the most general, is prepared with a scanner with relatively low spatial resolution but also big measuring volume - it is used for the whole scene documentation. Second stage is much more detailed: high resolution but smaller size of measuring volume for areas that required more detailed approach. The documentation process is supervised by a specialised application CrimeView3D, that is a software platform for measurements management (connecting with scanners and carrying out measurements, automatic or semi-automatic data registration in the real time) and data visualisation (3D visualisation of documented scenes). It also provides a series of useful tools for forensic technicians: virtual measuring tape, searching for sources of blood spatter, virtual walk on the crime scene and many others. In this paper we present our measuring system and the developed software. We also provide an outcome from research on metrological validation of scanners that was performed according to VDI/VDE standard. We present a CrimeView3D - a software-platform that was developed to manage the crime scene documentation process. We also present an outcome from measurement sessions that were conducted on real crime scenes with cooperation with Technicians from Central Forensic Laboratory of Police.

  20. Videotrees: Improving video surrogate presentation using hierarchy

    NARCIS (Netherlands)

    Jansen, Michel; Heeren, W.F.L.; van Dijk, Elisabeth M.A.G.

    As the amount of available video content increases, so does the need for better ways of browsing all this material. Because the nature of video makes it hard to process, the need arises for adequate surrogates for video that can readily be skimmed and browsed. In this paper, the effects of the use

  1. Maximizing Resource Utilization in Video Streaming Systems

    Science.gov (United States)

    Alsmirat, Mohammad Abdullah

    2013-01-01

    Video streaming has recently grown dramatically in popularity over the Internet, Cable TV, and wire-less networks. Because of the resource demanding nature of video streaming applications, maximizing resource utilization in any video streaming system is a key factor to increase the scalability and decrease the cost of the system. Resources to…

  2. Local spectral anisotropy is a valid cue for figure–ground organization in natural scenes

    OpenAIRE

    Ramenahalli, Sudarshan; Mihalas, Stefan; Niebur, Ernst

    2014-01-01

    An important step in the process of understanding visual scenes is its organization in different perceptual objects which requires figure-ground segregation. The determination which side of an occlusion boundary is figure (closer to the observer) and which is ground (further away from the observer) is made through a combination of global cues, like convexity, and local cues, like T-junctions. We here focus on a novel set of local cues in the intensity patterns along occlusion boundaries which...

  3. Automated intelligent video surveillance system for ships

    Science.gov (United States)

    Wei, Hai; Nguyen, Hieu; Ramu, Prakash; Raju, Chaitanya; Liu, Xiaoqing; Yadegar, Jacob

    2009-05-01

    To protect naval and commercial ships from attack by terrorists and pirates, it is important to have automatic surveillance systems able to detect, identify, track and alert the crew on small watercrafts that might pursue malicious intentions, while ruling out non-threat entities. Radar systems have limitations on the minimum detectable range and lack high-level classification power. In this paper, we present an innovative Automated Intelligent Video Surveillance System for Ships (AIVS3) as a vision-based solution for ship security. Capitalizing on advanced computer vision algorithms and practical machine learning methodologies, the developed AIVS3 is not only capable of efficiently and robustly detecting, classifying, and tracking various maritime targets, but also able to fuse heterogeneous target information to interpret scene activities, associate targets with levels of threat, and issue the corresponding alerts/recommendations to the man-in- the-loop (MITL). AIVS3 has been tested in various maritime scenarios and shown accurate and effective threat detection performance. By reducing the reliance on human eyes to monitor cluttered scenes, AIVS3 will save the manpower while increasing the accuracy in detection and identification of asymmetric attacks for ship protection.

  4. The motive for sensory pleasure: enjoyment of nature and its representation in painting, music, and literature.

    Science.gov (United States)

    Eisenberger, Robert; Sucharski, Ivan L; Yalowitz, Steven; Kent, Robert J; Loomis, Ross J; Jones, Jason R; Paylor, Sarah; Aselage, Justin; Mueller, Meta Steiger; McLaughlin, John P

    2010-04-01

    Eight studies assessed the motive for sensory pleasure (MSP) involving a general disposition to enjoy and pursue pleasant nature-related experiences and avoid unpleasant nature-related experiences. The stated enjoyment of pleasant sights, smells, sounds, and tactile sensations formed a unitary construct that was distinct from sensation seeking, novelty preference, and need for cognition. MSP was found to be related to (a) enjoyment of pleasant nature scenes and music of high but not low clarity; (b) enjoyment of writings that portrayed highly detailed nature scenes; (c) enjoyment of pleasantly themed paintings and dislike of unpleasant paintings, as distinct from findings with Openness to Experience; (d) choice of pleasant nature scenes over exciting or intellectually stimulating scenes; (e) view duration and memory of artistically rendered quilts; (f) interest in detailed information about nature scenes; and (g) frequency of sensory-type suggestions for improvement of a museum exhibit.

  5. IR characteristic simulation of city scenes based on radiosity model

    Science.gov (United States)

    Xiong, Xixian; Zhou, Fugen; Bai, Xiangzhi; Yu, Xiyu

    2013-09-01

    Reliable modeling for thermal infrared (IR) signatures of real-world city scenes is required for signature management of civil and military platforms. Traditional modeling methods generally assume that scene objects are individual entities during the physical processes occurring in infrared range. However, in reality, the physical scene involves convective and conductive interactions between objects as well as the radiations interactions between objects. A method based on radiosity model describes these complex effects. It has been developed to enable an accurate simulation for the radiance distribution of the city scenes. Firstly, the physical processes affecting the IR characteristic of city scenes were described. Secondly, heat balance equations were formed on the basis of combining the atmospheric conditions, shadow maps and the geometry of scene. Finally, finite difference method was used to calculate the kinetic temperature of object surface. A radiosity model was introduced to describe the scattering effect of radiation between surface elements in the scene. By the synthesis of objects radiance distribution in infrared range, we could obtain the IR characteristic of scene. Real infrared images and model predictions were shown and compared. The results demonstrate that this method can realistically simulate the IR characteristic of city scenes. It effectively displays the infrared shadow effects and the radiation interactions between objects in city scenes.

  6. The primal scene and symbol formation.

    Science.gov (United States)

    Niedecken, Dietmut

    2016-06-01

    This article discusses the meaning of the primal scene for symbol formation by exploring its way of processing in a child's play. The author questions the notion that a sadomasochistic way of processing is the only possible one. A model of an alternative mode of processing is being presented. It is suggested that both ways of processing intertwine in the "fabric of life" (D. Laub). Two clinical vignettes, one from an analytic child psychotherapy and the other from the analysis of a 30 year-old female patient, illustrate how the primal scene is being played out in the form of a terzet. The author explores whether the sadomasochistic way of processing actually precedes the "primal scene as a terzet". She discusses if it could even be regarded as a precondition for the formation of the latter or, alternatively, if the "combined parent-figure" gives rise to ways of processing. The question is being left open. Finally, it is shown how both modes of experiencing the primal scene underlie the discoursive and presentative symbol formation, respectively. Copyright © 2015 Institute of Psychoanalysis.

  7. Unattended real-time re-establishment of visibility in high dynamic range video and stills

    Science.gov (United States)

    Abidi, B.

    2014-05-01

    We describe a portable unattended persistent surveillance system that corrects for harsh illumination conditions, where bright sun light creates mixed contrast effects, i.e., heavy shadows and washouts. These effects result in high dynamic range scenes, where illuminance can vary from few luxes to a 6 figure value. When using regular monitors and cameras, such wide span of illuminations can only be visualized if the actual range of values is compressed, leading to the creation of saturated and/or dark noisy areas and a loss of information in these areas. Images containing extreme mixed contrast cannot be fully enhanced from a single exposure, simply because all information is not present in the original data. The active intervention in the acquisition process is required. A software package, capable of integrating multiple types of COTS and custom cameras, ranging from Unmanned Aerial Systems (UAS) data links to digital single-lens reflex cameras (DSLR), is described. Hardware and software are integrated via a novel smart data acquisition algorithm, which communicates to the camera the parameters that would maximize information content in the final processed scene. A fusion mechanism is then applied to the smartly acquired data, resulting in an enhanced scene where information in both dark and bright areas is revealed. Multi-threading and parallel processing are exploited to produce automatic real time full motion corrected video. A novel enhancement algorithm was also devised to process data from legacy and non-controllable cameras. The software accepts and processes pre-recorded sequences and stills, enhances visible, night vision, and Infrared data, and successfully applies to night time and dark scenes. Various user options are available, integrating custom functionalities of the application into intuitive and easy to use graphical interfaces. The ensuing increase in visibility in surveillance video and intelligence imagery will expand the performance and

  8. Modeling global scene factors in attention

    Science.gov (United States)

    Torralba, Antonio

    2003-07-01

    Models of visual attention have focused predominantly on bottom-up approaches that ignored structured contextual and scene information. I propose a model of contextual cueing for attention guidance based on the global scene configuration. It is shown that the statistics of low-level features across the whole image can be used to prime the presence or absence of objects in the scene and to predict their location, scale, and appearance before exploring the image. In this scheme, visual context information can become available early in the visual processing chain, which allows modulation of the saliency of image regions and provides an efficient shortcut for object detection and recognition. 2003 Optical Society of America

  9. Spectral feature characterization methods for blood stain detection in crime scene backgrounds

    Science.gov (United States)

    Yang, Jie; Mathew, Jobin J.; Dube, Roger R.; Messinger, David W.

    2016-05-01

    Blood stains are one of the most important types of evidence for forensic investigation. They contain valuable DNA information, and the pattern of the stains can suggest specifics about the nature of the violence that transpired at the scene. Blood spectral signatures containing unique reflectance or absorption features are important both for forensic on-site investigation and laboratory testing. They can be used for target detection and identification applied to crime scene hyperspectral imagery, and also be utilized to analyze the spectral variation of blood on various backgrounds. Non-blood stains often mislead the detection and can generate false alarms at a real crime scene, especially for dark and red backgrounds. This paper measured the reflectance of liquid blood and 9 kinds of non-blood samples in the range of 350 nm - 2500 nm in various crime scene backgrounds, such as pure samples contained in petri dish with various thicknesses, mixed samples with different colors and materials of fabrics, and mixed samples with wood, all of which are examined to provide sub-visual evidence for detecting and recognizing blood from non-blood samples in a realistic crime scene. The spectral difference between blood and non-blood samples are examined and spectral features such as "peaks" and "depths" of reflectance are selected. Two blood stain detection methods are proposed in this paper. The first method uses index to denote the ratio of "depth" minus "peak" over"depth" add"peak" within a wavelength range of the reflectance spectrum. The second method uses relative band depth of the selected wavelength ranges of the reflectance spectrum. Results show that the index method is able to discriminate blood from non-blood samples in most tested crime scene backgrounds, but is not able to detect it from black felt. Whereas the relative band depth method is able to discriminate blood from non-blood samples on all of the tested background material types and colors.

  10. Affective video retrieval: violence detection in Hollywood movies by large-scale segmental feature extraction.

    Science.gov (United States)

    Eyben, Florian; Weninger, Felix; Lehment, Nicolas; Schuller, Björn; Rigoll, Gerhard

    2013-01-01

    Without doubt general video and sound, as found in large multimedia archives, carry emotional information. Thus, audio and video retrieval by certain emotional categories or dimensions could play a central role for tomorrow's intelligent systems, enabling search for movies with a particular mood, computer aided scene and sound design in order to elicit certain emotions in the audience, etc. Yet, the lion's share of research in affective computing is exclusively focusing on signals conveyed by humans, such as affective speech. Uniting the fields of multimedia retrieval and affective computing is believed to lend to a multiplicity of interesting retrieval applications, and at the same time to benefit affective computing research, by moving its methodology "out of the lab" to real-world, diverse data. In this contribution, we address the problem of finding "disturbing" scenes in movies, a scenario that is highly relevant for computer-aided parental guidance. We apply large-scale segmental feature extraction combined with audio-visual classification to the particular task of detecting violence. Our system performs fully data-driven analysis including automatic segmentation. We evaluate the system in terms of mean average precision (MAP) on the official data set of the MediaEval 2012 evaluation campaign's Affect Task, which consists of 18 original Hollywood movies, achieving up to .398 MAP on unseen test data in full realism. An in-depth analysis of the worth of individual features with respect to the target class and the system errors is carried out and reveals the importance of peak-related audio feature extraction and low-level histogram-based video analysis.

  11. Affective video retrieval: violence detection in Hollywood movies by large-scale segmental feature extraction.

    Directory of Open Access Journals (Sweden)

    Florian Eyben

    Full Text Available Without doubt general video and sound, as found in large multimedia archives, carry emotional information. Thus, audio and video retrieval by certain emotional categories or dimensions could play a central role for tomorrow's intelligent systems, enabling search for movies with a particular mood, computer aided scene and sound design in order to elicit certain emotions in the audience, etc. Yet, the lion's share of research in affective computing is exclusively focusing on signals conveyed by humans, such as affective speech. Uniting the fields of multimedia retrieval and affective computing is believed to lend to a multiplicity of interesting retrieval applications, and at the same time to benefit affective computing research, by moving its methodology "out of the lab" to real-world, diverse data. In this contribution, we address the problem of finding "disturbing" scenes in movies, a scenario that is highly relevant for computer-aided parental guidance. We apply large-scale segmental feature extraction combined with audio-visual classification to the particular task of detecting violence. Our system performs fully data-driven analysis including automatic segmentation. We evaluate the system in terms of mean average precision (MAP on the official data set of the MediaEval 2012 evaluation campaign's Affect Task, which consists of 18 original Hollywood movies, achieving up to .398 MAP on unseen test data in full realism. An in-depth analysis of the worth of individual features with respect to the target class and the system errors is carried out and reveals the importance of peak-related audio feature extraction and low-level histogram-based video analysis.

  12. Human matching performance of genuine crime scene latent fingerprints.

    Science.gov (United States)

    Thompson, Matthew B; Tangen, Jason M; McCarthy, Duncan J

    2014-02-01

    There has been very little research into the nature and development of fingerprint matching expertise. Here we present the results of an experiment testing the claimed matching expertise of fingerprint examiners. Expert (n = 37), intermediate trainee (n = 8), new trainee (n = 9), and novice (n = 37) participants performed a fingerprint discrimination task involving genuine crime scene latent fingerprints, their matches, and highly similar distractors, in a signal detection paradigm. Results show that qualified, court-practicing fingerprint experts were exceedingly accurate compared with novices. Experts showed a conservative response bias, tending to err on the side of caution by making more errors of the sort that could allow a guilty person to escape detection than errors of the sort that could falsely incriminate an innocent person. The superior performance of experts was not simply a function of their ability to match prints, per se, but a result of their ability to identify the highly similar, but nonmatching fingerprints as such. Comparing these results with previous experiments, experts were even more conservative in their decision making when dealing with these genuine crime scene prints than when dealing with simulated crime scene prints, and this conservatism made them relatively less accurate overall. Intermediate trainees-despite their lack of qualification and average 3.5 years experience-performed about as accurately as qualified experts who had an average 17.5 years experience. New trainees-despite their 5-week, full-time training course or their 6 months experience-were not any better than novices at discriminating matching and similar nonmatching prints, they were just more conservative. Further research is required to determine the precise nature of fingerprint matching expertise and the factors that influence performance. The findings of this representative, lab-based experiment may have implications for the way fingerprint examiners testify in

  13. 47 CFR 80.1127 - On-scene communications.

    Science.gov (United States)

    2010-10-01

    ....1127 Telecommunication FEDERAL COMMUNICATIONS COMMISSION (CONTINUED) SAFETY AND SPECIAL RADIO SERVICES STATIONS IN THE MARITIME SERVICES Global Maritime Distress and Safety System (GMDSS) Operating Procedures for Distress and Safety Communications § 80.1127 On-scene communications. (a) On-scene communications...

  14. Recent advances in multiview distributed video coding

    Science.gov (United States)

    Dufaux, Frederic; Ouaret, Mourad; Ebrahimi, Touradj

    2007-04-01

    We consider dense networks of surveillance cameras capturing overlapped images of the same scene from different viewing directions, such a scenario being referred to as multi-view. Data compression is paramount in such a system due to the large amount of captured data. In this paper, we propose a Multi-view Distributed Video Coding approach. It allows for low complexity / low power consumption at the encoder side, and the exploitation of inter-view correlation without communications among the cameras. We introduce a combination of temporal intra-view side information and homography inter-view side information. Simulation results show both the improvement of the side information, as well as a significant gain in terms of coding efficiency.

  15. Strategy Video Games: Some Principles for Teaching

    Directory of Open Access Journals (Sweden)

    José Miguel Garrido Miranda

    2013-04-01

    Full Text Available In order to investigate the reasons that motivate students to play with strategy video games, an analysis of the observed discourse and practices of fifteen Chilean high school students during collective gaming sessions was conducted. By means of an ethno-methodological analysis, we preceded to identify and saturate emerging categories to determine the interests that impel these students to play. The findings, seen from a pedagogical perspective, suggest that the feeling of being part of a scene, solving increasingly complex situations and positively assessing the uncertainty produced by interaction with this type of environment, can become guiding elements for improving the design of teaching situations supported by the use of digital technologies in the classroom.

  16. IndigoVision IP video keeps watch over remote gas facilities in Amazon rainforest

    Energy Technology Data Exchange (ETDEWEB)

    Anon.

    2010-07-15

    In Brazil, IndigoVision's complete IP video security technology is being used to remotely monitor automated gas facilities in the Amazon rainforest. Twelve compounds containing millions of dollars of process automation, telemetry, and telecom equipment are spread across many thousands of miles of forest and centrally monitored in Rio de Janeiro using Control Center, the company's Security Management software. The security surveillance project uses a hybrid IP network comprising satellite, fibre optic, and wireless links. In addition to advanced compression technology and bandwidth tuning tools, the IP video system uses Activity Controlled Framerate (ACF), which controls the frame rate of the camera video stream based on the amount of motion in a scene. In the absence of activity, the video is streamed at a minimum framerate, but the moment activity is detected the framerate jumps to the configured maximum. This significantly reduces the amount of bandwidth needed. At each remote facility, fixed analog cameras are connected to transmitter nodules that convert the feed to high-quality digital video for transmission over the IP network. The system also integrates alarms with video surveillance. PIR intruder detectors are connected to the system via digital inputs on the transmitters. Advanced alarm-handling features in the Control Center software process the PIR detector alarms and alert operators to potential intrusions. This improves operator efficiency and incident response. 1 fig.

  17. NATURE-RURAL SETTLEMENT INTERACTIONS

    Directory of Open Access Journals (Sweden)

    Zehra Eminağaoğlu

    2006-04-01

    Full Text Available Conservation and management of natural environments are generally brought up upon adverse developments against nature in the humannature interactions. Although individual actions are often considered to be more immediate innatıre-related issuesi ecologic problems tend to spread in time and lead to reginol or even global problems. For this reason, it stands imperative that economic, ecologic and aesthetic values of the environment we live in be protected and used sustainably. Being the scene of nature and the environment landscape signifies the whole with living and nonliving entities where we live in. Dameged and destroyed landscape scenes particularly in urban areas necessitaites the reconsideration of human-nature relations and nature-frendly life style. This study investigates the rural settlements that show harmony with nature and reflects qualities of natural environments on the dwellings. Particularly, with the examples of drawing and pictures it examines the associatiation of rural settlements with nature as well as the use of the green as an occasional or spacial element.

  18. The occipital place area represents the local elements of scenes.

    Science.gov (United States)

    Kamps, Frederik S; Julian, Joshua B; Kubilius, Jonas; Kanwisher, Nancy; Dilks, Daniel D

    2016-05-15

    Neuroimaging studies have identified three scene-selective regions in human cortex: parahippocampal place area (PPA), retrosplenial complex (RSC), and occipital place area (OPA). However, precisely what scene information each region represents is not clear, especially for the least studied, more posterior OPA. Here we hypothesized that OPA represents local elements of scenes within two independent, yet complementary scene descriptors: spatial boundary (i.e., the layout of external surfaces) and scene content (e.g., internal objects). If OPA processes the local elements of spatial boundary information, then it should respond to these local elements (e.g., walls) themselves, regardless of their spatial arrangement. Indeed, we found that OPA, but not PPA or RSC, responded similarly to images of intact rooms and these same rooms in which the surfaces were fractured and rearranged, disrupting the spatial boundary. Next, if OPA represents the local elements of scene content information, then it should respond more when more such local elements (e.g., furniture) are present. Indeed, we found that OPA, but not PPA or RSC, responded more to multiple than single pieces of furniture. Taken together, these findings reveal that OPA analyzes local scene elements - both in spatial boundary and scene content representation - while PPA and RSC represent global scene properties. Copyright © 2016 Elsevier Inc. All rights reserved.

  19. Integration of prior knowledge into dense image matching for video surveillance

    Science.gov (United States)

    Menze, M.; Heipke, C.

    2014-08-01

    Three-dimensional information from dense image matching is a valuable input for a broad range of vision applications. While reliable approaches exist for dedicated stereo setups they do not easily generalize to more challenging camera configurations. In the context of video surveillance the typically large spatial extent of the region of interest and repetitive structures in the scene render the application of dense image matching a challenging task. In this paper we present an approach that derives strong prior knowledge from a planar approximation of the scene. This information is integrated into a graph-cut based image matching framework that treats the assignment of optimal disparity values as a labelling task. Introducing the planar prior heavily reduces ambiguities together with the search space and increases computational efficiency. The results provide a proof of concept of the proposed approach. It allows the reconstruction of dense point clouds in more general surveillance camera setups with wider stereo baselines.

  20. A Macro-Observation Scheme for Abnormal Event Detection in Daily-Life Video Sequences

    Directory of Open Access Journals (Sweden)

    Chiu Wei-Yao

    2010-01-01

    Full Text Available Abstract We propose a macro-observation scheme for abnormal event detection in daily life. The proposed macro-observation representation records the time-space energy of motions of all moving objects in a scene without segmenting individual object parts. The energy history of each pixel in the scene is instantly updated with exponential weights without explicitly specifying the duration of each activity. Since possible activities in daily life are numerous and distinct from each other and not all abnormal events can be foreseen, images from a video sequence that spans sufficient repetition of normal day-to-day activities are first randomly sampled. A constrained clustering model is proposed to partition the sampled images into groups. The new observed event that has distinct distance from any of the cluster centroids is then classified as an anomaly. The proposed method has been evaluated in daily work of a laboratory and BEHAVE benchmark dataset. The experimental results reveal that it can well detect abnormal events such as burglary and fighting as long as they last for a sufficient duration of time. The proposed method can be used as a support system for the scene that requires full time monitoring personnel.

  1. Selective scene perception deficits in a case of topographical disorientation.

    Science.gov (United States)

    Robin, Jessica; Lowe, Matthew X; Pishdadian, Sara; Rivest, Josée; Cant, Jonathan S; Moscovitch, Morris

    2017-07-01

    Topographical disorientation (TD) is a neuropsychological condition characterized by an inability to find one's way, even in familiar environments. One common contributing cause of TD is landmark agnosia, a visual recognition impairment specific to scenes and landmarks. Although many cases of TD with landmark agnosia have been documented, little is known about the perceptual mechanisms which lead to selective deficits in recognizing scenes. In the present study, we test LH, a man who exhibits TD and landmark agnosia, on measures of scene perception that require selectively attending to either the configural or surface properties of a scene. Compared to healthy controls, LH demonstrates perceptual impairments when attending to the configuration of a scene, but not when attending to its surface properties, such as the pattern of the walls or whether the ground is sand or grass. In contrast, when focusing on objects instead of scenes, LH demonstrates intact perception of both geometric and surface properties. This study demonstrates that in a case of TD and landmark agnosia, the perceptual impairments are selective to the layout of scenes, providing insight into the mechanism of landmark agnosia and scene-selective perceptual processes. Copyright © 2017 Elsevier Ltd. All rights reserved.

  2. Feature-aware natural texture synthesis

    KAUST Repository

    Wu, Fuzhang; Dong, Weiming; Kong, Yan; Mei, Xing; Yan, Dongming; Zhang, Xiaopeng; Paul, Jean Claude

    2014-01-01

    This article presents a framework for natural texture synthesis and processing. This framework is motivated by the observation that given examples captured in natural scene, texture synthesis addresses a critical problem, namely, that synthesis

  3. Optimizing color reproduction of natural images

    NARCIS (Netherlands)

    Yendrikhovskij, S.N.; Blommaert, F.J.J.; Ridder, de H.

    1998-01-01

    The paper elaborates on understanding, measuring and optimizing perceived color quality of natural images. We introduce a model for optimal color reproduction of natural scenes which is based on the assumption that color quality of natural images is constrained by perceived naturalness and

  4. Crime Scenes as Augmented Reality

    DEFF Research Database (Denmark)

    Sandvik, Kjetil

    2010-01-01

    Using the concept of augmented reality, this article will investigate how places in various ways have become augmented by means of different mediatization strategies. Augmentation of reality implies an enhancement of the places' emotional character: a certain mood, atmosphere or narrative surplus......, physical damage: they are all readable and interpretable signs. As augmented reality the crime scene carries a narrative which at first is hidden and must be revealed. Due to the process of investigation and the detective's ability to reason and deduce, the crime scene as place is reconstructed as virtual...

  5. Semi-Supervised Multitask Learning for Scene Recognition.

    Science.gov (United States)

    Lu, Xiaoqiang; Li, Xuelong; Mou, Lichao

    2015-09-01

    Scene recognition has been widely studied to understand visual information from the level of objects and their relationships. Toward scene recognition, many methods have been proposed. They, however, encounter difficulty to improve the accuracy, mainly due to two limitations: 1) lack of analysis of intrinsic relationships across different scales, say, the initial input and its down-sampled versions and 2) existence of redundant features. This paper develops a semi-supervised learning mechanism to reduce the above two limitations. To address the first limitation, we propose a multitask model to integrate scene images of different resolutions. For the second limitation, we build a model of sparse feature selection-based manifold regularization (SFSMR) to select the optimal information and preserve the underlying manifold structure of data. SFSMR coordinates the advantages of sparse feature selection and manifold regulation. Finally, we link the multitask model and SFSMR, and propose the semi-supervised learning method to reduce the two limitations. Experimental results report the improvements of the accuracy in scene recognition.

  6. Examination of the Suicide Characteristics Based on the Scene Investigation in Capital Budapest (2009-2011).

    Science.gov (United States)

    Kristóf, István; Vörös, Krisztina; Marcsa, Boglárka; Váradi-T, Aletta; Kosztya, Sándor; Törő, Klára

    2015-09-01

    Medicolegal evaluation of postmortem findings at the death scene represents an important part of forensic medicine. The aim of this study was to investigate the occurrence and characteristics of suicide events. Data collection was performed from the police scene investigation reports in capital Budapest between 2009 and 2011. In this study, epidemiological parameters such as age, gender, time and place of death, postmortem changes, suicidal method, seasonal and daily distribution, natural diseases, earlier psychiatric treatment, socioeconomic risks, supposed cause of death, final notes, earlier suicide attempts, and suicide ideations were analyzed. There were 892 suicide cases (619 males, 273 females) detected in the investigated period. Hanging, overdose of prescription medications, jumping, use of firearms, drowning, and electrotrauma showed statistical differences among genders (p<0.05). The most common methods of suicide among men and women were hanging (57.4%) and overdose of prescription medications (33%), respectively. Death scene characteristics represent the important factors for forensic medicine. © 2015 American Academy of Forensic Sciences.

  7. Representation of Gravity-Aligned Scene Structure in Ventral Pathway Visual Cortex.

    Science.gov (United States)

    Vaziri, Siavash; Connor, Charles E

    2016-03-21

    The ventral visual pathway in humans and non-human primates is known to represent object information, including shape and identity [1]. Here, we show the ventral pathway also represents scene structure aligned with the gravitational reference frame in which objects move and interact. We analyzed shape tuning of recently described macaque monkey ventral pathway neurons that prefer scene-like stimuli to objects [2]. Individual neurons did not respond to a single shape class, but to a variety of scene elements that are typically aligned with gravity: large planes in the orientation range of ground surfaces under natural viewing conditions, planes in the orientation range of ceilings, and extended convex and concave edges in the orientation range of wall/floor/ceiling junctions. For a given neuron, these elements tended to share a common alignment in eye-centered coordinates. Thus, each neuron integrated information about multiple gravity-aligned structures as they would be seen from a specific eye and head orientation. This eclectic coding strategy provides only ambiguous information about individual structures but explicit information about the environmental reference frame and the orientation of gravity in egocentric coordinates. In the ventral pathway, this could support perceiving and/or predicting physical events involving objects subject to gravity, recognizing object attributes like animacy based on movement not caused by gravity, and/or stabilizing perception of the world against changes in head orientation [3-5]. Our results, like the recent discovery of object weight representation [6], imply that the ventral pathway is involved not just in recognition, but also in physical understanding of objects and scenes. Copyright © 2016 Elsevier Ltd. All rights reserved.

  8. Illusory control, gambling, and video gaming: an investigation of regular gamblers and video game players.

    Science.gov (United States)

    King, Daniel L; Ejova, Anastasia; Delfabbro, Paul H

    2012-09-01

    There is a paucity of empirical research examining the possible association between gambling and video game play. In two studies, we examined the association between video game playing, erroneous gambling cognitions, and risky gambling behaviour. One hundred and fifteen participants, including 65 electronic gambling machine (EGM) players and 50 regular video game players, were administered a questionnaire that examined video game play, gambling involvement, problem gambling, and beliefs about gambling. We then assessed each groups' performance on a computerised gambling task that involved real money. A post-game survey examined perceptions of the skill and chance involved in the gambling task. The results showed that video game playing itself was not significantly associated with gambling involvement or problem gambling status. However, among those persons who both gambled and played video games, video game playing was uniquely and significantly positively associated with the perception of direct control over chance-based gambling events. Further research is needed to better understand the nature of this association, as it may assist in understanding the impact of emerging digital gambling technologies.

  9. Video stereolization: combining motion analysis with user interaction.

    Science.gov (United States)

    Liao, Miao; Gao, Jizhou; Yang, Ruigang; Gong, Minglun

    2012-07-01

    We present a semiautomatic system that converts conventional videos into stereoscopic videos by combining motion analysis with user interaction, aiming to transfer as much as possible labeling work from the user to the computer. In addition to the widely used structure from motion (SFM) techniques, we develop two new methods that analyze the optical flow to provide additional qualitative depth constraints. They remove the camera movement restriction imposed by SFM so that general motions can be used in scene depth estimation-the central problem in mono-to-stereo conversion. With these algorithms, the user's labeling task is significantly simplified. We further developed a quadratic programming approach to incorporate both quantitative depth and qualitative depth (such as these from user scribbling) to recover dense depth maps for all frames, from which stereoscopic view can be synthesized. In addition to visual results, we present user study results showing that our approach is more intuitive and less labor intensive, while producing 3D effect comparable to that from current state-of-the-art interactive algorithms.

  10. On-scene crisis intervention: psychological guidelines and communication strategies for first responders.

    Science.gov (United States)

    Miller, Laurence

    2010-01-01

    Effective emergency mental health intervention for victims of crime, natural disaster or terrorism begins the moment the first responders arrive. This article describes a range of on-scene crisis intervention options, including verbal communication, body language, behavioral strategies, and interpersonal style. The correct intervention in the first few moments and hours of a crisis can profoundly influence the recovery course of victims and survivors of catastrophic events.

  11. Characterizing popularity dynamics of online videos

    Science.gov (United States)

    Ren, Zhuo-Ming; Shi, Yu-Qiang; Liao, Hao

    2016-07-01

    Online popularity has a major impact on videos, music, news and other contexts in online systems. Characterizing online popularity dynamics is nature to explain the observed properties in terms of the already acquired popularity of each individual. In this paper, we provide a quantitative, large scale, temporal analysis of the popularity dynamics in two online video-provided websites, namely MovieLens and Netflix. The two collected data sets contain over 100 million records and even span a decade. We characterize that the popularity dynamics of online videos evolve over time, and find that the dynamics of the online video popularity can be characterized by the burst behaviors, typically occurring in the early life span of a video, and later restricting to the classic preferential popularity increase mechanism.

  12. Political conservatism predicts asymmetries in emotional scene memory.

    Science.gov (United States)

    Mills, Mark; Gonzalez, Frank J; Giuseffi, Karl; Sievert, Benjamin; Smith, Kevin B; Hibbing, John R; Dodd, Michael D

    2016-06-01

    Variation in political ideology has been linked to differences in attention to and processing of emotional stimuli, with stronger responses to negative versus positive stimuli (negativity bias) the more politically conservative one is. As memory is enhanced by attention, such findings predict that memory for negative versus positive stimuli should similarly be enhanced the more conservative one is. The present study tests this prediction by having participants study 120 positive, negative, and neutral scenes in preparation for a subsequent memory test. On the memory test, the same 120 scenes were presented along with 120 new scenes and participants were to respond whether a scene was old or new. Results on the memory test showed that negative scenes were more likely to be remembered than positive scenes, though, this was true only for political conservatives. That is, a larger negativity bias was found the more conservative one was. The effect was sizeable, explaining 45% of the variance across subjects in the effect of emotion. These findings demonstrate that the relationship between political ideology and asymmetries in emotion processing extend to memory and, furthermore, suggest that exploring the extent to which subject variation in interactions among emotion, attention, and memory is predicted by conservatism may provide new insights into theories of political ideology. Published by Elsevier B.V.

  13. Affective salience can reverse the effects of stimulus-driven salience on eye movements in complex scenes

    Directory of Open Access Journals (Sweden)

    Yaqing eNiu

    2012-09-01

    Full Text Available In natural vision both stimulus features and cognitive/affective factors influence an observer's attention. However, the relationship between stimulus-driven (bottom-up and cognitive/affective (top-down factors remains controversial: Can affective salience counteract strong visual stimulus signals and shift attention allocation irrespective of bottom-up features? Is there any difference between negative and positive scenes in terms of their influence on attention deployment? Here we examined the impact of affective factors on eye movement behavior, to understand the competition between visual stimulus-driven salience and affective salience and how they affect gaze allocation in complex scene viewing. Building on our previous research, we compared predictions generated by a visual salience model with measures indexing participant-identified emotionally meaningful regions of each image. To examine how eye movement behaviour differs for negative, positive, and neutral scenes, we examined the influence of affective salience in capturing attention according to emotional valence. Taken together, our results show that affective salience can override stimulus-driven salience and overall emotional valence can determine attention allocation in complex scenes. These findings are consistent with the hypothesis that cognitive/affective factors play a dominant role in active gaze control.

  14. Being There: (Re)Making the Assessment Scene

    Science.gov (United States)

    Gallagher, Chris W.

    2011-01-01

    I use Burkean analysis to show how neoliberalism undermines faculty assessment expertise and underwrites testing industry expertise in the current assessment scene. Contending that we cannot extricate ourselves from our limited agency in this scene until we abandon the familiar "stakeholder" theory of power, I propose a rewriting of the…

  15. HEP visualization and video technology

    International Nuclear Information System (INIS)

    Lebrun, P.; Swoboda, D.

    1994-01-01

    The use of scientific visualization for HEP analysis is briefly reviewed. The applications are highly interactive and very dynamical in nature. At Fermilab, E687, in collaboration with Visual Media Services, has produced a 1/2 hour video tape demonstrating the capability of SGI-EXPLORER applied to a Dalitz Analysis of Charm decay. This short contribution describes the authors experience with visualization and video technologies

  16. Does excessive play of violent first-person-shooter-video-games dampen brain activity in response to emotional stimuli?

    Science.gov (United States)

    Montag, Christian; Weber, Bernd; Trautner, Peter; Newport, Beate; Markett, Sebastian; Walter, Nora T; Felten, Andrea; Reuter, Martin

    2012-01-01

    The present case-control study investigated the processing of emotional pictures in excessive first-person-shooter-video-players and control persons. All participants of the fMRI experiment were confronted with pictures from four categories including pleasant, unpleasant, neutral content and pictures from the first-person-shooter-video-game 'Counterstrike'. Compared to controls, gamers showed a significantly lower activation of the left lateral medial frontal lobe while processing negative emotions. Another interesting finding of the study represents the higher activation of frontal and temporal brain areas in gamers when processing screen-shots from the first-person-shooter-video-game 'Counterstrike'. Higher brain activity in the lateral prefrontal cortex could represent a protection mechanism against experiencing negative emotions by down-regulating limbic brain activity. Due to a frequent confrontation with violent scenes, the first-person-shooter-video-gamers might have habituated to the effects of unpleasant stimuli resulting in lower brain activation. Individual differences in brain activations of the contrast Counterstrike>neutral pictures potentially resemble the activation of action-scripts related to the video-game. Copyright © 2011 Elsevier B.V. All rights reserved.

  17. Focal-plane change triggered video compression for low-power vision sensor systems.

    Directory of Open Access Journals (Sweden)

    Yu M Chi

    Full Text Available Video sensors with embedded compression offer significant energy savings in transmission but incur energy losses in the complexity of the encoder. Energy efficient video compression architectures for CMOS image sensors with focal-plane change detection are presented and analyzed. The compression architectures use pixel-level computational circuits to minimize energy usage by selectively processing only pixels which generate significant temporal intensity changes. Using the temporal intensity change detection to gate the operation of a differential DCT based encoder achieves nearly identical image quality to traditional systems (4dB decrease in PSNR while reducing the amount of data that is processed by 67% and reducing overall power consumption reduction of 51%. These typical energy savings, resulting from the sparsity of motion activity in the visual scene, demonstrate the utility of focal-plane change triggered compression to surveillance vision systems.

  18. Extracting 3d Semantic Information from Video Surveillance System Using Deep Learning

    Science.gov (United States)

    Zhang, J. S.; Cao, J.; Mao, B.; Shen, D. Q.

    2018-04-01

    At present, intelligent video analysis technology has been widely used in various fields. Object tracking is one of the important part of intelligent video surveillance, but the traditional target tracking technology based on the pixel coordinate system in images still exists some unavoidable problems. Target tracking based on pixel can't reflect the real position information of targets, and it is difficult to track objects across scenes. Based on the analysis of Zhengyou Zhang's camera calibration method, this paper presents a method of target tracking based on the target's space coordinate system after converting the 2-D coordinate of the target into 3-D coordinate. It can be seen from the experimental results: Our method can restore the real position change information of targets well, and can also accurately get the trajectory of the target in space.

  19. TACKLING EVENT DETECTION IN THE CONTEXT OF VIDEO SURVEILLANCE

    Directory of Open Access Journals (Sweden)

    Raducu DUMITRESCU

    2011-11-01

    Full Text Available In this paper we address the problem of event detection in the context of video surveillance systems. First we deal with background extraction. Three methods are being tested, namely: frame differencing, running average and an estimate of median filtering technique. This provides information about changing contents. Further, we use this information to address human presence detection in the scene. This is carried out thought a contour-based approach. Contours are extracted from moving regions and parameterized. Human silhouettes show particular signatures of these parameters. Experimental results prove the potential of this approach to event detection. However, these are our first preliminary results to this application.

  20. Saliency-Guided Detection of Unknown Objects in RGB-D Indoor Scenes.

    Science.gov (United States)

    Bao, Jiatong; Jia, Yunyi; Cheng, Yu; Xi, Ning

    2015-08-27

    This paper studies the problem of detecting unknown objects within indoor environments in an active and natural manner. The visual saliency scheme utilizing both color and depth cues is proposed to arouse the interests of the machine system for detecting unknown objects at salient positions in a 3D scene. The 3D points at the salient positions are selected as seed points for generating object hypotheses using the 3D shape. We perform multi-class labeling on a Markov random field (MRF) over the voxels of the 3D scene, combining cues from object hypotheses and 3D shape. The results from MRF are further refined by merging the labeled objects, which are spatially connected and have high correlation between color histograms. Quantitative and qualitative evaluations on two benchmark RGB-D datasets illustrate the advantages of the proposed method. The experiments of object detection and manipulation performed on a mobile manipulator validate its effectiveness and practicability in robotic applications.

  1. Scene text detection by leveraging multi-channel information and local context

    Science.gov (United States)

    Wang, Runmin; Qian, Shengyou; Yang, Jianfeng; Gao, Changxin

    2018-03-01

    As an important information carrier, texts play significant roles in many applications. However, text detection in unconstrained scenes is a challenging problem due to cluttered backgrounds, various appearances, uneven illumination, etc.. In this paper, an approach based on multi-channel information and local context is proposed to detect texts in natural scenes. According to character candidate detection plays a vital role in text detection system, Maximally Stable Extremal Regions(MSERs) and Graph-cut based method are integrated to obtain the character candidates by leveraging the multi-channel image information. A cascaded false positive elimination mechanism are constructed from the perspective of the character and the text line respectively. Since the local context information is very valuable for us, these information is utilized to retrieve the missing characters for boosting the text detection performance. Experimental results on two benchmark datasets, i.e., the ICDAR 2011 dataset and the ICDAR 2013 dataset, demonstrate that the proposed method have achieved the state-of-the-art performance.

  2. Construction and Optimization of Three-Dimensional Disaster Scenes within Mobile Virtual Reality

    Directory of Open Access Journals (Sweden)

    Ya Hu

    2018-06-01

    Full Text Available Because mobile virtual reality (VR is both mobile and immersive, three-dimensional (3D visualizations of disaster scenes based in mobile VR enable users to perceive and recognize disaster environments faster and better than is possible with other methods. To achieve immersion and prevent users from feeling dizzy, such visualizations require a high scene-rendering frame rate. However, the existing related visualization work cannot provide a sufficient solution for this purpose. This study focuses on the construction and optimization of a 3D disaster scene in order to satisfy the high frame-rate requirements for the rendering of 3D disaster scenes in mobile VR. First, the design of a plugin-free browser/server (B/S architecture for 3D disaster scene construction and visualization based in mobile VR is presented. Second, certain key technologies for scene optimization are discussed, including diverse modes of scene data representation, representation optimization of mobile scenes, and adaptive scheduling of mobile scenes. By means of these technologies, smartphones with various performance levels can achieve higher scene-rendering frame rates and improved visual quality. Finally, using a flood disaster as an example, a plugin-free prototype system was developed, and experiments were conducted. The experimental results demonstrate that a 3D disaster scene constructed via the methods addressed in this study has a sufficiently high scene-rendering frame rate to satisfy the requirements for rendering a 3D disaster scene in mobile VR.

  3. Sex differences in the brain response to affective scenes with or without humans.

    Science.gov (United States)

    Proverbio, Alice Mado; Adorni, Roberta; Zani, Alberto; Trestianu, Laura

    2009-10-01

    Recent findings have demonstrated that women might be more reactive than men to viewing painful stimuli (vicarious response to pain), and therefore more empathic [Han, S., Fan, Y., & Mao, L. (2008). Gender difference in empathy for pain: An electrophysiological investigation. Brain Research, 1196, 85-93]. We investigated whether the two sexes differed in their cerebral responses to affective pictures portraying humans in different positive or negative contexts compared to natural or urban scenarios. 440 IAPS slides were presented to 24 Italian students (12 women and 12 men). Half the pictures displayed humans while the remaining scenes lacked visible persons. ERPs were recorded from 128 electrodes and swLORETA (standardized weighted Low-Resolution Electromagnetic Tomography) source reconstruction was performed. Occipital P115 was greater in response to persons than to scenes and was affected by the emotional valence of the human pictures. This suggests that processing of biologically relevant stimuli is prioritized. Orbitofrontal N2 was greater in response to positive than negative human pictures in women but not in men, and not to scenes. A late positivity (LP) to suffering humans far exceeded the response to negative scenes in women but not in men. In both sexes, the contrast suffering-minus-happy humans revealed a difference in the activation of the occipito/temporal, right occipital (BA19), bilateral parahippocampal, left dorsal prefrontal cortex (DPFC) and left amygdala. However, increased right amygdala and right frontal area activities were observed only in women. The humans-minus-scenes contrast revealed a difference in the activation of the middle occipital gyrus (MOG) in men, and of the left inferior parietal (BA40), left superior temporal gyrus (STG, BA38) and right cingulate (BA31) in women (270-290 ms). These data indicate a sex-related difference in the brain response to humans, possibly supporting human empathy.

  4. A statistical model for radar images of agricultural scenes

    Science.gov (United States)

    Frost, V. S.; Shanmugan, K. S.; Holtzman, J. C.; Stiles, J. A.

    1982-01-01

    The presently derived and validated statistical model for radar images containing many different homogeneous fields predicts the probability density functions of radar images of entire agricultural scenes, thereby allowing histograms of large scenes composed of a variety of crops to be described. Seasat-A SAR images of agricultural scenes are accurately predicted by the model on the basis of three assumptions: each field has the same SNR, all target classes cover approximately the same area, and the true reflectivity characterizing each individual target class is a uniformly distributed random variable. The model is expected to be useful in the design of data processing algorithms and for scene analysis using radar images.

  5. The neural bases of spatial frequency processing during scene perception

    Science.gov (United States)

    Kauffmann, Louise; Ramanoël, Stephen; Peyrin, Carole

    2014-01-01

    Theories on visual perception agree that scenes are processed in terms of spatial frequencies. Low spatial frequencies (LSF) carry coarse information whereas high spatial frequencies (HSF) carry fine details of the scene. However, how and where spatial frequencies are processed within the brain remain unresolved questions. The present review addresses these issues and aims to identify the cerebral regions differentially involved in low and high spatial frequency processing, and to clarify their attributes during scene perception. Results from a number of behavioral and neuroimaging studies suggest that spatial frequency processing is lateralized in both hemispheres, with the right and left hemispheres predominantly involved in the categorization of LSF and HSF scenes, respectively. There is also evidence that spatial frequency processing is retinotopically mapped in the visual cortex. HSF scenes (as opposed to LSF) activate occipital areas in relation to foveal representations, while categorization of LSF scenes (as opposed to HSF) activates occipital areas in relation to more peripheral representations. Concomitantly, a number of studies have demonstrated that LSF information may reach high-order areas rapidly, allowing an initial coarse parsing of the visual scene, which could then be sent back through feedback into the occipito-temporal cortex to guide finer HSF-based analysis. Finally, the review addresses spatial frequency processing within scene-selective regions areas of the occipito-temporal cortex. PMID:24847226

  6. Sustained change blindness to incremental scene rotation: a dissociation between explicit change detection and visual memory.

    Science.gov (United States)

    Hollingworth, Andrew; Henderson, John M

    2004-07-01

    In a change detection paradigm, the global orientation of a natural scene was incrementally changed in 1 degree intervals. In Experiments 1 and 2, participants demonstrated sustained change blindness to incremental rotation, often coming to consider a significantly different scene viewpoint as an unchanged continuation of the original view. Experiment 3 showed that participants who failed to detect the incremental rotation nevertheless reliably detected a single-step rotation back to the initial view. Together, these results demonstrate an important dissociation between explicit change detection and visual memory. Following a change, visual memory is updated to reflect the changed state of the environment, even if the change was not detected.

  7. Semi-automatic scene generation using the Digital Anatomist Foundational Model.

    Science.gov (United States)

    Wong, B A; Rosse, C; Brinkley, J F

    1999-01-01

    A recent survey shows that a major impediment to more widespread use of computers in anatomy education is the inability to directly manipulate 3-D models, and to relate these to corresponding textual information. In the University of Washington Digital Anatomist Project we have developed a prototype Web-based scene generation program that combines the symbolic Foundational Model of Anatomy with 3-D models. A Web user can browse the Foundational Model (FM), then click to request that a 3-D scene be created of an object and its parts or branches. The scene is rendered by a graphics server, and a snapshot is sent to the Web client. The user can then manipulate the scene, adding new structures, deleting structures, rotating the scene, zooming, and saving the scene as a VRML file. Applications such as this, when fully realized with fast rendering and more anatomical content, have the potential to significantly change the way computers are used in anatomy education.

  8. Age-related changes in perception of movement in driving scenes.

    Science.gov (United States)

    Lacherez, Philippe; Turner, Laura; Lester, Robert; Burns, Zoe; Wood, Joanne M

    2014-07-01

    Age-related changes in motion sensitivity have been found to relate to reductions in various indices of driving performance and safety. The aim of this study was to investigate the basis of this relationship in terms of determining which aspects of motion perception are most relevant to driving. Participants included 61 regular drivers (age range 22-87 years). Visual performance was measured binocularly. Measures included visual acuity, contrast sensitivity and motion sensitivity assessed using four different approaches: (1) threshold minimum drift rate for a drifting Gabor patch, (2) Dmin from a random dot display, (3) threshold coherence from a random dot display, and (4) threshold drift rate for a second-order (contrast modulated) sinusoidal grating. Participants then completed the Hazard Perception Test (HPT) in which they were required to identify moving hazards in videos of real driving scenes, and also a Direction of Heading task (DOH) in which they identified deviations from normal lane keeping in brief videos of driving filmed from the interior of a vehicle. In bivariate correlation analyses, all motion sensitivity measures significantly declined with age. Motion coherence thresholds, and minimum drift rate threshold for the first-order stimulus (Gabor patch) both significantly predicted HPT performance even after controlling for age, visual acuity and contrast sensitivity. Bootstrap mediation analysis showed that individual differences in DOH accuracy partly explained these relationships, where those individuals with poorer motion sensitivity on the coherence and Gabor tests showed decreased ability to perceive deviations in motion in the driving videos, which related in turn to their ability to detect the moving hazards. The ability to detect subtle movements in the driving environment (as determined by the DOH task) may be an important contributor to effective hazard perception, and is associated with age, and an individuals' performance on tests of

  9. Visual search for changes in scenes creates long-term, incidental memory traces.

    Science.gov (United States)

    Utochkin, Igor S; Wolfe, Jeremy M

    2018-05-01

    Humans are very good at remembering large numbers of scenes over substantial periods of time. But how good are they at remembering changes to scenes? In this study, we tested scene memory and change detection two weeks after initial scene learning. In Experiments 1-3, scenes were learned incidentally during visual search for change. In Experiment 4, observers explicitly memorized scenes. At test, after two weeks observers were asked to discriminate old from new scenes, to recall a change that they had detected in the study phase, or to detect a newly introduced change in the memorization experiment. Next, they performed a change detection task, usually looking for the same change as in the study period. Scene recognition memory was found to be similar in all experiments, regardless of the study task. In Experiment 1, more difficult change detection produced better scene memory. Experiments 2 and 3 supported a "depth-of-processing" account for the effects of initial search and change detection on incidental memory for scenes. Of most interest, change detection was faster during the test phase than during the study phase, even when the observer had no explicit memory of having found that change previously. This result was replicated in two of our three change detection experiments. We conclude that scenes can be encoded incidentally as well as explicitly and that changes in those scenes can leave measurable traces even if they are not explicitly recalled.

  10. Scene reassembly after multimodal digitization and pipeline evaluation using photorealistic rendering

    DEFF Research Database (Denmark)

    Stets, Jonathan Dyssel; Dal Corso, Alessandro; Nielsen, Jannik Boll

    2017-01-01

    of the lighting environment. This enables pixelwise comparison of photographs of the real scene with renderings of the digital version of the scene. Such quantitative evaluation is useful for verifying acquired material appearance and reconstructed surface geometry, which is an important aspect of digital content......Transparent objects require acquisition modalities that are very different from the ones used for objects with more diffuse reflectance properties. Digitizing a scene where objects must be acquired with different modalities requires scene reassembly after reconstruction of the object surfaces....... This reassembly of a scene that was picked apart for scanning seems unexplored. We contribute with a multimodal digitization pipeline for scenes that require this step of reassembly. Our pipeline includes measurement of bidirectional reflectance distribution functions and high dynamic range imaging...

  11. Dynamic Frames Based Generation of 3D Scenes and Applications

    Directory of Open Access Journals (Sweden)

    Danijel Radošević

    2015-05-01

    Full Text Available Modern graphic/programming tools like Unity enables the possibility of creating 3D scenes as well as making 3D scene based program applications, including full physical model, motion, sounds, lightning effects etc. This paper deals with the usage of dynamic frames based generator in the automatic generation of 3D scene and related source code. The suggested model enables the possibility to specify features of the 3D scene in a form of textual specification, as well as exporting such features from a 3D tool. This approach enables higher level of code generation flexibility and the reusability of the main code and scene artifacts in a form of textual templates. An example of the generated application is presented and discussed.

  12. Visual search in scenes involves selective and non-selective pathways

    Science.gov (United States)

    Wolfe, Jeremy M; Vo, Melissa L-H; Evans, Karla K; Greene, Michelle R

    2010-01-01

    How do we find objects in scenes? For decades, visual search models have been built on experiments in which observers search for targets, presented among distractor items, isolated and randomly arranged on blank backgrounds. Are these models relevant to search in continuous scenes? This paper argues that the mechanisms that govern artificial, laboratory search tasks do play a role in visual search in scenes. However, scene-based information is used to guide search in ways that had no place in earlier models. Search in scenes may be best explained by a dual-path model: A “selective” path in which candidate objects must be individually selected for recognition and a “non-selective” path in which information can be extracted from global / statistical information. PMID:21227734

  13. Distribution of light in the human retina under natural viewing conditions

    Science.gov (United States)

    Gibert, Jorge C.

    Age-related macular degeneration (AMD) is the leading cause of blindness inAmerica. The fact that AMD wreaks most of the damage in the center of the retina raises the question of whether light, integrated over long periods, is more concentrated in the macula. A method, based on eye-tracking, was developed to measure the distribution of light in the retina under natural viewing conditions. The hypothesis was that integrated over time, retinal illumination peaked in the macula. Additionally a possible relationship between age and retinal illumination was investigated. The eye tracker superimposed the subject's gaze position on a video recorded by a scene camera. Five informed subjects were employed in feasibility tests, and 58 naive subjects participated in 5 phases. In phase 1 the subjects viewed a gray-scale image. In phase 2, they observed a sequence of photographic images. In phase 3 they viewed a video. In phase 4, they worked on a computer; in phase 5, the subjects walked around freely. The informed subjects were instructed to gaze at bright objects in the field of view and then at dark objects. Naive subjects were allowed to gaze freely for all phases. Using the subject's gaze coordinates, and the video provided by the scene camera, the cumulative light distribution on the retina was calculated for ˜15° around the fovea. As expected for control subjects, cumulative retinal light distributions peaked and dipped in the fovea when they gazed at bright or dark objects respectively. The light distribution maps obtained from the naive subjects presented a tendency to peak in the macula for phases 1, 2, and 3, a consistent tendency in phase 4 and a variable tendency in phase 5. The feasibility of using an eye-tracker system to measure the distribution of light in the retina was demonstrated, thus helping to understand the role played by light exposure in the etiology of AMD. Results showed that a tendency for light to peak in the macula is a characteristic of some

  14. Bring It to the Pitch: Combining Video and Movement Data to Enhance Team Sport Analysis.

    Science.gov (United States)

    Stein, Manuel; Janetzko, Halldor; Lamprecht, Andreas; Breitkreutz, Thorsten; Zimmermann, Philipp; Goldlucke, Bastian; Schreck, Tobias; Andrienko, Gennady; Grossniklaus, Michael; Keim, Daniel A

    2018-01-01

    Analysts in professional team sport regularly perform analysis to gain strategic and tactical insights into player and team behavior. Goals of team sport analysis regularly include identification of weaknesses of opposing teams, or assessing performance and improvement potential of a coached team. Current analysis workflows are typically based on the analysis of team videos. Also, analysts can rely on techniques from Information Visualization, to depict e.g., player or ball trajectories. However, video analysis is typically a time-consuming process, where the analyst needs to memorize and annotate scenes. In contrast, visualization typically relies on an abstract data model, often using abstract visual mappings, and is not directly linked to the observed movement context anymore. We propose a visual analytics system that tightly integrates team sport video recordings with abstract visualization of underlying trajectory data. We apply appropriate computer vision techniques to extract trajectory data from video input. Furthermore, we apply advanced trajectory and movement analysis techniques to derive relevant team sport analytic measures for region, event and player analysis in the case of soccer analysis. Our system seamlessly integrates video and visualization modalities, enabling analysts to draw on the advantages of both analysis forms. Several expert studies conducted with team sport analysts indicate the effectiveness of our integrated approach.

  15. Radiative transfer model for heterogeneous 3-D scenes

    Science.gov (United States)

    Kimes, D. S.; Kirchner, J. A.

    1982-01-01

    A general mathematical framework for simulating processes in heterogeneous 3-D scenes is presented. Specifically, a model was designed and coded for application to radiative transfers in vegetative scenes. The model is unique in that it predicts (1) the directional spectral reflectance factors as a function of the sensor's azimuth and zenith angles and the sensor's position above the canopy, (2) the spectral absorption as a function of location within the scene, and (3) the directional spectral radiance as a function of the sensor's location within the scene. The model was shown to follow known physical principles of radiative transfer. Initial verification of the model as applied to a soybean row crop showed that the simulated directional reflectance data corresponded relatively well in gross trends to the measured data. However, the model can be greatly improved by incorporating more sophisticated and realistic anisotropic scattering algorithms

  16. Modified-hybrid optical neural network filter for multiple object recognition within cluttered scenes

    Science.gov (United States)

    Kypraios, Ioannis; Young, Rupert C. D.; Chatwin, Chris R.

    2009-08-01

    Motivated by the non-linear interpolation and generalization abilities of the hybrid optical neural network filter between the reference and non-reference images of the true-class object we designed the modifiedhybrid optical neural network filter. We applied an optical mask to the hybrid optical neural network's filter input. The mask was built with the constant weight connections of a randomly chosen image included in the training set. The resulted design of the modified-hybrid optical neural network filter is optimized for performing best in cluttered scenes of the true-class object. Due to the shift invariance properties inherited by its correlator unit the filter can accommodate multiple objects of the same class to be detected within an input cluttered image. Additionally, the architecture of the neural network unit of the general hybrid optical neural network filter allows the recognition of multiple objects of different classes within the input cluttered image by modifying the output layer of the unit. We test the modified-hybrid optical neural network filter for multiple objects of the same and of different classes' recognition within cluttered input images and video sequences of cluttered scenes. The filter is shown to exhibit with a single pass over the input data simultaneously out-of-plane rotation, shift invariance and good clutter tolerance. It is able to successfully detect and classify correctly the true-class objects within background clutter for which there has been no previous training.

  17. Hierarchy-associated semantic-rule inference framework for classifying indoor scenes

    Science.gov (United States)

    Yu, Dan; Liu, Peng; Ye, Zhipeng; Tang, Xianglong; Zhao, Wei

    2016-03-01

    Typically, the initial task of classifying indoor scenes is challenging, because the spatial layout and decoration of a scene can vary considerably. Recent efforts at classifying object relationships commonly depend on the results of scene annotation and predefined rules, making classification inflexible. Furthermore, annotation results are easily affected by external factors. Inspired by human cognition, a scene-classification framework was proposed using the empirically based annotation (EBA) and a match-over rule-based (MRB) inference system. The semantic hierarchy of images is exploited by EBA to construct rules empirically for MRB classification. The problem of scene classification is divided into low-level annotation and high-level inference from a macro perspective. Low-level annotation involves detecting the semantic hierarchy and annotating the scene with a deformable-parts model and a bag-of-visual-words model. In high-level inference, hierarchical rules are extracted to train the decision tree for classification. The categories of testing samples are generated from the parts to the whole. Compared with traditional classification strategies, the proposed semantic hierarchy and corresponding rules reduce the effect of a variable background and improve the classification performance. The proposed framework was evaluated on a popular indoor scene dataset, and the experimental results demonstrate its effectiveness.

  18. Cognitive organization of roadway scenes : an empirical study.

    NARCIS (Netherlands)

    Gundy, C.M.

    1995-01-01

    This report describes six studies investigating the cognitive organization of roadway scenes. These scenes were represented by still photographs taken on a number of roads outside of built-up areas. Seventy-eight drivers, stratified by age and sex to simulate the Dutch driving population,

  19. A Flexible Object-of-Interest Annotation Framework for Online Video Portals

    Directory of Open Access Journals (Sweden)

    Robert Sorschag

    2012-02-01

    Full Text Available In this work, we address the use of object recognition techniques to annotate what is shown where in online video collections. These annotations are suitable to retrieve specific video scenes for object related text queries which is not possible with the manually generated metadata that is used by current portals. We are not the first to present object annotations that are generated with content-based analysis methods. However, the proposed framework possesses some outstanding features that offer good prospects for its application in real video portals. Firstly, it can be easily used as background module in any video environment. Secondly, it is not based on a fixed analysis chain but on an extensive recognition infrastructure that can be used with all kinds of visual features, matching and machine learning techniques. New recognition approaches can be integrated into this infrastructure with low development costs and a configuration of the used recognition approaches can be performed even on a running system. Thus, this framework might also benefit from future advances in computer vision. Thirdly, we present an automatic selection approach to support the use of different recognition strategies for different objects. Last but not least, visual analysis can be performed efficiently on distributed, multi-processor environments and a database schema is presented to store the resulting video annotations as well as the off-line generated low-level features in a compact form. We achieve promising results in an annotation case study and the instance search task of the TRECVID 2011 challenge.

  20. Performance Benefits with Scene-Linked HUD Symbology: An Attentional Phenomenon?

    Science.gov (United States)

    Levy, Jonathan L.; Foyle, David C.; McCann, Robert S.; Null, Cynthia H. (Technical Monitor)

    1999-01-01

    Previous research has shown that in a simulated flight task, navigating a path defined by ground markers while maintaining a target altitude is more accurate when an altitude indicator appears in a virtual "scenelinked" format (projected symbology moving as if it were part of the out-the-window environment) compared to the fixed-location, superimposed format found on present-day HUDs (Foyle, McCann & Shelden, 1995). One explanation of the scene-linked performance advantage is that attention can be divided between scene-linked symbology and the outside world more efficiently than between standard (fixed-position) HUD symbology and the outside world. The present study tested two alternative explanations by manipulating the location of the scene-linked HUD symbology relative to the ground path markers. Scene-linked symbology yielded better ground path-following performance than standard fixed-location superimposed symbology regardless of whether the scene-linked symbology appeared directly along the ground path or at various distances off the path. The results support the explanation that the performance benefits found with scene-linked symbology are attentional.

  1. Scene complexity: influence on perception, memory, and development in the medial temporal lobe

    Directory of Open Access Journals (Sweden)

    Xiaoqian J Chai

    2010-03-01

    Full Text Available Regions in the medial temporal lobe (MTL and prefrontal cortex (PFC are involved in memory formation for scenes in both children and adults. The development in children and adolescents of successful memory encoding for scenes has been associated with increased activation in PFC, but not MTL, regions. However, evidence suggests that a functional subregion of the MTL that supports scene perception, located in the parahippocampal gyrus (PHG, goes through a prolonged maturation process. Here we tested the hypothesis that maturation of scene perception supports the development of memory for complex scenes. Scenes were characterized by their levels of complexity defined by the number of unique object categories depicted in the scene. Recognition memory improved with age, in participants ages 8-24, for high, but not low, complexity scenes. High-complexity compared to low-complexity scenes activated a network of regions including the posterior PHG. The difference in activations for high- versus low- complexity scenes increased with age in the right posterior PHG. Finally, activations in right posterior PHG were associated with age-related increases in successful memory formation for high-, but not low-, complexity scenes. These results suggest that functional maturation of the right posterior PHG plays a critical role in the development of enduring long-term recollection for high-complexity scenes.

  2. Crime Scene Investigation.

    Science.gov (United States)

    Harris, Barbara; Kohlmeier, Kris; Kiel, Robert D.

    Casting students in grades 5 through 12 in the roles of reporters, lawyers, and detectives at the scene of a crime, this interdisciplinary activity involves participants in the intrigue and drama of crime investigation. Using a hands-on, step-by-step approach, students work in teams to investigate a crime and solve a mystery. Through role-playing…

  3. SCEGRAM: An image database for semantic and syntactic inconsistencies in scenes.

    Science.gov (United States)

    Öhlschläger, Sabine; Võ, Melissa Le-Hoa

    2017-10-01

    Our visual environment is not random, but follows compositional rules according to what objects are usually found where. Despite the growing interest in how such semantic and syntactic rules - a scene grammar - enable effective attentional guidance and object perception, no common image database containing highly-controlled object-scene modifications has been publically available. Such a database is essential in minimizing the risk that low-level features drive high-level effects of interest, which is being discussed as possible source of controversial study results. To generate the first database of this kind - SCEGRAM - we took photographs of 62 real-world indoor scenes in six consistency conditions that contain semantic and syntactic (both mild and extreme) violations as well as their combinations. Importantly, always two scenes were paired, so that an object was semantically consistent in one scene (e.g., ketchup in kitchen) and inconsistent in the other (e.g., ketchup in bathroom). Low-level salience did not differ between object-scene conditions and was generally moderate. Additionally, SCEGRAM contains consistency ratings for every object-scene condition, as well as object-absent scenes and object-only images. Finally, a cross-validation using eye-movements replicated previous results of longer dwell times for both semantic and syntactic inconsistencies compared to consistent controls. In sum, the SCEGRAM image database is the first to contain well-controlled semantic and syntactic object-scene inconsistencies that can be used in a broad range of cognitive paradigms (e.g., verbal and pictorial priming, change detection, object identification, etc.) including paradigms addressing developmental aspects of scene grammar. SCEGRAM can be retrieved for research purposes from http://www.scenegrammarlab.com/research/scegram-database/ .

  4. Learning from Narrated Instruction Videos.

    Science.gov (United States)

    Alayrac, Jean-Baptiste; Bojanowski, Piotr; Agrawal, Nishant; Sivic, Josef; Laptev, Ivan; Lacoste-Julien, Simon

    2017-09-05

    Automatic assistants could guide a person or a robot in performing new tasks, such as changing a car tire or repotting a plant. Creating such assistants, however, is non-trivial and requires understanding of visual and verbal content of a video. Towards this goal, we here address the problem of automatically learning the main steps of a task from a set of narrated instruction videos. We develop a new unsupervised learning approach that takes advantage of the complementary nature of the input video and the associated narration. The method sequentially clusters textual and visual representations of a task, where the two clustering problems are linked by joint constraints to obtain a single coherent sequence of steps in both modalities. To evaluate our method, we collect and annotate a new challenging dataset of real-world instruction videos from the Internet. The dataset contains videos for five different tasks with complex interactions between people and objects, captured in a variety of indoor and outdoor settings. We experimentally demonstrate that the proposed method can automatically discover, learn and localize the main steps of a task input videos.

  5. Mental Layout Extrapolations Prime Spatial Processing of Scenes

    Science.gov (United States)

    Gottesman, Carmela V.

    2011-01-01

    Four experiments examined whether scene processing is facilitated by layout representation, including layout that was not perceived but could be predicted based on a previous partial view (boundary extension). In a priming paradigm (after Sanocki, 2003), participants judged objects' distances in photographs. In Experiment 1, full scenes (target),…

  6. Adaptive and Selective Time Averaging of Auditory Scenes

    DEFF Research Database (Denmark)

    McWalter, Richard Ian; McDermott, Josh H.

    2018-01-01

    longer than previously reported integration times in the auditory system. Integration also showed signs of being restricted to sound elements attributed to a common source. The results suggest an integration process that depends on stimulus characteristics, integrating over longer extents when......To overcome variability, estimate scene characteristics, and compress sensory input, perceptual systems pool data into statistical summaries. Despite growing evidence for statistical representations in perception, the underlying mechanisms remain poorly understood. One example...... it benefits statistical estimation of variable signals and selectively integrating stimulus components likely to have a common cause in the world. Our methodology could be naturally extended to examine statistical representations of other types of sensory signals. Sound texture perception is thought...

  7. Image/video understanding systems based on network-symbolic models

    Science.gov (United States)

    Kuvich, Gary

    2004-03-01

    Vision is a part of a larger information system that converts visual information into knowledge structures. These structures drive vision process, resolve ambiguity and uncertainty via feedback projections, and provide image understanding that is an interpretation of visual information in terms of such knowledge models. Computer simulation models are built on the basis of graphs/networks. The ability of human brain to emulate similar graph/network models is found. Symbols, predicates and grammars naturally emerge in such networks, and logic is simply a way of restructuring such models. Brain analyzes an image as a graph-type relational structure created via multilevel hierarchical compression of visual information. Primary areas provide active fusion of image features on a spatial grid-like structure, where nodes are cortical columns. Spatial logic and topology naturally present in such structures. Mid-level vision processes like perceptual grouping, separation of figure from ground, are special kinds of network transformations. They convert primary image structure into the set of more abstract ones, which represent objects and visual scene, making them easy for analysis by higher-level knowledge structures. Higher-level vision phenomena are results of such analysis. Composition of network-symbolic models combines learning, classification, and analogy together with higher-level model-based reasoning into a single framework, and it works similar to frames and agents. Computational intelligence methods transform images into model-based knowledge representation. Based on such principles, an Image/Video Understanding system can convert images into the knowledge models, and resolve uncertainty and ambiguity. This allows creating intelligent computer vision systems for design and manufacturing.

  8. Automatic Video-based Analysis of Human Motion

    DEFF Research Database (Denmark)

    Fihl, Preben

    The human motion contains valuable information in many situations and people frequently perform an unconscious analysis of the motion of other people to understand their actions, intentions, and state of mind. An automatic analysis of human motion will facilitate many applications and thus has...... received great interest from both industry and research communities. The focus of this thesis is on video-based analysis of human motion and the thesis presents work within three overall topics, namely foreground segmentation, action recognition, and human pose estimation. Foreground segmentation is often...... the first important step in the analysis of human motion. By separating foreground from background the subsequent analysis can be focused and efficient. This thesis presents a robust background subtraction method that can be initialized with foreground objects in the scene and is capable of handling...

  9. Graphics processing unit (GPU) real-time infrared scene generation

    Science.gov (United States)

    Christie, Chad L.; Gouthas, Efthimios (Themie); Williams, Owen M.

    2007-04-01

    VIRSuite, the GPU-based suite of software tools developed at DSTO for real-time infrared scene generation, is described. The tools include the painting of scene objects with radiometrically-associated colours, translucent object generation, polar plot validation and versatile scene generation. Special features include radiometric scaling within the GPU and the presence of zoom anti-aliasing at the core of VIRSuite. Extension of the zoom anti-aliasing construct to cover target embedding and the treatment of translucent objects is described.

  10. Semantic guidance of eye movements in real-world scenes

    OpenAIRE

    Hwang, Alex D.; Wang, Hsueh-Cheng; Pomplun, Marc

    2011-01-01

    The perception of objects in our visual world is influenced by not only their low-level visual features such as shape and color, but also their high-level features such as meaning and semantic relations among them. While it has been shown that low-level features in real-world scenes guide eye movements during scene inspection and search, the influence of semantic similarity among scene objects on eye movements in such situations has not been investigated. Here we study guidance of eye movemen...

  11. Naturalness and image quality : saturation and lightness variation in color images of natural scenes

    NARCIS (Netherlands)

    Ridder, de H.

    1996-01-01

    The relation between perceived image quality and naturalness was investigated by varying the colorfulness of natural images at various lightness levels. At each lightness level, subjects assessed perceived colorfulness, naturalness, and quality as a function of average saturation by means of direct

  12. Defining spatial relations in a specific ontology for automated scene creation

    Directory of Open Access Journals (Sweden)

    D. Contraş

    2013-06-01

    Full Text Available This paper presents the approach of building an ontology for automatic scene generation. Every scene contains various elements (backgrounds, characters, objects which are spatially interrelated. The article focuses on these spatial and temporal relationships of the elements constituting a scene.

  13. Scene perception in posterior cortical atrophy: categorization, description and fixation patterns.

    Science.gov (United States)

    Shakespeare, Timothy J; Yong, Keir X X; Frost, Chris; Kim, Lois G; Warrington, Elizabeth K; Crutch, Sebastian J

    2013-01-01

    Partial or complete Balint's syndrome is a core feature of the clinico-radiological syndrome of posterior cortical atrophy (PCA), in which individuals experience a progressive deterioration of cortical vision. Although multi-object arrays are frequently used to detect simultanagnosia in the clinical assessment and diagnosis of PCA, to date there have been no group studies of scene perception in patients with the syndrome. The current study involved three linked experiments conducted in PCA patients and healthy controls. Experiment 1 evaluated the accuracy and latency of complex scene perception relative to individual faces and objects (color and grayscale) using a categorization paradigm. PCA patients were both less accurate (faces < scenes < objects) and slower (scenes < objects < faces) than controls on all categories, with performance strongly associated with their level of basic visual processing impairment; patients also showed a small advantage for color over grayscale stimuli. Experiment 2 involved free description of real world scenes. PCA patients generated fewer features and more misperceptions than controls, though perceptual errors were always consistent with the patient's global understanding of the scene (whether correct or not). Experiment 3 used eye tracking measures to compare patient and control eye movements over initial and subsequent fixations of scenes. Patients' fixation patterns were significantly different to those of young and age-matched controls, with comparable group differences for both initial and subsequent fixations. Overall, these findings describe the variability in everyday scene perception exhibited by individuals with PCA, and indicate the importance of exposure duration in the perception of complex scenes.

  14. Scene perception in Posterior Cortical Atrophy: categorisation, description and fixation patterns

    Directory of Open Access Journals (Sweden)

    Timothy J Shakespeare

    2013-10-01

    Full Text Available Partial or complete Balint’s syndrome is a core feature of the clinico-radiological syndrome of posterior cortical atrophy (PCA, in which individuals experience a progressive deterioration of cortical vision. Although multi-object arrays are frequently used to detect simultanagnosia in the clinical assessment and diagnosis of PCA, to date there have been no group studies of scene perception in patients with the syndrome. The current study involved three linked experiments conducted in PCA patients and healthy controls. Experiment 1 evaluated the accuracy and latency of complex scene perception relative to individual faces and objects (colour and greyscale using a categorisation paradigm. PCA patients were both less accurate (faces<scenesscenesscenes. PCA patients generated fewer features and more misperceptions than controls, though perceptual errors were always consistent with the patient’s global understanding of the scene (whether correct or not. Experiment 3 used eye tracking measures to compare patient and control eye movements over initial and subsequent fixations of scenes. Patients’ fixation patterns were significantly different to those of young and age-matched controls, with comparable group differences for both initial and subsequent fixations. Overall, these findings describe the variability in everyday scene perception exhibited by individuals with PCA, and indicate the importance of exposure duration in the perception of complex scenes.

  15. Object tracking using multiple camera video streams

    Science.gov (United States)

    Mehrubeoglu, Mehrube; Rojas, Diego; McLauchlan, Lifford

    2010-05-01

    Two synchronized cameras are utilized to obtain independent video streams to detect moving objects from two different viewing angles. The video frames are directly correlated in time. Moving objects in image frames from the two cameras are identified and tagged for tracking. One advantage of such a system involves overcoming effects of occlusions that could result in an object in partial or full view in one camera, when the same object is fully visible in another camera. Object registration is achieved by determining the location of common features in the moving object across simultaneous frames. Perspective differences are adjusted. Combining information from images from multiple cameras increases robustness of the tracking process. Motion tracking is achieved by determining anomalies caused by the objects' movement across frames in time in each and the combined video information. The path of each object is determined heuristically. Accuracy of detection is dependent on the speed of the object as well as variations in direction of motion. Fast cameras increase accuracy but limit the speed and complexity of the algorithm. Such an imaging system has applications in traffic analysis, surveillance and security, as well as object modeling from multi-view images. The system can easily be expanded by increasing the number of cameras such that there is an overlap between the scenes from at least two cameras in proximity. An object can then be tracked long distances or across multiple cameras continuously, applicable, for example, in wireless sensor networks for surveillance or navigation.

  16. System and method for extracting dominant orientations from a scene

    Science.gov (United States)

    Straub, Julian; Rosman, Guy; Freifeld, Oren; Leonard, John J.; Fisher, III; , John W.

    2017-05-30

    In one embodiment, a method of identifying the dominant orientations of a scene comprises representing a scene as a plurality of directional vectors. The scene may comprise a three-dimensional representation of a scene, and the plurality of directional vectors may comprise a plurality of surface normals. The method further comprises determining, based on the plurality of directional vectors, a plurality of orientations describing the scene. The determined plurality of orientations explains the directionality of the plurality of directional vectors. In certain embodiments, the plurality of orientations may have independent axes of rotation. The plurality of orientations may be determined by representing the plurality of directional vectors as lying on a mathematical representation of a sphere, and inferring the parameters of a statistical model to adapt the plurality of orientations to explain the positioning of the plurality of directional vectors lying on the mathematical representation of the sphere.

  17. Oculomotor capture during real-world scene viewing depends on cognitive load.

    Science.gov (United States)

    Matsukura, Michi; Brockmole, James R; Boot, Walter R; Henderson, John M

    2011-03-25

    It has been claimed that gaze control during scene viewing is largely governed by stimulus-driven, bottom-up selection mechanisms. Recent research, however, has strongly suggested that observers' top-down control plays a dominant role in attentional prioritization in scenes. A notable exception to this strong top-down control is oculomotor capture, where visual transients in a scene draw the eyes. One way to test whether oculomotor capture during scene viewing is independent of an observer's top-down goal setting is to reduce observers' cognitive resource availability. In the present study, we examined whether increasing observers' cognitive load influences the frequency and speed of oculomotor capture during scene viewing. In Experiment 1, we tested whether increasing observers' cognitive load modulates the degree of oculomotor capture by a new object suddenly appeared in a scene. Similarly, in Experiment 2, we tested whether increasing observers' cognitive load modulates the degree of oculomotor capture by an object's color change. In both experiments, the degree of oculomotor capture decreased as observers' cognitive resources were reduced. These results suggest that oculomotor capture during scene viewing is dependent on observers' top-down selection mechanisms. Copyright © 2011 Elsevier Ltd. All rights reserved.

  18. Gay and Lesbian Scene in Metelkova

    Directory of Open Access Journals (Sweden)

    Nataša Velikonja

    2013-09-01

    Full Text Available The article deals with the development of the gay and lesbian scene in ACC Metelkova, while specifying the preliminary aspects of establishing and building gay and lesbian activism associated with spatial issues. The struggle for space or occupying public space is vital for the gay and lesbian scene, as it provides not only the necessary socializing opportunities for gays and lesbians, but also does away with the historical hiding of homosexuality in the closet, in seclusion and silence. Because of their autonomy and long-term, continuous existence, homo-clubs at Metelkova contributed to the consolidation of the gay and lesbian scene in Slovenia and significantly improved the opportunities for cultural, social and political expression of gays and lesbians. Such a synthesis of the cultural, social and political, further intensified in Metelkova, and characterizes the gay and lesbian community in Slovenia from the very outset of gay and lesbian activism in 1984. It is this long-term synthesis that keeps this community in Slovenia so vital and politically resilient.

  19. Reidentification of Persons Using Clothing Features in Real-Life Video

    Directory of Open Access Journals (Sweden)

    Guodong Zhang

    2017-01-01

    Full Text Available Person reidentification, which aims to track people across nonoverlapping cameras, is a fundamental task in automated video processing. Moving people often appear differently when viewed from different nonoverlapping cameras because of differences in illumination, pose, and camera properties. The color histogram is a global feature of an object that can be used for identification. This histogram describes the distribution of all colors on the object. However, the use of color histograms has two disadvantages. First, colors change differently under different lighting and at different angles. Second, traditional color histograms lack spatial information. We used a perception-based color space to solve the illumination problem of traditional histograms. We also used the spatial pyramid matching (SPM model to improve the image spatial information in color histograms. Finally, we used the Gaussian mixture model (GMM to show features for person reidentification, because the main color feature of GMM is more adaptable for scene changes, and improve the stability of the retrieved results for different color spaces in various scenes. Through a series of experiments, we found the relationships of different features that impact person reidentification.

  20. Real-time construction and visualisation of drift-free video mosaics from unconstrained camera motion

    Directory of Open Access Journals (Sweden)

    Mateusz Brzeszcz

    2015-08-01

    Full Text Available This work proposes a novel approach for real-time video mosaicking facilitating drift-free mosaic construction and visualisation, with integrated frame blending and redundancy management, that is shown to be flexible to a range of varying mosaic scenarios. The approach supports unconstrained camera motion with in-sequence loop closing, variation in camera focal distance (zoom and recovery from video sequence breaks. Real-time performance, over extended duration sequences, is realised via novel aspects of frame management within the mosaic representation and thus avoiding the high data redundancy associated with temporally dense, spatially overlapping video frame inputs. This managed set of image frames is visualised in real time using a dynamic mosaic representation of overlapping textured graphics primitives in place of the traditional globally constructed, and hence frequently reconstructed, mosaic image. Within this formulation, subsequent optimisation occurring during online construction can thus efficiency adjust relative frame positions via simple primitive position transforms. Effective visualisation is similarly facilitated by online inter-frame blending to overcome the illumination and colour variance associated with modern camera hardware. The evaluation illustrates overall robustness in video mosaic construction under a diverse range of conditions including indoor and outdoor environments, varying illumination and presence of in-scene motion on varying computational platforms.

  1. The Video Collaborative Localization of a Miner's Lamp Based on Wireless Multimedia Sensor Networks for Underground Coal Mines.

    Science.gov (United States)

    You, Kaiming; Yang, Wei; Han, Ruisong

    2015-09-29

    Based on wireless multimedia sensor networks (WMSNs) deployed in an underground coal mine, a miner's lamp video collaborative localization algorithm was proposed to locate miners in the scene of insufficient illumination and bifurcated structures of underground tunnels. In bifurcation area, several camera nodes are deployed along the longitudinal direction of tunnels, forming a collaborative cluster in wireless way to monitor and locate miners in underground tunnels. Cap-lamps are regarded as the feature of miners in the scene of insufficient illumination of underground tunnels, which means that miners can be identified by detecting their cap-lamps. A miner's lamp will project mapping points on the imaging plane of collaborative cameras and the coordinates of mapping points are calculated by collaborative cameras. Then, multiple straight lines between the positions of collaborative cameras and their corresponding mapping points are established. To find the three-dimension (3D) coordinate location of the miner's lamp a least square method is proposed to get the optimal intersection of the multiple straight lines. Tests were carried out both in a corridor and a realistic scenario of underground tunnel, which show that the proposed miner's lamp video collaborative localization algorithm has good effectiveness, robustness and localization accuracy in real world conditions of underground tunnels.

  2. Integration of heterogeneous features for remote sensing scene classification

    Science.gov (United States)

    Wang, Xin; Xiong, Xingnan; Ning, Chen; Shi, Aiye; Lv, Guofang

    2018-01-01

    Scene classification is one of the most important issues in remote sensing (RS) image processing. We find that features from different channels (shape, spectral, texture, etc.), levels (low-level and middle-level), or perspectives (local and global) could provide various properties for RS images, and then propose a heterogeneous feature framework to extract and integrate heterogeneous features with different types for RS scene classification. The proposed method is composed of three modules (1) heterogeneous features extraction, where three heterogeneous feature types, called DS-SURF-LLC, mean-Std-LLC, and MS-CLBP, are calculated, (2) heterogeneous features fusion, where the multiple kernel learning (MKL) is utilized to integrate the heterogeneous features, and (3) an MKL support vector machine classifier for RS scene classification. The proposed method is extensively evaluated on three challenging benchmark datasets (a 6-class dataset, a 12-class dataset, and a 21-class dataset), and the experimental results show that the proposed method leads to good classification performance. It produces good informative features to describe the RS image scenes. Moreover, the integration of heterogeneous features outperforms some state-of-the-art features on RS scene classification tasks.

  3. Lateralized eye use towards video stimuli in bearded dragons (Pogona vitticeps

    Directory of Open Access Journals (Sweden)

    Anna Frohnwieser

    2017-08-01

    Full Text Available Lateralized eye use is thought to increase brain efficiency, as the two hemispheres process different information perceived by the eyes. It has been observed in a wide variety of vertebrate species and, in general, information about conspecifics appears to elicit a left eye preference whilst information about prey elicits the opposite. In reptiles, this phenomenon has only been investigated using live conspecifics in agonistic contexts, and so it is not clear whether it can be found when using video stimuli. Here, bearded dragons (Pogona vitticeps were presented with videos of female conspecifics and prey that either moved or were stationary, along with a control video of an empty background. Females exhibited a left eye bias towards conspecifics but males did not; however, both sexes looked at conspecifics significantly longer than prey. Further, animals used their left eye significantly longer when viewing moving stimuli of both categories. These results suggest that, in lizards, lateralized eye use when viewing conspecifics may be controlled by sex, and strongly influenced by stimulus movement. This study, therefore, provides important insights into the role of lateralized processing in lizard perception, and sets the scene for future work investigating the role of sex on perception of conspecifics and the role of motion in lateralized eye use.

  4. High-speed three-frame image recording system using colored flash units and low-cost video equipment

    Science.gov (United States)

    Racca, Roberto G.; Scotten, Larry N.

    1995-05-01

    This article describes a method that allows the digital recording of sequences of three black and white images at rates of several thousand frames per second using a system consisting of an ordinary CCD camcorder, three flash units with color filters, a PC-based frame grabber board and some additional electronics. The maximum framing rate is determined by the duration of the flashtube emission, and for common photographic flash units lasting about 20 microsecond(s) it can exceed 10,000 frames per second in actual use. The subject under study is strobe- illuminated using a red, a green and a blue flash unit controlled by a special sequencer, and the three images are captured by a color CCD camera on a single video field. Color is used as the distinguishing parameter that allows the overlaid exposures to be resolved. The video output for that particular field will contain three individual scenes, one for each primary color component, which potentially can be resolved with no crosstalk between them. The output is electronically decoded into the primary color channels, frame grabbed and stored into digital memory, yielding three time-resolved images of the subject. A synchronization pulse provided by the flash sequencer triggers the frame grabbing so that the correct video field is acquired. A scheme involving the use of videotape as intermediate storage allows the frame grabbing to be performed using a monochrome video digitizer. Ideally each flash- illuminated scene would be confined to one color channel, but in practice various factors, both optical and electronic, affect color separation. Correction equations have been derived that counteract these effects in the digitized images and minimize 'ghosting' between frames. Once the appropriate coefficients have been established through a calibration procedure that needs to be performed only once for a given configuration of the equipment, the correction process is carried out transparently in software every time a

  5. Simulator scene display evaluation device

    Science.gov (United States)

    Haines, R. F. (Inventor)

    1986-01-01

    An apparatus for aligning and calibrating scene displays in an aircraft simulator has a base on which all of the instruments for the aligning and calibrating are mounted. Laser directs beam at double right prism which is attached to pivoting support on base. The pivot point of the prism is located at the design eye point (DEP) of simulator during the aligning and calibrating. The objective lens in the base is movable on a track to follow the laser beam at different angles within the field of vision at the DEP. An eyepiece and a precision diopter are movable into a position behind the prism during the scene evaluation. A photometer or illuminometer is pivotable about the pivot into and out of position behind the eyepiece.

  6. Synchronous contextual irregularities affect early scene processing: replication and extension.

    Science.gov (United States)

    Mudrik, Liad; Shalgi, Shani; Lamy, Dominique; Deouell, Leon Y

    2014-04-01

    Whether contextual regularities facilitate perceptual stages of scene processing is widely debated, and empirical evidence is still inconclusive. Specifically, it was recently suggested that contextual violations affect early processing of a scene only when the incongruent object and the scene are presented a-synchronously, creating expectations. We compared event-related potentials (ERPs) evoked by scenes that depicted a person performing an action using either a congruent or an incongruent object (e.g., a man shaving with a razor or with a fork) when scene and object were presented simultaneously. We also explored the role of attention in contextual processing by using a pre-cue to direct subjects׳ attention towards or away from the congruent/incongruent object. Subjects׳ task was to determine how many hands the person in the picture used in order to perform the action. We replicated our previous findings of frontocentral negativity for incongruent scenes that started ~ 210 ms post stimulus presentation, even earlier than previously found. Surprisingly, this incongruency ERP effect was negatively correlated with the reaction times cost on incongruent scenes. The results did not allow us to draw conclusions about the role of attention in detecting the regularity, due to a weak attention manipulation. By replicating the 200-300 ms incongruity effect with a new group of subjects at even earlier latencies than previously reported, the results strengthen the evidence for contextual processing during this time window even when simultaneous presentation of the scene and object prevent the formation of prior expectations. We discuss possible methodological limitations that may account for previous failures to find this an effect, and conclude that contextual information affects object model selection processes prior to full object identification, with semantic knowledge activation stages unfolding only later on. Copyright © 2014 Elsevier Ltd. All rights reserved.

  7. Perceptual load in different regions of the visual scene and its relevance for driving.

    Science.gov (United States)

    Marciano, Hadas; Yeshurun, Yaffa

    2015-06-01

    The aim of this study was to better understand the role played by perceptual load, at both central and peripheral regions of the visual scene, in driving safety. Attention is a crucial factor in driving safety, and previous laboratory studies suggest that perceptual load is an important factor determining the efficiency of attentional selectivity. Yet, the effects of perceptual load on driving were never studied systematically. Using a driving simulator, we orthogonally manipulated the load levels at the road (central load) and its sides (peripheral load), while occasionally introducing critical events at one of these regions. Perceptual load affected driving performance at both regions of the visual scene. Critically, the effect was different for central versus peripheral load: Whereas load levels on the road mainly affected driving speed, load levels on its sides mainly affected the ability to detect critical events initiating from the roadsides. Moreover, higher levels of peripheral load impaired performance but mainly with low levels of central load, replicating findings with simple letter stimuli. Perceptual load has a considerable effect on driving, but the nature of this effect depends on the region of the visual scene at which the load is introduced. Given the observed importance of perceptual load, authors of future studies of driving safety should take it into account. Specifically, these findings suggest that our understanding of factors that may be relevant for driving safety would benefit from studying these factors under different levels of load at different regions of the visual scene. © 2014, Human Factors and Ergonomics Society.

  8. Cortical Representations of Speech in a Multitalker Auditory Scene.

    Science.gov (United States)

    Puvvada, Krishna C; Simon, Jonathan Z

    2017-09-20

    The ability to parse a complex auditory scene into perceptual objects is facilitated by a hierarchical auditory system. Successive stages in the hierarchy transform an auditory scene of multiple overlapping sources, from peripheral tonotopically based representations in the auditory nerve, into perceptually distinct auditory-object-based representations in the auditory cortex. Here, using magnetoencephalography recordings from men and women, we investigate how a complex acoustic scene consisting of multiple speech sources is represented in distinct hierarchical stages of the auditory cortex. Using systems-theoretic methods of stimulus reconstruction, we show that the primary-like areas in the auditory cortex contain dominantly spectrotemporal-based representations of the entire auditory scene. Here, both attended and ignored speech streams are represented with almost equal fidelity, and a global representation of the full auditory scene with all its streams is a better candidate neural representation than that of individual streams being represented separately. We also show that higher-order auditory cortical areas, by contrast, represent the attended stream separately and with significantly higher fidelity than unattended streams. Furthermore, the unattended background streams are more faithfully represented as a single unsegregated background object rather than as separated objects. Together, these findings demonstrate the progression of the representations and processing of a complex acoustic scene up through the hierarchy of the human auditory cortex. SIGNIFICANCE STATEMENT Using magnetoencephalography recordings from human listeners in a simulated cocktail party environment, we investigate how a complex acoustic scene consisting of multiple speech sources is represented in separate hierarchical stages of the auditory cortex. We show that the primary-like areas in the auditory cortex use a dominantly spectrotemporal-based representation of the entire auditory

  9. Separate and simultaneous adjustment of light qualities in a real scene

    NARCIS (Netherlands)

    Xia, L.; Pont, S.C.; Heynderickx, I.E.J.R.

    2017-01-01

    Humans are able to estimate light field properties in a scene in that they have expectations of the objects' appearance inside it. Previously, we probed such expectations in a real scene by asking whether a "probe object" fitted a real scene with regard to its lighting. But how well are observers

  10. Multi-view 3D scene reconstruction using ant colony optimization techniques

    International Nuclear Information System (INIS)

    Chrysostomou, Dimitrios; Gasteratos, Antonios; Nalpantidis, Lazaros; Sirakoulis, Georgios C

    2012-01-01

    This paper presents a new method performing high-quality 3D object reconstruction of complex shapes derived from multiple, calibrated photographs of the same scene. The novelty of this research is found in two basic elements, namely: (i) a novel voxel dissimilarity measure, which accommodates the elimination of the lighting variations of the models and (ii) the use of an ant colony approach for further refinement of the final 3D models. The proposed reconstruction procedure employs a volumetric method based on a novel projection test for the production of a visual hull. While the presented algorithm shares certain aspects with the space carving algorithm, it is, nevertheless, first enhanced with the lightness compensating image comparison method, and then refined using ant colony optimization. The algorithm is fast, computationally simple and results in accurate representations of the input scenes. In addition, compared to previous publications, the particular nature of the proposed algorithm allows accurate 3D volumetric measurements under demanding lighting environmental conditions, due to the fact that it can cope with uneven light scenes, resulting from the characteristics of the voxel dissimilarity measure applied. Besides, the intelligent behavior of the ant colony framework provides the opportunity to formulate the process as a combinatorial optimization problem, which can then be solved by means of a colony of cooperating artificial ants, resulting in very promising results. The method is validated with several real datasets, along with qualitative comparisons with other state-of-the-art 3D reconstruction techniques, following the Middlebury benchmark. (paper)

  11. Recognizing the Stranger: Recognition Scenes in the Gospel of John

    DEFF Research Database (Denmark)

    Larsen, Kasper Bro

    Recognizing the Stranger is the first monographic study of recognition scenes and motifs in the Gospel of John. The recognition type-scene (anagnōrisis) was a common feature in ancient drama and narrative, highly valued by Aristotle as a touching moment of truth, e.g., in Oedipus’ tragic self...... structures of the type-scene in order to show how Jesus’ true identity can be recognized behind the half-mask of his human appearance....

  12. Brief Report: Diminished Gaze Preference for Dynamic Social Interaction Scenes in Youth with Autism Spectrum Disorders.

    Science.gov (United States)

    Shaffer, Rebecca C; Pedapati, Ernest V; Shic, Frederick; Gaietto, Kristina; Bowers, Katherine; Wink, Logan K; Erickson, Craig A

    2017-02-01

    In this study, we present an eye-tracking paradigm, adapted from previous work with toddlers, for assessing social-interaction looking preferences in youth ages 5-17 with ASD and typically-developing controls (TDC). Videos of children playing together (Social Scenes, SS) were presented side-by-side with animated geometric shapes (GS). Participants with ASD demonstrated reduced SS preferences compared to TDC, results also represented continuously by associations between higher SS preferences and fewer social difficulties across the combined sample. Exploratory analyses identified associations between increased SS preferences and higher Vineland Daily Living Skills in ASD and suggested SS preferences in TDC females might drive ASD versus TDC between-group differences. These findings describe potentially sex-linked couplings between preferences for social information and social functioning in school-aged children.

  13. SAR Raw Data Generation for Complex Airport Scenes

    Directory of Open Access Journals (Sweden)

    Jia Li

    2014-10-01

    Full Text Available The method of generating the SAR raw data of complex airport scenes is studied in this paper. A formulation of the SAR raw signal model of airport scenes is given. Via generating the echoes from the background, aircrafts and buildings, respectively, the SAR raw data of the unified SAR imaging geometry is obtained from their vector additions. The multipath scattering and the shadowing between the background and different ground covers of standing airplanes and buildings are analyzed. Based on the scattering characteristics, coupling scattering models and SAR raw data models of different targets are given, respectively. A procedure is given to generate the SAR raw data of airport scenes. The SAR images from the simulated raw data demonstrate the validity of the proposed method.

  14. The Video Collaborative Localization of a Miner’s Lamp Based on Wireless Multimedia Sensor Networks for Underground Coal Mines

    Directory of Open Access Journals (Sweden)

    Kaiming You

    2015-09-01

    Full Text Available Based on wireless multimedia sensor networks (WMSNs deployed in an underground coal mine, a miner’s lamp video collaborative localization algorithm was proposed to locate miners in the scene of insufficient illumination and bifurcated structures of underground tunnels. In bifurcation area, several camera nodes are deployed along the longitudinal direction of tunnels, forming a collaborative cluster in wireless way to monitor and locate miners in underground tunnels. Cap-lamps are regarded as the feature of miners in the scene of insufficient illumination of underground tunnels, which means that miners can be identified by detecting their cap-lamps. A miner’s lamp will project mapping points on the imaging plane of collaborative cameras and the coordinates of mapping points are calculated by collaborative cameras. Then, multiple straight lines between the positions of collaborative cameras and their corresponding mapping points are established. To find the three-dimension (3D coordinate location of the miner’s lamp a least square method is proposed to get the optimal intersection of the multiple straight lines. Tests were carried out both in a corridor and a realistic scenario of underground tunnel, which show that the proposed miner’s lamp video collaborative localization algorithm has good effectiveness, robustness and localization accuracy in real world conditions of underground tunnels.

  15. The Video Collaborative Localization of a Miner’s Lamp Based on Wireless Multimedia Sensor Networks for Underground Coal Mines

    Science.gov (United States)

    You, Kaiming; Yang, Wei; Han, Ruisong

    2015-01-01

    Based on wireless multimedia sensor networks (WMSNs) deployed in an underground coal mine, a miner’s lamp video collaborative localization algorithm was proposed to locate miners in the scene of insufficient illumination and bifurcated structures of underground tunnels. In bifurcation area, several camera nodes are deployed along the longitudinal direction of tunnels, forming a collaborative cluster in wireless way to monitor and locate miners in underground tunnels. Cap-lamps are regarded as the feature of miners in the scene of insufficient illumination of underground tunnels, which means that miners can be identified by detecting their cap-lamps. A miner’s lamp will project mapping points on the imaging plane of collaborative cameras and the coordinates of mapping points are calculated by collaborative cameras. Then, multiple straight lines between the positions of collaborative cameras and their corresponding mapping points are established. To find the three-dimension (3D) coordinate location of the miner’s lamp a least square method is proposed to get the optimal intersection of the multiple straight lines. Tests were carried out both in a corridor and a realistic scenario of underground tunnel, which show that the proposed miner’s lamp video collaborative localization algorithm has good effectiveness, robustness and localization accuracy in real world conditions of underground tunnels. PMID:26426023

  16. Effects of aging on neural connectivity underlying selective memory for emotional scenes.

    Science.gov (United States)

    Waring, Jill D; Addis, Donna Rose; Kensinger, Elizabeth A

    2013-02-01

    Older adults show age-related reductions in memory for neutral items within complex visual scenes, but just like young adults, older adults exhibit a memory advantage for emotional items within scenes compared with the background scene information. The present study examined young and older adults' encoding-stage effective connectivity for selective memory of emotional items versus memory for both the emotional item and its background. In a functional magnetic resonance imaging (fMRI) study, participants viewed scenes containing either positive or negative items within neutral backgrounds. Outside the scanner, participants completed a memory test for items and backgrounds. Irrespective of scene content being emotionally positive or negative, older adults had stronger positive connections among frontal regions and from frontal regions to medial temporal lobe structures than did young adults, especially when items and backgrounds were subsequently remembered. These results suggest there are differences between young and older adults' connectivity accompanying the encoding of emotional scenes. Older adults may require more frontal connectivity to encode all elements of a scene rather than just encoding the emotional item. Published by Elsevier Inc.

  17. Radio Wave Propagation Scene Partitioning for High-Speed Rails

    Directory of Open Access Journals (Sweden)

    Bo Ai

    2012-01-01

    Full Text Available Radio wave propagation scene partitioning is necessary for wireless channel modeling. As far as we know, there are no standards of scene partitioning for high-speed rail (HSR scenarios, and therefore we propose the radio wave propagation scene partitioning scheme for HSR scenarios in this paper. Based on our measurements along the Wuhan-Guangzhou HSR, Zhengzhou-Xian passenger-dedicated line, Shijiazhuang-Taiyuan passenger-dedicated line, and Beijing-Tianjin intercity line in China, whose operation speeds are above 300 km/h, and based on the investigations on Beijing South Railway Station, Zhengzhou Railway Station, Wuhan Railway Station, Changsha Railway Station, Xian North Railway Station, Shijiazhuang North Railway Station, Taiyuan Railway Station, and Tianjin Railway Station, we obtain an overview of HSR propagation channels and record many valuable measurement data for HSR scenarios. On the basis of these measurements and investigations, we partitioned the HSR scene into twelve scenarios. Further work on theoretical analysis based on radio wave propagation mechanisms, such as reflection and diffraction, may lead us to develop the standard of radio wave propagation scene partitioning for HSR. Our work can also be used as a basis for the wireless channel modeling and the selection of some key techniques for HSR systems.

  18. Unconscious analyses of visual scenes based on feature conjunctions.

    Science.gov (United States)

    Tachibana, Ryosuke; Noguchi, Yasuki

    2015-06-01

    To efficiently process a cluttered scene, the visual system analyzes statistical properties or regularities of visual elements embedded in the scene. It is controversial, however, whether those scene analyses could also work for stimuli unconsciously perceived. Here we show that our brain performs the unconscious scene analyses not only using a single featural cue (e.g., orientation) but also based on conjunctions of multiple visual features (e.g., combinations of color and orientation information). Subjects foveally viewed a stimulus array (duration: 50 ms) where 4 types of bars (red-horizontal, red-vertical, green-horizontal, and green-vertical) were intermixed. Although a conscious perception of those bars was inhibited by a subsequent mask stimulus, the brain correctly analyzed the information about color, orientation, and color-orientation conjunctions of those invisible bars. The information of those features was then used for the unconscious configuration analysis (statistical processing) of the central bars, which induced a perceptual bias and illusory feature binding in visible stimuli at peripheral locations. While statistical analyses and feature binding are normally 2 key functions of the visual system to construct coherent percepts of visual scenes, our results show that a high-level analysis combining those 2 functions is correctly performed by unconscious computations in the brain. (c) 2015 APA, all rights reserved).

  19. Impact of the motion and visual complexity of the background on players' performance in video game-like displays.

    Science.gov (United States)

    Caroux, Loïc; Le Bigot, Ludovic; Vibert, Nicolas

    2013-01-01

    The visual interfaces of virtual environments such as video games often show scenes where objects are superimposed on a moving background. Three experiments were designed to better understand the impact of the complexity and/or overall motion of two types of visual backgrounds often used in video games on the detection and use of superimposed, stationary items. The impact of background complexity and motion was assessed during two typical video game tasks: a relatively complex visual search task and a classic, less demanding shooting task. Background motion impaired participants' performance only when they performed the shooting game task, and only when the simplest of the two backgrounds was used. In contrast, and independently of background motion, performance on both tasks was impaired when the complexity of the background increased. Eye movement recordings demonstrated that most of the findings reflected the impact of low-level features of the two backgrounds on gaze control.

  20. The nature of impulsivity: visual exposure to natural environments decreases impulsive decision-making in a delay discounting task.

    Directory of Open Access Journals (Sweden)

    Meredith S Berry

    Full Text Available The benefits of visual exposure to natural environments for human well-being in areas of stress reduction, mood improvement, and attention restoration are well documented, but the effects of natural environments on impulsive decision-making remain unknown. Impulsive decision-making in delay discounting offers generality, predictive validity, and insight into decision-making related to unhealthy behaviors. The present experiment evaluated differences in such decision-making in humans experiencing visual exposure to one of the following conditions: natural (e.g., mountains, built (e.g., buildings, or control (e.g., triangles using a delay discounting task that required participants to choose between immediate and delayed hypothetical monetary outcomes. Participants viewed the images before and during the delay discounting task. Participants were less impulsive in the condition providing visual exposure to natural scenes compared to built and geometric scenes. Results suggest that exposure to natural environments results in decreased impulsive decision-making relative to built environments.

  1. Emotional Scene Content Drives the Saccade Generation System Reflexively

    Science.gov (United States)

    Nummenmaa, Lauri; Hyona, Jukka; Calvo, Manuel G.

    2009-01-01

    The authors assessed whether parafoveal perception of emotional content influences saccade programming. In Experiment 1, paired emotional and neutral scenes were presented to parafoveal vision. Participants performed voluntary saccades toward either of the scenes according to an imperative signal (color cue). Saccadic reaction times were faster…

  2. Review of On-Scene Management of Mass-Casualty Attacks

    Directory of Open Access Journals (Sweden)

    Annelie Holgersson

    2016-02-01

    Full Text Available Background: The scene of a mass-casualty attack (MCA entails a crime scene, a hazardous space, and a great number of people needing medical assistance. Public transportation has been the target of such attacks and involves a high probability of generating mass casualties. The review aimed to investigate challenges for on-scene responses to MCAs and suggestions made to counter these challenges, with special attention given to attacks on public transportation and associated terminals. Methods: Articles were found through PubMed and Scopus, “relevant articles” as defined by the databases, and a manual search of references. Inclusion criteria were that the article referred to attack(s and/or a public transportation-related incident and issues concerning formal on-scene response. An appraisal of the articles’ scientific quality was conducted based on an evidence hierarchy model developed for the study. Results: One hundred and five articles were reviewed. Challenges for command and coordination on scene included establishing leadership, inter-agency collaboration, multiple incident sites, and logistics. Safety issues entailed knowledge and use of personal protective equipment, risk awareness and expectations, cordons, dynamic risk assessment, defensive versus offensive approaches, and joining forces. Communication concerns were equipment shortfalls, dialoguing, and providing information. Assessment problems were scene layout and interpreting environmental indicators as well as understanding setting-driven needs for specialist skills and resources. Triage and treatment difficulties included differing triage systems, directing casualties, uncommon injuries, field hospitals, level of care, providing psychological and pediatric care. Transportation hardships included scene access, distance to hospitals, and distribution of casualties. Conclusion: Commonly encountered challenges during unintentional incidents were added to during MCAs

  3. Top-down control of visual perception: attention in natural vision.

    Science.gov (United States)

    Rolls, Edmund T

    2008-01-01

    Top-down perceptual influences can bias (or pre-empt) perception. In natural scenes, the receptive fields of neurons in the inferior temporal visual cortex (IT) shrink to become close to the size of objects. This facilitates the read-out of information from the ventral visual system, because the information is primarily about the object at the fovea. Top-down attentional influences are much less evident in natural scenes than when objects are shown against blank backgrounds, though are still present. It is suggested that the reduced receptive-field size in natural scenes, and the effects of top-down attention contribute to change blindness. The receptive fields of IT neurons in complex scenes, though including the fovea, are frequently asymmetric around the fovea, and it is proposed that this is the solution the IT uses to represent multiple objects and their relative spatial positions in a scene. Networks that implement probabilistic decision-making are described, and it is suggested that, when in perceptual systems they take decisions (or 'test hypotheses'), they influence lower-level networks to bias visual perception. Finally, it is shown that similar processes extend to systems involved in the processing of emotion-provoking sensory stimuli, in that word-level cognitive states provide top-down biasing that reaches as far down as the orbitofrontal cortex, where, at the first stage of affective representations, olfactory, taste, flavour, and touch processing is biased (or pre-empted) in humans.

  4. The Nature and Predictive Value of Mothers’ Beliefs Regarding Infants’ and Toddlers’ TV/Video Viewing: Applying the Integrative Model of Behavioral Prediction

    Science.gov (United States)

    Vaala, Sarah E.

    2014-01-01

    Viewing television and video programming has become a normative behavior among US infants and toddlers. Little is understood about parents’ decision-making about the extent of their young children’s viewing, though numerous organizations are interested in reducing time spent viewing among infants and toddlers. Prior research has examined parents’ belief in the educational value of TV/videos for young children and the predictive value of this belief for understanding infant/toddler viewing rates, though other possible salient beliefs remain largely unexplored. This study employs the integrative model of behavioral prediction (Fishbein & Ajzen, 2010) to examine 30 maternal beliefs about infants’ and toddlers’ TV/video viewing which were elicited from a prior sample of mothers. Results indicate that mothers tend to hold more positive than negative beliefs about the outcomes associated with young children’s TV/video viewing, and that the nature of the aggregate set of beliefs is predictive of their general attitudes and intentions to allow their children to view, as well as children’s estimated viewing rates. Analyses also uncover multiple dimensions within the full set of beliefs, which explain more variance in mothers’ attitudes and intentions and children’s viewing than the uni-dimensional index. The theoretical and practical implications of the findings are discussed. PMID:25431537

  5. Clandestine laboratory scene investigation and processing using portable GC/MS

    Science.gov (United States)

    Matejczyk, Raymond J.

    1997-02-01

    This presentation describes the use of portable gas chromatography/mass spectrometry for on-scene investigation and processing of clandestine laboratories. Clandestine laboratory investigations present special problems to forensic investigators. These crime scenes contain many chemical hazards that must be detected, identified and collected as evidence. Gas chromatography/mass spectrometry performed on-scene with a rugged, portable unit is capable of analyzing a variety of matrices for drugs and chemicals used in the manufacture of illicit drugs, such as methamphetamine. Technologies used to detect various materials at a scene have particular applications but do not address the wide range of samples, chemicals, matrices and mixtures that exist in clan labs. Typical analyses performed by GC/MS are for the purpose of positively establishing the identity of starting materials, chemicals and end-product collected from clandestine laboratories. Concerns for the public and investigator safety and the environment are also important factors for rapid on-scene data generation. Here is described the implementation of a portable multiple-inlet GC/MS system designed for rapid deployment to a scene to perform forensic investigations of clandestine drug manufacturing laboratories. GC/MS has long been held as the 'gold standard' in performing forensic chemical analyses. With the capability of GC/MS to separate and produce a 'chemical fingerprint' of compounds, it is utilized as an essential technique for detecting and positively identifying chemical evidence. Rapid and conclusive on-scene analysis of evidence will assist the forensic investigators in collecting only pertinent evidence thereby reducing the amount of evidence to be transported, reducing chain of custody concerns, reducing costs and hazards, maintaining sample integrity and speeding the completion of the investigative process.

  6. Characterizing popularity dynamics of online videos

    OpenAIRE

    Ren, Zhuo-Ming; Shi, , Yu-Qiang; Liao, Hao

    2016-01-01

    Online popularity has a major impact on videos, music, news and other contexts in online systems. Characterizing online popularity dynamics is nature to explain the observed properties in terms of the already acquired popularity of each individual. In this paper, we provide a quantitative, large scale, temporal analysis of the popularity dynamics in two online video-provided websites, namely MovieLens and Netflix. The two collected data sets contain over 100 million records and even span...

  7. Fast-track video-assisted thoracoscopic surgery

    DEFF Research Database (Denmark)

    Holbek, Bo Laksafoss; Petersen, René Horsleben; Kehlet, Henrik

    2016-01-01

    Objectives To provide a short overview of fast-track video-assisted thoracoscopic surgery (VATS) and to identify areas requiring further research. Design A literature search was made using key words including: fast-track, enhanced recovery, video-assisted thoracoscopic surgery, robot......-assisted thoracoscopic surgery (RATS), robotic, thoracotomy, single-incision, uniportal, natural orifice transluminal endoscopic surgery (NOTES), chest tube, air-leak, digital drainage, pain management, analgesia, perioperative management, anaesthesia and non-intubated. References from articles were screened for further...

  8. Study on a High Compression Processing for Video-on-Demand e-learning System

    Science.gov (United States)

    Nomura, Yoshihiko; Matsuda, Ryutaro; Sakamoto, Ryota; Sugiura, Tokuhiro; Matsui, Hirokazu; Kato, Norihiko

    The authors proposed a high-quality and small-capacity lecture-video-file creating system for distance e-learning system. Examining the feature of the lecturing scene, the authors ingeniously employ two kinds of image-capturing equipment having complementary characteristics : one is a digital video camera with a low resolution and a high frame rate, and the other is a digital still camera with a high resolution and a very low frame rate. By managing the two kinds of image-capturing equipment, and by integrating them with image processing, we can produce course materials with the greatly reduced file capacity : the course materials satisfy the requirements both for the temporal resolution to see the lecturer's point-indicating actions and for the high spatial resolution to read the small written letters. As a result of a comparative experiment, the e-lecture using the proposed system was confirmed to be more effective than an ordinary lecture from the viewpoint of educational effect.

  9. Automatic Association of Chats and Video Tracks for Activity Learning and Recognition in Aerial Video Surveillance

    Directory of Open Access Journals (Sweden)

    Riad I. Hammoud

    2014-10-01

    Full Text Available We describe two advanced video analysis techniques, including video-indexed by voice annotations (VIVA and multi-media indexing and explorer (MINER. VIVA utilizes analyst call-outs (ACOs in the form of chat messages (voice-to-text to associate labels with video target tracks, to designate spatial-temporal activity boundaries and to augment video tracking in challenging scenarios. Challenging scenarios include low-resolution sensors, moving targets and target trajectories obscured by natural and man-made clutter. MINER includes: (1 a fusion of graphical track and text data using probabilistic methods; (2 an activity pattern learning framework to support querying an index of activities of interest (AOIs and targets of interest (TOIs by movement type and geolocation; and (3 a user interface to support streaming multi-intelligence data processing. We also present an activity pattern learning framework that uses the multi-source associated data as training to index a large archive of full-motion videos (FMV. VIVA and MINER examples are demonstrated for wide aerial/overhead imagery over common data sets affording an improvement in tracking from video data alone, leading to 84% detection with modest misdetection/false alarm results due to the complexity of the scenario. The novel use of ACOs and chat Sensors 2014, 14 19844 messages in video tracking paves the way for user interaction, correction and preparation of situation awareness reports.

  10. Automatic association of chats and video tracks for activity learning and recognition in aerial video surveillance.

    Science.gov (United States)

    Hammoud, Riad I; Sahin, Cem S; Blasch, Erik P; Rhodes, Bradley J; Wang, Tao

    2014-10-22

    We describe two advanced video analysis techniques, including video-indexed by voice annotations (VIVA) and multi-media indexing and explorer (MINER). VIVA utilizes analyst call-outs (ACOs) in the form of chat messages (voice-to-text) to associate labels with video target tracks, to designate spatial-temporal activity boundaries and to augment video tracking in challenging scenarios. Challenging scenarios include low-resolution sensors, moving targets and target trajectories obscured by natural and man-made clutter. MINER includes: (1) a fusion of graphical track and text data using probabilistic methods; (2) an activity pattern learning framework to support querying an index of activities of interest (AOIs) and targets of interest (TOIs) by movement type and geolocation; and (3) a user interface to support streaming multi-intelligence data processing. We also present an activity pattern learning framework that uses the multi-source associated data as training to index a large archive of full-motion videos (FMV). VIVA and MINER examples are demonstrated for wide aerial/overhead imagery over common data sets affording an improvement in tracking from video data alone, leading to 84% detection with modest misdetection/false alarm results due to the complexity of the scenario. The novel use of ACOs and chat Sensors 2014, 14 19844 messages in video tracking paves the way for user interaction, correction and preparation of situation awareness reports.

  11. Real-time maritime scene simulation for ladar sensors

    Science.gov (United States)

    Christie, Chad L.; Gouthas, Efthimios; Swierkowski, Leszek; Williams, Owen M.

    2011-06-01

    Continuing interest exists in the development of cost-effective synthetic environments for testing Laser Detection and Ranging (ladar) sensors. In this paper we describe a PC-based system for real-time ladar scene simulation of ships and small boats in a dynamic maritime environment. In particular, we describe the techniques employed to generate range imagery accompanied by passive radiance imagery. Our ladar scene generation system is an evolutionary extension of the VIRSuite infrared scene simulation program and includes all previous features such as ocean wave simulation, the physically-realistic representation of boat and ship dynamics, wake generation and simulation of whitecaps, spray, wake trails and foam. A terrain simulation extension is also under development. In this paper we outline the development, capabilities and limitations of the VIRSuite extensions.

  12. Making Time for Nature: Visual Exposure to Natural Environments Lengthens Subjective Time Perception and Reduces Impulsivity.

    Directory of Open Access Journals (Sweden)

    Meredith S Berry

    Full Text Available Impulsivity in delay discounting is associated with maladaptive behaviors such as overeating and drug and alcohol abuse. Researchers have recently noted that delay discounting, even when measured by a brief laboratory task, may be the best predictor of human health related behaviors (e.g., exercise currently available. Identifying techniques to decrease impulsivity in delay discounting, therefore, could help improve decision-making on a global scale. Visual exposure to natural environments is one recent approach shown to decrease impulsive decision-making in a delay discounting task, although the mechanism driving this result is currently unknown. The present experiment was thus designed to evaluate not only whether visual exposure to natural (mountains, lakes relative to built (buildings, cities environments resulted in less impulsivity, but also whether this exposure influenced time perception. Participants were randomly assigned to either a natural environment condition or a built environment condition. Participants viewed photographs of either natural scenes or built scenes before and during a delay discounting task in which they made choices about receiving immediate or delayed hypothetical monetary outcomes. Participants also completed an interval bisection task in which natural or built stimuli were judged as relatively longer or shorter presentation durations. Following the delay discounting and interval bisection tasks, additional measures of time perception were administered, including how many minutes participants thought had passed during the session and a scale measurement of whether time "flew" or "dragged" during the session. Participants exposed to natural as opposed to built scenes were less impulsive and also reported longer subjective session times, although no differences across groups were revealed with the interval bisection task. These results are the first to suggest that decreased impulsivity from exposure to natural as

  13. Cybersickness in the presence of scene rotational movements along different axes.

    Science.gov (United States)

    Lo, W T; So, R H

    2001-02-01

    Compelling scene movements in a virtual reality (VR) system can cause symptoms of motion sickness (i.e., cybersickness). A within-subject experiment has been conducted to investigate the effects of scene oscillations along different axes on the level of cybersickness. Sixteen male participants were exposed to four 20-min VR simulation sessions. The four sessions used the same virtual environment but with scene oscillations along different axes, i.e., pitch, yaw, roll, or no oscillation (speed: 30 degrees/s, range: +/- 60 degrees). Verbal ratings of the level of nausea were taken at 5-min intervals during the sessions and sickness symptoms were also measured before and after the sessions using the Simulator Sickness Questionnaire (SSQ). In the presence of scene oscillation, both nausea ratings and SSQ scores increased at significantly higher rates than with no oscillation. While individual participants exhibited different susceptibilities to nausea associated with VR simulation containing scene oscillations along different rotational axes, the overall effects of axis among our group of 16 randomly selected participants were not significant. The main effects of, and interactions among, scene oscillation, duration, and participants are discussed in the paper.

  14. Framework of passive millimeter-wave scene simulation based on material classification

    Science.gov (United States)

    Park, Hyuk; Kim, Sung-Hyun; Lee, Ho-Jin; Kim, Yong-Hoon; Ki, Jae-Sug; Yoon, In-Bok; Lee, Jung-Min; Park, Soon-Jun

    2006-05-01

    Over the past few decades, passive millimeter-wave (PMMW) sensors have emerged as useful implements in transportation and military applications such as autonomous flight-landing system, smart weapons, night- and all weather vision system. As an efficient way to predict the performance of a PMMW sensor and apply it to system, it is required to test in SoftWare-In-the-Loop (SWIL). The PMMW scene simulation is a key component for implementation of this simulator. However, there is no commercial on-the-shelf available to construct the PMMW scene simulation; only there have been a few studies on this technology. We have studied the PMMW scene simulation method to develop the PMMW sensor SWIL simulator. This paper describes the framework of the PMMW scene simulation and the tentative results. The purpose of the PMMW scene simulation is to generate sensor outputs (or image) from a visible image and environmental conditions. We organize it into four parts; material classification mapping, PMMW environmental setting, PMMW scene forming, and millimeter-wave (MMW) sensorworks. The background and the objects in the scene are classified based on properties related with MMW radiation and reflectivity. The environmental setting part calculates the following PMMW phenomenology; atmospheric propagation and emission including sky temperature, weather conditions, and physical temperature. Then, PMMW raw images are formed with surface geometry. Finally, PMMW sensor outputs are generated from PMMW raw images by applying the sensor characteristics such as an aperture size and noise level. Through the simulation process, PMMW phenomenology and sensor characteristics are simulated on the output scene. We have finished the design of framework of the simulator, and are working on implementation in detail. As a tentative result, the flight observation was simulated in specific conditions. After implementation details, we plan to increase the reliability of the simulation by data collecting

  15. Predicting the usefulness and naturalness of color reproductions

    NARCIS (Netherlands)

    Janssen, T.J.W.M.; Blommaert, F.J.J.

    2000-01-01

    We present algorithms for predicting the usefulness and naturalness of color reproductions of natural scenes. The algorithms are based on a computational model of the stages that lead to an observer's impression of the usefulness and naturalness of an image. These stages are (1) the perception, or

  16. Hydrological AnthropoScenes

    Science.gov (United States)

    Cudennec, Christophe

    2016-04-01

    The Anthropocene concept encapsulates the planetary-scale changes resulting from accelerating socio-ecological transformations, beyond the stratigraphic definition actually in debate. The emergence of multi-scale and proteiform complexity requires inter-discipline and system approaches. Yet, to reduce the cognitive challenge of tackling this complexity, the global Anthropocene syndrome must now be studied from various topical points of view, and grounded at regional and local levels. A system approach should allow to identify AnthropoScenes, i.e. settings where a socio-ecological transformation subsystem is clearly coherent within boundaries and displays explicit relationships with neighbouring/remote scenes and within a nesting architecture. Hydrology is a key topical point of view to be explored, as it is important in many aspects of the Anthropocene, either with water itself being a resource, hazard or transport force; or through the network, connectivity, interface, teleconnection, emergence and scaling issues it determines. We will schematically exemplify these aspects with three contrasted hydrological AnthropoScenes in Tunisia, France and Iceland; and reframe therein concepts of the hydrological change debate. Bai X., van der Leeuw S., O'Brien K., Berkhout F., Biermann F., Brondizio E., Cudennec C., Dearing J., Duraiappah A., Glaser M., Revkin A., Steffen W., Syvitski J., 2016. Plausible and desirable futures in the Anthropocene: A new research agenda. Global Environmental Change, in press, http://dx.doi.org/10.1016/j.gloenvcha.2015.09.017 Brondizio E., O'Brien K., Bai X., Biermann F., Steffen W., Berkhout F., Cudennec C., Lemos M.C., Wolfe A., Palma-Oliveira J., Chen A. C-T. Re-conceptualizing the Anthropocene: A call for collaboration. Global Environmental Change, in review. Montanari A., Young G., Savenije H., Hughes D., Wagener T., Ren L., Koutsoyiannis D., Cudennec C., Grimaldi S., Blöschl G., Sivapalan M., Beven K., Gupta H., Arheimer B., Huang Y

  17. Hierarchical Model for the Similarity Measurement of a Complex Holed-Region Entity Scene

    Directory of Open Access Journals (Sweden)

    Zhanlong Chen

    2017-11-01

    Full Text Available Complex multi-holed-region entity scenes (i.e., sets of random region with holes are common in spatial database systems, spatial query languages, and the Geographic Information System (GIS. A multi-holed-region (region with an arbitrary number of holes is an abstraction of the real world that primarily represents geographic objects that have more than one interior boundary, such as areas that contain several lakes or lakes that contain islands. When the similarity of the two complex holed-region entity scenes is measured, the number of regions in the scenes and the number of holes in the regions are usually different between the two scenes, which complicates the matching relationships of holed-regions and holes. The aim of this research is to develop several holed-region similarity metrics and propose a hierarchical model to measure comprehensively the similarity between two complex holed-region entity scenes. The procedure first divides a complex entity scene into three layers: a complex scene, a micro-spatial-scene, and a simple entity (hole. The relationships between the adjacent layers are considered to be sets of relationships, and each level of similarity measurements is nested with the adjacent one. Next, entity matching is performed from top to bottom, while the similarity results are calculated from local to global. In addition, we utilize position graphs to describe the distribution of the holed-regions and subsequently describe the directions between the holes using a feature matrix. A case study that uses the Great Lakes in North America in 1986 and 2015 as experimental data illustrates the entire similarity measurement process between two complex holed-region entity scenes. The experimental results show that the hierarchical model accounts for the relationships of the different layers in the entire complex holed-region entity scene. The model can effectively calculate the similarity of complex holed-region entity scenes, even if the

  18. Learning object-to-class kernels for scene classification.

    Science.gov (United States)

    Zhang, Lei; Zhen, Xiantong; Shao, Ling

    2014-08-01

    High-level image representations have drawn increasing attention in visual recognition, e.g., scene classification, since the invention of the object bank. The object bank represents an image as a response map of a large number of pretrained object detectors and has achieved superior performance for visual recognition. In this paper, based on the object bank representation, we propose the object-to-class (O2C) distances to model scene images. In particular, four variants of O2C distances are presented, and with the O2C distances, we can represent the images using the object bank by lower-dimensional but more discriminative spaces, called distance spaces, which are spanned by the O2C distances. Due to the explicit computation of O2C distances based on the object bank, the obtained representations can possess more semantic meanings. To combine the discriminant ability of the O2C distances to all scene classes, we further propose to kernalize the distance representation for the final classification. We have conducted extensive experiments on four benchmark data sets, UIUC-Sports, Scene-15, MIT Indoor, and Caltech-101, which demonstrate that the proposed approaches can significantly improve the original object bank approach and achieve the state-of-the-art performance.

  19. A clinical pilot study of a modular video-CT augmentation system for image-guided skull base surgery

    Science.gov (United States)

    Liu, Wen P.; Mirota, Daniel J.; Uneri, Ali; Otake, Yoshito; Hager, Gregory; Reh, Douglas D.; Ishii, Masaru; Gallia, Gary L.; Siewerdsen, Jeffrey H.

    2012-02-01

    Augmentation of endoscopic video with preoperative or intraoperative image data [e.g., planning data and/or anatomical segmentations defined in computed tomography (CT) and magnetic resonance (MR)], can improve navigation, spatial orientation, confidence, and tissue resection in skull base surgery, especially with respect to critical neurovascular structures that may be difficult to visualize in the video scene. This paper presents the engineering and evaluation of a video augmentation system for endoscopic skull base surgery translated to use in a clinical study. Extension of previous research yielded a practical system with a modular design that can be applied to other endoscopic surgeries, including orthopedic, abdominal, and thoracic procedures. A clinical pilot study is underway to assess feasibility and benefit to surgical performance by overlaying CT or MR planning data in realtime, high-definition endoscopic video. Preoperative planning included segmentation of the carotid arteries, optic nerves, and surgical target volume (e.g., tumor). An automated camera calibration process was developed that demonstrates mean re-projection accuracy (0.7+/-0.3) pixels and mean target registration error of (2.3+/-1.5) mm. An IRB-approved clinical study involving fifteen patients undergoing skull base tumor surgery is underway in which each surgery includes the experimental video-CT system deployed in parallel to the standard-of-care (unaugmented) video display. Questionnaires distributed to one neurosurgeon and two otolaryngologists are used to assess primary outcome measures regarding the benefit to surgical confidence in localizing critical structures and targets by means of video overlay during surgical approach, resection, and reconstruction.

  20. The role of memory for visual search in scenes.

    Science.gov (United States)

    Le-Hoa Võ, Melissa; Wolfe, Jeremy M

    2015-03-01

    Many daily activities involve looking for something. The ease with which these searches are performed often allows one to forget that searching represents complex interactions between visual attention and memory. Although a clear understanding exists of how search efficiency will be influenced by visual features of targets and their surrounding distractors or by the number of items in the display, the role of memory in search is less well understood. Contextual cueing studies have shown that implicit memory for repeated item configurations can facilitate search in artificial displays. When searching more naturalistic environments, other forms of memory come into play. For instance, semantic memory provides useful information about which objects are typically found where within a scene, and episodic scene memory provides information about where a particular object was seen the last time a particular scene was viewed. In this paper, we will review work on these topics, with special emphasis on the role of memory in guiding search in organized, real-world scenes. © 2015 New York Academy of Sciences.

  1. Characteristics of color memory for natural scenes

    Science.gov (United States)

    Amano, Kinjiro; Uchikawa, Keiji; Kuriki, Ichiro

    2002-08-01

    To study the characteristics of color memory for natural images, a memory-identification task was performed with differing color contrasts; three of the contrasts were defined by chromatic and luminance components of the image, and the others were defined with respect to the categorical colors. After observing a series of pictures successively, subjects identified the pictures using a confidence rating. Detection of increased contrasts tended to be harder than detection of decreased contrasts, suggesting that the chromaticness of pictures is enhanced in memory. Detecting changes within each color category was more difficult than across the categories. A multiple mechanism that processes color differences and categorical colors is briefly considered. 2002 Optical Society of America

  2. A Two-Stream Deep Fusion Framework for High-Resolution Aerial Scene Classification

    Directory of Open Access Journals (Sweden)

    Yunlong Yu

    2018-01-01

    Full Text Available One of the challenging problems in understanding high-resolution remote sensing images is aerial scene classification. A well-designed feature representation method and classifier can improve classification accuracy. In this paper, we construct a new two-stream deep architecture for aerial scene classification. First, we use two pretrained convolutional neural networks (CNNs as feature extractor to learn deep features from the original aerial image and the processed aerial image through saliency detection, respectively. Second, two feature fusion strategies are adopted to fuse the two different types of deep convolutional features extracted by the original RGB stream and the saliency stream. Finally, we use the extreme learning machine (ELM classifier for final classification with the fused features. The effectiveness of the proposed architecture is tested on four challenging datasets: UC-Merced dataset with 21 scene categories, WHU-RS dataset with 19 scene categories, AID dataset with 30 scene categories, and NWPU-RESISC45 dataset with 45 challenging scene categories. The experimental results demonstrate that our architecture gets a significant classification accuracy improvement over all state-of-the-art references.

  3. Research on hyperspectral dynamic scene and image sequence simulation

    Science.gov (United States)

    Sun, Dandan; Liu, Fang; Gao, Jiaobo; Sun, Kefeng; Hu, Yu; Li, Yu; Xie, Junhu; Zhang, Lei

    2016-10-01

    This paper presents a simulation method of hyperspectral dynamic scene and image sequence for hyperspectral equipment evaluation and target detection algorithm. Because of high spectral resolution, strong band continuity, anti-interference and other advantages, in recent years, hyperspectral imaging technology has been rapidly developed and is widely used in many areas such as optoelectronic target detection, military defense and remote sensing systems. Digital imaging simulation, as a crucial part of hardware in loop simulation, can be applied to testing and evaluation hyperspectral imaging equipment with lower development cost and shorter development period. Meanwhile, visual simulation can produce a lot of original image data under various conditions for hyperspectral image feature extraction and classification algorithm. Based on radiation physic model and material characteristic parameters this paper proposes a generation method of digital scene. By building multiple sensor models under different bands and different bandwidths, hyperspectral scenes in visible, MWIR, LWIR band, with spectral resolution 0.01μm, 0.05μm and 0.1μm have been simulated in this paper. The final dynamic scenes have high real-time and realistic, with frequency up to 100 HZ. By means of saving all the scene gray data in the same viewpoint image sequence is obtained. The analysis results show whether in the infrared band or the visible band, the grayscale variations of simulated hyperspectral images are consistent with the theoretical analysis results.

  4. Dynamics of scene representations in the human brain revealed by magnetoencephalography and deep neural networks

    Science.gov (United States)

    Cichy, Radoslaw Martin; Khosla, Aditya; Pantazis, Dimitrios; Oliva, Aude

    2017-01-01

    Human scene recognition is a rapid multistep process evolving over time from single scene image to spatial layout processing. We used multivariate pattern analyses on magnetoencephalography (MEG) data to unravel the time course of this cortical process. Following an early signal for lower-level visual analysis of single scenes at ~100 ms, we found a marker of real-world scene size, i.e. spatial layout processing, at ~250 ms indexing neural representations robust to changes in unrelated scene properties and viewing conditions. For a quantitative model of how scene size representations may arise in the brain, we compared MEG data to a deep neural network model trained on scene classification. Representations of scene size emerged intrinsically in the model, and resolved emerging neural scene size representation. Together our data provide a first description of an electrophysiological signal for layout processing in humans, and suggest that deep neural networks are a promising framework to investigate how spatial layout representations emerge in the human brain. PMID:27039703

  5. Recognition and attention guidance during contextual cueing in real-world scenes: evidence from eye movements.

    Science.gov (United States)

    Brockmole, James R; Henderson, John M

    2006-07-01

    When confronted with a previously encountered scene, what information is used to guide search to a known target? We contrasted the role of a scene's basic-level category membership with its specific arrangement of visual properties. Observers were repeatedly shown photographs of scenes that contained consistently but arbitrarily located targets, allowing target positions to be associated with scene content. Learned scenes were then unexpectedly mirror reversed, spatially translating visual features as well as the target across the display while preserving the scene's identity and concept. Mirror reversals produced a cost as the eyes initially moved toward the position in the display in which the target had previously appeared. The cost was not complete, however; when initial search failed, the eyes were quickly directed to the target's new position. These results suggest that in real-world scenes, shifts of attention are initially based on scene identity, and subsequent shifts are guided by more detailed information regarding scene and object layout.

  6. Towards intelligent video understanding applied to plasma facing component monitoring

    International Nuclear Information System (INIS)

    Martin, V.; Travere, J.M.; Moncada, V.; Bremond, F.

    2011-01-01

    In this paper, we promote intelligent plasma facing component video monitoring for both real-time purposes (machine protection issues) and post event analysis purposes (plasma-wall interaction understanding). We propose a vision-based system able to automatically detect and classify into different pre-defined categories thermal phenomena such as localized hot spots or transient thermal events (e.g. electrical arcing) from infrared imaging data of PFCs. This original computer vision system is made intelligent by endowing it with high level reasoning (i.e. integration of a priori knowledge of thermal event spatio-temporal properties to guide the recognition), self-adaptability to varying conditions (e.g. different thermal scenes and plasma scenarios), and learning capabilities (e.g. statistical modelling of event behaviour based on training samples). (authors)

  7. An Attempt at Assessing Preferences for Natural Landscapes

    Science.gov (United States)

    Calvin, James S.; And Others

    1972-01-01

    Investigation of ways in which man makes a psychological assessment of his environment. Concerned with variables in the environment itself, fifteen photographs of natural landscape scenes were rated on each of twenty-one semantic differential scales by college students. Two major dimensions emerged: natural scenic beauty and a natural force…

  8. Review of infrared scene projector technology-1993

    Science.gov (United States)

    Driggers, Ronald G.; Barnard, Kenneth J.; Burroughs, E. E.; Deep, Raymond G.; Williams, Owen M.

    1994-07-01

    The importance of testing IR imagers and missile seekers with realistic IR scenes warrants a review of the current technologies used in dynamic infrared scene projection. These technologies include resistive arrays, deformable mirror arrays, mirror membrane devices, liquid crystal light valves, laser writers, laser diode arrays, and CRTs. Other methods include frustrated total internal reflection, thermoelectric devices, galvanic cells, Bly cells, and vanadium dioxide. A description of each technology is presented along with a discussion of their relative benefits and disadvantages. The current state of each methodology is also summarized. Finally, the methods are compared and contrasted in terms of their performance parameters.

  9. Video Surveillance in Mental Health Facilities: Is it Ethical?

    Science.gov (United States)

    Stolovy, Tali; Melamed, Yuval; Afek, Arnon

    2015-05-01

    Video surveillance is a tool for managing safety and security within public spaces. In mental health facilities, the major benefit of video surveillance is that it enables 24 hour monitoring of patients, which has the potential to reduce violent and aggressive behavior. The major disadvantage is that such observation is by nature intrusive. It diminishes privacy, a factor of huge importance for psychiatric inpatients. Thus, an ongoing debate has developed following the increasing use of cameras in this setting. This article presents the experience of a medium-large academic state hospital that uses video surveillance, and explores the various ethical and administrative aspects of video surveillance in mental health facilities.

  10. Use of AFIS for linking scenes of crime.

    Science.gov (United States)

    Hefetz, Ido; Liptz, Yakir; Vaturi, Shaul; Attias, David

    2016-05-01

    Forensic intelligence can provide critical information in criminal investigations - the linkage of crime scenes. The Automatic Fingerprint Identification System (AFIS) is an example of a technological improvement that has advanced the entire forensic identification field to strive for new goals and achievements. In one example using AFIS, a series of burglaries into private apartments enabled a fingerprint examiner to search latent prints from different burglary scenes against an unsolved latent print database. Latent finger and palm prints coming from the same source were associated with over than 20 cases. Then, by forensic intelligence and profile analysis the offender's behavior could be anticipated. He was caught, identified, and arrested. It is recommended to perform an AFIS search of LT/UL prints against current crimes automatically as part of laboratory protocol and not by an examiner's discretion. This approach may link different crime scenes. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  11. VideoSET: Video Summary Evaluation through Text

    OpenAIRE

    Yeung, Serena; Fathi, Alireza; Fei-Fei, Li

    2014-01-01

    In this paper we present VideoSET, a method for Video Summary Evaluation through Text that can evaluate how well a video summary is able to retain the semantic information contained in its original video. We observe that semantics is most easily expressed in words, and develop a text-based approach for the evaluation. Given a video summary, a text representation of the video summary is first generated, and an NLP-based metric is then used to measure its semantic distance to ground-truth text ...

  12. PERANCANGAN VIDEO PANDUAN FITNES SEBAGAI MEDIA PEMBELAJARAN

    Directory of Open Access Journals (Sweden)

    Rizkysari Meimaharani

    2013-06-01

    Full Text Available ABSTRACT Designing fitness exercise tutorial level beginner as learning and promotion media for life gym was designed to provide guidelines of good movement in the fitness training sessions for beginners, especially the gym because life member will be distributed free of charge for new members sign up. For the process of editing video tutorial software and hardware needed adequate for smooth production. The results also depend on the ability of either constituent knowledge of a general nature and especially directing, editing, creativity, and the ability of hardware, software and technology / computer. Excess video guide allows members to understand the movement is good and right to avoid unwanted injury. Not only guides the movement are presented in this video project but also the member is given petuntuk diet and proper diet for target practice can be easily achieved. Excess video guide allows members to understand the movement is good and right to avoid unwanted injury. Not only guides the movement are presented in this video project but also the member is given guide of diet and proper diet for target practice can be easily achieved. The presence of video editing technology offers convenience to an agency to educate the public through video learning and served as media promotion of a service or related agency theme of the video.

  13. Data Partitioning Technique for Improved Video Prioritization

    Directory of Open Access Journals (Sweden)

    Ismail Amin Ali

    2017-07-01

    Full Text Available A compressed video bitstream can be partitioned according to the coding priority of the data, allowing prioritized wireless communication or selective dropping in a congested channel. Known as data partitioning in the H.264/Advanced Video Coding (AVC codec, this paper introduces a further sub-partition of one of the H.264/AVC codec’s three data-partitions. Results show a 5 dB improvement in Peak Signal-to-Noise Ratio (PSNR through this innovation. In particular, the data partition containing intra-coded residuals is sub-divided into data from: those macroblocks (MBs naturally intra-coded, and those MBs forcibly inserted for non-periodic intra-refresh. Interactive user-to-user video streaming can benefit, as then HTTP adaptive streaming is inappropriate and the High Efficiency Video Coding (HEVC codec is too energy demanding.

  14. Gordon Craig's Scene Project: a history open to revision

    Directory of Open Access Journals (Sweden)

    Luiz Fernando

    2014-09-01

    Full Text Available The article proposes a review of Gordon Craig’s Scene project, an invention patented in 1910 and developed until 1922. Craig himself kept an ambiguous position whether it was an unfulfilled project or not. His son and biographer Edward Craig sustained that Craig’s original aims were never achieved because of technical limitation, and most of the scholars who examined the matter followed this position. Departing from the actual screen models saved in the Bibliothèque Nationale de France, Craig’s original notebooks, and a short film from 1963, I defend that the patented project and the essay published in 1923 mean, indeed, the materialisation of the dreamed device of the thousand scenes in one scene

  15. Places in the Brain: Bridging Layout and Object Geometry in Scene-Selective Cortex.

    Science.gov (United States)

    Dillon, Moira R; Persichetti, Andrew S; Spelke, Elizabeth S; Dilks, Daniel D

    2017-06-13

    Diverse animal species primarily rely on sense (left-right) and egocentric distance (proximal-distal) when navigating the environment. Recent neuroimaging studies with human adults show that this information is represented in 2 scene-selective cortical regions-the occipital place area (OPA) and retrosplenial complex (RSC)-but not in a third scene-selective region-the parahippocampal place area (PPA). What geometric properties, then, does the PPA represent, and what is its role in scene processing? Here we hypothesize that the PPA represents relative length and angle, the geometric properties classically associated with object recognition, but only in the context of large extended surfaces that compose the layout of a scene. Using functional magnetic resonance imaging adaptation, we found that the PPA is indeed sensitive to relative length and angle changes in pictures of scenes, but not pictures of objects that reliably elicited responses to the same geometric changes in object-selective cortical regions. Moreover, we found that the OPA is also sensitive to such changes, while the RSC is tolerant to such changes. Thus, the geometric information typically associated with object recognition is also used during some aspects of scene processing. These findings provide evidence that scene-selective cortex differentially represents the geometric properties guiding navigation versus scene categorization. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  16. Estimating cotton canopy ground cover from remotely sensed scene reflectance

    International Nuclear Information System (INIS)

    Maas, S.J.

    1998-01-01

    Many agricultural applications require spatially distributed information on growth-related crop characteristics that could be supplied through aircraft or satellite remote sensing. A study was conducted to develop and test a methodology for estimating plant canopy ground cover for cotton (Gossypium hirsutum L.) from scene reflectance. Previous studies indicated that a relatively simple relationship between ground cover and scene reflectance could be developed based on linear mixture modeling. Theoretical analysis indicated that the effects of shadows in the scene could be compensated for by averaging the results obtained using scene reflectance in the red and near-infrared wavelengths. The methodology was tested using field data collected over several years from cotton test plots in Texas and California. Results of the study appear to verify the utility of this approach. Since the methodology relies on information that can be obtained solely through remote sensing, it would be particularly useful in applications where other field information, such as plant size, row spacing, and row orientation, is unavailable

  17. Significance of perceptually relevant image decolorization for scene classification

    Science.gov (United States)

    Viswanathan, Sowmya; Divakaran, Govind; Soman, Kutti Padanyl

    2017-11-01

    Color images contain luminance and chrominance components representing the intensity and color information, respectively. The objective of this paper is to show the significance of incorporating chrominance information to the task of scene classification. An improved color-to-grayscale image conversion algorithm that effectively incorporates chrominance information is proposed using the color-to-gray structure similarity index and singular value decomposition to improve the perceptual quality of the converted grayscale images. The experimental results based on an image quality assessment for image decolorization and its success rate (using the Cadik and COLOR250 datasets) show that the proposed image decolorization technique performs better than eight existing benchmark algorithms for image decolorization. In the second part of the paper, the effectiveness of incorporating the chrominance component for scene classification tasks is demonstrated using a deep belief network-based image classification system developed using dense scale-invariant feature transforms. The amount of chrominance information incorporated into the proposed image decolorization technique is confirmed with the improvement to the overall scene classification accuracy. Moreover, the overall scene classification performance improved by combining the models obtained using the proposed method and conventional decolorization methods.

  18. Medial Temporal Lobe Contributions to Episodic Future Thinking: Scene Construction or Future Projection?

    Science.gov (United States)

    Palombo, D J; Hayes, S M; Peterson, K M; Keane, M M; Verfaellie, M

    2018-02-01

    Previous research has shown that the medial temporal lobes (MTL) are more strongly engaged when individuals think about the future than about the present, leading to the suggestion that future projection drives MTL engagement. However, future thinking tasks often involve scene processing, leaving open the alternative possibility that scene-construction demands, rather than future projection, are responsible for the MTL differences observed in prior work. This study explores this alternative account. Using functional magnetic resonance imaging, we directly contrasted MTL activity in 1) high scene-construction and low scene-construction imagination conditions matched in future thinking demands and 2) future-oriented and present-oriented imagination conditions matched in scene-construction demands. Consistent with the alternative account, the MTL was more active for the high versus low scene-construction condition. By contrast, MTL differences were not observed when comparing the future versus present conditions. Moreover, the magnitude of MTL activation was associated with the extent to which participants imagined a scene but was not associated with the extent to which participants thought about the future. These findings help disambiguate which component processes of imagination specifically involve the MTL. Published by Oxford University Press 2016.

  19. SeeCoast: persistent surveillance and automated scene understanding for ports and coastal areas

    Science.gov (United States)

    Rhodes, Bradley J.; Bomberger, Neil A.; Freyman, Todd M.; Kreamer, William; Kirschner, Linda; L'Italien, Adam C.; Mungovan, Wendy; Stauffer, Chris; Stolzar, Lauren; Waxman, Allen M.; Seibert, Michael

    2007-04-01

    SeeCoast is a prototype US Coast Guard port and coastal area surveillance system that aims to reduce operator workload while maintaining optimal domain awareness by shifting their focus from having to detect events to being able to analyze and act upon the knowledge derived from automatically detected anomalous activities. The automated scene understanding capability provided by the baseline SeeCoast system (as currently installed at the Joint Harbor Operations Center at Hampton Roads, VA) results from the integration of several components. Machine vision technology processes the real-time video streams provided by USCG cameras to generate vessel track and classification (based on vessel length) information. A multi-INT fusion component generates a single, coherent track picture by combining information available from the video processor with that from surface surveillance radars and AIS reports. Based on this track picture, vessel activity is analyzed by SeeCoast to detect user-defined unsafe, illegal, and threatening vessel activities using a rule-based pattern recognizer and to detect anomalous vessel activities on the basis of automatically learned behavior normalcy models. Operators can optionally guide the learning system in the form of examples and counter-examples of activities of interest, and refine the performance of the learning system by confirming alerts or indicating examples of false alarms. The fused track picture also provides a basis for automated control and tasking of cameras to detect vessels in motion. Real-time visualization combining the products of all SeeCoast components in a common operating picture is provided by a thin web-based client.

  20. Eye movements and attention in reading, scene perception, and visual search.

    Science.gov (United States)

    Rayner, Keith

    2009-08-01

    Eye movements are now widely used to investigate cognitive processes during reading, scene perception, and visual search. In this article, research on the following topics is reviewed with respect to reading: (a) the perceptual span (or span of effective vision), (b) preview benefit, (c) eye movement control, and (d) models of eye movements. Related issues with respect to eye movements during scene perception and visual search are also reviewed. It is argued that research on eye movements during reading has been somewhat advanced over research on eye movements in scene perception and visual search and that some of the paradigms developed to study reading should be more widely adopted in the study of scene perception and visual search. Research dealing with "real-world" tasks and research utilizing the visual-world paradigm are also briefly discussed.

  1. Tracing Sequential Video Production

    DEFF Research Database (Denmark)

    Otrel-Cass, Kathrin; Khalid, Md. Saifuddin

    2015-01-01

    , for one week in 2014, and collected and analyzed visual data to learn about scientists’ practices. The visual material that was collected represented the agreed on material artifacts that should aid the students' reflective process to make sense of science technology practices. It was up to the student...... video, nature of the interactional space, and material and spatial semiotics....

  2. Moving object detection in top-view aerial videos improved by image stacking

    Science.gov (United States)

    Teutsch, Michael; Krüger, Wolfgang; Beyerer, Jürgen

    2017-08-01

    Image stacking is a well-known method that is used to improve the quality of images in video data. A set of consecutive images is aligned by applying image registration and warping. In the resulting image stack, each pixel has redundant information about its intensity value. This redundant information can be used to suppress image noise, resharpen blurry images, or even enhance the spatial image resolution as done in super-resolution. Small moving objects in the videos usually get blurred or distorted by image stacking and thus need to be handled explicitly. We use image stacking in an innovative way: image registration is applied to small moving objects only, and image warping blurs the stationary background that surrounds the moving objects. Our video data are coming from a small fixed-wing unmanned aerial vehicle (UAV) that acquires top-view gray-value images of urban scenes. Moving objects are mainly cars but also other vehicles such as motorcycles. The resulting images, after applying our proposed image stacking approach, are used to improve baseline algorithms for vehicle detection and segmentation. We improve precision and recall by up to 0.011, which corresponds to a reduction of the number of false positive and false negative detections by more than 3 per second. Furthermore, we show how our proposed image stacking approach can be implemented efficiently.

  3. Film grain noise modeling in advanced video coding

    Science.gov (United States)

    Oh, Byung Tae; Kuo, C.-C. Jay; Sun, Shijun; Lei, Shawmin

    2007-01-01

    A new technique for film grain noise extraction, modeling and synthesis is proposed and applied to the coding of high definition video in this work. The film grain noise is viewed as a part of artistic presentation by people in the movie industry. On one hand, since the film grain noise can boost the natural appearance of pictures in high definition video, it should be preserved in high-fidelity video processing systems. On the other hand, video coding with film grain noise is expensive. It is desirable to extract film grain noise from the input video as a pre-processing step at the encoder and re-synthesize the film grain noise and add it back to the decoded video as a post-processing step at the decoder. Under this framework, the coding gain of the denoised video is higher while the quality of the final reconstructed video can still be well preserved. Following this idea, we present a method to remove film grain noise from image/video without distorting its original content. Besides, we describe a parametric model containing a small set of parameters to represent the extracted film grain noise. The proposed model generates the film grain noise that is close to the real one in terms of power spectral density and cross-channel spectral correlation. Experimental results are shown to demonstrate the efficiency of the proposed scheme.

  4. Robotic Discovery of the Auditory Scene

    National Research Council Canada - National Science Library

    Martinson, E; Schultz, A

    2007-01-01

    .... Motivated by the large negative effect of ambient noise sources on robot audition, the long-term goal is to provide awareness of the auditory scene to a robot, so that it may more effectively act...

  5. Scene Recognition for Indoor Localization Using a Multi-Sensor Fusion Approach

    Directory of Open Access Journals (Sweden)

    Mengyun Liu

    2017-12-01

    Full Text Available After decades of research, there is still no solution for indoor localization like the GNSS (Global Navigation Satellite System solution for outdoor environments. The major reasons for this phenomenon are the complex spatial topology and RF transmission environment. To deal with these problems, an indoor scene constrained method for localization is proposed in this paper, which is inspired by the visual cognition ability of the human brain and the progress in the computer vision field regarding high-level image understanding. Furthermore, a multi-sensor fusion method is implemented on a commercial smartphone including cameras, WiFi and inertial sensors. Compared to former research, the camera on a smartphone is used to “see” which scene the user is in. With this information, a particle filter algorithm constrained by scene information is adopted to determine the final location. For indoor scene recognition, we take advantage of deep learning that has been proven to be highly effective in the computer vision community. For particle filter, both WiFi and magnetic field signals are used to update the weights of particles. Similar to other fingerprinting localization methods, there are two stages in the proposed system, offline training and online localization. In the offline stage, an indoor scene model is trained by Caffe (one of the most popular open source frameworks for deep learning and a fingerprint database is constructed by user trajectories in different scenes. To reduce the volume requirement of training data for deep learning, a fine-tuned method is adopted for model training. In the online stage, a camera in a smartphone is used to recognize the initial scene. Then a particle filter algorithm is used to fuse the sensor data and determine the final location. To prove the effectiveness of the proposed method, an Android client and a web server are implemented. The Android client is used to collect data and locate a user. The web

  6. Adaptive attunement of selective covert attention to evolutionary-relevant emotional visual scenes.

    Science.gov (United States)

    Fernández-Martín, Andrés; Gutiérrez-García, Aída; Capafons, Juan; Calvo, Manuel G

    2017-05-01

    We investigated selective attention to emotional scenes in peripheral vision, as a function of adaptive relevance of scene affective content for male and female observers. Pairs of emotional-neutral images appeared peripherally-with perceptual stimulus differences controlled-while viewers were fixating on a different stimulus in central vision. Early selective orienting was assessed by the probability of directing the first fixation towards either scene, and the time until first fixation. Emotional scenes selectively captured covert attention even when they were task-irrelevant, thus revealing involuntary, automatic processing. Sex of observers and specific emotional scene content (e.g., male-to-female-aggression, families and babies, etc.) interactively modulated covert attention, depending on adaptive priorities and goals for each sex, both for pleasant and unpleasant content. The attentional system exhibits domain-specific and sex-specific biases and attunements, probably rooted in evolutionary pressures to enhance reproductive and protective success. Emotional cues selectively capture covert attention based on their bio-social significance. Copyright © 2017 Elsevier Inc. All rights reserved.

  7. High dynamic range adaptive real-time smart camera: an overview of the HDR-ARTiST project

    Science.gov (United States)

    Lapray, Pierre-Jean; Heyrman, Barthélémy; Ginhac, Dominique

    2015-04-01

    Standard cameras capture only a fraction of the information that is visible to the human visual system. This is specifically true for natural scenes including areas of low and high illumination due to transitions between sunlit and shaded areas. When capturing such a scene, many cameras are unable to store the full Dynamic Range (DR) resulting in low quality video where details are concealed in shadows or washed out by sunlight. The imaging technique that can overcome this problem is called HDR (High Dynamic Range) imaging. This paper describes a complete smart camera built around a standard off-the-shelf LDR (Low Dynamic Range) sensor and a Virtex-6 FPGA board. This smart camera called HDR-ARtiSt (High Dynamic Range Adaptive Real-time Smart camera) is able to produce a real-time HDR live video color stream by recording and combining multiple acquisitions of the same scene while varying the exposure time. This technique appears as one of the most appropriate and cheapest solution to enhance the dynamic range of real-life environments. HDR-ARtiSt embeds real-time multiple captures, HDR processing, data display and transfer of a HDR color video for a full sensor resolution (1280 1024 pixels) at 60 frames per second. The main contributions of this work are: (1) Multiple Exposure Control (MEC) dedicated to the smart image capture with alternating three exposure times that are dynamically evaluated from frame to frame, (2) Multi-streaming Memory Management Unit (MMMU) dedicated to the memory read/write operations of the three parallel video streams, corresponding to the different exposure times, (3) HRD creating by combining the video streams using a specific hardware version of the Devebecs technique, and (4) Global Tone Mapping (GTM) of the HDR scene for display on a standard LCD monitor.

  8. Adaptive attunement of selective covert attention to evolutionary-relevant emotional visual scenes

    OpenAIRE

    Fernández-Martín, Andrés (UNIR); Gutiérrez-García, Aida; Capafons, Juan; Calvo, Manuel G

    2017-01-01

    We investigated selective attention to emotional scenes in peripheral vision, as a function of adaptive relevance of scene affective content for male and female observers. Pairs of emotional neutral images appeared peripherally with perceptual stimulus differences controlled while viewers were fixating on a different stimulus in central vision. Early selective orienting was assessed by the probability of directing the first fixation towards either scene, and the time until first fixation. Emo...

  9. Knowledge Guided Disambiguation for Large-Scale Scene Classification With Multi-Resolution CNNs

    Science.gov (United States)

    Wang, Limin; Guo, Sheng; Huang, Weilin; Xiong, Yuanjun; Qiao, Yu

    2017-04-01

    Convolutional Neural Networks (CNNs) have made remarkable progress on scene recognition, partially due to these recent large-scale scene datasets, such as the Places and Places2. Scene categories are often defined by multi-level information, including local objects, global layout, and background environment, thus leading to large intra-class variations. In addition, with the increasing number of scene categories, label ambiguity has become another crucial issue in large-scale classification. This paper focuses on large-scale scene recognition and makes two major contributions to tackle these issues. First, we propose a multi-resolution CNN architecture that captures visual content and structure at multiple levels. The multi-resolution CNNs are composed of coarse resolution CNNs and fine resolution CNNs, which are complementary to each other. Second, we design two knowledge guided disambiguation techniques to deal with the problem of label ambiguity. (i) We exploit the knowledge from the confusion matrix computed on validation data to merge ambiguous classes into a super category. (ii) We utilize the knowledge of extra networks to produce a soft label for each image. Then the super categories or soft labels are employed to guide CNN training on the Places2. We conduct extensive experiments on three large-scale image datasets (ImageNet, Places, and Places2), demonstrating the effectiveness of our approach. Furthermore, our method takes part in two major scene recognition challenges, and achieves the second place at the Places2 challenge in ILSVRC 2015, and the first place at the LSUN challenge in CVPR 2016. Finally, we directly test the learned representations on other scene benchmarks, and obtain the new state-of-the-art results on the MIT Indoor67 (86.7\\%) and SUN397 (72.0\\%). We release the code and models at~\\url{https://github.com/wanglimin/MRCNN-Scene-Recognition}.

  10. Contextual Guidance of Eye Movements and Attention in Real-World Scenes: The Role of Global Features in Object Search

    Science.gov (United States)

    Torralba, Antonio; Oliva, Aude; Castelhano, Monica S.; Henderson, John M.

    2006-01-01

    Many experiments have shown that the human visual system makes extensive use of contextual information for facilitating object search in natural scenes. However, the question of how to formally model contextual influences is still open. On the basis of a Bayesian framework, the authors present an original approach of attentional guidance by global…

  11. Analysis of body fluids for forensic purposes: from laboratory testing to non-destructive rapid confirmatory identification at a crime scene.

    Science.gov (United States)

    Virkler, Kelly; Lednev, Igor K

    2009-07-01

    Body fluid traces recovered at crime scenes are among the most important types of evidence to forensic investigators. They contain valuable DNA evidence which can identify a suspect or victim as well as exonerate an innocent individual. The first step of identifying a particular body fluid is highly important since the nature of the fluid is itself very informative to the investigation, and the destructive nature of a screening test must be considered when only a small amount of material is available. The ability to characterize an unknown stain at the scene of the crime without having to wait for results from a laboratory is another very critical step in the development of forensic body fluid analysis. Driven by the importance for forensic applications, body fluid identification methods have been extensively developed in recent years. The systematic analysis of these new developments is vital for forensic investigators to be continuously educated on possible superior techniques. Significant advances in laser technology and the development of novel light detectors have dramatically improved spectroscopic methods for molecular characterization over the last decade. The application of this novel biospectroscopy for forensic purposes opens new and exciting opportunities for the development of on-field, non-destructive, confirmatory methods for body fluid identification at a crime scene. In addition, the biospectroscopy methods are universally applicable to all body fluids unlike the majority of current techniques which are valid for individual fluids only. This article analyzes the current methods being used to identify body fluid stains including blood, semen, saliva, vaginal fluid, urine, and sweat, and also focuses on new techniques that have been developed in the last 5-6 years. In addition, the potential of new biospectroscopic techniques based on Raman and fluorescence spectroscopy is evaluated for rapid, confirmatory, non-destructive identification of a body

  12. Occlusion Handling in Videos Object Tracking: A Survey

    International Nuclear Information System (INIS)

    Lee, B Y; Liew, L H; Cheah, W S; Wang, Y C

    2014-01-01

    Object tracking in video has been an active research since for decades. This interest is motivated by numerous applications, such as surveillance, human-computer interaction, and sports event monitoring. Many challenges related to tracking objects still remain, this can arise due to abrupt object motion, changing appearance patterns of objects and the scene, non-rigid object structures and most significant are occlusion of tracked object be it object-to-object or object-to-scene occlusions. Generally, occlusion in object tracking occur under three situations: self-occlusion, inter-object occlusion by background scene structure. Self-occlusion occurs most frequently while tracking articulated objects when one part of the object occludes another. Inter-object occlusion occurs when two objects being tracked occlude each other whereas occlusion by the background occurs when a structure in the background occludes the tracked objects. Typically, tracking methods handle occlusion by modelling the object motion using linear and non-linear dynamic models. The derived models will be used to continuously predicting the object location when a tracked object is occluded until the object reappears. Example of these method are Kalman filtering and Particle filtering trackers. Researchers have also utilised other features to resolved occlusion, for example, silhouette projections, colour histogram and optical flow. We will present some result from a previously conducted experiment when tracking single object using Kalman filter, Particle filter and Mean Shift trackers under various occlusion situation in this paper. We will also review various other occlusion handling methods that involved using multiple cameras. In a nutshell, the goal of this paper is to discuss in detail the problem of occlusion in object tracking and review the state of the art occlusion handling methods, classify them into different categories, and identify new trends. Moreover, we discuss the important

  13. Occlusion Handling in Videos Object Tracking: A Survey

    Science.gov (United States)

    Lee, B. Y.; Liew, L. H.; Cheah, W. S.; Wang, Y. C.

    2014-02-01

    Object tracking in video has been an active research since for decades. This interest is motivated by numerous applications, such as surveillance, human-computer interaction, and sports event monitoring. Many challenges related to tracking objects still remain, this can arise due to abrupt object motion, changing appearance patterns of objects and the scene, non-rigid object structures and most significant are occlusion of tracked object be it object-to-object or object-to-scene occlusions. Generally, occlusion in object tracking occur under three situations: self-occlusion, inter-object occlusion by background scene structure. Self-occlusion occurs most frequently while tracking articulated objects when one part of the object occludes another. Inter-object occlusion occurs when two objects being tracked occlude each other whereas occlusion by the background occurs when a structure in the background occludes the tracked objects. Typically, tracking methods handle occlusion by modelling the object motion using linear and non-linear dynamic models. The derived models will be used to continuously predicting the object location when a tracked object is occluded until the object reappears. Example of these method are Kalman filtering and Particle filtering trackers. Researchers have also utilised other features to resolved occlusion, for example, silhouette projections, colour histogram and optical flow. We will present some result from a previously conducted experiment when tracking single object using Kalman filter, Particle filter and Mean Shift trackers under various occlusion situation in this paper. We will also review various other occlusion handling methods that involved using multiple cameras. In a nutshell, the goal of this paper is to discuss in detail the problem of occlusion in object tracking and review the state of the art occlusion handling methods, classify them into different categories, and identify new trends. Moreover, we discuss the important

  14. Representations and Techniques for 3D Object Recognition and Scene Interpretation

    CERN Document Server

    Hoiem, Derek

    2011-01-01

    One of the grand challenges of artificial intelligence is to enable computers to interpret 3D scenes and objects from imagery. This book organizes and introduces major concepts in 3D scene and object representation and inference from still images, with a focus on recent efforts to fuse models of geometry and perspective with statistical machine learning. The book is organized into three sections: (1) Interpretation of Physical Space; (2) Recognition of 3D Objects; and (3) Integrated 3D Scene Interpretation. The first discusses representations of spatial layout and techniques to interpret physi

  15. Characteristics of nontrauma scene flights for air medical transport.

    Science.gov (United States)

    Krebs, Margaret G; Fletcher, Erica N; Werman, Howard; McKenzie, Lara B

    2014-01-01

    Little is known about the use of air medical transport for patients with medical, rather than traumatic, emergencies. This study describes the practices of air transport programs, with respect to nontrauma scene responses, in several areas throughout the United States and Canada. A descriptive, retrospective study was conducted of all nontrauma scene flights from 2008 and 2009. Flight information and patient demographic data were collected from 5 air transport programs. Descriptive statistics were used to examine indications for transport, Glasgow Coma Scale Scores, and loaded miles traveled. A total of 1,785 nontrauma scene flights were evaluated. The percentage of scene flights contributed by nontraumatic emergencies varied between programs, ranging from 0% to 44.3%. The most common indication for transport was cardiac, nonST-segment elevation myocardial infarction (22.9%). Cardiac arrest was the indication for transport in 2.5% of flights. One air transport program reported a high percentage (49.4) of neurologic, stroke, flights. The use of air transport for nontraumatic emergencies varied considerably between various air transport programs and regions. More research is needed to evaluate which nontraumatic emergencies benefit from air transport. National guidelines regarding the use of air transport for nontraumatic emergencies are needed. Copyright © 2014 Air Medical Journal Associates. Published by Elsevier Inc. All rights reserved.

  16. Narrative Collage of Image Collections by Scene Graph Recombination.

    Science.gov (United States)

    Fang, Fei; Yi, Miao; Feng, Hui; Hu, Shenghong; Xiao, Chunxia

    2017-10-04

    Narrative collage is an interesting image editing art to summarize the main theme or storyline behind an image collection. We present a novel method to generate narrative images with plausible semantic scene structures. To achieve this goal, we introduce a layer graph and a scene graph to represent relative depth order and semantic relationship between image objects, respectively. We firstly cluster the input image collection to select representative images, and then extract a group of semantic salient objects from each representative image. Both Layer graphs and scene graphs are constructed and combined according to our specific rules for reorganizing the extracted objects in every image. We design an energy model to appropriately locate every object on the final canvas. Experiment results show that our method can produce competitive narrative collage result and works well on a wide range of image collections.

  17. Virtual environments for scene of crime reconstruction and analysis

    Science.gov (United States)

    Howard, Toby L. J.; Murta, Alan D.; Gibson, Simon

    2000-02-01

    This paper describes research conducted in collaboration with Greater Manchester Police (UK), to evalute the utility of Virtual Environments for scene of crime analysis, forensic investigation, and law enforcement briefing and training. We present an illustrated case study of the construction of a high-fidelity virtual environment, intended to match a particular real-life crime scene as closely as possible. We describe and evaluate the combination of several approaches including: the use of the Manchester Scene Description Language for constructing complex geometrical models; the application of a radiosity rendering algorithm with several novel features based on human perceptual consideration; texture extraction from forensic photography; and experiments with interactive walkthroughs and large-screen stereoscopic display of the virtual environment implemented using the MAVERIK system. We also discuss the potential applications of Virtual Environment techniques in the Law Enforcement and Forensic communities.

  18. The benefits of playing video games.

    Science.gov (United States)

    Granic, Isabela; Lobel, Adam; Engels, Rutger C M E

    2014-01-01

    Video games are a ubiquitous part of almost all children's and adolescents' lives, with 97% playing for at least one hour per day in the United States. The vast majority of research by psychologists on the effects of "gaming" has been on its negative impact: the potential harm related to violence, addiction, and depression. We recognize the value of that research; however, we argue that a more balanced perspective is needed, one that considers not only the possible negative effects but also the benefits of playing these games. Considering these potential benefits is important, in part, because the nature of these games has changed dramatically in the last decade, becoming increasingly complex, diverse, realistic, and social in nature. A small but significant body of research has begun to emerge, mostly in the last five years, documenting these benefits. In this article, we summarize the research on the positive effects of playing video games, focusing on four main domains: cognitive, motivational, emotional, and social. By integrating insights from developmental, positive, and social psychology, as well as media psychology, we propose some candidate mechanisms by which playing video games may foster real-world psychosocial benefits. Our aim is to provide strong enough evidence and a theoretical rationale to inspire new programs of research on the largely unexplored mental health benefits of gaming. Finally, we end with a call to intervention researchers and practitioners to test the positive uses of video games, and we suggest several promising directions for doing so. PsycINFO Database Record (c) 2014 APA, all rights reserved.

  19. A low-cost, high-resolution, video-rate imaging optical radar

    Energy Technology Data Exchange (ETDEWEB)

    Sackos, J.T.; Nellums, R.O.; Lebien, S.M.; Diegert, C.F. [Sandia National Labs., Albuquerque, NM (United States); Grantham, J.W.; Monson, T. [Air Force Research Lab., Eglin AFB, FL (United States)

    1998-04-01

    Sandia National Laboratories has developed a unique type of portable low-cost range imaging optical radar (laser radar or LADAR). This innovative sensor is comprised of an active floodlight scene illuminator and an image intensified CCD camera receiver. It is a solid-state device (no moving parts) that offers significant size, performance, reliability, and simplicity advantages over other types of 3-D imaging sensors. This unique flash LADAR is based on low cost, commercially available hardware, and is well suited for many government and commercial uses. This paper presents an update of Sandia`s development of the Scannerless Range Imager technology and applications, and discusses the progress that has been made in evolving the sensor into a compact, low, cost, high-resolution, video rate Laser Dynamic Range Imager.

  20. STREAM PROCESSING ALGORITHMS FOR DYNAMIC 3D SCENE ANALYSIS

    Science.gov (United States)

    2018-02-15

    PROCESSING ALGORITHMS FOR DYNAMIC 3D SCENE ANALYSIS 5a. CONTRACT NUMBER FA8750-14-2-0072 5b. GRANT NUMBER N/A 5c. PROGRAM ELEMENT NUMBER 62788F 6...of Figures 1 The 3D processing pipeline flowchart showing key modules. . . . . . . . . . . . . . . . . 12 2 Overall view (data flow) of the proposed...pipeline flowchart showing key modules. from motion and bundle adjustment algorithm. By fusion of depth masks of the scene obtained from 3D