Ge, Jing; Zhang, Guoping; Yang, Zongkai
Multimedia technology and networks protocol are the basic technology of the video surveillance system. A network remote video surveillance system based on MPEG-4 video coding standards is designed and implemented in this paper. The advantages of the MPEG-4 are analyzed in detail in the surveillance field, and then the real-time protocol and real-time control protocol (RTP/RTCP) are chosen as the networks transmission protocol. The whole system includes video coding control module, playing back module, network transmission module and network receiver module The scheme of management, control and storage about video data are discussed. The DirectShow technology is used to playback video data. The transmission scheme of digital video processing in networks, RTP packaging of MPEG-4 video stream is discussed. The receiver scheme of video date and mechanism of buffer are discussed. The most of the functions are archived by software, except that the video coding control module is achieved by hardware. The experiment results show that it provides good video quality and has the real-time performance. This system can be applied into wide fields.
Zhang, Zhengbing; Deng, Huiping; Xia, Zhenhua
Video systems have been widely used in many fields such as conferences, public security, military affairs and medical treatment. With the rapid development of FPGA, SOPC has been paid great attentions in the area of image and video processing in recent years. A network video transmission system based on SOPC is proposed in this paper for the purpose of video acquisition, video encoding and network transmission. The hardware platform utilized to design the system is an SOPC board of model Altera's DE2, which includes an FPGA chip of model EP2C35F672C6, an Ethernet controller and a video I/O interface. An IP core, known as Nios II embedded processor, is used as the CPU of the system. In addition, a hardware module for format conversion of video data, and another module to realize Motion-JPEG have been designed with Verilog HDL. These two modules are attached to the Nios II processor as peripheral equipments through the Avalon bus. Simulation results show that these two modules work as expected. Uclinux including TCP/IP protocol as well as the driver of Ethernet controller is chosen as the embedded operating system and an application program scheme is proposed.
Petkovic, M.; Jonker, Willem
An increasing number of large publicly available video libraries results in a demand for techniques that can manipulate the video data based on content. In this paper, we present a content-based video retrieval system called Cobra. The system supports automatic extraction and retrieval of high-level
Xia, Jiali; Jin, Jesse S.
Video-On-Demand is a new development on the Internet. In order to manage the rich multimedia information and the large number of users, we present an Internet Video-On-Demand system with some E- Commerce flavors. This paper presents the system architecture and technologies required in the implementation. It provides interactive Video-On-Demand services in which the user has a complete control over the session presentation. It allows the user to select and receive specific video information by retrieving the database. For improving the performance of video information retrieval and management, the video information is represented by hierarchical video metadata in XML format. Video metadatabase stored the video information in this hierarchical structure and allows user to search the video shots at different semantic levels in the database. To browse the searched video, the user not only has full-function VCR capabilities as the traditional Video-On-Demand, but also can browse the video in a hierarchical method to view different shots. In order to perform management of large number of users over the Internet, a membership database designed and managed in an E-Commerce environment, which allows the user to access the video database based on different access levels.
Full Text Available In order to support high-definition video transmission, an implementation of video transmission system based on Long Term Evolution is designed. This system is developed on Xilinx Virtex-6 FPGA ML605 Evaluation Board. The paper elaborates the features of baseband link designed in Xilinx ISE and protocol stack designed in Xilinx SDK, and introduces the process of setting up hardware and software platform in Xilinx XPS. According to test, this system consumes less hardware resource and is able to transmit bidirectional video clearly and stably.
Zhao, Heng; Wang, Xiang-jun
This paper presents a FPGA based video interface conversion system that enables the inter-conversion between digital and analog video. Cyclone IV series EP4CE22F17C chip from Altera Corporation is used as the main video processing chip, and single-chip is used as the information interaction control unit between FPGA and PC. The system is able to encode/decode messages from the PC. Technologies including video decoding/encoding circuits, bus communication protocol, data stream de-interleaving and de-interlacing, color space conversion and the Camera Link timing generator module of FPGA are introduced. The system converts Composite Video Broadcast Signal (CVBS) from the CCD camera into Low Voltage Differential Signaling (LVDS), which will be collected by the video processing unit with Camera Link interface. The processed video signals will then be inputted to system output board and displayed on the monitor.The current experiment shows that it can achieve high-quality video conversion with minimum board size.
Full Text Available An object-based video authentication system, which combines watermarking, error correction coding (ECC, and digital signature techniques, is presented for protecting the authenticity between video objects and their associated backgrounds. In this system, a set of angular radial transformation (ART coefficients is selected as the feature to represent the video object and the background, respectively. ECC and cryptographic hashing are applied to those selected coefficients to generate the robust authentication watermark. This content-based, semifragile watermark is then embedded into the objects frame by frame before MPEG4 coding. In watermark embedding and extraction, groups of discrete Fourier transform (DFT coefficients are randomly selected, and their energy relationships are employed to hide and extract the watermark. The experimental results demonstrate that our system is robust to MPEG4 compression, object segmentation errors, and some common object-based video processing such as object translation, rotation, and scaling while securely preventing malicious object modifications. The proposed solution can be further incorporated into public key infrastructure (PKI.
Chow, John W.; Carlton, Les G.; Ekkekakis, Panteleimon; Hay, James G.
Discusses advantages of a video-based, digitized image system for the study and analysis of projectile motion in the physics laboratory. Describes the implementation of a web-based digitized video system. (WRM)
Yang, Fan; Ma, Chunting; Li, Haoyi
The design of a wireless video transmission system based on STM32, the system uses the STM32F103VET6 microprocessor as the core, through the video acquisition module collects video data, video data will be sent to the receiver through the wireless transmitting module, receiving data will be displayed on the LCD screen. The software design process of receiver and transmitter is introduced. The experiment proves that the system realizes wireless video transmission function.
A HTTP based video transmission system has been built upon the p2p(peer to peer) network structure utilizing the Java technologies. This makes the video monitoring available to any host which has been connected to the World Wide Web in any method, including those hosts behind firewalls or in isolated sub-networking. In order to achieve this, a video source peer has been developed, together with the client video playback peer. The video source peer can respond to the video stream request in HTTP protocol. HTTP based pipe communication model is developed to speeding the transmission of video stream data, which has been encoded into fragments using the JPEG codec. To make the system feasible in conveying video streams between arbitrary peers on the web, a HTTP protocol based relay peer is implemented as well. This video monitoring system has been applied in a tele-robotic system as a visual feedback to the operator.
Al-Hamad, A.; Moussa, A.; El-Sheimy, N.
The last two decades have witnessed a huge growth in the demand for geo-spatial data. This demand has encouraged researchers around the world to develop new algorithms and design new mapping systems in order to obtain reliable sources for geo-spatial data. Mobile Mapping Systems (MMS) are one of the main sources for mapping and Geographic Information Systems (GIS) data. MMS integrate various remote sensing sensors, such as cameras and LiDAR, along with navigation sensors to provide the 3D coordinates of points of interest from moving platform (e.g. cars, air planes, etc.). Although MMS can provide accurate mapping solution for different GIS applications, the cost of these systems is not affordable for many users and only large scale companies and institutions can benefits from MMS systems. The main objective of this paper is to propose a new low cost MMS with reasonable accuracy using the available sensors in smartphones and its video camera. Using the smartphone video camera, instead of capturing individual images, makes the system easier to be used by non-professional users since the system will automatically extract the highly overlapping frames out of the video without the user intervention. Results of the proposed system are presented which demonstrate the effect of the number of the used images in mapping solution. In addition, the accuracy of the mapping results obtained from capturing a video is compared to the same results obtained from using separate captured images instead of video.
Bescos, Jesus; Martinez, Jose M.; Cabrera, Julian M.; Cisneros, Guillermo
This paper describes the first stages of a research project that is currently being developed in the Image Processing Group of the UPM. The aim of this effort is to add video capabilities to the Storage and Retrieval Information System already working at our premises. Here we will focus on the early design steps of a Video Information System. For this purpose, we present a review of most of the reported techniques for video temporal segmentation and semantic segmentation, previous steps to afford the content extraction task, and we discuss them to select the more suitable ones. We then outline a block design of a temporal segmentation module, and present guidelines to the design of the semantic segmentation one. All these operations trend to facilitate automation in the extraction of low level features and semantic features that will finally take part of the video descriptors.
Li, Yucheng; Han, Dantao; Yan, Juanli
A wireless video surveillance system based on ARM was designed and implemented in this article. The newest ARM11 S3C6410 was used as the main monitoring terminal chip with the embedded Linux operating system. The video input was obtained by the analog CCD and transferred from analog to digital by the video chip TVP5150. The video was packed by RTP and transmitted by the wireless USB TL-WN322G+ after being compressed by H.264 encoders in S3C6410. Further more, the video images were preprocessed. It can detect the abnormities of the specified scene and the abnormal alarms. The video transmission definition is the standard definition 480P. The video stream can be real-time monitored. The system has been used in the real-time intelligent video surveillance of the specified scene.
Schlecht, Leslie E.; Kutler, Paul (Technical Monitor)
This is a proposal for a general use system based, on the SGI IRIS workstation platform, for recording computer animation to videotape. In addition, this system would provide features for simple editing and enhancement. Described here are a list of requirements for the system, and a proposed configuration including the SGI VideoLab Integrator, VideoMedia VLAN animation controller and the Pioneer rewritable laserdisc recorder.
Ramezani, Mohsen; Yaghmaee, Farzin
In recent years, fast growth of online video sharing eventuated new issues such as helping users to find their requirements in an efficient way. Hence, Recommender Systems (RSs) are used to find the users' most favorite items. Finding these items relies on items or users similarities. Though, many factors like sparsity and cold start user impress the recommendation quality. In some systems, attached tags are used for searching items (e.g. videos) as personalized recommendation. Different views, incomplete and inaccurate tags etc. can weaken the performance of these systems. Considering the advancement of computer vision techniques can help improving RSs. To this end, content based search can be used for finding items (here, videos are considered). In such systems, a video is taken from the user to find and recommend a list of most similar videos to the query one. Due to relating most videos to humans, we present a novel low complex scalable method to recommend videos based on the model of included action. This method has recourse to human action retrieval approaches. For modeling human actions, some interest points are extracted from each action and their motion information are used to compute the action representation. Moreover, a fuzzy dissimilarity measure is presented to compare videos for ranking them. The experimental results on HMDB, UCFYT, UCF sport and KTH datasets illustrated that, in most cases, the proposed method can reach better results than most used methods.
Full Text Available Design of automated video surveillance systems is one of the exigent missions in computer vision community because of their ability to automatically select frames of interest in incoming video streams based on motion detection. This research paper focuses on the real-time hardware implementation of a motion detection algorithm for such vision based automated surveillance systems. A dedicated VLSI architecture has been proposed and designed for clustering-based motion detection scheme. The working prototype of a complete standalone automated video surveillance system, including input camera interface, designed motion detection VLSI architecture, and output display interface, with real-time relevant motion detection capabilities, has been implemented on Xilinx ML510 (Virtex-5 FX130T FPGA platform. The prototyped system robustly detects the relevant motion in real-time in live PAL (720 × 576 resolution video streams directly coming from the camera.
Full Text Available Video surveillance systems are based on video and image processing research areas in the scope of computer science. Video processing covers various methods which are used to browse the changes in existing scene for specific video. Nowadays, video processing is one of the important areas of computer science. Two-dimensional videos are used to apply various segmentation and object detection and tracking processes which exists in multimedia content-based indexing, information retrieval, visual and distributed cross-camera surveillance systems, people tracking, traffic tracking and similar applications. Background subtraction (BS approach is a frequently used method for moving object detection and tracking. In the literature, there exist similar methods for this issue. In this research study, it is proposed to provide a more efficient method which is an addition to existing methods. According to model which is produced by using adaptive background subtraction (ABS, an object detection and tracking system’s software is implemented in computer environment. The performance of developed system is tested via experimental works with related video datasets. The experimental results and discussion are given in the study
Xia, Zhen-Hua; Wang, Xiao-Shuang
With the rapid development of the electronic technology, multimedia technology and mobile communication technology, video monitoring system is going to the embedded, digital and wireless direction. In this paper, a solution of wireless video monitoring system based on WCDMA is proposed. This solution makes full use of the advantages of 3G, which have Extensive coverage network and wide bandwidth. It can capture the video streaming from the chip's video port, real-time encode the image data by the high speed DSP, and have enough bandwidth to transmit the monitoring image through WCDMA wireless network. The experiments demonstrate that the system has the advantages of high stability, good image quality, good transmission performance, and in addition, the system has been widely used, not be restricted by geographical position since it adopts wireless transmission. So, it is suitable used in sparsely populated, harsh environment scenario.
classic films Ii- into separate FM signals for video dual soundtrack or stereo sound censed from nearlk every major stu- and audio. Another...though never disruptive. While my enthusiasm for the subject was distinctly lim- i’ed. I felt almost as if Iwere in the presence of a histori - cally
Full Text Available To make people at different places participate in the same conference, speak and discuss freely, the interactive remote video conferencing system is designed and realized based on multi-Agent collaboration. FEC (forward error correction and tree P2P technology are firstly used to build a live conference structure to transfer audio and video data; then the branch conference port can participate to speak and discuss through the application of becoming a interactive focus; the introduction of multi-Agent collaboration technology improve the system robustness. The experiments showed that, under normal network conditions, the system can support 350 branch conference node simultaneously to make live broadcasting. The audio and video quality is smooth. It can carry out large-scale remote video conference.
Madiedo, J. M.; Trigo-Rodriguez, J. M.; Lyytinen, E.
The SPanish Meteor Network (SPMN) is performing a continuous monitoring of meteor activity over Spain and neighbouring countries. The huge amount of data obtained by the 25 video observing stations that this network is currently operating made it necessary to develop new software packages to accomplish some tasks, such as data reduction and remote operation of autonomous systems based on high-sensitivity CCD video devices. The main characteristics of this software are described here.
M. van Persie
Full Text Available During a fire incident live airborne video offers the fire brigade an additional means of information. Essential for the effective usage of the daylight and infra red video data from the UAS is that the information is fully integrated into the crisis management system of the fire brigade. This is a GIS based system in which all relevant geospatial information is brought together and automatically distributed to all levels of the organisation. In the context of the Dutch Fire-Fly project a geospatial video server was integrated with a UAS and the fire brigades crisis management system, so that real-time geospatial airborne video and derived products can be made available at all levels during a fire incident. The most important elements of the system are the Delftdynamics Robot Helicopter, the Video Multiplexing System, the Keystone geospatial video server/editor and the Eagle and CCS-M crisis management systems. In discussion with the Security Region North East Gelderland user requirements and a concept of operation were defined, demonstrated and evaluated. This article describes the technical and operational approach and results.
Chen, Jin; Wang, Yifan; Wang, Xuelei; Wang, Yuehong; Hu, Rui
Combine harvester usually works in sparsely populated areas with harsh environment. In order to achieve the remote real-time video monitoring of the working state of combine harvester. A remote video monitoring system based on ARM11 and embedded Linux is developed. The system uses USB camera for capturing working state video data of the main parts of combine harvester, including the granary, threshing drum, cab and cut table. Using JPEG image compression standard to compress video data then transferring monitoring screen to remote monitoring center over the network for long-range monitoring and management. At the beginning of this paper it describes the necessity of the design of the system. Then it introduces realization methods of hardware and software briefly. And then it describes detailedly the configuration and compilation of embedded Linux operating system and the compiling and transplanting of video server program are elaborated. At the end of the paper, we carried out equipment installation and commissioning on combine harvester and then tested the system and showed the test results. In the experiment testing, the remote video monitoring system for combine harvester can achieve 30fps with the resolution of 800x600, and the response delay in the public network is about 40ms.
Gustafson, Peter C.
For many years, photogrammetry has been in use at TRW. During that time, needs have arisen for highly repetitive measurements. In an effort to satisfy these needs in a timely manner, a specialized Robotic Video Photogrammetry System (RVPS) was developed by TRW in conjunction with outside vendors. The primary application for the RVPS has strict accuracy requirements that demand significantly more images than the previously used film-based system. The time involved in taking these images was prohibitive but by automating the data acquisition process, video techniques became a practical alternative to the more traditional film- based approach. In fact, by applying video techniques, measurement productivity was enhanced significantly. Analysis involved was also brought `on-board' to the RVPS, allowing shop floor acquisition and delivery of results. The RVPS has also been applied in other tasks and was found to make a critical improvement in productivity, allowing many more tests to be run in a shorter time cycle. This paper will discuss the creation of the system and TRW's experiences with the RVPS. Highlighted will be the lessons learned during these efforts and significant attributes of the process not common to the standard application of photogrammetry for industrial measurement. As productivity and ease of use continue to drive the application of photogrammetry in today's manufacturing climate, TRW expects several systems, with technological improvements applied, to be in use in the near future.
Darabi, K; G. Ghinea
In this paper an expert-based model for generation of personalized video summaries is suggested. The video frames are initially scored and annotated by multiple video experts. Thereafter, the scores for the video segments that have been assigned the higher priorities by end users will be upgraded. Considering the required summary length, the highest scored video frames will be inserted into a personalized final summary. For evaluation purposes, the video summaries generated by our system have...
Bulan, Orhan; Loce, Robert P.; Wu, Wencheng; Wang, YaoRong; Bernal, Edgar A.; Fan, Zhigang
Urban parking management is receiving significant attention due to its potential to reduce traffic congestion, fuel consumption, and emissions. Real-time parking occupancy detection is a critical component of on-street parking management systems, where occupancy information is relayed to drivers via smart phone apps, radio, Internet, on-road signs, or global positioning system auxiliary signals. Video-based parking occupancy detection systems can provide a cost-effective solution to the sensing task while providing additional functionality for traffic law enforcement and surveillance. We present a video-based on-street parking occupancy detection system that can operate in real time. Our system accounts for the inherent challenges that exist in on-street parking settings, including illumination changes, rain, shadows, occlusions, and camera motion. Our method utilizes several components from video processing and computer vision for motion detection, background subtraction, and vehicle detection. We also present three traffic law enforcement applications: parking angle violation detection, parking boundary violation detection, and exclusion zone violation detection, which can be integrated into the parking occupancy cameras as a value-added option. Our experimental results show that the proposed parking occupancy detection method performs in real-time at 5 frames/s and achieves better than 90% detection accuracy across several days of videos captured in a busy street block under various weather conditions such as sunny, cloudy, and rainy, among others.
AKINCI, Gökay; Polat, Ediz; Koçak, Orhan Murat
Eye pupil detection systems have become increasingly popular in image processing and computer vision applications in medical systems. In this study, a video-based eye pupil detection system is developed for diagnosing bipolar disorder. Bipolar disorder is a condition in which people experience changes in cognitive processes and abilities, including reduced attentional and executive capabilities and impaired memory. In order to detect these abnormal behaviors, a number of neuropsychologi...
Yang, Jian; Xie, Xiaofang; Wang, Yan
Based on the AHRS (Attitude and Heading Reference System) and PTZ (Pan/Tilt/Zoom) camera, we designed a video monitoring and tracking system. The overall structure of the system and the software design are given. The key technologies such as serial port communication and head attitude tracking are introduced, and the codes of the key part are given.
Lee, Kang Oh; Nakaji, Kei; 中司, 敬
A web-based video direct e-commerce system was developed to solve the problems in the internet shopping and to increase trust in safety and quality of agricultural products from consumers. We found that the newly developed e-commerce system could overcome demerits of the internet shopping and give consumers same effects as purchasing products offline. Producers could have opportunities to explain products and to talk to customers and get increased income because of maintaining a certain numbe...
Han, Xiaoguang; Fu, Hongbo; Zheng, Hanlin; Liu, Ligang; Wang, Jue
Stop-motion is a well-established animation technique but is often laborious and requires craft skills. A new video-based system can animate the vast majority of everyday objects in stop-motion style, more flexibly and intuitively. Animators can perform and capture motions continuously instead of breaking them into increments and shooting one still picture per increment. More important, the system permits direct hand manipulation without resorting to rigs, achieving more natural object control for beginners. The system's key component is two-phase keyframe-based capturing and processing, assisted by computer vision techniques. With this system, even amateurs can generate high-quality stop-motion animations.
Xia, Xue; Qiu, Yun; Hu, Lin; Fan, Jingchao; Guo, Xiuming; Zhou, Guomin
International audience; As the proposition of the ‘Internet plus’ concept and speedy progress of new media technology, traditional business have been increasingly shared in the development fruits of the informatization and the networking. Proceeding from the real plant protection demands, the construction of a cloud-based video monitoring system that surveillances diseases and pests in apple orchards has been discussed, aiming to solve the lack of timeliness and comprehensiveness in the contr...
Li, Xiangzhen; Xie, Xiaodan; Yin, Xiaoqiang
In the information age, the rapid development in the direction of intelligent video processing, complex algorithm proposed the powerful challenge on the performance of the processor. In this article, through the FPGA + TMS320C6678 frame structure, the image to fog, merge into an organic whole, to stabilize the image enhancement, its good real-time, superior performance, break through the traditional function of video processing system is simple, the product defects such as single, solved the video application in security monitoring, video, etc. Can give full play to the video monitoring effectiveness, improve enterprise economic benefits.
Wen, Ming; Hu, Haibo
To meet the demands of high definition of video and transmission at real-time during the surgery of endoscope, this paper designs an HD mobile video transmission system. This system uses H.264/AVC to encode the original video data and transports it in the network by RTP/RTCP protocol. Meanwhile, the system implements a stable video transmission in portable terminals (such as tablet PCs, mobile phones) under the 3G mobile network. The test result verifies the strong repair ability and stability under the conditions of low bandwidth, high packet loss rate, and high delay and shows a high practical value.
Anishchenko, S.; Beylin, D.; Stepanov, P.; Stepanov, A.; Weinberg, I. N.; Schaeffer, S.; Zavarzin, V.; Shaposhnikov, D.; Smith, M. F.
Unintentional head motion during Positron Emission Tomography (PET) data acquisition can degrade PET image quality and lead to artifacts. Poor patient compliance, head tremor, and coughing are examples of movement sources. Head motion due to patient non-compliance can be an issue with the rise of amyloid brain PET in dementia patients. To preserve PET image resolution and quantitative accuracy, head motion can be tracked and corrected in the image reconstruction algorithm. While fiducial markers can be used, a contactless approach is preferable. A video-based head motion tracking system for a dedicated portable brain PET scanner was developed. Four wide-angle cameras organized in two stereo pairs are used for capturing video of the patient's head during the PET data acquisition. Facial points are automatically tracked and used to determine the six degree of freedom head pose as a function of time. The presented work evaluated the newly designed tracking system using a head phantom and a moving American College of Radiology (ACR) phantom. The mean video-tracking error was 0.99±0.90 mm relative to the magnetic tracking device used as ground truth. Qualitative evaluation with the ACR phantom shows the advantage of the motion tracking application. The developed system is able to perform tracking with accuracy close to millimeter and can help to preserve resolution of brain PET images in presence of movements.
Magnor, Marcus A
Driven by consumer-market applications that enjoy steadily increasing economic importance, graphics hardware and rendering algorithms are a central focus of computer graphics research. Video-based rendering is an approach that aims to overcome the current bottleneck in the time-consuming modeling process and has applications in areas such as computer games, special effects, and interactive TV. This book offers an in-depth introduction to video-based rendering, a rapidly developing new interdisciplinary topic employing techniques from computer graphics, computer vision, and telecommunication en
Full Text Available This paper presents a parallel TBB-CUDA implementation for the acceleration of single-Gaussian distribution model, which is effective for background removal in the video-based fire detection system. In this framework, TBB mainly deals with initializing work of the estimated Gaussian model running on CPU, and CUDA performs background removal and adaption of the model running on GPU. This implementation can exploit the combined computation power of TBB-CUDA, which can be applied to the real-time environment. Over 220 video sequences are utilized in the experiments. The experimental results illustrate that TBB+CUDA can achieve a higher speedup than both TBB and CUDA. The proposed framework can effectively overcome the disadvantages of limited memory bandwidth and few execution units of CPU, and it reduces data transfer latency and memory latency between CPU and GPU.
Full Text Available Video surveillance system senses and trails out all the threatening issues in the real time environment. It prevents from security threats with the help of visual devices which gather the information related to videos like CCTV’S and IP (Internet Protocol cameras. Video surveillance system has become a key for addressing problems in the public security. They are mostly deployed on the IP based network. So, all the possible security threats exist in the IP based application might also be the threats available for the reliable application which is available for video surveillance. In result, it may increase cybercrime, illegal video access, mishandling videos and so on. Hence, in this paper an intelligent model is used to propose security for video surveillance system which ensures safety and it provides secured access on video.
Schneider, Jeffrey C; Ozsecen, Muzaffer Y; Muraoka, Nicholas K; Mancinelli, Chiara; Della Croce, Ugo; Ryan, Colleen M; Bonato, Paolo
Burn contractures are common and difficult to treat. Measuring continuous joint motion would inform the assessment of contracture interventions; however, it is not standard clinical practice. This study examines use of an interactive gaming system to measure continuous joint motion data. To assess the usability of an exoskeleton-based interactive gaming system in the rehabilitation of upper extremity burn contractures. Feasibility study. Eight subjects with a history of burn injury and upper extremity contractures were recruited from the outpatient clinic of a regional inpatient rehabilitation facility. Subjects used an exoskeleton-based interactive gaming system to play 4 different video games. Continuous joint motion data were collected at the shoulder and elbow during game play. Visual analog scale for engagement, difficulty and comfort. Angular range of motion by subject, joint, and game. The study population had an age of 43 ± 16 (mean ± standard deviation) years and total body surface area burned range of 10%-90%. Subjects reported satisfactory levels of enjoyment, comfort, and difficulty. Continuous joint motion data demonstrated variable characteristics by subject, plane of motion, and game. This study demonstrates the feasibility of use of an exoskeleton-based interactive gaming system in the burn population. Future studies are needed that examine the efficacy of tailoring interactive video games to the specific joint impairments of burn survivors. Copyright © 2016 American Academy of Physical Medicine and Rehabilitation. Published by Elsevier Inc. All rights reserved.
Yamada, Takaaki; Echizen, Isao; Tezuka, Satoru; Yoshiura, Hiroshi
Emerging broadband networks and high performance of PCs provide new business opportunities of the live video streaming services for the Internet users in sport events or in music concerts. Digital watermarking for video helps to protect the copyright of the video content and the real-time processing is an essential requirement. For the small start of new business, it should be achieved by flexible software without special equipments. This paper describes a novel real-time watermarking system implemented on a commodity PC. We propose the system architecture and methods to shorten watermarking time by reusing the estimated watermark imperceptibility among neighboring frames. A prototype system enables real time processing in a series of capturing NTSC signals, watermarking the video, encoding it to MPEG4 in QGVA, 1Mbps, 30fps style and storing the video for 12 hours in maximum
Offering ready access to the security industry's cutting-edge digital future, Intelligent Network Video provides the first complete reference for all those involved with developing, implementing, and maintaining the latest surveillance systems. Pioneering expert Fredrik Nilsson explains how IP-based video surveillance systems provide better image quality, and a more scalable and flexible system at lower cost. A complete and practical reference for all those in the field, this volume:Describes all components relevant to modern IP video surveillance systemsProvides in-depth information about ima
Krüger, Andreas; Edelmann-Nusser, Jürgen
This study aims at determining the accuracy of a full body inertial measurement system in a real skiing environment in comparison with an optical video based system. Recent studies have shown the use of inertial measurement systems for the determination of kinematical parameters in alpine skiing. However, a quantitative validation of a full body inertial measurement system for the application in alpine skiing is so far not available. For the purpose of this study, a skier performed a test-run equipped with a full body inertial measurement system in combination with a DGPS. In addition, one turn of the test-run was analyzed by an optical video based system. With respect to the analyzed angles, a maximum mean difference of 4.9° was measured. No differences in the measured angles between the inertial measurement system and the combined usage with a DGPS were found. Concerning the determination of the skier's trajectory, an additional system (e.g., DGPS) must be used. As opposed to optical methods, the main advantages of the inertial measurement system are the determination of kinematical parameters without the limitation of restricted capture volume, and small time costs for the measurement preparation and data analysis.
Full Text Available An approach has been proposed for automatic adaptive subtitle coloring using fuzzy logic-based algorithm. This system changes the color of the video subtitle/caption to “pleasant” color according to color harmony and the visual perception of the image background colors. In the fuzzy analyzer unit, using RGB histograms of background image, the R, G, and B values for the color of the subtitle/caption are computed using fixed fuzzy IF-THEN rules fully driven from the color harmony theories to satisfy complementary color and subtitle-background color harmony conditions. A real-time hardware structure has been proposed for implementation of the front-end processing unit as well as the fuzzy analyzer unit.
Smith, Jemma; Hand, Linda; Dowrick, Peter W.
This study examined the efficacy of video self modeling (VSM) using feedforward, to teach various goals of a picture exchange communication system (PECS). The participants were two boys with autism and one man with Down syndrome. All three participants were non-verbal with no current functional system of communication; the two children had long…
Javier I. Portillo
Full Text Available Automatic surveillance of airport surface is one of the core components of advanced surface movement, guidance, and control systems (A-SMGCS. This function is in charge of the automatic detection, identification, and tracking of all interesting targets (aircraft and relevant ground vehicles in the airport movement area. This paper presents a novel approach for object tracking based on sequences of video images. A fuzzy system has been developed to ponder update decisions both for the trajectories and shapes estimated for targets from the image regions extracted in the images. The advantages of this approach are robustness, flexibility in the design to adapt to different situations, and efficiency for operation in real time, avoiding combinatorial enumeration. Results obtained in representative ground operations show the system capabilities to solve complex scenarios and improve tracking accuracy. Finally, an automatic procedure, based on neuro-fuzzy techniques, has been applied in order to obtain a set of rules from representative examples. Validation of learned system shows the capability to learn the suitable tracker decisions.
Valli, D.; Ganesan, K.
Chaos based cryptosystems are an efficient method to deal with improved speed and highly secured multimedia encryption because of its elegant features, such as randomness, mixing, ergodicity, sensitivity to initial conditions and control parameters. In this paper, two chaos based cryptosystems are proposed: one is the higher-dimensional 12D chaotic map and the other is based on the Ikeda delay differential equation (DDE) suitable for designing a real-time secure symmetric video encryption scheme. These encryption schemes employ a substitution box (S-box) to diffuse the relationship between pixels of plain video and cipher video along with the diffusion of current input pixel with the previous cipher pixel, called cipher block chaining (CBC). The proposed method enhances the robustness against statistical, differential and chosen/known plain text attacks. Detailed analysis is carried out in this paper to demonstrate the security and uniqueness of the proposed scheme.
Full Text Available About the video image processing's vehicle detection and counting system research, which has video vehicle detection, vehicle targets' image processing, and vehicle counting function. Vehicle detection is the use of inter-frame difference method and vehicle shadow segmentation techniques for vehicle testing. Image processing functions is the use of color image gray processing, image segmentation, mathematical morphology analysis and image fills, etc. on target detection to be processed, and then the target vehicle extraction. Counting function is to count the detected vehicle. The system is the use of inter-frame video difference method to detect vehicle and the use of the method of adding frame to vehicle and boundary comparison method to complete the counting function, with high recognition rate, fast, and easy operation. The purpose of this paper is to enhance traffic management modernization and automation levels. According to this study, it can provide a reference for the future development of related applications.
This comprehensive and accessible text/reference presents an overview of the state of the art in video coding technology. Specifically, the book introduces the tools of the AVS2 standard, describing how AVS2 can help to achieve a significant improvement in coding efficiency for future video networks and applications by incorporating smarter coding tools such as scene video coding. Topics and features: introduces the basic concepts in video coding, and presents a short history of video coding technology and standards; reviews the coding framework, main coding tools, and syntax structure of AV
Belonging to the wider academic field of computer vision, video analytics has aroused a phenomenal surge of interest since the current millennium. Video analytics is intended to solve the problem of the incapability of exploiting video streams in real time for the purpose of detection or anticipation. It involves analyzing the videos using algorithms that detect and track objects of interest over time and that indicate the presence of events or suspect behavior involving these objects.The aims of this book are to highlight the operational attempts of video analytics, to identify possi
Liu, Wen P.; Mirota, Daniel J.; Uneri, Ali; Otake, Yoshito; Hager, Gregory; Reh, Douglas D.; Ishii, Masaru; Gallia, Gary L.; Siewerdsen, Jeffrey H.
Augmentation of endoscopic video with preoperative or intraoperative image data [e.g., planning data and/or anatomical segmentations defined in computed tomography (CT) and magnetic resonance (MR)], can improve navigation, spatial orientation, confidence, and tissue resection in skull base surgery, especially with respect to critical neurovascular structures that may be difficult to visualize in the video scene. This paper presents the engineering and evaluation of a video augmentation system for endoscopic skull base surgery translated to use in a clinical study. Extension of previous research yielded a practical system with a modular design that can be applied to other endoscopic surgeries, including orthopedic, abdominal, and thoracic procedures. A clinical pilot study is underway to assess feasibility and benefit to surgical performance by overlaying CT or MR planning data in realtime, high-definition endoscopic video. Preoperative planning included segmentation of the carotid arteries, optic nerves, and surgical target volume (e.g., tumor). An automated camera calibration process was developed that demonstrates mean re-projection accuracy (0.7+/-0.3) pixels and mean target registration error of (2.3+/-1.5) mm. An IRB-approved clinical study involving fifteen patients undergoing skull base tumor surgery is underway in which each surgery includes the experimental video-CT system deployed in parallel to the standard-of-care (unaugmented) video display. Questionnaires distributed to one neurosurgeon and two otolaryngologists are used to assess primary outcome measures regarding the benefit to surgical confidence in localizing critical structures and targets by means of video overlay during surgical approach, resection, and reconstruction.
Takahata, Minoru; Uemori, Akira; Nakano, Hirotaka
This video-on-demand service is constructed of distributed servers, including video servers that supply real-time MPEG-1 video & audio, real-time MPEG-1 encoders, and an application server that supplies additional text information and agents for retrieval. This system has three distinctive features that enable it to provide multi viewpoint access to real-time visual information: (1) The terminal application uses an agent-oriented approach that allows the system to be easily extended. The agents are implemented using a commercial authoring tool plus additional objects that communicate with the video servers by using TCP/IP protocols. (2) The application server manages the agents, automatically processes text information and is able to handle unexpected alterations of the contents. (3) The distributed system has an economical, flexible architecture to store long video streams. The real-time MPEG-1 encoder system is based on multi channel phase-shifting processing. We also describe a practical application of this system, a prototype TV-on-demand service called TVOD. This provides access to broadcast television programs for the previous week.
Full Text Available The aim of this paper is to present video quality prediction models for objective non-intrusive, prediction of H.264 encoded video for all content types combining parameters both in the physical and application layer over Universal Mobile Telecommunication Systems (UMTS networks. In order to characterize the Quality of Service (QoS level, a learning model based on Adaptive Neural Fuzzy Inference System (ANFIS and a second model based on non-linear regression analysis is proposed to predict the video quality in terms of the Mean Opinion Score (MOS. The objective of the paper is two-fold. First, to find the impact of QoS parameters on end-to-end video quality for H.264 encoded video. Second, to develop learning models based on ANFIS and non-linear regression analysis to predict video quality over UMTS networks by considering the impact of radio link loss models. The loss models considered are 2-state Markov models. Both the models are trained with a combination of physical and application layer parameters and validated with unseen dataset. Preliminary results show that good prediction accuracy was obtained from both the models. The work should help in the development of a reference-free video prediction model and QoS control methods for video over UMTS networks.
Full Text Available With the rapid development of wireless networks and image acquisition technology, wireless video transmission technology has been widely applied in various communication systems. The traditional video monitoring technology is restricted by some conditions such as layout, environmental, the relatively large volume, cost, and so on. In view of this problem, this paper proposes a method that the mobile car can be equipped with wireless video monitoring system. The mobile car which has some functions such as detection, video acquisition and wireless data transmission is developed based on STC89C52 Micro Control Unit (MCU and WiFi router. Firstly, information such as image, temperature and humidity is processed by the MCU and communicated with the router, and then returned by the WiFi router to the host computer phone. Secondly, control information issued by the host computer phone is received by WiFi router and sent to the MCU, and then the MCU sends relevant instructions. Lastly, the wireless transmission of video images and the remote control of the car are realized. The results prove that the system has some features such as simple operation, high stability, fast response, low cost, strong flexibility, widely application, and so on. The system has certain practical value and popularization value.
Roh, Mootaek; McHugh, Thomas J; Lee, Kyungmin
To investigate the relationship between neural function and behavior it is necessary to record neuronal activity in the brains of freely behaving animals, a technique that typically involves tethering to a data acquisition system. Optimally this approach allows animals to behave without any interference of movement or task performance. Currently many laboratories in the cognitive and behavioral neuroscience fields employ commercial motorized commutator systems using torque sensors to detect tether movement induced by the trajectory behaviors of animals. In this study we describe a novel motorized commutator system which is automatically controlled by video tracking. To obtain accurate head direction data two light emitting diodes were used and video image noise was minimized by physical light source manipulation. The system calculates the rotation of the animal across a single trial by processing head direction data and the software, which calibrates the motor rotation angle, subsequently generates voltage pulses to actively untwist the tether. This system successfully provides a tether twist-free environment for animals performing behavioral tasks and simultaneous neural activity recording. To the best of our knowledge, it is the first to utilize video tracking generated head direction to detect tether twisting and compensate with a motorized commutator system. Our automatic commutator control system promises an affordable and accessible method to improve behavioral neurophysiology experiments, particularly in mice.
M.Sc. (Computer Science) A video conference is an interactive meeting between two or more locations, facilitated by simultaneous two-way video and audio transmissions. People in a video conference, also known as participants, join these video conferences for business and recreational purposes. In a typical video conference, we should properly identify and authenticate every participant in the video conference, if information discussed during the video conference is confidential. This preve...
Greenwoll, D.A.; Matter, J.C. (Sandia National Labs., Albuquerque, NM (United States)); Ebel, P.E. (BE, Inc., Barnwell, SC (United States))
The purpose of this NUREG is to present technical information that should be useful to NRC licensees in designing closed-circuit television systems for video alarm assessment. There is a section on each of the major components in a video system: camera, lens, lighting, transmission, synchronization, switcher, monitor, and recorder. Each section includes information on component selection, procurement, installation, test, and maintenance. Considerations for system integration of the components are contained in each section. System emphasis is focused on perimeter intrusion detection and assessment systems. A glossary of video terms is included. 13 figs., 9 tabs.
Xie, Ruobing; Li, Li; Jin, Weiqi; Guo, Hong
It is prevalent for the low-light night-vision helmet to equip the binocular viewer with image intensifiers. Such equipment can not only acquire night vision ability, but also obtain the sense of stereo vision to achieve better perception and understanding of the visual field. However, since the image intensifier is for direct-observation, it is difficult to apply the modern image processing technology. As a result, developing digital video technology in night vision is of great significance. In this paper, we design a low-light night-vision helmet with digital imaging device. It consists of three parts: a set of two low-illumination CMOS cameras, a binocular OLED micro display and an image processing PCB. Stereopsis is achieved through the binocular OLED micro display. We choose Speed-Up Robust Feature (SURF) algorithm for image registration. Based on the image matching information and the cameras' calibration parameters, disparity can be calculated in real-time. We then elaborately derive the constraints of binocular stereo display. The sense of stereo vision can be obtained by dynamically adjusting the content of the binocular OLED micro display. There is sufficient space for function extensions in our system. The performance of this low-light night-vision helmet can be further enhanced in combination with The HDR technology and image fusion technology, etc.
Partha Sindu I Gede
Full Text Available The purpose of this study was to determine the effect of the use of the instructional media based on lecture video and slide synchronization system on Statistics learning achievement of the students of PTI department . The benefit of this research is to help lecturers in the instructional process i to improve student's learning achievements that lead to better students’ learning outcomes. Students can use instructional media which is created from the lecture video and slide synchronization system to support more interactive self-learning activities. Students can conduct learning activities more efficiently and conductively because synchronized lecture video and slide can assist students in the learning process. The population of this research was all students of semester VI (six majoring in Informatics Engineering Education. The sample of the research was the students of class VI B and VI D of the academic year 2016/2017. The type of research used in this study was quasi-experiment. The research design used was post test only with non equivalent control group design. The result of this research concluded that there was a significant influence in the application of learning media based on lectures video and slide synchronization system on statistics learning result on PTI department.
Full Text Available Video streaming over the Internet has gained significant popularity during the last years, and the academy and industry have realized a great research effort in this direction. In this scenario, scalable video coding (SVC has emerged as an important video standard to provide more functionality to video transmission and storage applications. This paper proposes and evaluates two strategies based on scalable video coding for P2P video streaming services. In the first strategy, SVC is used to offer differentiated quality video to peers with heterogeneous capacities. The second strategy uses SVC to reach a homogeneous video quality between different videos from different sources. The obtained results show that our proposed strategies enable a system to improve its performance and introduce benefits such as differentiated quality of video for clients with heterogeneous capacities and variable network conditions.
Jalal, Ahmad; Kamal, Shaharyar; Kim, Daijin
Recent advancements in depth video sensors technologies have made human activity recognition (HAR) realizable for elderly monitoring applications. Although conventional HAR utilizes RGB video sensors, HAR could be greatly improved with depth video sensors which produce depth or distance information. In this paper, a depth-based life logging HAR system is designed to recognize the daily activities of elderly people and turn these environments into an intelligent living space. Initially, a depth imaging sensor is used to capture depth silhouettes. Based on these silhouettes, human skeletons with joint information are produced which are further used for activity recognition and generating their life logs. The life-logging system is divided into two processes. Firstly, the training system includes data collection using a depth camera, feature extraction and training for each activity via Hidden Markov Models. Secondly, after training, the recognition engine starts to recognize the learned activities and produces life logs. The system was evaluated using life logging features against principal component and independent component features and achieved satisfactory recognition rates against the conventional approaches. Experiments conducted on the smart indoor activity datasets and the MSRDailyActivity3D dataset show promising results. The proposed system is directly applicable to any elderly monitoring system, such as monitoring healthcare problems for elderly people, or examining the indoor activities of people at home, office or hospital.
Agnisarman, Sruthy; Narasimha, Shraddhaa; Chalil Madathil, Kapil; Welch, Brandon; Brinda, Fnu; Ashok, Aparna; McElligott, James
Telemedicine is the use of technology to provide and support health care when distance separates the clinical service and the patient. Home-based telemedicine systems involve the use of such technology for medical support and care connecting the patient from the comfort of their homes with the clinician. In order for such a system to be used extensively, it is necessary to understand not only the issues faced by the patients in using them but also the clinician. The aim of this study was to conduct a heuristic evaluation of 4 telemedicine software platforms-Doxy.me, Polycom, Vidyo, and VSee-to assess possible problems and limitations that could affect the usability of the system from the clinician's perspective. It was found that 5 experts individually evaluated all four systems using Nielsen's list of heuristics, classifying the issues based on a severity rating scale. A total of 46 unique problems were identified by the experts. The heuristics most frequently violated were visibility of system status and Error prevention amounting to 24% (11/46 issues) each. Esthetic and minimalist design was second contributing to 13% (6/46 issues) of the total errors. Heuristic evaluation coupled with a severity rating scale was found to be an effective method for identifying problems with the systems. Prioritization of these problems based on the rating provides a good starting point for resolving the issues affecting these platforms. There is a need for better transparency and a more streamlined approach for how physicians use telemedicine systems. Visibility of the system status and speaking the users' language are keys for achieving this.
Full Text Available The design of smart video surveillance systems is an active research field among the computer vision community because of their ability to perform automatic scene analysis by selecting and tracking the objects of interest. In this paper, we present the design and implementation of an FPGA-based standalone working prototype system for real-time tracking of an object of interest in live video streams for such systems. In addition to real-time tracking of the object of interest, the implemented system is also capable of providing purposive automatic camera movement (pan-tilt in the direction determined by movement of the tracked object. The complete system, including camera interface, DDR2 external memory interface controller, designed object tracking VLSI architecture, camera movement controller and display interface, has been implemented on the Xilinx ML510 (Virtex-5 FX130T FPGA Board. Our proposed, designed and implemented system robustly tracks the target object present in the scene in real time for standard PAL (720 × 576 resolution color video and automatically controls camera movement in the direction determined by the movement of the tracked object.
Jia, Changxin; Sang, Xinzhu; Liu, Jing; Cheng, Mingsheng
As people's life quality have been improved significantly, the traditional 2D video technology can not meet people's urgent desire for a better video quality, which leads to the rapid development of 3D video technology. Simultaneously people want to watch 3D video in portable devices,. For achieving the above purpose, we set up a remote stereoscopic video play platform. The platform consists of a server and clients. The server is used for transmission of different formats of video and the client is responsible for receiving remote video for the next decoding and pixel restructuring. We utilize and improve Live555 as video transmission server. Live555 is a cross-platform open source project which provides solutions for streaming media such as RTSP protocol and supports transmission of multiple video formats. At the receiving end, we use our laboratory own player. The player for Android, which is with all the basic functions as the ordinary players do and able to play normal 2D video, is the basic structure for redevelopment. Also RTSP is implemented into this structure for telecommunication. In order to achieve stereoscopic display, we need to make pixel rearrangement in this player's decoding part. The decoding part is the local code which JNI interface calls so that we can extract video frames more effectively. The video formats that we process are left and right, up and down and nine grids. In the design and development, a large number of key technologies from Android application development have been employed, including a variety of wireless transmission, pixel restructuring and JNI call. By employing these key technologies, the design plan has been finally completed. After some updates and optimizations, the video player can play remote 3D video well anytime and anywhere and meet people's requirement.
Full Text Available We present a user-based method that detects regions of interest within a video in order to provide video skims and video summaries. Previous research in video retrieval has focused on content-based techniques, such as pattern recognition algorithms that attempt to understand the low-level features of a video. We are proposing a pulse modeling method, which makes sense of a web video by analyzing users' Replay interactions with the video player. In particular, we have modeled the user information seeking behavior as a time series and the semantic regions as a discrete pulse of fixed width. Then, we have calculated the correlation coefficient between the dynamically detected pulses at the local maximums of the user activity signal and the pulse of reference. We have found that users' Replay activity significantly matches the important segments in information-rich and visually complex videos, such as lecture, how-to, and documentary. The proposed signal processing of user activity is complementary to previous work in content-based video retrieval and provides an additional user-based dimension for modeling the semantics of a social video on the web.
Chen, Ming; He, Jing; Deng, Rui; Chen, Qinghui; Zhang, Jinlong; Chen, Lin
To further investigate the feasibility of the digital signal processing (DSP) algorithms (e.g., symbol timing synchronization, channel estimation and equalization, and sampling clock frequency offset (SCFO) estimation and compensation) for real-time optical orthogonal frequency-division multiplexing (OFDM) system, 2.97-Gb/s real-time high-definition video signal parallel transmission is experimentally demonstrated in OFDM-based short-reach intensity-modulated direct-detection (IM-DD) systems. The experimental results show that, in the presence of ∼12 ppm SCFO between transmitter and receiver, the adaptively modulated OFDM signal transmission over 20 km standard single-mode fiber with an error bit rate less than 1 × 10-9 can be achieved by using only DSP-based small SCFO estimation and compensation method without utilizing forward error correction technique. To the best of our knowledge, for the first time, we successfully demonstrate that the video signal at a bit rate in excess of 1-Gb/s transmission in a simple real-valued inverse fast Fourier transform and fast Fourier transform based IM-DD optical OFDM system employing a directly modulated laser.
Full Text Available The task of human hand trajectory tracking and gesture trajectory recognition based on synchronized color and depth video is considered. Toward this end, in the facet of hand tracking, a joint observation model with the hand cues of skin saliency, motion and depth is integrated into particle filter in order to move particles to local peak in the likelihood. The proposed hand tracking method, namely, salient skin, motion, and depth based particle filter (SSMD-PF, is capable of improving the tracking accuracy considerably, in the context of the signer performing the gesture toward the camera device and in front of moving, cluttered backgrounds. In the facet of gesture recognition, a shape-order context descriptor on the basis of shape context is introduced, which can describe the gesture in spatiotemporal domain. The efficient shape-order context descriptor can reveal the shape relationship and embed gesture sequence order information into descriptor. Moreover, the shape-order context leads to a robust score for gesture invariant. Our approach is complemented with experimental results on the settings of the challenging hand-signed digits datasets and American sign language dataset, which corroborate the performance of the novel techniques.
Yu, Yiqing; Liu, Huayong; Wang, Hongbin; Zhou, Dongru
In this paper, we propose content-based video retrieval, which is a kind of retrieval by its semantical contents. Because video data is composed of multimodal information streams such as video, auditory and textual streams, we describe a strategy of using multimodal analysis for automatic parsing sports video. The paper first defines the basic structure of sports video database system, and then introduces a new approach that integrates visual stream analysis, speech recognition, speech signal processing and text extraction to realize video retrieval. The experimental results for TV sports video of football games indicate that the multimodal analysis is effective for video retrieval by quickly browsing tree-like video clips or inputting keywords within predefined domain.
Full Text Available Automated video object recognition is a topic of emerging importance in both defense and civilian applications. This work describes an accurate and low-power neuromorphic architecture and system for real-time automated video object recognition. Our system, Neuormorphic Visual Understanding of Scenes (NEOVUS, is inspired by recent findings in computational neuroscience on feed-forward object detection and classification pipelines for processing and extracting relevant information from visual data. The NEOVUS architecture is inspired by the ventral (what and dorsal (where streams of the mammalian visual pathway and combines retinal processing, form-based and motion-based object detection, and convolutional neural nets based object classification. Our system was evaluated by the Defense Advanced Research Projects Agency (DARPA under the NEOVISION2 program on a variety of urban area video datasets collected from both stationary and moving platforms. The datasets are challenging as they include a large number of targets in cluttered scenes with varying illumination and occlusion conditions. The NEOVUS system was also mapped to commercially available off-the-shelf hardware. The dynamic power requirement for the system that includes a 5.6Mpixel retinal camera processed by object detection and classification algorithms at 30 frames per second was measured at 21.7 Watts (W, for an effective energy consumption of 5.4 nanoJoules (nJ per bit of incoming video. In a systematic evaluation of five different teams by DARPA on three aerial datasets, the NEOVUS demonstrated the best performance with the highest recognition accuracy and at least three orders of magnitude lower energy consumption than two independent state of the art computer vision systems. These unprecedented results show that the NEOVUS has the potential to revolutionize automated video object recognition towards enabling practical low-power and mobile video processing applications.
Rachit Mohan Garg; Yamini Sood; Neha Tyagi
With the increase in the bandwidth & the transmission speed over the internet, transmission of multimedia objects like video, audio, images has become an easier work. In this paper we provide an approach that can be useful for transmission of video objects over the internet without much fuzz. The approach provides a ontology based framework that is used to establish an automatic deployment of video transmission system. Further the video is compressed using the structural flow mechanism tha...
Full Text Available Vision-based monitoring systems using visible spectrum (regular video cameras can complement or substitute conventional sensors and provide rich positional and classification data. Although new camera technologies, including thermal video sensors, may improve the performance of digital video-based sensors, their performance under various conditions has rarely been evaluated at multimodal facilities. The purpose of this research is to integrate existing computer vision methods for automated data collection and evaluate the detection, classification, and speed measurement performance of thermal video sensors under varying lighting and temperature conditions. Thermal and regular video data was collected simultaneously under different conditions across multiple sites. Although the regular video sensor narrowly outperformed the thermal sensor during daytime, the performance of the thermal sensor is significantly better for low visibility and shadow conditions, particularly for pedestrians and cyclists. Retraining the algorithm on thermal data yielded an improvement in the global accuracy of 48%. Thermal speed measurements were consistently more accurate than for the regular video at daytime and nighttime. Thermal video is insensitive to lighting interference and pavement temperature, solves issues associated with visible light cameras for traffic data collection, and offers other benefits such as privacy, insensitivity to glare, storage space, and lower processing requirements.
Full Text Available The paper presents an NP-video rendering system based on natural phenomena. It provides a simple nonphotorealistic video synthesis system in which user can obtain a flow-like stylization painting and infinite video scene. Firstly, based on anisotropic Kuwahara filtering in conjunction with line integral convolution, the phenomena video scene can be rendered to flow-like stylization painting. Secondly, the methods of frame division, patches synthesis, will be used to synthesize infinite playing video. According to selection examples from different natural video texture, our system can generate stylized of flow-like and infinite video scenes. The visual discontinuities between neighbor frames are decreased, and we also preserve feature and details of frames. This rendering system is easy and simple to implement.
Hada, Yoshiaki; Ogata, Hiroaki; Yano, Yoneo
This paper focuses on an online video based correction system for language learning. The prototype system using the proposed model supports learning between a native English teacher and a non-native learner using a videoconference system. It extends the videoconference system so that it can record the conversation of a learning scene. If a teacher…
Hsu, Charles; Szu, Harold
An intelligent video surveillance system is able to detect and identify abnormal and alarming situations by analyzing object movement. The Smart Sensing Surveillance Video (S3V) System is proposed to minimize video processing and transmission, thus allowing a fixed number of cameras to be connected on the system, and making it suitable for its applications in remote battlefield, tactical, and civilian applications including border surveillance, special force operations, airfield protection, perimeter and building protection, and etc. The S3V System would be more effective if equipped with visual understanding capabilities to detect, analyze, and recognize objects, track motions, and predict intentions. In addition, alarm detection is performed on the basis of parameters of the moving objects and their trajectories, and is performed using semantic reasoning and ontologies. The S3V System capabilities and technologies have great potential for both military and civilian applications, enabling highly effective security support tools for improving surveillance activities in densely crowded environments. It would be directly applicable to solutions for emergency response personnel, law enforcement, and other homeland security missions, as well as in applications requiring the interoperation of sensor networks with handheld or body-worn interface devices.
Bolle, R. M.; Yeo, B.-L.; Yeung, M.
Digital video databases are becoming more and more pervasive and finding video of interest in large databases is rapidly becoming a problem. Intelligent means of quick content-based video retrieval and content-based rapid video viewing is, therefore, an important topic of research. Video is a rich source of data, it contains visual and audio information, and in many cases, there is text associated with the video. Content-based video retrieval should use all this information in an efficient and effective way. From a human perspective, a video query can be viewed as an iterated sequence of navigating, searching, browsing, and viewing. This paper addresses video search in terms of these phases.
Alsmirat, Mohammad Abdullah
Video streaming has recently grown dramatically in popularity over the Internet, Cable TV, and wire-less networks. Because of the resource demanding nature of video streaming applications, maximizing resource utilization in any video streaming system is a key factor to increase the scalability and decrease the cost of the system. Resources to…
Petre Vlah-Horea BOŢIANU
Full Text Available During the last years, video files showing different surgical procedures have become extremely available and popular on the internet. They are available on both free and unrestricted sites, as well as on dedicated sites which control the medical quality of the information. Honest presentation and a minimal video-editing to include information about the procedure are mandatory to achieve a product with a true educational value. The integration of the web-based video educational resources in the continuing medical information system seems to be limited and the true educational impact very difficult to assess. A review of the available literature dedicated on this subject shows that the main challenge is related to the human factor and not to the available technology.
The study enlightens the effectiveness of Video Game Based Learning in English Grammar at standard VI. A Video Game package was prepared and it consisted of self-learning activities in play way manner which attracted the minds of the young learners. Chief objective: Find out the effectiveness of Video-Game based learning in English grammar.…
Full Text Available In developing nations, many expanding cities are facing challenges that result from the overwhelming numbers of people and vehicles. Collecting real-time, reliable and precise traffic flow information is crucial for urban traffic management. The main purpose of this paper is to develop an adaptive model that can assess the real-time vehicle counts on urban roads using computer vision technologies. This paper proposes an automatic real-time background update algorithm for vehicle detection and an adaptive pattern for vehicle counting based on the virtual loop and detection line methods. In addition, a new robust detection method is introduced to monitor the real-time traffic congestion state of road section. A prototype system has been developed and installed on an urban road for testing. The results show that the system is robust, with a real-time counting accuracy exceeding 99% in most field scenarios.
Donnelly, Mark P; Nugent, Chris D; Craig, David; Passmore, Peter; Mulvenna, Maurice
The current paper presents details regarding the early developments of a memory prompt solution for persons with early dementia. Using everyday technology, in the form of a cell-phone, video reminders are delivered to assist with daily activities. The proposed CPVS system will permit carers to record and schedule video reminders remotely using a standard personal computer and web cam. It is the aim of the three year project that through the frequent delivery of helpful video reminders that a 'virtual carer' will be present with the person with dementia at all times. The first prototype of the system has been fully implemented with the first field trial scheduled to take place in May 2008. Initially, only three patient carer dyads will be involved, however, the second field trial aims to involve 30 dyads in the study. Details of the first prototype and the methods of evaluation are presented herein.
Bouma, Henri; van der Mark, Wannes; Eendebak, Pieter T.; Landsmeer, Sander H.; van Eekeren, Adam W. M.; ter Haar, Frank B.; Wieringa, F. Pieter; van Basten, Jean-Paul
Compared to open surgery, minimal invasive surgery offers reduced trauma and faster recovery. However, lack of direct view limits space perception. Stereo-endoscopy improves depth perception, but is still restricted to the direct endoscopic field-of-view. We describe a novel technology that reconstructs 3D-panoramas from endoscopic video streams providing a much wider cumulative overview. The method is compatible with any endoscope. We demonstrate that it is possible to generate photorealistic 3D-environments from mono- and stereoscopic endoscopy. The resulting 3D-reconstructions can be directly applied in simulators and e-learning. Extended to real-time processing, the method looks promising for telesurgery or other remote vision-guided tasks.
Seo, Jungdong; Kim, Donghyun; Ryu, Seungchul; Sohn, Kwanghoon
Multi-view video coding (MVC) is a video coding standard developed by MPEG and VCEG for multi-view video. It showed average PSNR gain of 1.5dB compared with view-independent coding by H.264/AVC. However, because resolutions of multi-view video are getting higher for more realistic 3D effect, high performance video codec is needed. MVC adopted hierarchical B-picture structure and inter-view prediction as core techniques. The hierarchical B-picture structure removes the temporal redundancy, and the inter-view prediction reduces the inter-view redundancy by compensated prediction from the reconstructed neighboring views. Nevertheless, MVC has inherent limitation in coding efficiency, because it is based on H.264/AVC. To overcome the limit, an enhanced video codec for multi-view video based on Key Technology Area (KTA) is proposed. KTA is a high efficiency video codec by Video Coding Expert Group (VCEG), and it was carried out for coding efficiency beyond H.264/AVC. The KTA software showed better coding gain than H.264/AVC by using additional coding techniques. The techniques and the inter-view prediction are implemented into the proposed codec, which showed high coding gain compared with the view-independent coding result by KTA. The results presents that the inter-view prediction can achieve higher efficiency in a multi-view video codec based on a high performance video codec such as HEVC.
Yamamoto, Daisuke; Nagao, Katashi
In this paper, we developed a Web-based video annotation system, named iVAS (intelligent Video Annotation Server). Audiences can associate any video content on the Internet with annotations. The system analyzes video content in order to acquire cut/shot information and color histograms. And it also automatically generates a Web page for editing annotations. Then, audiences can create annotation data by two methods. The first one helps the users to create text data such as person/object names, scene descriptions, and comments interactively. The second method facilitates the users associating any video fragments with their subjective impression by just clicking a mouse button. The generated annotation data are accumulated and managed by an XML database connected with iVAS. We also developed some application systems based on annotations such as video retrieval, video simplification, and video-content-based community support. One of the major advantages of our approach is easy integration of hand-coded and automatically-generated (such as color histograms and cut/shot information) annotations. Additionally, since our annotation system is open for public, we must consider some reliability or correctness of annotation data. We also developed an automatic evaluation method of annotation reliability using the users' feedback. In the future, these fundamental technologies will contribute to the formation of new communities centered around video content.
Li, Kang; Li, Sheng; Oh, Sangmin; Fu, Yun
Video analysis and understanding play a central role in visual intelligence. In this paper, we aim to analyze unconstrained videos, by designing features and approaches to represent and analyze videography styles in the videos. Videography denotes the process of making videos. The unconstrained videos are defined as the long duration consumer videos that usually have diverse editing artifacts and significant complexity of contents. We propose to construct a videography dictionary, which can be utilized to represent every video clip as a sequence of videography words. In addition to semantic features, such as foreground object motion and camera motion, we also incorporate two novel interpretable features to characterize videography, including the scale information and the motion correlations. We then demonstrate that, by using statistical analysis methods, the unique videography signatures extracted from different events can be automatically identified. For real-world applications, we explore the use of videography analysis for three types of applications, including content-based video retrieval, video summarization (both visual and textual), and videography-based feature pooling. In the experiments, we evaluate the performance of our approach and other methods on a large-scale unconstrained video dataset, and show that the proposed approach significantly benefits video analysis in various ways.
Ebe, Kazuyu, E-mail: email@example.com; Tokuyama, Katsuichi; Baba, Ryuta; Ogihara, Yoshisada; Ichikawa, Kosuke; Toyama, Joji [Joetsu General Hospital, 616 Daido-Fukuda, Joetsu-shi, Niigata 943-8507 (Japan); Sugimoto, Satoru [Juntendo University Graduate School of Medicine, Bunkyo-ku, Tokyo 113-8421 (Japan); Utsunomiya, Satoru; Kagamu, Hiroshi; Aoyama, Hidefumi [Graduate School of Medical and Dental Sciences, Niigata University, Niigata 951-8510 (Japan); Court, Laurence [The University of Texas MD Anderson Cancer Center, Houston, Texas 77030-4009 (United States)
Purpose: To develop and evaluate a new video image-based QA system, including in-house software, that can display a tracking state visually and quantify the positional accuracy of dynamic tumor tracking irradiation in the Vero4DRT system. Methods: Sixteen trajectories in six patients with pulmonary cancer were obtained with the ExacTrac in the Vero4DRT system. Motion data in the cranio–caudal direction (Y direction) were used as the input for a programmable motion table (Quasar). A target phantom was placed on the motion table, which was placed on the 2D ionization chamber array (MatriXX). Then, the 4D modeling procedure was performed on the target phantom during a reproduction of the patient’s tumor motion. A substitute target with the patient’s tumor motion was irradiated with 6-MV x-rays under the surrogate infrared system. The 2D dose images obtained from the MatriXX (33 frames/s; 40 s) were exported to in-house video-image analyzing software. The absolute differences in the Y direction between the center of the exposed target and the center of the exposed field were calculated. Positional errors were observed. The authors’ QA results were compared to 4D modeling function errors and gimbal motion errors obtained from log analyses in the ExacTrac to verify the accuracy of their QA system. The patients’ tumor motions were evaluated in the wave forms, and the peak-to-peak distances were also measured to verify their reproducibility. Results: Thirteen of sixteen trajectories (81.3%) were successfully reproduced with Quasar. The peak-to-peak distances ranged from 2.7 to 29.0 mm. Three trajectories (18.7%) were not successfully reproduced due to the limited motions of the Quasar. Thus, 13 of 16 trajectories were summarized. The mean number of video images used for analysis was 1156. The positional errors (absolute mean difference + 2 standard deviation) ranged from 0.54 to 1.55 mm. The error values differed by less than 1 mm from 4D modeling function errors
Jensen, Karsten; Juhl, Jens
There is an increasing demand for assessing foot mal positions and an interest in monitoring the effect of treatment. In the last decades several different motion capture systems has been used. This abstract describes a new low cost motion capture system.......There is an increasing demand for assessing foot mal positions and an interest in monitoring the effect of treatment. In the last decades several different motion capture systems has been used. This abstract describes a new low cost motion capture system....
From the streets of London to subway stations in New York City, hundreds of thousands of surveillance cameras ubiquitously collect hundreds of thousands of videos, often running 24/7. How can such vast volumes of video data be stored, analyzed, indexed, and searched? How can advanced video analysis and systems autonomously recognize people and detect targeted activities real-time? Collating and presenting the latest information Intelligent Video Surveillance: Systems and Technology explores these issues, from fundamentals principle to algorithmic design and system implementation.An Integrated
Mendi, Engin; Bayrak, Coskun; Cecen, Songul; Ermisoglu, Emre
Development of health information technology has had a dramatic impact to improve the efficiency and quality of medical care. Developing interoperable health information systems for healthcare providers has the potential to improve the quality and equitability of patient-centered healthcare. In this article, we describe an automated content-based medical video analysis and management service that provides convenience and ease in accessing the relevant medical video content without sequential scanning. The system facilitates effective temporal video segmentation and content-based visual information retrieval that enable a more reliable understanding of medical video content. The system is implemented as a Web- and mobile-based service and has the potential to offer a knowledge-sharing platform for the purpose of efficient medical video content access.
Gregorio, Massimo De
In this paper we present an intelligent active video surveillance system currently adopted in two different application domains: railway tunnels and outdoor storage areas. The system takes advantages of the integration of Artificial Neural Networks (ANN) and symbolic Artificial Intelligence (AI). This hybrid system is formed by virtual neural sensors (implemented as WiSARD-like systems) and BDI agents. The coupling of virtual neural sensors with symbolic reasoning for interpreting their outputs, makes this approach both very light from a computational and hardware point of view, and rather robust in performances. The system works on different scenarios and in difficult light conditions.
Wang, Shuangbao; Kelly, William
In this paper, we present a novel system, inVideo, for video data analytics, and its use in transforming linear videos into interactive learning objects. InVideo is able to analyze video content automatically without the need for initial viewing by a human. Using a highly efficient video indexing engine we developed, the system is able to analyze…
... From the Federal Register Online via the Government Publishing Office FEDERAL COMMUNICATIONS COMMISSION 47 CFR Part 76 Open Video Systems AGENCY: Federal Communications Commission. ACTION: Final rule... Open Video Systems. DATES: The amendments to 47 CFR 76.1505(d) and 76.1506(d), (l)(3), and (m)(2...
Full Text Available Shooting free throws plays an important role in basketball. The major problem in performing a correct free throw seems to be inappropriate training. Training is performed offline and it is often not that persistent. The aim of this paper is to consciously modify and control the free throw using biofeedback. Elbow and shoulder dynamics are calculated by an image processing technique equipped with a video image acquisition system. The proposed setup in this paper, named learning control system, is able to quantify and provide feedback of the above parameters in real time as audio signals. Therefore, it yielded to performing a correct learning and conscious control of shooting. Experimental results showed improvements in the free throw shooting style including shot pocket and locked position. The mean values of elbow and shoulder angles were controlled approximately on 89o and 26o, for shot pocket and also these angles were tuned approximately on 180o and 47o respectively for the locked position (closed to the desired pattern of the free throw based on valid FIBA references. Not only the mean values enhanced but also the standard deviations of these angles decreased meaningfully, which shows shooting style convergence and uniformity. Also, in training conditions, the average percentage of making successful free throws increased from about 64% to even 87% after using this setup and in competition conditions the average percentage of successful free throws enhanced about 20%, although using the learning control system may not be the only reason for these outcomes. The proposed system is easy to use, inexpensive, portable and real time applicable.
Vision is a part of information system that converts visual information into knowledge structures. These structures drive the vision process, resolving ambiguity and uncertainty via feedback, and provide image understanding, which is an interpretation of visual information in terms of these knowledge models. It is hard to split the entire system apart, and vision mechanisms cannot be completely understood separately from informational processes related to knowledge and intelligence. Brain reduces informational and computational complexities, using implicit symbolic coding of features, hierarchical compression, and selective processing of visual information. Vision is a component of situation awareness, motion and planning systems. Foveal vision provides semantic analysis, recognizing objects in the scene. Peripheral vision guides fovea to salient objects and provides scene context. Biologically inspired Network-Symbolic representation, in which both systematic structural/logical methods and neural/statistical methods are parts of a single mechanism, converts visual information into relational Network-Symbolic structures, avoiding precise artificial computations of 3-D models. Network-Symbolic transformations derive more abstract structures that allows for invariant recognition of an object as exemplar of a class and for a reliable identification even if the object is occluded. Systems with such smart vision will be able to navigate in real environment and understand real-world situations.
Vision evolved as a sensory system for reaching, grasping and other motion activities. In advanced creatures, it has become a vital component of situation awareness, navigation and planning systems. Vision is part of a larger information system that converts visual information into knowledge structures. These structures drive the vision process, resolving ambiguity and uncertainty via feedback, and provide image understanding, that is an interpretation of visual information in terms of such knowledge models. It is hard to split such a system apart. Biologically inspired Network-Symbolic representation, where both systematic structural/logical methods and neural/statistical methods are parts of a single mechanism, is the most feasible for natural processing of visual information. It converts visual information into relational Network-Symbolic models, avoiding artificial precise computations of 3-dimensional models. Logic of visual scenes can be captured in such models and used for disambiguation of visual information. Network-Symbolic transformations derive abstract structures, which allows for invariant recognition of an object as exemplar of a class. Active vision helps create unambiguous network-symbolic models. This approach is consistent with NIST RCS. The UGV, equipped with such smart vision, will be able to plan path and navigate in a real environment, perceive and understand complex real-world situations and act accordingly.
Robert C. Lorenz
Full Text Available Video games contain elaborate reinforcement and reward schedules that have the potential to maximize motivation. Neuroimaging studies suggest that video games might have an influence on the reward system. However, it is not clear whether reward-related properties represent a precondition, which biases an individual towards playing video games, or if these changes are the result of playing video games. Therefore, we conducted a longitudinal study to explore reward-related functional predictors in relation to video gaming experience as well as functional changes in the brain in response to video game training.Fifty healthy participants were randomly assigned to a video game training (TG or control group (CG. Before and after training/control period, functional magnetic resonance imaging (fMRI was conducted using a non-video game related reward task.At pretest, both groups showed strongest activation in ventral striatum (VS during reward anticipation. At posttest, the TG showed very similar VS activity compared to pretest. In the CG, the VS activity was significantly attenuated.This longitudinal study revealed that video game training may preserve reward responsiveness in the ventral striatum in a retest situation over time. We suggest that video games are able to keep striatal responses to reward flexible, a mechanism which might be of critical value for applications such as therapeutic cognitive training.
Lorenz, Robert C; Gleich, Tobias; Gallinat, Jürgen; Kühn, Simone
Video games contain elaborate reinforcement and reward schedules that have the potential to maximize motivation. Neuroimaging studies suggest that video games might have an influence on the reward system. However, it is not clear whether reward-related properties represent a precondition, which biases an individual toward playing video games, or if these changes are the result of playing video games. Therefore, we conducted a longitudinal study to explore reward-related functional predictors in relation to video gaming experience as well as functional changes in the brain in response to video game training. Fifty healthy participants were randomly assigned to a video game training (TG) or control group (CG). Before and after training/control period, functional magnetic resonance imaging (fMRI) was conducted using a non-video game related reward task. At pretest, both groups showed strongest activation in ventral striatum (VS) during reward anticipation. At posttest, the TG showed very similar VS activity compared to pretest. In the CG, the VS activity was significantly attenuated. This longitudinal study revealed that video game training may preserve reward responsiveness in the VS in a retest situation over time. We suggest that video games are able to keep striatal responses to reward flexible, a mechanism which might be of critical value for applications such as therapeutic cognitive training.
Byrnes, Patrick D.; Higgins, William E.
Image-guided bronchoscopy is a critical component in the treatment of lung cancer and other pulmonary disorders. During bronchoscopy, a high-resolution endobronchial video stream facilitates guidance through the lungs and allows for visual inspection of a patient's airway mucosal surfaces. Despite the detailed information it contains, little effort has been made to incorporate recorded video into the clinical workflow. Follow-up procedures often required in cancer assessment or asthma treatment could significantly benefit from effectively parsed and summarized video. Tracking diagnostic regions of interest (ROIs) could potentially better equip physicians to detect early airway-wall cancer or improve asthma treatments, such as bronchial thermoplasty. To address this need, we have developed a system for the postoperative analysis of recorded endobronchial video. The system first parses an input video stream into endoscopic shots, derives motion information, and selects salient representative key frames. Next, a semi-automatic method for CT-video registration creates data linkages between a CT-derived airway-tree model and the input video. These data linkages then enable the construction of a CT-video chest model comprised of a bronchoscopy path history (BPH) - defining all airway locations visited during a procedure - and texture-mapping information for rendering registered video frames onto the airwaytree model. A suite of analysis tools is included to visualize and manipulate the extracted data. Video browsing and retrieval is facilitated through a video table of contents (TOC) and a search query interface. The system provides a variety of operational modes and additional functionality, including the ability to define regions of interest. We demonstrate the potential of our system using two human case study examples.
Yan, Ling; Hicks, Matt; Winslow, Korey; Comella, Cynthia; Ludlow, Christy; Jinnah, H. A; Rosen, Ami R; Wright, Laura; Galpern, Wendy R; Perlmutter, Joel S
Background We developed a novel secured web-based dystonia video repository for the Dystonia Coalition, part of the Rare Disease Clinical Research network funded by the Office of Rare Diseases Research and the National Institute of Neurological Disorders and Stroke. A critical component of phenotypic data collection for all projects of the Dystonia Coalition includes a standardized video of each participant. We now describe our method for collecting, serving and securing these videos that is widely applicable to other studies. Methods Each recruiting site uploads standardized videos to a centralized secured server for processing to permit website posting. The streaming technology used to view the videos from the website does not allow downloading of video files. With appropriate institutional review board approval and agreement with the hosting institution, users can search and view selected videos on the website using customizable, permissions-based access that maintains security yet facilitates research and quality control. Results This approach provides a convenient platform for researchers across institutions to evaluate and analyze shared video data. We have applied this methodology for quality control, confirmation of diagnoses, validation of rating scales, and implementation of new research projects. Conclusions We believe our system can be a model for similar projects that require access to common video resources. PMID:25630890
Yan, Ling; Hicks, Matt; Winslow, Korey; Comella, Cynthia; Ludlow, Christy; Jinnah, H A; Rosen, Ami R; Wright, Laura; Galpern, Wendy R; Perlmutter, Joel S
We developed a novel secured web-based dystonia video repository for the Dystonia Coalition, part of the Rare Disease Clinical Research network funded by the Office of Rare Diseases Research and the National Institute of Neurological Disorders and Stroke. A critical component of phenotypic data collection for all projects of the Dystonia Coalition includes a standardized video of each participant. We now describe our method for collecting, serving and securing these videos that is widely applicable to other studies. Each recruiting site uploads standardized videos to a centralized secured server for processing to permit website posting. The streaming technology used to view the videos from the website does not allow downloading of video files. With appropriate institutional review board approval and agreement with the hosting institution, users can search and view selected videos on the website using customizable, permissions-based access that maintains security yet facilitates research and quality control. This approach provides a convenient platform for researchers across institutions to evaluate and analyze shared video data. We have applied this methodology for quality control, confirmation of diagnoses, validation of rating scales, and implementation of new research projects. We believe our system can be a model for similar projects that require access to common video resources. Copyright © 2015 Elsevier Ltd. All rights reserved.
Botella, Guillermo; García, Carlos; Meyer-Bäse, Uwe
This contribution focuses on different topics covered by the special issue titled `Hardware Implementation of Machine vision Systems' including FPGAs, GPUS, embedded systems, multicore implementations for image analysis such as edge detection, segmentation, pattern recognition and object recognition/interpretation, image enhancement/restoration, image/video compression, image similarity and retrieval, satellite image processing, medical image processing, motion estimation, neuromorphic and bioinspired vision systems, video processing, image formation and physics based vision, 3D processing/coding, scene understanding, and multimedia.
Laptenok, V. D.; Seregin, Y. N.; Bocharov, A. N.; Murygin, A. V.; Tynchenko, V. S.
Equipment of video observation system for electron beam welding process was developed. Construction of video observation system allows to reduce negative effects on video camera during the process of electron beam welding and get qualitative images of this process.
van der Schaar-Mitrea, Mihaela; de With, Peter H. N.
The diversity in TV images has augmented with the increased application of computer graphics. In this paper we study z coding system that supports both the lossless coding of such graphics data and regular lossy video compression. The lossless coding techniques are based on runlength and arithmetical coding. For video compression, we introduce a simple block predictive coding technique featuring individual pixel access, so that it enables a gradual shift from lossless coding of graphics to the lossy coding of video. An overall bit rate control completes the system. Computer simulations show a very high quality with a compression factor between 2-3.
Schmale, Sebastian; Seidel, Pascal; Thiermann, Steffen; Paul, Steffen
Inpainting-based compression and reconstruction methodology can be applied to systems with limited resources to enable continuously monitor neurological activity. In this work, an approach based on sparse representations and K-SVD is augmented to a video processing in order to improve the recovery quality. That was mainly achieved by using another direction of spatial correlation and the extraction of cuboids across frames. The implementation of overlapping frames between the recorded data blocks avoids rising errors at the boundaries during the inpainting-based recovery. Controlling the electrode states per frame plays a key role for high data compression and precise recovery. The proposed 3D inpainting approach can compete with common methods like JPEG, JPEG2000 or MPEG-4 in terms of the degree of compression and reconstruction accuracy, which was applied on real measured local field potentials of a human patient.
Full Text Available Leveraging network virtualization technologies, the community-based video systems rely on the measurement of common interests to define and steady relationship between community members, which promotes video sharing performance and improves scalability community structure. In this paper, we propose a novel mobile Video Community discovery scheme using ontology-based semantical interest capture (VCOSI. An ontology-based semantical extension approach is proposed, which describes video content and measures video similarity according to video key word selection methods. In order to reduce the calculation load of video similarity, VCOSI designs a prefix-filtering-based estimation algorithm to decrease energy consumption of mobile nodes. VCOSI further proposes a member relationship estimate method to construct scalable and resilient node communities, which promotes video sharing capacity of video systems with the flexible and economic community maintenance. Extensive tests show how VCOSI obtains better performance results in comparison with other state-of-the-art solutions.
Liu, Shuo; Piao, Yan
As the effect of atmospheric particles scattering, the video image captured by outdoor surveillance system has low contrast and brightness, which directly affects the application value of the system. The traditional defogging technology is mostly studied by software for the defogging algorithms of the single frame image. Moreover, the algorithms have large computation and high time complexity. Then, the defogging technology of video image based on Digital Signal Processing (DSP) has the problem of complex peripheral circuit. It can't be realized in real-time processing, and it's hard to debug and upgrade. In this paper, with the improved dark channel prior algorithm, we propose a kind of defogging technology of video image based on Field Programmable Gate Array (FPGA). Compared to the traditional defogging methods, the video image with high resolution can be processed in real-time. Furthermore, the function modules of the system have been designed by hardware description language. At last, the results show that the defogging system based on FPGA can process the video image with minimum resolution of 640×480 in real-time. After defogging, the brightness and contrast of video image are improved effectively. Therefore, the defogging technology proposed in the paper has a great variety of applications including aviation, forest fire prevention, national security and other important surveillance.
imperceptibility, they used Watson's DCT-based visual model for video watermarking. Ge et al. (2003) have presented a ... accordance with the human visual system and signal characteristics. Hsu & Wu (1998) have proposed a ...... Shannon C E 1949 Communication theory of secrecy system. Bell System Technical Journal ...
Kasprowicz, Grzegorz; Pastuszak, Grzegorz; Poźniak, Krzysztof; Trochimiuk, Maciej; Abramowski, Andrzej; Gaska, Michal; Bukowiecka, Danuta; Tyburska, Agata; Struniawski, Jarosław; Jastrzebski, Pawel; Jewartowski, Blazej; Frasunek, Przemysław; Nalbach-Moszynska, Małgorzata; Brawata, Sebastian; Bubak, Iwona; Gloza, Małgorzata
The purpose of the project is development of a platform which integrates video signals from many sources. The signals can be sourced by existing analogue CCTV surveillance installations, recent internet-protocol (IP) cameras or single cameras of any type. The system will consist of portable devices that provide conversion, encoding, transmission and archiving. The sharing subsystem will use distributed file system and also user console which provides simultaneous access to any of video streams in real time. The system is fully modular so its extension is possible, both from hardware and software side. Due to standard modular technology used, partial technology modernization is also possible during a long exploitation period.
Vision is only a part of a system that converts visual information into knowledge structures. These structures drive the vision process, resolving ambiguity and uncertainty via feedback, and provide image understanding, which is an interpretation of visual information in terms of these knowledge models. These mechanisms provide a reliable recognition if the object is occluded or cannot be recognized as a whole. It is hard to split the entire system apart, and reliable solutions to the target recognition problems are possible only within the solution of a more generic Image Understanding Problem. Brain reduces informational and computational complexities, using implicit symbolic coding of features, hierarchical compression, and selective processing of visual information. Biologically inspired Network-Symbolic representation, where both systematic structural/logical methods and neural/statistical methods are parts of a single mechanism, is the most feasible for such models. It converts visual information into relational Network-Symbolic structures, avoiding artificial precise computations of 3-dimensional models. Network-Symbolic Transformations derive abstract structures, which allows for invariant recognition of an object as exemplar of a class. Active vision helps creating consistent models. Attention, separation of figure from ground and perceptual grouping are special kinds of network-symbolic transformations. Such Image/Video Understanding Systems will be reliably recognizing targets.
Vision is only a part of a larger system that converts visual information into knowledge structures. These structures drive the vision process, resolving ambiguity and uncertainty via feedback, and provide image understanding, which is an interpretation of visual information in terms of these knowledge models. This mechanism provides a reliable recognition if the target is occluded or cannot be recognized. It is hard to split the entire system apart, and reliable solutions to the target recognition problems are possible only within the solution of a more generic Image Understanding Problem. Brain reduces informational and computational complexities, using implicit symbolic coding of features, hierarchical compression, and selective processing of visual information. Biologically inspired Network-Symbolic representation, where both systematic structural/logical methods and neural/statistical methods are parts of a single mechanism, converts visual information into relational Network-Symbolic structures, avoiding artificial precise computations of 3-dimensional models. Logic of visual scenes can be captured in Network-Symbolic models and used for disambiguation of visual information. Network-Symbolic Transformations derive abstract structures, which allow for invariant recognition of an object as exemplar of a class. Active vision helps build consistent, unambiguous models. Such Image/Video Understanding Systems will be able reliably recognizing targets in real-world conditions.
Full Text Available This paper reports on the development of an automated embedded video surveillance system using two customized embedded RISC processors. The application is partitioned into object tracking and video stream encoding subsystems. The real-time object tracker is able to detect and track moving objects by video images of scenes taken by stationary cameras. It is based on the block-matching algorithm. The video stream encoding involves the optimization of an international telecommunications union (ITU-T H.263 baseline video encoder for quarter common intermediate format (QCIF and common intermediate format (CIF resolution images. The two subsystems running on two processor cores were integrated and a simple protocol was added to realize the automated video surveillance system. The experimental results show that the system is capable of detecting, tracking, and encoding QCIF and CIF resolution images with object movements in them in real-time. With low cycle-count, low-transistor count, and low-power consumption requirements, the system is ideal for deployment in remote locations.
Vaish, Pallavi; Bharath, R; Rajalakshmi, P
Telesonography involves transmission of ultrasound video from remote areas to the doctors for getting diagnosis. Due to the lack of trained sonographers in remote areas, the ultrasound videos scanned by these untrained persons do not contain the proper information that is required by a physician. As compared to standard methods for video transmission, mHealth driven systems need to be developed for transmitting valid medical videos. To overcome this problem, we are proposing an organ validation algorithm to evaluate the ultrasound video based on the content present. This will guide the semi skilled person to acquire the representative data from patient. Advancement in smartphone technology allows us to perform high medical image processing on smartphone. In this paper we have developed an Application (APP) for a smartphone which can automatically detect the valid frames (which consist of clear organ visibility) in an ultrasound video and ignores the invalid frames (which consist of no-organ visibility), and produces a compressed sized video. This is done by extracting the GIST features from the Region of Interest (ROI) of the frame and then classifying the frame using SVM classifier with quadratic kernel. The developed application resulted with the accuracy of 94.93% in classifying valid and invalid images.
Kokaram, Anil; Doyle, Erika; Lennon, Daire; Joyeux, Laurent; Fuller, Ray
In Psychology it is common to conduct studies involving the observation of humans undertaking some task. The sessions are typically recorded on video and used for subjective visual analysis. The subjective analysis is tedious and time consuming, not only because much useless video material is recorded but also because subjective measures of human behaviour are not necessarily repeatable. This paper presents tools using content based video analysis that allow automated parsing of video from one such study involving Dyslexia. The tools rely on implicit measures of human motion that can be generalised to other applications in the domain of human observation. Results comparing quantitative assessment of human motion with subjective assessment are also presented, illustrating that the system is a useful scientific tool.
Du, Xun; Li, Honglin; Ahalt, Stanley C.
The term Content-Based appears often in applications for which MPEG-7 is expected to play a significant role. MPEG-7 standardizes descriptors of multimedia content, and while compression is not the primary focus of MPEG-7, the descriptors defined by MPEG-7 can be used to reconstruct a rough representation of an original multimedia source. In contrast, current image and video compression standards such as JPEG and MPEG are not designed to encode at the very low bit-rates that could be accomplished with MPEG-7 using descriptors. In this paper we show that content-based mechanisms can be introduced into compression algorithms to improve the scalability and functionality of current compression methods such as JPEG and MPEG. This is the fundamental idea behind Content-Based Compression (CBC). Our definition of CBC is a compression method that effectively encodes a sufficient description of the content of an image or a video in order to ensure that the recipient is able to reconstruct the image or video to some degree of accuracy. The degree of accuracy can be, for example, the classification error rate of the encoded objects, since in MPEG-7 the classification error rate measures the performance of the content descriptors. We argue that the major difference between a content-based compression algorithm and conventional block-based or object-based compression algorithms is that content-based compression replaces the quantizer with a more sophisticated classifier, or with a quantizer which minimizes classification error. Compared to conventional image and video compression methods such as JPEG and MPEG, our results show that content-based compression is able to achieve more efficient image and video coding by suppressing the background while leaving the objects of interest nearly intact.
Tao, Junjie; Jia, Lili; You, Ying
Advances in digital video compression and IP communication technologies raised new issues and challenges concerning the integrity and authenticity of surveillance videos. It is so important that the system should ensure that once recorded, the video cannot be altered; ensuring the audit trail is intact for evidential purposes. This paper gives an overview of passive techniques of Digital Video Forensics which are based on intrinsic fingerprints inherent in digital surveillance videos. In this paper, we performed a thorough research of literatures relevant to video manipulation detection methods which accomplish blind authentications without referring to any auxiliary information. We presents review of various existing methods in literature, and much more work is needed to be done in this field of video forensics based on video data analysis and observation of the surveillance systems.
DeMuth, David M., Jr.; Schwalm, M.
A video capture and Macromedia Flash-based video analysis system is used in an implementation of the problem solving and collaborative methodologies (UMinn, Heller/Heller) at the University of Minnesota, Crookston, where an open source polling system has been developed to verify preparation. Its features include a platform independent web interface, assignable and automatic polling intervals, and graphical reporting. Question types: Multiple Choice (optionally randomly presented), random variable, ranking, and survey. An HTML editor with image/video/file uploading allows for high quality presentation of questions. An optional collaborative feature forces agreement among students in small groups. An overview of the system and its impact on preparation will be presented. http://ray.crk.umn.edu/aapt/ Funded by NSF-CCLI 102280781
Wadwekar, Vaibhav; Nair, Pradeep Pankajakshan; Murgai, Aditya; Thirunavukkarasu, Sibi; Thazhath, Harichandrakumar Kottyen
Different studies have described useful signs to diagnose psychogenic non-epileptic seizure (PNES). A few authors have tried to describe the semiologic groups among PNES patients; each group consisting of combination of features. But there is no uniformity of nomenclature among these studies. Our aim was to find out whether the objective classification system proposed by Hubsch et al. was useful and adequate to classify PNES patient population from South India. We retrospectively analyzed medical records and video EEG monitoring data of patients, recorded during 3 year period from June 2010 to July 2013. We observed the semiologic features of each PNES episode and tried to group them strictly adhering to Hubsch et al. classification. Minor modifications were made to include patients who were left unclassified. A total of 65 patients were diagnosed to have PNES during this period, out of which 11 patients were excluded due to inadequate data. We could classify 42(77.77%) patients without modifying the defining criteria of the Hubsch et al. groups. With minor modification we could classify 94.96% patients. The modified groups with patient distribution are as follows: Class 1--dystonic attacks with primitive gestural activities [3(5.6%)]. Class 2 – paucikinetic attacks with or without preserved responsiveness [5(9.3%)]. Class 3--pseudosyncope with or without hyperventilation [21(38.9%)]. Class 4--hyperkinetic prolonged attacks with hyperventilation, involvement of limbs and/or trunk [14(25.9%)]. Class 5--axial dystonic attacks [8(14.8%)]. Class 6--unclassified type [3(5.6%)]. This study demonstrates that the Hubsch's classification with minor modifications is useful and adequate to classify PNES patients from South India. Copyright © 2013 British Epilepsy Association. Published by Elsevier Ltd. All rights reserved.
Full Text Available Future wireless video transmission systems will consider orthogonal frequency division multiplexing (OFDM as the basic modulation technique due to its robustness and low complexity implementation in the presence of frequency-selective channels. Recently, adaptive bit loading techniques have been applied to OFDM showing good performance gains in cable transmission systems. In this paper a multilayer bit loading technique, based on the so called Ã‚Â“ordered subcarrier selection algorithm,Ã‚Â” is proposed and applied to a Hiperlan2-like wireless system at 5 GHz for efficient layered multimedia transmission. Different schemes realizing unequal error protection both at coding and modulation levels are compared. The strong impact of this technique in terms of video quality is evaluated for MPEG-4 video transmission.
Huang, Xin; Søgaard, Jacob; Forchhammer, Søren
This paper proposes a No-Reference (NR) Video Quality Assessment (VQA) method for videos subject to the distortion given by the High Efficiency Video Coding (HEVC) scheme. The assessment is performed without access to the bitstream. The proposed analysis is based on the transform coefficients...
Mirota, Daniel J.; Uneri, Ali; Schafer, Sebastian; Nithiananthan, Sajendra; Reh, Douglas D.; Ishii, Masaru; Gallia, Gary L.; Taylor, Russell H.; Hager, Gregory D.; Siewerdsen, Jeffrey H.
The safety of endoscopic skull base surgery can be enhanced by accurate navigation in preoperative computed tomography (CT) or, more recently, intraoperative cone-beam CT (CBCT). The ability to register real-time endoscopic video with CBCT offers an additional advantage by rendering information directly within the visual scene to account for intraoperative anatomical change. However, tracker localization error (~ 1–2 mm) limits the accuracy with which video and tomographic images can be registered. This paper reports the first implementation of image-based video-CBCT registration, conducts a detailed quantitation of the dependence of registration accuracy on system parameters, and demonstrates improvement in registration accuracy achieved by the image-based approach. Performance was evaluated as a function of parameters intrinsic to the image-based approach, including system geometry, CBCT image quality, and computational runtime. Overall system performance was evaluated in a cadaver study simulating transsphenoidal skull base tumor excision. Results demonstrated significant improvement (p < 0.001)in registration accuracy with a mean reprojection distance error of 1.28 mm for the image-based approach versus 1.82 mm for the conventional tracker-based method. Image-based registration was highly robust against CBCT image quality factors of noise and resolution, permitting integration with low-dose intraoperative CBCT. PMID:23372078
Aghito, Shankar Manuel; Forchhammer, Søren
A novel algorithm for coding gray level alpha planes in object-based video is presented. The scheme is based on segmentation in multiple layers. Different coders are specifically designed for each layer. In order to reduce the bit rate, cross-layer redundancies as well as temporal correlation...... are exploited. Coding results show the superior efficiency of the proposed scheme compared with MPEG-4...
Spector, B.; Eilbert, L.; Finando, S.; Fukuda, F.
A Video Integrated Measurement (VIM) System is described which incorporates the use of various noninvasive diagnostic procedures (moire contourography, electromyography, posturometry, infrared thermography, etc.), used individually or in combination, for the evaluation of neuromusculoskeletal and other disorders and their management with biofeedback and other therapeutic procedures. The system provides for measuring individual diagnostic and therapeutic modes, or multiple modes by split screen superimposition, of real time (actual) images of the patient and idealized (ideal-normal) models on a video monitor, along with analog and digital data, graphics, color, and other transduced symbolic information. It is concluded that this system provides an innovative and efficient method by which the therapist and patient can interact in biofeedback training/learning processes and holds considerable promise for more effective measurement and treatment of a wide variety of physical and behavioral disorders.
Gershkoff, I.; Haspert, J. K.; Morgenstern, B.
A cost model that can be used to systematically identify the costs of procuring and operating satellite linked communications systems is described. The user defines a network configuration by specifying the location of each participating site, the interconnection requirements, and the transmission paths available for the uplink (studio to satellite), downlink (satellite to audience), and voice talkback (between audience and studio) segments of the network. The model uses this information to calculate the least expensive signal distribution path for each participating site. Cost estimates are broken downy by capital, installation, lease, operations and maintenance. The design of the model permits flexibility in specifying network and cost structure.
Mahadevan, Swaminatha V; Gisondi, Michael A; Sovndal, Shannon S; Gilbert, Gregory H
To assure a smooth transition to their new work environment, rotating students and housestaff require detailed orientations to the physical layout and operations of the emergency department. Although such orientations are useful for new staff members, they represent a significant time commitment for the faculty members charged with this task. To address this issue, the authors developed a series of short instructional videos that provide a comprehensive and consistent method of emergency department orientation. The videos are viewed through Web-based streaming technology that allows learners to complete the orientation process from any computer with Internet access before their first shift. This report describes the stepwise process used to produce these videos and discusses the potential benefits of converting to an Internet-based orientation system.
Belyaev, Evgeny; Turlikov, Andrey; Ukhanova, Ann
This paper presents latency-constrained video transmission over high-speed wireless personal area networks (WPANs). Low-power video compression is proposed as an alternative to uncompressed video transmission. A video source rate control based on MINMAX quality criteria is introduced. Practical...... results for video encoder based on H.264/AVC standard are also given....
Belyaev, Evgeny; Turlikov, Andrey; Ukhanova, Ann
This paper presents latency-constrained video transmission over high-speed wireless personal area networks (WPANs). Low-power video compression is proposed as an alternative to uncompressed video transmission. A video source rate control based on MINMAX quality criteria is introduced. Practical results for video encoder based on H.264/AVC standard are also given.
Wouters, Pieter; Tabbers, Huib; Paas, Fred
textabstractIn this review we argue that interactivity can be effective in video-based models to engage learners in relevant cognitive processes. We do not treat modeling as an isolated instructional method but adopted the social cognitive model of sequential skill acquisition in which learners start with observation and finish with independent, self-regulated performance. Moreover, we concur with the notion that interactivity should emphasize the cognitive processes that learners engage in w...
Ibrahim, Mohamad A.
This thesis reports original work on hardware realization of symmetric video encryption using chaos-based continuous systems as pseudo-random number generators. The thesis also presents some of the serious degradations caused by digitally implementing chaotic systems. Subsequently, some techniques to eliminate such defects, including the ultimately adopted scheme are listed and explained in detail. Moreover, the thesis describes original work on the design of an encryption system to encrypt MPEG-2 video streams. Information about the MPEG-2 standard that fits this design context is presented. Then, the security of the proposed system is exhaustively analyzed and the performance is compared with other reported systems, showing superiority in performance and security. The thesis focuses more on the hardware and the circuit aspect of the system’s design. The system is realized on Xilinx Vetrix-4 FPGA with hardware parameters and throughput performance surpassing conventional encryption systems.
Aghdasi, Hadi S.; Abbaspour, Maghsoud; Moghadam, Mohsen Ebrahimi; Samei, Yasaman
Technological progress in the fields of Micro Electro-Mechanical Systems (MEMS) and wireless communications and also the availability of CMOS cameras, microphones and small-scale array sensors, which may ubiquitously capture multimedia content from the field, have fostered the development of low-cost limited resources Wireless Video-based Sensor Networks (WVSN). With regards to the constraints of video-based sensor nodes and wireless sensor networks, a supporting video stream is not easy to implement with the present sensor network protocols. In this paper, a thorough architecture is presented for video transmission over WVSN called Energy-efficient and high-Quality Video transmission Architecture (EQV-Architecture). This architecture influences three layers of communication protocol stack and considers wireless video sensor nodes constraints like limited process and energy resources while video quality is preserved in the receiver side. Application, transport, and network layers are the layers in which the compression protocol, transport protocol, and routing protocol are proposed respectively, also a dropping scheme is presented in network layer. Simulation results over various environments with dissimilar conditions revealed the effectiveness of the architecture in improving the lifetime of the network as well as preserving the video quality. PMID:27873772
Full Text Available Increase in number of elderly people who are living independently needs especial care in the form of healthcare monitoring systems. Recent advancements in depth video technologies have made human activity recognition (HAR realizable for elderly healthcare applications. In this paper, a depth video-based novel method for HAR is presented using robust multi-features and embedded Hidden Markov Models (HMMs to recognize daily life activities of elderly people living alone in indoor environment such as smart homes. In the proposed HAR framework, initially, depth maps are analyzed by temporal motion identification method to segment human silhouettes from noisy background and compute depth silhouette area for each activity to track human movements in a scene. Several representative features, including invariant, multi-view differentiation and spatiotemporal body joints features were fused together to explore gradient orientation change, intensity differentiation, temporal variation and local motion of specific body parts. Then, these features are processed by the dynamics of their respective class and learned, modeled, trained and recognized with specific embedded HMM having active feature values. Furthermore, we construct a new online human activity dataset by a depth sensor to evaluate the proposed features. Our experiments on three depth datasets demonstrated that the proposed multi-features are efficient and robust over the state of the art features for human action and activity recognition.
Full Text Available We investigate the video assignment problem of a hierarchical Video-on-Demand (VOD system in heterogeneous environments where different quality levels of videos can be encoded using either replication or layering. In such systems, videos are delivered to clients either through a proxy server or video broadcast/unicast channels. The objective of our work is to determine the appropriate coding strategy as well as the suitable delivery mechanism for a specific quality level of a video such that the overall system blocking probability is minimized. In order to find a near-optimal solution for such a complex video assignment problem, an evolutionary approach based on genetic algorithm (GA is proposed. From the results, it is shown that the system performance can be significantly enhanced by efficiently coupling the various techniques.
Aghito, Shankar Manuel; Stegmann, Mikkel Bille; Forchhammer, Søren
Aspects of video communication based on gaze interaction are considered. The overall idea is to use gaze interaction to control video, e.g. for video conferencing. Towards this goal, animation of a facial mask is demonstrated. The animation is based on images using Active Appearance Models (AAM...
Lee, Gil-Beom; Lee, Myeong-Jin; Lee, Woo-Kyung; Park, Joo-Heon; Kim, Tae-Hwan
Intelligent video surveillance systems detect pre-configured surveillance events through background modeling, foreground and object extraction, object tracking, and event detection. Shadow regions inside video frames sometimes appear as foreground objects, interfere with ensuing processes, and finally degrade the event detection performance of the systems. Conventional studies have mostly used intensity, color, texture, and geometric information to perform shadow detection in daytime video, but these methods lack the capability of removing shadows in nighttime video. In this paper, a novel shadow detection algorithm for nighttime video is proposed; this algorithm partitions each foreground object based on the object's vertical histogram and screens out shadow objects by validating their orientations heading toward regions of light sources. From the experimental results, it can be seen that the proposed algorithm shows more than 93.8% shadow removal and 89.9% object extraction rates for nighttime video sequences, and the algorithm outperforms conventional shadow removal algorithms designed for daytime videos.
Smithwick, Erica; Baxter, Emily; Kim, Kyung; Edel-Malizia, Stephanie; Rocco, Stevie; Blackstock, Dean
Two forms of interactive video were assessed in an online course focused on conservation. The hypothesis was that interactive video enhances student perceptions about learning and improves mental models of social-ecological systems. Results showed that students reported greater learning and attitudes toward the subject following interactive video.…
Cofield, Jay L.
Streaming video used as an augmentation in Web-based instruction was investigated to: (1) determine if demographic characteristics would lead to significantly different beliefs about the use and perceived effectiveness of streaming video, and (2) whether or not there are characteristics of streaming video that would lead to beliefs about the…
Corl, Frank M; Johnson, Pamela T; Rowell, Melissa R; Fishman, Elliot K
Video "podcasting" is an Internet-based publication and syndication technology that is defined as the process of capturing, editing, distributing, and downloading audio, video, and general multimedia productions. The expanded capacity for visual components allows radiologists to view still and animated media. These image-viewing characteristics and the ease of widespread delivery are well suited for radiologic education. This article presents detailed information about how to generate and distribute a video podcast using a Macintosh platform.
Yang, Jie Chi; Huang, Yi Ting; Tsai, Chi Cheng; Chung, Ching I.; Wu, Yu Chieh
In recent years, using video as a learning resource has received a lot of attention and has been successfully applied to many learning activities. In comparison with text-based learning, video learning integrates more multimedia resources, which usually motivate learners more than texts. However, one of the major limitations of video learning is…
Qu, Zhiguo; Chen, Siyi; Ji, Sai
As one of important multimedia forms in quantum network, quantum video attracts more and more attention of experts and scholars in the world. A secure quantum video steganography protocol with large payload based on the video strip encoding method called as MCQI (Multi-Channel Quantum Images) is proposed in this paper. The new protocol randomly embeds the secret information with the form of quantum video into quantum carrier video on the basis of unique features of video frames. It exploits to embed quantum video as secret information for covert communication. As a result, its capacity are greatly expanded compared with the previous quantum steganography achievements. Meanwhile, the new protocol also achieves good security and imperceptibility by virtue of the randomization of embedding positions and efficient use of redundant frames. Furthermore, the receiver enables to extract secret information from stego video without retaining the original carrier video, and restore the original quantum video as a follow. The simulation and experiment results prove that the algorithm not only has good imperceptibility, high security, but also has large payload.
Khosla, Deepak; Huber, David J.; Bhattacharyya, Rajan
In this paper, we describe an algorithm and system for optimizing search and detection performance for "items of interest" (IOI) in large-sized images and videos that employ the Rapid Serial Visual Presentation (RSVP) based EEG paradigm and surprise algorithms that incorporate motion processing to determine whether static or video RSVP is used. The system works by first computing a motion surprise map on image sub-regions (chips) of incoming sensor video data and then uses those surprise maps to label the chips as either "static" or "moving". This information tells the system whether to use a static or video RSVP presentation and decoding algorithm in order to optimize EEG based detection of IOI in each chip. Using this method, we are able to demonstrate classification of a series of image regions from video with an azimuth value of 1, indicating perfect classification, over a range of display frequencies and video speeds.
The video coding and distribution approach presented in this paper has two key characteristics that make it ideal for integration of video communication services over common broadband digital networks. The modular multi-resolution nature of the coding scheme provides the necessary flexibility to accommodate future advances in video technology as well as robust distribution over various network environments. This paper will present an efficient and scalable coding scheme for video communications. The scheme is capable of encoding and decoding video signals in a hierarchical, multilayer fashion to provide video at differing quality grades. Subsequently, the utilization of this approach to enable efficient bandwidth sharing and robust distribution of video signals in multipoint communications is presented. Coding and distribution architectures are discussed which include multi-party communications in a multi-window fashion within ATM environments. Furthermore, under the limited capabilities typical of wideband/broadband access networks, this architecture accommodates important video-based service applications such as Interactive Distance Learning.
Spyrou, Evaggelos; Iakovidis, Dimitris K.
The wireless capsule endoscope is a swallowable medical device equipped with a miniature camera enabling the visual examination of the gastrointestinal (GI) tract. It wirelessly transmits thousands of images to an external video recording system, while its location and orientation are being tracked approximately by external sensor arrays. In this paper we investigate a video-based approach to tracking the capsule endoscope without requiring any external equipment. The proposed method involves extraction of speeded up robust features from video frames, registration of consecutive frames based on the random sample consensus algorithm, and estimation of the displacement and rotation of interest points within these frames. The results obtained by the application of this method on wireless capsule endoscopy videos indicate its effectiveness and improved performance over the state of the art. The findings of this research pave the way for a cost-effective localization and travel distance measurement of capsule endoscopes in the GI tract, which could contribute in the planning of more accurate surgical interventions.
Gaydos, Matthew J.
This paper presents a series of studies detailing the research and development of the educational science video game "Citizen Science." It documents the design process, beginning with the initial grant and ending with a case study of two teachers who used the game in their classrooms. Following a design-based research approach, this…
Full Text Available This paper presents an H.323 standard compliant virtual video conferencing system. The proposed system not only serves as a multipoint control unit (MCU for multipoint connection but also provides a gateway function between the H.323 LAN (local-area network and the H.324 WAN (wide-area network users. The proposed virtual video conferencing system provides user-friendly object compositing and manipulation features including 2D video object scaling, repositioning, rotation, and dynamic bit-allocation in a 3D virtual environment. A reliable, and accurate scheme based on background image mosaics is proposed for real-time extracting and tracking foreground video objects from the video captured with an active camera. Chroma-key insertion is used to facilitate video objects extraction and manipulation. We have implemented a prototype of the virtual conference system with an integrated graphical user interface to demonstrate the feasibility of the proposed methods.
Allin, Thomas Højgaard; Neubert, Torsten; Laursen, Steen
at the Pic du Midi Observatory in Southern France, the system was operational during the period from July 18 to September 15, 2003. The video system, based two low-light, non-intensified CCD video cameras, was mounted on top of a motorized pan/tilt unit. The cameras and the pan/tilt unit were controlled over...
Mihajlovic, V.; Petkovic, M.
Content-based video retrieval is emerging as an important part in the process of utilization of various multimedia documents. In this report we present a novel system for the automatic indexing and content-based retrieval of multimedia documents. We chose the domain of Formula 1 sport videos because
Gudumasu, Srinivas; Hamza, Ahmed; Asbun, Eduardo; He, Yong; Ye, Yan
This paper proposes a layer based buffer aware rate adaptation design which is able to avoid abrupt video quality fluctuation, reduce re-buffering latency and improve bandwidth utilization when compared to a conventional simulcast based adaptive streaming system. The proposed adaptation design schedules DASH segment requests based on the estimated bandwidth, dependencies among video layers and layer buffer fullness. Scalable HEVC video coding is the latest state-of-art video coding technique that can alleviate various issues caused by simulcast based adaptive video streaming. With scalable coded video streams, the video is encoded once into a number of layers representing different qualities and/or resolutions: a base layer (BL) and one or more enhancement layers (EL), each incrementally enhancing the quality of the lower layers. Such layer based coding structure allows fine granularity rate adaptation for the video streaming applications. Two video streaming use cases are presented in this paper. The first use case is to stream HD SHVC video over a wireless network where available bandwidth varies, and the performance comparison between proposed layer-based streaming approach and conventional simulcast streaming approach is provided. The second use case is to stream 4K/UHD SHVC video over a hybrid access network that consists of a 5G millimeter wave high-speed wireless link and a conventional wired or WiFi network. The simulation results verify that the proposed layer based rate adaptation approach is able to utilize the bandwidth more efficiently. As a result, a more consistent viewing experience with higher quality video content and minimal video quality fluctuations can be presented to the user.
Full Text Available Technological progress in the fields of Micro Electro-Mechanical Systems (MEMS and wireless communications and also the availability of CMOS cameras, microphones and small-scale array sensors, which may ubiquitously capture multimedia content from the field, have fostered the development of low-cost limited resources Wireless Video-based Sensor Networks (WVSN. With regards to the constraints of videobased sensor nodes and wireless sensor networks, a supporting video stream is not easy to implement with the present sensor network protocols. In this paper, a thorough architecture is presented for video transmission over WVSN called Energy-efficient and high-Quality Video transmission Architecture (EQV-Architecture. This architecture influences three layers of communication protocol stack and considers wireless video sensor nodes constraints like limited process and energy resources while video quality is preserved in the receiver side. Application, transport, and network layers are the layers in which the compression protocol, transport protocol, and routing protocol are proposed respectively, also a dropping scheme is presented in network layer. Simulation results over various environments with dissimilar conditions revealed the effectiveness of the architecture in improving the lifetime of the network as well as preserving the video quality.
Aghdasi, Hadi S; Abbaspour, Maghsoud; Moghadam, Mohsen Ebrahimi; Samei, Yasaman
Technological progress in the fields of Micro Electro-Mechanical Systems (MEMS) and wireless communications and also the availability of CMOS cameras, microphones and small-scale array sensors, which may ubiquitously capture multimedia content from the field, have fostered the development of low-cost limited resources Wireless Video-based Sensor Networks (WVSN). With regards to the constraints of videobased sensor nodes and wireless sensor networks, a supporting video stream is not easy to implement with the present sensor network protocols. In this paper, a thorough architecture is presented for video transmission over WVSN called Energy-efficient and high-Quality Video transmission Architecture (EQV-Architecture). This architecture influences three layers of communication protocol stack and considers wireless video sensor nodes constraints like limited process and energy resources while video quality is preserved in the receiver side. Application, transport, and network layers are the layers in which the compression protocol, transport protocol, and routing protocol are proposed respectively, also a dropping scheme is presented in network layer. Simulation results over various environments with dissimilar conditions revealed the effectiveness of the architecture in improving the lifetime of the network as well as preserving the video quality.
Mikulec, Martin; Voznak, Miroslav; Safarik, Jakub; Partila, Pavol; Rozhon, Jan; Mehic, Miralem
The paper deals with presentation of the IVAS system within the 7FP EU INDECT project. The INDECT project aims at developing the tools for enhancing the security of citizens and protecting the confidentiality of recorded and stored information. It is a part of the Seventh Framework Programme of European Union. We participate in INDECT portal and the Interactive Video Audio System (IVAS). This IVAS system provides a communication gateway between police officers working in dispatching centre and police officers in terrain. The officers in dispatching centre have capabilities to obtain information about all online police officers in terrain, they can command officers in terrain via text messages, voice or video calls and they are able to manage multimedia files from CCTV cameras or other sources, which can be interesting for officers in terrain. The police officers in terrain are equipped by smartphones or tablets. Besides common communication, they can reach pictures or videos sent by commander in office and they can respond to the command via text or multimedia messages taken by their devices. Our IVAS system is unique because we are developing it according to the special requirements from the Police of the Czech Republic. The IVAS communication system is designed to use modern Voice over Internet Protocol (VoIP) services. The whole solution is based on open source software including linux and android operating systems. The technical details of our solution are presented in the paper.
DomŻał, Mariusz; Jedrasiak, Karol; Sobel, Dawid; Ryt, Artur; Nawrat, Aleksander
Video motion magnification (VMM) allows people see otherwise not visible subtle changes in surrounding world. VMM is also capable of hiding them with a modified version of the algorithm. It is possible to magnify motion related to breathing of patients in hospital to observe it or extinguish it and extract other information from stabilized image sequence for example blood flow. In both cases we would like to perform calculations in real time. Unfortunately, the VMM algorithm requires a great amount of computing power. In the article we suggest that VMM algorithm can be parallelized (each thread processes one pixel) and in order to prove that we implemented the algorithm on GPU using CUDA technology. CPU is used only to grab, write, display frame and schedule work for GPU. Each GPU kernel performs spatial decomposition, reconstruction and motion amplification. In this work we presented approach that achieves a significant speedup over existing methods and allow to VMM process video in real-time. This solution can be used as preprocessing for other algorithms in more complex systems or can find application wherever real time motion magnification would be useful. It is worth to mention that the implementation runs on most modern desktops and laptops compatible with CUDA technology.
In recent years, the proliferation of available video content and the popularity of the Internet have encouraged service providers to develop new ways of distributing content to clients. Increasing video scaling ratios and advanced digital signal processing techniques have led to Internet Video-on-Demand applications, but these currently lack efficiency and quality. Scalable Video on Demand: Adaptive Internet-based Distribution examines how current video compression and streaming can be used to deliver high-quality applications over the Internet. In addition to analysing the problems
Chang, Chia-Hu; Wu, Ja-Ling
With the development of content-based multimedia analysis, virtual content insertion has been widely used and studied for video enrichment and multimedia advertising. However, how to automatically insert a user-selected virtual content into personal videos in a less-intrusive manner, with an attractive representation, is a challenging problem. In this chapter, we present an evolution-based virtual content insertion system which can insert virtual contents into videos with evolved animations according to predefined behaviors emulating the characteristics of evolutionary biology. The videos are considered not only as carriers of message conveyed by the virtual content but also as the environment in which the lifelike virtual contents live. Thus, the inserted virtual content will be affected by the videos to trigger a series of artificial evolutions and evolve its appearances and behaviors while interacting with video contents. By inserting virtual contents into videos through the system, users can easily create entertaining storylines and turn their personal videos into visually appealing ones. In addition, it would bring a new opportunity to increase the advertising revenue for video assets of the media industry and online video-sharing websites.
Shahid, M.; Søgaard, Jacob; Pokhrel, J.
humans are considered to be the most valid method of the as- sessment of QoE. Besides lab-based subjective experiments, crowdsourcing based subjective assessment of video quality is gaining popularity as an alternative method. This paper presents insights into a study that investigates perceptual pref......- erences of various adaptive video streaming scenarios through crowdsourcing based subjective quality assessment....
Kuo, Yung-Lung; Lee, Jiann-Shu; Hsieh, Min-Chai
Eye and head movements evoked in response to obvious visual attention shifts. However, there has been little progress on the causes of absent-mindedness so far. The paper proposes an attention awareness system that captures the conditions regarding the interaction of eye gaze and head pose under various attentional switching in computer classroom.…
Bardram, Jakob Eyvind; Bossen, Claus; Lykke-Olesen, Andreas
Virtual studio technology enables the mixing of physical and digital 3D objects and thus expands the way of representing design ideas in terms of virtual video prototypes, which offers new possibilities for designers by combining elements of prototypes, mock-ups, scenarios, and conventional video....... In this article we report our initial experience in the domain of pervasive healthcare with producing virtual video prototypes and using them in a design workshop. Our experience has been predominantly favourable. The production of a virtual video prototype forces the designers to decide very concrete design...
Bardram, Jakob; Bossen, Claus; Lykke-Olesen, Andreas
Virtual studio technology enables the mixing of physical and digital 3D objects and thus expands the way of representing design ideas in terms of virtual video prototypes, which offers new possibilities for designers by combining elements of prototypes, mock-ups, scenarios, and conventional video....... In this article we report our initial experience in the domain of pervasive healthcare with producing virtual video prototypes and using them in a design workshop. Our experience has been predominantly favourable. The production of a virtual video prototype forces the designers to decide very concrete design...
Moreno, J; Ramos-Castro, J; Movellan, J; Parrado, E; Rodas, G; Capdevila, L
Our aim is to demonstrate the usefulness of photoplethysmography (PPG) for analyzing heart rate variability (HRV) using a standard 5-min test at rest with paced breathing, comparing the results with real RR intervals and testing supine and sitting positions. Simultaneous recordings of R-R intervals were conducted with a Polar system and a non-contact PPG, based on facial video recording on 20 individuals. Data analysis and editing were performed with individually designated software for each instrument. Agreement on HRV parameters was assessed with concordance correlations, effect size from ANOVA and Bland and Altman plots. For supine position, differences between video and Polar systems showed a small effect size in most HRV parameters. For sitting position, these differences showed a moderate effect size in most HRV parameters. A new procedure, based on the pixels that contained more heart beat information, is proposed for improving the signal-to-noise ratio in the PPG video signal. Results were acceptable in both positions but better in the supine position. Our approach could be relevant for applications that require monitoring of stress or cardio-respiratory health, such as effort/recuperation states in sports. © Georg Thieme Verlag KG Stuttgart · New York.
Lemma, Aweke; van der Veen, Michiel; Celik, Mehmet
Successful watermarking algorithms have already been developed for various applications ranging from meta-data tagging to forensic tracking. Nevertheless, it is commendable to develop alternative watermarking techniques that provide a broader basis for meeting emerging services, usage models and security threats. To this end, we propose a new multiplicative watermarking technique for video, which is based on the principles of our successful MASK audio watermark. Audio-MASK has embedded the watermark by modulating the short-time envelope of the audio signal and performed detection using a simple envelope detector followed by a SPOMF (symmetrical phase-only matched filter). Video-MASK takes a similar approach and modulates the image luminance envelope. In addition, it incorporates a simple model to account for the luminance sensitivity of the HVS (human visual system). Preliminary tests show algorithms transparency and robustness to lossy compression.
Video Streaming is nowadays the Internet’s biggest source of consumer traffic. Traditional content providers rely on centralised client-server model for distributing their video streaming content. The current generation is moving from being passive viewers, or content consumers, to active content
Qin, Jinlei; Li, Zheng; Niu, Yuguang
The article put forward a method which had been used for video image acquisition and processing, and a system based on Java media framework (JMF) had been implemented by it. The method could be achieved not only by B/S mode but also by C/S mode taking advantage of the predominance of the Java language. Some key issues such as locating video data source, playing video, video image acquisition and processing and so on had been expatiated in detail. The operation results of the system show that this method is fully compatible with common video capture device. At the same time the system possesses many excellences as lower cost, more powerful, easier to develop and cross-platform etc. Finally the application prospect of the method which is based on java and JMF is pointed out.
Video-based teaching material is a rich and powerful medium being used in computer assisted learning. This paper aimed to assess the learning outcomes and student nurses' acceptance and satisfaction with the video-based lectures versus the traditional method of teaching human anatomy and physiology courses.
Bourennane, Salah; Fossati, Caroline; Ketchantang, William
Among existing biometrics, iris recognition systems are among the most accurate personal biometric identification systems. However, the acquisition of a workable iris image requires strict cooperation of the user; otherwise, the image will be rejected by a verification module because of its poor quality, inducing a high false reject rate (FRR). The FRR may also increase when iris localization fails or when the pupil is too dilated. To improve the existing methods, we propose to use video sequences acquired in real time by a camera. In order to keep the same computational load to identify the iris, we propose a new method to estimate the iris characteristics. First, we propose a new iris texture characterization based on Fourier-Mellin transform, which is less sensitive to pupil dilatations than previous methods. Then, we develop a new iris localization algorithm that is robust to variations of quality (partial occlusions due to eyelids and eyelashes, light reflects, etc.), and finally, we introduce a fast and new criterion of suitable image selection from an iris video sequence for an accurate recognition. The accuracy of each step of the algorithm in the whole proposed recognition process is tested and evaluated using our own iris video database and several public image databases, such as CASIA, UBIRIS, and BATH.
Hanjalic, Alan; Ceccarelli, Marco; Lagendijk, Reginald L.; Biemond, Jan
In the European project SMASH mass-market storage systems for domestic use are under study. Besides the storage technology that is developed in this project, the related objective of user-friendly browsing/query of video data is studied as well. Key issues in developing a user-friendly system are (1) minimizing the user-intervention in preparatory steps (extraction and storage of representative information needed for browsing/query), (2) providing an acceptable representation of the stored video content in view of a higher automation level, (3) the possibility for performing these steps directly on the incoming stream at storage time, and (4) parameter-robustness of algorithms used for these steps. This paper proposes and validate novel approaches for automation of mentioned preparatory phases. A detection method for abrupt shot changes is proposed, using locally computed threshold based on a statistical model for frame-to-frame differences. For the extraction of representative frames (key frames) an approach is presented which distributes a given number of key frames over the sequence depending on content changes in a temporal segment of the sequence. A multimedia database is introduced, able to automatically store all bibliographic information about a recorded video as well as a visual representation of the content without any manual intervention from the user.
Ryoo, M. S.; Matthies, Larry
In this evaluation paper, we discuss convolutional neural network (CNN)-based approaches for human activity recognition. In particular, we investigate CNN architectures designed to capture temporal information in videos and their applications to the human activity recognition problem. There have been multiple previous works to use CNN-features for videos. These include CNNs using 3-D XYT convolutional filters, CNNs using pooling operations on top of per-frame image-based CNN descriptors, and recurrent neural networks to learn temporal changes in per-frame CNN descriptors. We experimentally compare some of these different representatives CNNs while using first-person human activity videos. We especially focus on videos from a robots viewpoint, captured during its operations and human-robot interactions.
Hulsman, Robert L; van der Vloodt, Jane
Self-evaluation and peer-feedback are important strategies within the reflective practice paradigm for the development and maintenance of professional competencies like medical communication. Characteristics of the self-evaluation and peer-feedback annotations of medical students' video recorded communication skills were analyzed. Twenty-five year 4 medical students recorded history-taking consultations with a simulated patient, uploaded the video to a web-based platform, marked and annotated positive and negative events. Peers reviewed the video and self-evaluations and provided feedback. Analyzed were the number of marked positive and negative annotations and the amount of text entered. Topics and specificity of the annotations were coded and analyzed qualitatively. Students annotated on average more negative than positive events. Additional peer-feedback was more often positive. Topics most often related to structuring the consultation. Students were most critical about their biomedical topics. Negative annotations were more specific than positive annotations. Self-evaluations were more specific than peer-feedback and both show a significant correlation. Four response patterns were detected that negatively bias specificity assessment ratings. Teaching students to be more specific in their self-evaluations may be effective for receiving more specific peer-feedback. Videofragmentrating is a convenient tool to implement reflective practice activities like self-evaluation and peer-feedback to the classroom in the teaching of clinical skills. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Bao, Tianlong; Ding, Chunhui; Karmoshi, Saleem; Zhu, Ming
Face recognition has been widely studied recently while video-based face recognition still remains a challenging task because of the low quality and large intra-class variation of video captured face images. In this paper, we focus on two scenarios of video-based face recognition: 1)Still-to-Video(S2V) face recognition, i.e., querying a still face image against a gallery of video sequences; 2)Video-to-Still(V2S) face recognition, in contrast to S2V scenario. A novel method was proposed in this paper to transfer still and video face images to an Euclidean space by a carefully designed convolutional neural network, then Euclidean metrics are used to measure the distance between still and video images. Identities of still and video images that group as pairs are used as supervision. In the training stage, a joint loss function that measures the Euclidean distance between the predicted features of training pairs and expanding vectors of still images is optimized to minimize the intra-class variation while the inter-class variation is guaranteed due to the large margin of still images. Transferred features are finally learned via the designed convolutional neural network. Experiments are performed on COX face dataset. Experimental results show that our method achieves reliable performance compared with other state-of-the-art methods.
Abdollahian, Golnaz; Delp, Edward J.
Although considerable work has been done in management of "structured" video such as movies, sports, and television programs that has known scene structures, "unstructured" video analysis is still a challenging problem due to its unrestricted nature. The purpose of this paper is to address issues in the analysis of unstructured video and in particular video shot by a typical unprofessional user (i.e home video). We describe how one can make use of camera motion information for unstructured video analysis. A new concept, "camera viewing direction," is introduced as the building block of home video analysis. Motion displacement vectors are employed to temporally segment the video based on this concept. We then find the correspondence between the camera behavior with respect to the subjective importance of the information in each segment and describe how different patterns in the camera motion can indicate levels of interest in a particular object or scene. By extracting these patterns, the most representative frames, keyframes, for the scenes are determined and aggregated to summarize the video sequence.
Full Text Available Due to the development of action cameras, the use of video technology for collecting geo-spatial data becomes an important trend. The objective of this study is to compare the image-mode and video-mode of multiple action cameras for 3D point clouds generation. Frame images are acquired from discrete camera stations while videos are taken from continuous trajectories. The proposed method includes five major parts: (1 camera calibration, (2 video conversion and alignment, (3 orientation modelling, (4 dense matching, and (5 evaluation. As the action cameras usually have large FOV in wide viewing mode, camera calibration plays an important role to calibrate the effect of lens distortion before image matching. Once the camera has been calibrated, the author use these action cameras to take video in an indoor environment. The videos are further converted into multiple frame images based on the frame rates. In order to overcome the time synchronous issues in between videos from different viewpoints, an additional timer APP is used to determine the time shift factor between cameras in time alignment. A structure form motion (SfM technique is utilized to obtain the image orientations. Then, semi-global matching (SGM algorithm is adopted to obtain dense 3D point clouds. The preliminary results indicated that the 3D points from 4K video are similar to 12MP images, but the data acquisition performance of 4K video is more efficient than 12MP digital images.
Full Text Available We consider multiuser video transmission for users that are non-equidistantly positioned from base station. We propose a greedy algorithm for video streaming in a wireless system with capacity achieving channel coding, that implements the cross-layer principle by partially separating the physical and the application layer. In such a system the parameters at the physical layer are dependent on the packet length and the conditions in the wireless channel and the parameters at the application layer are dependent on the reduction of the expected distortion assuming no packet errors in the system. We also address the fairness in the multiuser video system with non-equidistantly positioned users. Our fairness algorithm is based on modified opportunistic round robin scheduling. We evaluate the performance of the proposed algorithms by simulating the transmission of H.264/AVC video signals in a TDMA wireless system.
Ömer Faruk VURAL
Full Text Available Video-based learning has been extensively incorporated to enhance instruction. The advanced communication technology has greatly increased the possibilities and relative value of delivering instructional video content in onlineeducation applications. Simple watching instructional video often results in poor learning outcomes. Therefore, current video-based learning resources are used in combination with other teaching methods. Concept mapping, one of teaching methods, can provide another form of this type of interactivity and may enhance the active learning capacity. The new learning tool, which consisted of video viewer, supporting text, and interactive concept map, was developed to investigate the effect of time spent interacting with the learning tool by creating concept maps relate to student achievement. The study results showed that there was no relationship found between student achievement and time spent interacting with the learning tool
Miller, Scott; Redman, S.
Students in online courses continue to lag their peers in comparable face-to-face (F2F) courses (Ury 2004, Slater & Jones 2004). A meta-study of web-based vs. classroom instruction by Sitzmann et al (2006) discovered that the degree of learner control positively influences the effectiveness of instruction: students do better when they are in control of their own learning. In particular, web-based courses are more effective when they incorporate a larger variety of instructional methods. To address this need, we developed a series of online videos to demonstrate various astronomical concepts and provided them to students enrolled in an online introductory astronomy course at Penn State University. We found that the online students performed worse than the F2F students on questions unrelated to the videos (t = -2.84), but that the online students who watched the videos performed better than the F2F students on related examination questions (t = 2.11). We also found that the online students who watched the videos performed significantly better than those who did not (t = 3.43). While the videos in general proved helpful, some videos were more helpful than others. We will discuss our thoughts on why this might be, and future plans to improve upon this study. These videos are freely available on iTunesU, YouTube, and Google Video.
Full Text Available Abstract Today's video surveillance systems are increasingly equipped with video content analysis for a great variety of applications. However, reliability and robustness of video content analysis algorithms remain an issue. They have to be measured against ground truth data in order to quantify the performance and advancements of new algorithms. Therefore, a variety of measures have been proposed in the literature, but there has neither been a systematic overview nor an evaluation of measures for specific video analysis tasks yet. This paper provides a systematic review of measures and compares their effectiveness for specific aspects, such as segmentation, tracking, and event detection. Focus is drawn on details like normalization issues, robustness, and representativeness. A software framework is introduced for continuously evaluating and documenting the performance of video surveillance systems. Based on many years of experience, a new set of representative measures is proposed as a fundamental part of an evaluation framework.
Full Text Available Today's video surveillance systems are increasingly equipped with video content analysis for a great variety of applications. However, reliability and robustness of video content analysis algorithms remain an issue. They have to be measured against ground truth data in order to quantify the performance and advancements of new algorithms. Therefore, a variety of measures have been proposed in the literature, but there has neither been a systematic overview nor an evaluation of measures for specific video analysis tasks yet. This paper provides a systematic review of measures and compares their effectiveness for specific aspects, such as segmentation, tracking, and event detection. Focus is drawn on details like normalization issues, robustness, and representativeness. A software framework is introduced for continuously evaluating and documenting the performance of video surveillance systems. Based on many years of experience, a new set of representative measures is proposed as a fundamental part of an evaluation framework.
de Jong, Franciska M.G.; Gauvain, Jean-Luc; den Hartog, Jurgen; den Hartog, Jeremy; Netter, Klaus
This paper describes the Olive project which aims to support automated indexing of video material by use of human language technologies. Olive is making use of speech recognition to automatically derive transcriptions of the sound tracks, generating time-coded linguistic elements which serve as the
Shi, Qisen; Song, Jianxin
With the rapid development of mobile live business, transcoding HD video is often a challenge for mobile devices due to their limited processing capability and bandwidth-constrained network connection. For live service providers, it's wasteful for resources to delay lots of transcoding server because some of them are free to work sometimes. To deal with this issue, this paper proposed an Openstack-based flexible transcoding framework to achieve real-time video adaption for mobile device and make computing resources used efficiently. To this end, we introduced a special method of video stream splitting and VMs resource scheduling based on access pressure prediction,which is forecasted by an AR model.
This paper researches LSB matching VSA based on H265 protocol with the research background of 26 original Video sequences, it firstly extracts classification features out from training samples as input of SVM, and trains in SVM to obtain high-quality category classification model, and then tests whether there is suspicious information in the video sample. The experimental results show that VSA algorithm based on LSB matching can be more practical to obtain all frame embedded secret information and carrier and video of local frame embedded. In addition, VSA adopts the method of frame by frame with a strong robustness in resisting attack in the corresponding time domain.
Angehrn, Albert; Maxwell, Katrina
Angehrn, A. A., & Maxwell, K. (2008). TENTube: A video-based connection tool supporting competence development. In H. W. Sligte & R. Koper (Eds.), Proceedings of the 4th TENCompetence Open Workshop. Empowering Learners for Lifelong Competence Development: pedagogical, organisational and
Bae, Kyung-Hoon; Kim, Seung-Cheol; Hwang, Yong Seok; Kim, Eun-Soo
In this paper, variable disparity-motion estimation (VDME) based 3-view video coding is proposed. In the encoding, key-frame coding (KFC) based motion estimation and variable disparity estimation (VDE) for effectively fast three-view video encoding are processed. These proposed algorithms enhance the performance of 3-D video encoding/decoding system in terms of accuracy of disparity estimation and computational overhead. From some experiments, stereo sequences of 'Pot Plant' and 'IVO', it is shown that the proposed algorithm's PSNRs is 37.66 and 40.55 dB, and the processing time is 0.139 and 0.124 sec/frame, respectively.
Full Text Available Most universities are already implementing wired and wireless network that is used to access integrated information systems and the Internet. At present it is important to do research on the influence of the broadcasting system through the access point for video transmitter learning in the university area. At every university computer network through the access point must also use the cable in its implementation. These networks require cables that will connect and transmit data from one computer to another computer. While wireless networks of computers connected through radio waves. This research will be a test or assessment of how the influence of the network using the WLAN access point for video broadcasting means learning from the server to the client. Instructional video broadcasting from the server to the client via the access point will be used for video broadcasting means of learning. This study aims to understand how to build a wireless network by using an access point. It also builds a computer server as instructional videos supporting software that can be used for video server that will be emitted by broadcasting via the access point and establish a system of transmitting video from the server to the client via the access point.
Ozil, Ipek; Plawecki, Martin H; Doerschuk, Peter C; O'Connor, Sean J
The influence of family history and genetics on the risk for the development of abuse or dependence is a major theme in alcoholism research. Recent research have used endophenotypes and behavioral paradigms to help detect further genetic contributions to this disease. Electronic tasks, essentially video games, which provide alcohol as a reward in controlled environments and with specified exposures have been developed to explore some of the behavioral and subjective characteristics of individuals with or at risk for alcohol substance use disorders. A generative model (containing parameters with unknown values) of a simple game involving a progressive work paradigm is described along with the associated point process signal processing that allows system identification of the model. The system is demonstrated on human subject data. The same human subject completing the task under different circumstances, e.g., with larger and smaller alcohol reward values, is assigned different parameter values. Potential meanings of the different parameter values are described.
Pappas, Ilias O.; Giannakos, Michail N.; Mikalef, Patrick
The use of video-based open educational resources is widespread, and includes multiple approaches to implementation. In this paper, the term "with-video assignments" is introduced to portray video learning resources enhanced with assignments. The goal of this study is to examine the factors that influence students' intention to adopt…
Full Text Available Abstract Interest in 3D video applications and systems is growing rapidly and technology is maturating. It is expected that multiview autostereoscopic displays will play an important role in home user environments, since they support multiuser 3D sensation and motion parallax impression. The tremendous data rate cannot be handled efficiently by representation and coding formats such as MVC or MPEG-C Part 3. Multiview video plus depth (MVD is a new format that efficiently supports such advanced 3DV systems, but this requires high-quality intermediate view synthesis. For this, a new approach is presented that separates unreliable image regions along depth discontinuities from reliable image regions, which are treated separately and fused to the final interpolated view. In contrast to previous layered approaches, our algorithm uses two boundary layers and one reliable layer, performs image-based 3D warping only, and was generically implemented, that is, does not necessarily rely on 3D graphics support. Furthermore, different hole-filling and filtering methods are added to provide high-quality intermediate views. As a result, high-quality intermediate views for an existing 9-view auto-stereoscopic display as well as other stereo- and multiscopic displays are presented, which prove the suitability of our approach for advanced 3DV systems.
Mohamed M. Fouad
Full Text Available In this paper, we present a modified inter-view prediction Multiview Video Coding (MVC scheme from the perspective of viewer's interactivity. When a viewer requests some view(s, our scheme leads to lower transmission bit-rate. We develop an interactive multiview video streaming system exploiting that modified MVC scheme. Conventional interactive multiview video systems require high bandwidth due to redundant data being transferred. With real data test sequences, clear improvements are shown using the proposed interactive multiview video system compared to competing ones in terms of the average transmission bit-rate and storage size of the decoded (i.e., transferred data with comparable rate-distortion.
B. D. Reljin
Full Text Available Extracting video shots is an essential preprocessing step to almost all video analysis, indexing, and other content-based operations. This process is equivalent to detecting the shot boundaries in a video. In this paper we presents video Shot Boundary Detection (SBD based on Multifractal Analysis (MA. Low-level features (color and texture features are extracted from each frame in video sequence. Features are concatenated in feature vectors (FVs and stored in feature matrix. Matrix rows correspond to FVs of frames from video sequence, while columns are time series of particular FV component. Multifractal analysis is applied to FV component time series, and shot boundaries are detected as high singularities of time series above pre defined treshold. Proposed SBD method is tested on real video sequence with 64 shots, with manually labeled shot boundaries. Detection accuracy depends on number FV components used. For only one FV component detection accuracy lies in the range 76-92% (depending on selected threshold, while by combining two FV components all shots are detected completely (accuracy of 100%.
Shopovska, Ivana; Jovanov, Ljubomir; Goossens, Bart; Philips, Wilfried
High dynamic range (HDR) image generation from a number of differently exposed low dynamic range (LDR) images has been extensively explored in the past few decades, and as a result of these efforts a large number of HDR synthesis methods have been proposed. Since HDR images are synthesized by combining well-exposed regions of the input images, one of the main challenges is dealing with camera or object motion. In this paper we propose a method for the synthesis of HDR video from a single camera using multiple, differently exposed video frames, with circularly alternating exposure times. One of the potential applications of the system is in driver assistance systems and autonomous vehicles, involving significant camera and object movement, non- uniform and temporally varying illumination, and the requirement of real-time performance. To achieve these goals simultaneously, we propose a HDR synthesis approach based on weighted averaging of aligned radiance maps. The computational complexity of high-quality optical flow methods for motion compensation is still pro- hibitively high for real-time applications. Instead, we rely on more efficient global projective transformations to solve camera movement, while moving objects are detected by thresholding the differences between the trans- formed and brightness adapted images in the set. To attain temporal consistency of the camera motion in the consecutive HDR frames, the parameters of the perspective transformation are stabilized over time by means of computationally efficient temporal filtering. We evaluated our results on several reference HDR videos, on synthetic scenes, and using 14-bit raw images taken with a standard camera.
Zhao, Baoquan; Xu, Songhua; Lin, Shujin; Luo, Xiaonan; Duan, Lian
... shall be “Open Video System Notice of Intent” and “Attention: Media Bureau.” This wording shall be... Notice of Intent with the Office of the Secretary and the Bureau Chief, Media Bureau. The Notice of... capacity through a fair, open and non-discriminatory process; the process must be insulated from any bias...
Rusanovskyy, Dmytro; Hannuksela, Miska M.; Su, Wenyi
This paper describes a novel approach of using depth information for advanced coding of associated video data in Multiview Video plus Depth (MVD)-based 3D video systems. As a possible implementation of this conception, we describe two coding tools that have been developed for H.264/AVC based 3D Video Codec as response to Moving Picture Experts Group (MPEG) Call for Proposals (CfP). These tools are Depth-based Motion Vector Prediction (DMVP) and Backward View Synthesis Prediction (BVSP). Simulation results conducted under JCT-3V/MPEG 3DV Common Test Conditions show, that proposed in this paper tools reduce bit rate of coded video data by 15% of average delta bit rate reduction, which results in 13% of bit rate savings on total for the MVD data over the state-of-the-art MVC+D coding. Moreover, presented in this paper conception of depth-based coding of video has been further developed by MPEG 3DV and JCT-3V and this work resulted in even higher compression efficiency, bringing about 20% of delta bit rate reduction on total for coded MVD data over the reference MVC+D coding. Considering significant gains, proposed in this paper coding approach can be beneficial for development of new 3D video coding standards. [Figure not available: see fulltext.
Full Text Available A novel video conference system is developed. Suppose that three people A, B, and C attend the video conference, the proposed system enables eye contact among every pair. Furthermore, when B and C chat, A feels as if B and C were facing each other (eye contact seems to be kept among B and C. In the case of a triangle video conference, the respective video system is composed of a half mirror, two video cameras, and two monitors. Each participant watches other participants' images that are reflected by the half mirror. Cameras are set behind the half mirror. Since participants' image (face and the camera position are adjusted to be the same direction, eye contact is kept and conversation becomes very natural compared with conventional video conference systems where participants' eyes do not point to the other participant. When 3 participants sit at the vertex of an equilateral triangle, eyes can be kept even for the situation mentioned above (eye contact between B and C from the aspect of A. Eye contact can be kept not only for 2 or 3 participants but also any number of participants as far as they sit at the vertex of a regular polygon.
Dagnaes-Hansen, Julia; Mahmood, Oria; Bube, Sarah
OBJECTIVE: Direct observation in assessment of clinical skills is prone to bias, demands the observer to be present at a certain location at a specific time, and is time-consuming. Video-based assessment could remove the risk of bias, increase flexibility, and reduce the time spent on assessment....... This study investigated if video-based assessment was a reliable tool for cystoscopy and if direct observers were prone to bias compared with video-raters. DESIGN: This study was a blinded observational trial. Twenty medical students and 9 urologists were recorded during 2 cystoscopies and rated by a direct...... observer and subsequently by 2 blinded video-raters on a global rating scale (GRS) for cystoscopy. Both intrarater and interrater reliability were explored. Furthermore, direct observer bias was explored by a paired samples t-test. RESULTS: Intrarater reliability calculated by Pearson's r was 0...
Tarman, Lisa; Ferris, Pamella; Bruder, Dan; Gilligan, Nick; Morgan, James; Delooper, John
The Princeton Plasma Physics Laboratory in collaboration with teachers and students has developed a variety of educational outreach activities to demonstrate basic physical science principles using a variety of instructive and entertaining science demonstrations at elementary, middle and high school levels. A new effort has been initiated to demonstrate this material using web based videos. This can be a resource for the teacher, parent or student. In addition, the video is coupled to web-based lesson plans for the educator. The goal of these videos and lesson plans is to create enthusiasm and stimulate curiosity among students and teachers, showing them that science is interesting, fun, and within their grasp. The first video demonstrates a number of activities one can accomplish using a half-coated fluorescent tube. Additional demonstrations will be added over time.
Albert A Angehrn
Full Text Available The vast majority of knowledge management initiatives fail because they do not take sufficiently into account the emotional, psychological and social needs of individuals. Only if users see real value for themselves will they actively use and contribute their own knowledge to the system, and engage with other users. Connection dynamics can make this easier, and even enjoyable, by connecting people and bringing them closer through shared experiences such as playing a game together. A higher connectedness of people to other people, and to relevant knowledge assets, will motivate them to participate more actively and increase system usage. In this paper, we describe the design of TENTube, a video-based connection tool we are developing to support competence development. TENTube integrates rich profiling and network visualization and navigation with agent-enhanced game-like connection dynamics.
Full Text Available A software platform is exposed, which was developed to enable demonstration and capacity testing. The platform simulates a joint optimized wireless video transmission. The development succeeded within the frame of the IST-PHOENIX project and is based on the system optimization model of the project. One of the constitutive parts of the model, the wireless network segment, is changed to a detailed, standard UTRA network simulation module. This paper consists of (1 a brief description of the projects simulation chain, (2 brief description of the UTRAN system, and (3 the integration of the two segments. The role of the UTRAN part in the joint optimization is described, with the configuration and control of this element. Finally, some simulation results are shown. In the conclusion, we show how our simulation results translate into real-world performance gains.
Giroire, Frédéric; Huin, Nicolas
International audience; —We study distributed systems for live video streaming. These systems can be of two types: structured and un-structured. In an unstructured system, the diffusion is done opportunistically. The advantage is that it handles churn, that is the arrival and departure of users, which is very high in live streaming systems, in a smooth way. On the opposite, in a structured system, the diffusion of the video is done using explicit diffusion trees. The advantage is that the dif...
Shi, Xu-li; Jin, Zhi-cheng; Teng, Guo-wei; Zhang, Zhao-yang; An, Ping; Xiao, Guang
It is a hot focus of current researches in video standards that how to transmit video streams over Internet and wireless networks. One of the key methods is FGS(Fine-Granular-Scalability), which can always adapt to the network bandwidth varying but with some sacrifice of coding efficiency, is supported by MPEG-4. Object-based video coding algorithm has been firstly included in MPEG-4 standard that can be applied in interactive video. However, the real time segmentation of VOP(video object plan) is difficult that limit the application of MPEG-4 standard in interactive video. H.264/AVC is the up-to-date video-coding standard, which enhance compression performance and provision a network-friendly video representation. In this paper, we proposed a new Object Based FGS(OBFGS) coding algorithm embedded in H.264/AVC that is different from that in mpeg-4. After the algorithms optimization for the H.264 encoder, the FGS first finish the base-layer coding. Then extract moving VOP using the base-layer information of motion vectors and DCT coefficients. Sparse motion vector field of p-frame composed of 4*4 blocks, 4*8 blocks and 8*4 blocks in base-layer is interpolated. The DCT coefficient of I-frame is calculated by using information of spatial intra-prediction. After forward projecting each p-frame vector to the immediate adjacent I-frame, the method extracts moving VOPs (video object plan) using a recursion 4*4 block classification process. Only the blocks that belong to the moving VOP in 4*4 block-level accuracy is coded to produce enhancement-layer stream. Experimental results show that our proposed system can obtain high interested VOP quality at the cost of fewer coding efficiency.
Al Hadhrami, Tawfik; Nightingale, James M.; Wang, Qi; Grecos, Christos
In emergency situations, the ability to remotely monitor unfolding events using high-quality video feeds will significantly improve the incident commander's understanding of the situation and thereby aids effective decision making. This paper presents a novel, adaptive video monitoring system for emergency situations where the normal communications network infrastructure has been severely impaired or is no longer operational. The proposed scheme, operating over a rapidly deployable wireless mesh network, supports real-time video feeds between first responders, forward operating bases and primary command and control centers. Video feeds captured on portable devices carried by first responders and by static visual sensors are encoded in H.264/SVC, the scalable extension to H.264/AVC, allowing efficient, standard-based temporal, spatial, and quality scalability of the video. A three-tier video delivery system is proposed, which balances the need to avoid overuse of mesh nodes with the operational requirements of the emergency management team. In the first tier, the video feeds are delivered at a low spatial and temporal resolution employing only the base layer of the H.264/SVC video stream. Routing in this mode is designed to employ all nodes across the entire mesh network. In the second tier, whenever operational considerations require that commanders or operators focus on a particular video feed, a `fidelity control' mechanism at the monitoring station sends control messages to the routing and scheduling agents in the mesh network, which increase the quality of the received picture using SNR scalability while conserving bandwidth by maintaining a low frame rate. In this mode, routing decisions are based on reliable packet delivery with the most reliable routes being used to deliver the base and lower enhancement layers; as fidelity is increased and more scalable layers are transmitted they will be assigned to routes in descending order of reliability. The third tier
Craciunescu, Razvan; Mihovska, Albena Dimitrova; Kyriazakos, Sofoklis
Gesture recognition is a topic in computer science and language technology with the goal of interpreting human gestures via mathematical algorithms. Gestures can originate from any bodily motion or state but commonly originate from the face or hand. Current research focus includes on the emotion...... recognition from the face and hand gesture recognition. Gesture recognition enables humans to communicate with the machine and interact naturally without any mechanical devices. This paper investigates the possibility to use non-audio/video sensors in order to design a low-cost gesture recognition device...
Full Text Available Turning a city into a smart city has attracted considerable attention. A smart city can be seen as a city that uses digital technology not only to improve the quality of people’s life, but also, to have a positive impact in the environment and, at the same time, offer efficient and easy-to-use services. A fundamental aspect to be considered in a smart city is people’s safety and welfare, therefore, having a good security system becomes a necessity, because it allows us to detect and identify potential risk situations, and then take appropriate decisions to help people or even prevent criminal acts. In this paper we present an architecture for automated video surveillance based on the cloud computing schema capable of acquiring a video stream from a set of cameras connected to the network, process that information, detect, label and highlight security-relevant events automatically, store the information and provide situational awareness in order to minimize response time to take the appropriate action.
Valentín, L.; Serrano, S. A.; Oves García, R.; Andrade, A.; Palacios-Alonso, M. A.; Sucar, L. Enrique
Turning a city into a smart city has attracted considerable attention. A smart city can be seen as a city that uses digital technology not only to improve the quality of people's life, but also, to have a positive impact in the environment and, at the same time, offer efficient and easy-to-use services. A fundamental aspect to be considered in a smart city is people's safety and welfare, therefore, having a good security system becomes a necessity, because it allows us to detect and identify potential risk situations, and then take appropriate decisions to help people or even prevent criminal acts. In this paper we present an architecture for automated video surveillance based on the cloud computing schema capable of acquiring a video stream from a set of cameras connected to the network, process that information, detect, label and highlight security-relevant events automatically, store the information and provide situational awareness in order to minimize response time to take the appropriate action.
Su, Ang; Zhang, Yueqiang; Dong, Jing; Xu, Yuhua; Zhu, Xianwei; Zhang, Xiaohu
The high portability of small Unmanned Aircraft Vehicles (UAVs) makes them play an important role in surveillance and reconnaissance tasks, so the military and civilian desires for UAVs are constantly growing. Recently, we have developed a real-time video exploitation system for our small UAV which is mainly used in forest patrol tasks. Our system consists of six key models, including image contrast enhancement, video stabilization, mosaicing, salient target indication, moving target indication, and display of the footprint and flight path on map. Extensive testing on the system has been implemented and the result shows our system performed well.
Full Text Available Wireless local area networks (WLANs such as IEEE 802.11a/g utilise numerous transmission modes, each providing different throughputs and reliability levels. Most link adaptation algorithms proposed in the literature (i maximise the error-free data throughput, (ii do not take into account the content of the data stream, and (iii rely strongly on the use of ARQ. Low-latency applications, such as real-time video transmission, do not permit large numbers of retransmission. In this paper, a novel link adaptation scheme is presented that improves the quality of service (QoS for video transmission. Rather than maximising the error-free throughput, our scheme minimises the video distortion of the received sequence. With the use of simple and local rate distortion measures and end-to-end distortion models at the video encoder, the proposed scheme estimates the received video distortion at the current transmission rate, as well as on the adjacent lower and higher rates. This allows the system to select the link-speed which offers the lowest distortion and to adapt to the channel conditions. Simulation results are presented using the MPEG-4/AVC H.264 video compression standard over IEEE 802.11g. The results show that the proposed system closely follows the optimum theoretic solution.
Kapustin, A. A.; Razumovskii, V. N.; Iatsevich, G. B.
A spatial-spectral analysis method is considered for a laser scanning video system with the phase processing of a received signal, on a modulation frequency. Distortions caused by the system are analyzed, and a general problem is reduced for the case of a cylindrical surface. The approach suggested can also be used for scanning microwave systems.
Cheng, Hui; Butler, Darren
Aerial surveillance has long been used by the military to locate, monitor and track the enemy. Recently, its scope has expanded to include law enforcement activities, disaster management and commercial applications. With the ever-growing amount of aerial surveillance video acquired daily, there is an urgent need for extracting actionable intelligence in a timely manner. Furthermore, to support high-level video understanding, this analysis needs to go beyond current approaches and consider the relationships, motivations and intentions of the objects in the scene. In this paper we propose a system for interpreting aerial surveillance videos that automatically generates a succinct but meaningful description of the observed regions, objects and events. For a given video, the semantics of important regions and objects, and the relationships between them, are summarised into a semantic concept graph. From this, a textual description is derived that provides new search and indexing options for aerial video and enables the fusion of aerial video with other information modalities, such as human intelligence, reports and signal intelligence. Using a Mixture-of-Experts video segmentation algorithm an aerial video is first decomposed into regions and objects with predefined semantic meanings. The objects are then tracked and coerced into a semantic concept graph and the graph is summarized spatially, temporally and semantically using ontology guided sub-graph matching and re-writing. The system exploits domain specific knowledge and uses a reasoning engine to verify and correct the classes, identities and semantic relationships between the objects. This approach is advantageous because misclassifications lead to knowledge contradictions and hence they can be easily detected and intelligently corrected. In addition, the graph representation highlights events and anomalies that a low-level analysis would overlook.
Hu, Yue-Yung; Mazer, Laura M; Yule, Steven J; Arriaga, Alexander F; Greenberg, Caprice C; Lipsitz, Stuart R; Gawande, Atul A; Smink, Douglas S
Surgical expertise demands technical and nontechnical skills. Traditionally, surgical trainees acquired these skills in the operating room; however, operative time for residents has decreased with duty hour restrictions. As in other professions, video analysis may help maximize the learning experience. To develop and evaluate a postoperative video-based coaching intervention for residents. In this mixed methods analysis, 10 senior (postgraduate year 4 and 5) residents were videorecorded operating with an attending surgeon at an academic tertiary care hospital. Each video formed the basis of a 1-hour one-on-one coaching session conducted by the operative attending; although a coaching framework was provided, participants determined the specific content collaboratively. Teaching points were identified in the operating room and the video-based coaching sessions; iterative inductive coding, followed by thematic analysis, was performed. Teaching points made in the operating room were compared with those in the video-based coaching sessions with respect to initiator, content, and teaching technique, adjusting for time. Among 10 cases, surgeons made more teaching points per unit time (63.0 vs 102.7 per hour) while coaching. Teaching in the video-based coaching sessions was more resident centered; attendings were more inquisitive about residents' learning needs (3.30 vs 0.28, P = .04), and residents took more initiative to direct their education (27% [198 of 729 teaching points] vs 17% [331 of 1977 teaching points], P coaching is a novel and feasible modality for supplementing intraoperative learning. Objective evaluation demonstrates that video-based coaching may be particularly useful for teaching higher-level concepts, such as decision making, and for individualizing instruction and feedback to each resident.
Full Text Available The recent development of three dimensional (3D display technologies has resulted in a proliferation of 3D video production and broadcasting, attracting a lot of research into capture, compression and delivery of stereoscopic content. However, the predominant design practice of interactions with 3D video content has failed to address its differences and possibilities in comparison to the existing 2D video interactions. This paper presents a study of user requirements related to interaction with the stereoscopic 3D video. The study suggests that the change of view, zoom in/out, dynamic video browsing, and textual information are the most relevant interactions with stereoscopic 3D video. In addition, we identified a strong demand for object selection that resulted in a follow-up study of user preferences in 3D selection using virtual-hand and ray-casting metaphors. These results indicate that interaction modality affects users’ decision of object selection in terms of chosen location in 3D, while user attitudes do not have significant impact. Furthermore, the ray-casting-based interaction modality using Wiimote can outperform the volume-based interaction modality using mouse and keyboard for object positioning accuracy.
... system operator may charge different rates to different classes of video programming providers, provided... open video systems. 76.1504 Section 76.1504 Telecommunication FEDERAL COMMUNICATIONS COMMISSION (CONTINUED) BROADCAST RADIO SERVICES MULTICHANNEL VIDEO AND CABLE TELEVISION SERVICE Open Video Systems § 76...
Full Text Available Intelligent video surveillance systems detect pre-configured surveillance events through background modeling, foreground and object extraction, object tracking, and event detection. Shadow regions inside video frames sometimes appear as foreground objects, interfere with ensuing processes, and finally degrade the event detection performance of the systems. Conventional studies have mostly used intensity, color, texture, and geometric information to perform shadow detection in daytime video, but these methods lack the capability of removing shadows in nighttime video. In this paper, a novel shadow detection algorithm for nighttime video is proposed; this algorithm partitions each foreground object based on the object’s vertical histogram and screens out shadow objects by validating their orientations heading toward regions of light sources. From the experimental results, it can be seen that the proposed algorithm shows more than 93.8% shadow removal and 89.9% object extraction rates for nighttime video sequences, and the algorithm outperforms conventional shadow removal algorithms designed for daytime videos.
Song, Joongseok; Kim, Changseob; Park, Hanhoon; Park, Jong-Il
We propose a practical system that can effectively mix the depth data of real and virtual objects by using a Z buffer and can quickly generate digital mixed reality video holograms by using multiple graphic processing units (GPUs). In an experiment, we verify that real objects and virtual objects can be merged naturally in free viewing angles, and the occlusion problem is well handled. Furthermore, we demonstrate that the proposed system can generate mixed reality video holograms at 7.6 frames per second. Finally, the system performance is objectively verified by users' subjective evaluations.
Fog, Benedikte; Ulfkjær, Jacob Kanneworff Stigsen; Schlichter, Bjarne Rerup
not sufficiently reflect the theoretical recommendations of using video optimally in a management education. It did not comply with the video learning sequence as introduced by Marx and Frost (1998). However, it questions if the level of cognitive orientation activities can become too extensive. It finds......The study of business information systems has become increasingly important in the Digital Economy. However, it has been found that students have difficulties understanding the practical implications thereof and this leads to a motivational decreases. This study aims to investigate how to optimize...... the use of video to increase comprehension of the practical implications of studying business information systems. This qualitative study is based on observations and focus group interviews with first semester business students. The findings suggest that the video examined in the case study did...
Galiano, Vicente; López-Granado, Otoniel; Malumbres, Manuel P.; Migallón, Hector
Three-dimensional wavelet transform (3D-DWT) encoders are good candidates for applications like professional video editing, video surveillance, multi-spectral satellite imaging, etc. where a frame must be reconstructed as quickly as possible. In this paper, we present a new 3D-DWT video encoder based on a fast run-length coding engine. Furthermore, we present several multicore optimizations to speed-up the 3D-DWT computation. An exhaustive evaluation of the proposed encoder (3D-GOP-RL) has been performed, and we have compared the evaluation results with other video encoders in terms of rate/distortion (R/D), coding/decoding delay, and memory consumption. Results show that the proposed encoder obtains good R/D results for high-resolution video sequences with nearly in-place computation using only the memory needed to store a group of pictures. After applying the multicore optimization strategies over the 3D DWT, the proposed encoder is able to compress a full high-definition video sequence in real-time.
Chan, Lap Ki; Patil, Nivritti G; Chen, Julie Y; Lam, Jamie C M; Lau, Chak S; Ip, Mary S M
Traditionally, paper cases are used as 'triggers' to stimulate learning in problem-based learning (PBL). However, video may be a better medium because it preserves the original language, encourages the active extraction of information, avoids depersonalization of patients and allows direct observation of clinical consultations. In short, it exposes the students to the complexity of actual clinical problems. The study aims to find out whether students and facilitators who are accustomed to paper cases would prefer video triggers or paper cases and the reasons for their preference. After students and facilitators had completed a video PBL tutorial, their responses were measured by a structured questionnaire using a modified Likert scale. A total of 257 students (92%) and 26 facilitators (100%) responded. The majority of students and facilitators considered that using video triggers could enhance the students' observational powers and clinical reasoning, help them to integrate different information and better understand the cases and motivate them to learn. They found PBL using video triggers more interesting and preferred it to PBL using paper cases. Video triggers are preferred by both students and facilitators over paper cases in PBL.
Chen, Chien-Hsu; Chou, Yin-Ju
This study focuses on development of augmented video system on traditional picture postcards. The system will provide users to print out the augmented reality marker on the sticker to stick on the picture postcard, and it also allows users to record their real time image and video to augment on that stick marker. According dynamic image, users can share travel moods, greeting, and travel experience to their friends. Without changing in the traditional picture postcards, we develop augmented video system on them by augmented reality (AR) technology. It not only keeps the functions of traditional picture postcards, but also enhances user's experience to keep the user's memories and emotional expression by augmented digital media information on them.
Rothkrantz, L.; Lefter, I.
The paper describes a surveillance system of cameras installed at lamppost of a military area. The surveillance system has been designed to detect unwanted visitors or suspicious behaviors. The area is composed of streets, building blocks and surrounded by gates and water. The video recordings are
Full Text Available With the growing popularity of personal digital assistant devices and smart phones, more and more consumers are becoming quite enthusiastic to appreciate videos via mobile devices. However, limited display size of the mobile devices has been imposing significant barriers for users to enjoy browsing high-resolution videos. In this paper, we present an attention-information-based spatial adaptation framework to address this problem. The whole framework includes two major parts: video content generation and video adaptation system. During video compression, the attention information in video sequences will be detected using an attention model and embedded into bitstreams with proposed supplement-enhanced information (SEI structure. Furthermore, we also develop an innovative scheme to adaptively adjust quantization parameters in order to simultaneously improve the quality of overall encoding and the quality of transcoding the attention areas. When the high-resolution bitstream is transmitted to mobile users, a fast transcoding algorithm we developed earlier will be applied to generate a new bitstream for attention areas in frames. The new low-resolution bitstream containing mostly attention information, instead of the high-resolution one, will be sent to users for display on the mobile devices. Experimental results show that the proposed spatial adaptation scheme is able to improve both subjective and objective video qualities.
Ecliptic Enterprises Corporation, headquartered in Pasadena, California, provided onboard video systems for rocket and space shuttle launches before it was tasked by Ames Research Center to craft the Data Handling Unit that would control sensor instruments onboard the Lunar Crater Observation and Sensing Satellite (LCROSS) spacecraft. The technological capabilities the company acquired on this project, as well as those gained developing a high-speed video system for monitoring the parachute deployments for the Orion Pad Abort Test Program at Dryden Flight Research Center, have enabled the company to offer high-speed and high-definition video for geosynchronous satellites and commercial space missions, providing remarkable footage that both informs engineers and inspires the imagination of the general public.
Full Text Available Vision is a complex process that integrates multiple aspects of an image: spatial frequencies, topology and colour. Unfortunately, so far, all these elements were independently took into consideration for the development of image and video quality metrics, therefore we propose an approach that blends together all of them. Our approach allows for the analysis of the complexity of colour images in the RGB colour space, based on the probabilistic algorithm for calculating the fractal dimension and lacunarity. Given that all the existing fractal approaches are defined only for gray-scale images, we extend them to the colour domain. We show how these two colour fractal features capture the multiple aspects that characterize the degradation of the video signal, based on the hypothesis that the quality degradation perceived by the user is directly proportional to the modification of the fractal complexity. We claim that the two colour fractal measures can objectively assess the quality of the video signal and they can be used as metrics for the user-perceived video quality degradation and we validated them through experimental results obtained for an MPEG-4 video streaming application; finally, the results are compared against the ones given by unanimously-accepted metrics and subjective tests.
Dow, Ximeng Y; Sullivan, Shane Z; Muir, Ryan D; Simpson, Garth J
A fast (up to video rate) two-photon excited fluorescence lifetime imaging system based on interleaved digitization is demonstrated. The system is compatible with existing beam-scanning microscopes with minor electronics and software modification. Proof-of-concept demonstrations were performed using laser dyes and biological tissue.
Jones, D. P.; Shirey, D. L.; Amai, W. A.
This paper presents a high bandwidth fiber-optic communication system intended for post accident recovery of weapons. The system provides bi-directional multichannel, and multi-media communications. Two smaller systems that were developed as direct spin-offs of the larger system are also briefly discussed.
Jones, D.P.; Shirey, D.L.; Amai, W.A.
This paper presents a high bandwidth fiber-optic communication system intended for post accident recovery of weapons. The system provides bi-directional multichannel, and multi-media communications. Two smaller systems that were developed as direct spin-offs of the larger system are also briefly discussed.
Worring, M.; Snoek, C.G.M.; de Rooij, O.; Nguyen, G.P.; Koelma, D.C.
In this paper we present the methods and visualizations used in the MediaMill video search engine. The basis for the engine is a semantic indexing process which derives a lexicon of 101 concepts. To support the user in navigating the collection, the system defines a visual similarity space, a
Ørngreen, Rikke; Levinsen, Karin Ellen Tweddell; Jelsbak, Vibe Alopaeus
The Bachelor Programme in Biomedical Laboratory Analysis at VIA's healthcare university college in Aarhus has established a blended class which combines traditional and live broadcast teaching (via an innovative choice of video conferencing system). On the so-called net-days, students have....... From here a number of general principles and perspective were derived for the specific program which can be useful to contemplate in general for similar educations. It is concluded that the blended class model using live video stream represents a viable pedagogical solution for the Bachelor Programme...... sheds light on the pedagogical challenges, the educational designs possible, the opportunities and constrains associated with video conferencing as a pedagogical practice, as well as the technological, structural and organisational conditions involved. In this paper a participatory action research...
Kottmann, R.; Ratmeyer, V.; Pop Ristov, A.; Boetius, A.
More and more seagoing scientific expeditions use video-controlled research platforms such as Remote Operating Vehicles (ROV), Autonomous Underwater Vehicles (AUV), and towed camera systems. These produce many hours of video material which contains detailed and scientifically highly valuable footage of the biological, chemical, geological, and physical aspects of the oceans. Many of the videos contain unique observations of unknown life-forms which are rare, and which cannot be sampled and studied otherwise. To make such video material online accessible and to create a collaborative annotation environment the "Video Annotation and processing platform" (V-App) was developed. A first solely web-based installation for ROV videos is setup at the German Center for Marine Environmental Sciences (available at http://videolib.marum.de). It allows users to search and watch videos with a standard web browser based on the HTML5 standard. Moreover, V-App implements social web technologies allowing a distributed world-wide scientific community to collaboratively annotate videos anywhere at any time. It has several features fully implemented among which are: • User login system for fine grained permission and access control • Video watching • Video search using keywords, geographic position, depth and time range and any combination thereof • Video annotation organised in themes (tracks) such as biology and geology among others in standard or full screen mode • Annotation keyword management: Administrative users can add, delete, and update single keywords for annotation or upload sets of keywords from Excel-sheets • Download of products for scientific use This unique web application system helps making costly ROV videos online available (estimated cost range between 5.000 - 10.000 Euros per hour depending on the combination of ship and ROV). Moreover, with this system each expert annotation adds instantaneous available and valuable knowledge to otherwise uncharted
... COMMISSION In the Matter of Certain Video Analytics Software, Systems, Components Thereof, and Products... analytics software, systems, components thereof, and products containing same by reason of infringement of... after importation of certain video analytics software, systems, components thereof, and products...
... COMMISSION Certain Video Analytics Software, Systems, Components Thereof, and Products Containing Same... analytics software, systems, components thereof, and products containing same by reason of infringement of... after importation of certain video analytics software, systems, components thereof, and products...
... COMMISSION Certain Video Analytics Software, Systems, Components Thereof, and Products Containing Same... Trade Commission has received a complaint entitled Certain Video Analytics Software, Systems, Components... analytics software, systems, components thereof, and products containing same. The complaint names as...
Galiano, Vicente; López-Granado, Otoniel; Malumbres, Manuel P.; Drummond, Leroy Anthony; Migallón, Hector
The 3D-DWT is a mathematical tool of increasing importance in those applications that require an efficient processing of huge amounts of volumetric info. Other applications like professional video editing, video surveillance applications, multi-spectral satellite imaging, HQ video delivery, etc, would rather use 3D-DWT encoders to reconstruct a frame as fast as possible. In this article, we introduce a fast GPU-based encoder which uses 3D-DWT transform and lower trees. Also, we present an exhaustive analysis of the use of GPU memory. Our proposal shows good trade off between R/D, coding delay (as fast as MPEG-2 for High definition) and memory requirements (up to 6 times less memory than x264).
Zhao, Fan; Yao, Zao; Song, XiaoFang; Yao, Yi
Image and video dehazing is a popular topic in the field of computer vision and digital image processing. A fast, optimized dehazing algorithm was recently proposed that enhances contrast and reduces flickering artifacts in a dehazed video sequence by minimizing a cost function that makes transmission values spatially and temporally coherent. However, its fixed-size block partitioning leads to block effects. Further, the weak edges in a hazy image are not addressed. Hence, a video dehazing algorithm based on customized spectral clustering is proposed. To avoid block artifacts, the spectral clustering is customized to segment static scenes to ensure the same target has the same transmission value. Assuming that dehazed edge images have richer detail than before restoration, an edge cost function is added to the ransmission model. The experimental results demonstrate that the proposed method provides higher dehazing quality and lower time complexity than the previous technique.
Gramss, Denise; Struve, Doreen
The study reported in this paper investigated the usefulness of different instructions for guiding inexperienced older adults through interactive systems. It was designed to compare different media in relation to their social as well as their motivational impact on the elderly during the learning process. Precisely, the video was compared with…
Glazkov, V. D.; Goretov, Iu. M.; Rozhavskii, E. I.; Shcherbakov, V. V.
The self-correcting video section of the satellite-borne Fragment multispectral scanning system is described. This section scheme makes possible a sufficiently efficient equalization of the transformation coefficients of all the measuring sections in the presence of a reference-radiation source and a single reference time interval for all the sections.
White, Mark G.
A video-based format was used during a graduate seminar course designed to educate students on the nature of catalysis, to help transfer information among students working on similar problems, and to improve communication skills. The mechanics of and student reaction to this seminar course are discussed. (JN)
Winston, Karin; Grendarova, Petra; Rabi, Doreen
This study reviews the published literature on the use of video-based decision aids (DA) for patients. The authors describe the areas of medicine in which video-based patient DA have been evaluated, the medical decisions targeted, their reported impact, in which countries studies are being conducted, and publication trends. The literature review was conducted systematically using Medline, Embase, CINAHL, PsychInfo, and Pubmed databases from inception to 2016. References of identified studies were reviewed, and hand-searches of relevant journals were conducted. 488 studies were included and organized based on predefined study characteristics. The most common decisions addressed were cancer screening, risk reduction, advance care planning, and adherence to provider recommendations. Most studies had sample sizes of fewer than 300, and most were performed in the United States. Outcomes were generally reported as positive. This field of study was relatively unknown before 1990s but the number of studies published annually continues to increase. Videos are largely positive interventions but there are significant remaining knowledge gaps including generalizability across populations. Clinicians should consider incorporating video-based DA in their patient interactions. Future research should focus on less studied areas and the mechanisms underlying effective patient decision aids. Copyright © 2017 Elsevier B.V. All rights reserved.
Adhikari, Sam; Stark, David E
This paper presents a video-based eye-tracking method, ideally deployed via a mobile device or laptop-based webcam, as a tool for measuring brain function. Eye movements and pupillary motility are tightly regulated by brain circuits, are subtly perturbed by many disease states, and are measurable using video-based methods. Quantitative measurement of eye movement by readily available webcams may enable early detection and diagnosis, as well as remote/serial monitoring, of neurological and neuropsychiatric disorders. We successfully extracted computational and semantic features for 14 testing sessions, comprising 42 individual video blocks and approximately 17,000 image frames generated across several days of testing. Here, we demonstrate the feasibility of collecting video-based eye-tracking data from a standard webcam in order to assess psychomotor function. Furthermore, we were able to demonstrate through systematic analysis of this data set that eye-tracking features (in particular, radial and tangential variance on a circular visual-tracking paradigm) predict performance on well-validated psychomotor tests. © 2017 New York Academy of Sciences.
Full Text Available Aiming at improving the video visual resolution quality and details clarity, a novel learning-based video superresolution reconstruction algorithm using spatiotemporal nonlocal similarity is proposed in this paper. Objective high-resolution (HR estimations of low-resolution (LR video frames can be obtained by learning LR-HR correlation mapping and fusing spatiotemporal nonlocal similarities between video frames. With the objective of improving algorithm efficiency while guaranteeing superresolution quality, a novel visual saliency-based LR-HR correlation mapping strategy between LR and HR patches is proposed based on semicoupled dictionary learning. Moreover, aiming at improving performance and efficiency of spatiotemporal similarity matching and fusion, an improved spatiotemporal nonlocal fuzzy registration scheme is established using the similarity weighting strategy based on pseudo-Zernike moment feature similarity and structural similarity, and the self-adaptive regional correlation evaluation strategy. The proposed spatiotemporal fuzzy registration scheme does not rely on accurate estimation of subpixel motion, and therefore it can be adapted to complex motion patterns and is robust to noise and rotation. Experimental results demonstrate that the proposed algorithm achieves competitive superresolution quality compared to other state-of-the-art algorithms in terms of both subjective and objective evaluations.
Full Text Available In this work we introduce a simple client-server system architecture and algorithms for ubiquitous live video and VOD service support. The main features of the system are: efficient usage of network resources, emphasis on user personalization, and ease of implementation. The system supports many continuous service requirements such as QoS provision, user mobility between networks and between different communication devices, and simultaneous usage of a device by a number of users.
Bodo, Yann; Laurent, Nathalie; Laurent, Christophe; Dugelay, Jean-Luc
With the popularity of high-bandwidth modems and peer-to-peer networks, the contents of videos must be highly protected from piracy. Traditionally, the models utilized to protect this kind of content are scrambling and watermarking. While the former protects the content against eavesdropping (a priori protection), the latter aims at providing a protection against illegal mass distribution (a posteriori protection). Today, researchers agree that both models must be used conjointly to reach a sufficient level of security. However, scrambling works generally by encryption resulting in an unintelligible content for the end-user. At the moment, some applications (such as e-commerce) may require a slight degradation of content so that the user has an idea of the content before buying it. In this paper, we propose a new video protection model, called waterscrambling, whose aim is to give such a quality degradation-based security model. This model works in the compressed domain and disturbs the motion vectors, degrading the video quality. It also allows embedding of a classical invisible watermark enabling protection against mass distribution. In fact, our model can be seen as an intermediary solution to scrambling and watermarking.
Full Text Available With the popularity of high-bandwidth modems and peer-to-peer networks, the contents of videos must be highly protected from piracy. Traditionally, the models utilized to protect this kind of content are scrambling and watermarking. While the former protects the content against eavesdropping (a priori protection, the latter aims at providing a protection against illegal mass distribution (a posteriori protection. Today, researchers agree that both models must be used conjointly to reach a sufficient level of security. However, scrambling works generally by encryption resulting in an unintelligible content for the end-user. At the moment, some applications (such as e-commerce may require a slight degradation of content so that the user has an idea of the content before buying it. In this paper, we propose a new video protection model, called waterscrambling, whose aim is to give such a quality degradation-based security model. This model works in the compressed domain and disturbs the motion vectors, degrading the video quality. It also allows embedding of a classical invisible watermark enabling protection against mass distribution. In fact, our model can be seen as an intermediary solution to scrambling and watermarking.
Full Text Available This work presents a fall detection system that is based on image processing technology. The system can detect falling by various humans via analysis of video frame. First, the system utilizes the method of mixture and Gaussian background model to generate information about the background, and the noise and shadow of background are eliminated to extract the possible positions of moving objects. The extraction of a foreground image generates more noise and damage. Therefore, morphological and size filters are utilized to eliminate this noise and repair the damage to the image. Extraction of the foreground image yields the locations of human heads in the image. The median point, height, and aspect ratio of the people in the image are calculated. These characteristics are utilized to trace objects. The change of the characteristics of objects among various consecutive images can be used to evaluate those persons enter or leave the scene. The method of fall detection uses the height and aspect ratio of the human body, analyzes the image in which one person overlaps with another, and detects whether a human has fallen or not. Experimental results demonstrate that the proposed method can efficiently detect falls by multiple persons.
Hulsman, Robert L.; van der Vloodt, Jane
Objective: Self-evaluation and peer-feedback are important strategies within the reflective practice paradigm for the development and maintenance of professional competencies like medical communication. Characteristics of the self-evaluation and peer-feedback annotations of medical students' video
Du, Bangshi; Qi, Feng; Shao, Sujie; Wang, Ying; Li, Weijian
Video conference system has become an important support platform for smart grid operation and management, its operation quality is gradually concerning grid enterprise. First, the evaluation indicator system covering network, business and operation maintenance aspects was established on basis of video conference system's operation statistics. Then, the operation quality assessment model combining genetic algorithm with regularized BP neural network was proposed, which outputs operation quality level of the system within a time period and provides company manager with some optimization advice. The simulation results show that the proposed evaluation model offers the advantages of fast convergence and high prediction accuracy in contrast with regularized BP neural network, and its generalization ability is superior to LM-BP neural network and Bayesian BP neural network.
Zhou, Peipei; Ding, Qinghai; Luo, Haibo; Hou, Xinglin
Violent interaction detection is of vital importance in some video surveillance scenarios like railway stations, prisons or psychiatric centres. Existing vision-based methods are mainly based on hand-crafted features such as statistic features between motion regions, leading to a poor adaptability to another dataset. En lightened by the development of convolutional networks on common activity recognition, we construct a FightNet to represent the complicated visual violence interaction. In this paper, a new input modality, image acceleration field is proposed to better extract the motion attributes. Firstly, each video is framed as RGB images. Secondly, optical flow field is computed using the consecutive frames and acceleration field is obtained according to the optical flow field. Thirdly, the FightNet is trained with three kinds of input modalities, i.e., RGB images for spatial networks, optical flow images and acceleration images for temporal networks. By fusing results from different inputs, we conclude whether a video tells a violent event or not. To provide researchers a common ground for comparison, we have collected a violent interaction dataset (VID), containing 2314 videos with 1077 fight ones and 1237 no-fight ones. By comparison with other algorithms, experimental results demonstrate that the proposed model for violent interaction detection shows higher accuracy and better robustness.
In the latest Joint Video Exploration Team development, the quadtree plus binary tree (QTBT) block partitioning structure has been proposed for future video coding. Compared to the traditional quadtree structure of High Efficiency Video Coding (HEVC) standard, QTBT provides more flexible patterns for splitting the blocks, which results in dramatically increased combinations of block partitions and high computational complexity. In view of this, a confidence interval based early termination (CIET) scheme is proposed for QTBT to identify the unnecessary partition modes in the sense of rate-distortion (RD) optimization. In particular, a RD model is established to predict the RD cost of each partition pattern without the full encoding process. Subsequently, the mode decision problem is casted into a probabilistic framework to select the final partition based on the confidence interval decision strategy. Experimental results show that the proposed CIET algorithm can speed up QTBT block partitioning structure by reducing 54.7% encoding time with only 1.12% increase in terms of bit rate. Moreover, the proposed scheme performs consistently well for the high resolution sequences, of which the video coding efficiency is crucial in real applications.
Kong, Hyoun-Joong; Seo, Jong Mo; Hwang, Jeong Min; Kim, Hee Chan
Binocular indirect ophthalmoscope (BIO) provides a wider view of fundus with stereopsis contrary to the direct one. Proposed system is composed of portable BIO and 3D viewing unit. The illumination unit of BIO utilized high flux LED as a light source, LED condensing lens cap for beam focusing, color filters and small lithium ion battery. In optics unit of BIO, beam splitter was used to distribute an examinee's fundus image both to examiner's eye and to CMOS camera module attached to device. Captured retinal video stream data from stereo camera modules were sent to PC through USB 2.0 connectivity. For 3D viewing, two video streams having parallax between them were aligned vertically and horizontally and made into side-by-side video stream for cross-eyed stereoscopy. And the data were converted into autostereoscopic video stream using vertical interlacing for stereoscopic LCD which has glass 3D filter attached to the front side of it. Our newly devised system presented the real-time 3-D view of fundus to assistants with less dizziness than cross-eyed stereoscopy. And the BIO showed good performance compared to conventional portable BIO (Spectra Plus, Keeler Limited, Windsor, UK).
Ahmed, Mohamed; Karmouch, Ahmed
Video information, image processing, and computer vision techniques are developing rapidly because of the availability of acquisition, processing, and editing tools that use current hardware and software systems. However, problems still remain in conveying this video data to the end users. Limiting factors are the resource capabilities in distributed architectures and the features of the users' terminals. The efficient use of image processing, video indexing, and analysis techniques can provide users with solutions or alternatives. We see the video stream as a sequence of correlated images containing in its structure temporal events such as camera editing effects and presents a new algorithm for achieving video segmentation, indexing, and key framing tasks. The algorithm is based on color histograms and uses a binary penetration technique. Although much has been done in this area, most work does not adequately consider the optimization of timing performance and processing storage. This is especially the case if the techniques are designed for use in run-time distributed environments. Our main contribution is to blend high performance and storage criteria with the need to achieve effective results. The algorithm exploits the temporal heuristic characteristic of the visual information within a video stream. It takes into consideration the issues of detecting false cuts and missing true cuts due to the movement of the camera, the optical flow of large objects, or both. We provide a discussion, together with results from experiments and from the implementation of our application, to show the merits of the new algorithm as compared to the existing one.
... From the Federal Register Online via the Government Publishing Office INTERNATIONAL TRADE COMMISSION Certain Video Analytics Software, Systems, Components Thereof, and Products Containing Same... the United States after importation of certain video analytics software systems, components thereof...
... From the Federal Register Online via the Government Publishing Office INTERNATIONAL TRADE COMMISSION Investigations: Terminations, Modifications and Rulings: Certain Video Game Systems and... United States after importation of certain video game systems and controllers by reason of infringement...
Allin, T.; Neubert, T.; Laursen, S.; Rasmussen, I. L.; Soula, S.
In support for global ELF/VLF observations, HF measurements in France, and conjugate photometry/VLF observations in South Africa, we developed and operated a semi-automatic, remotely controlled video system for the observation of middle-atmospheric transient luminous events (TLEs). Installed at the Pic du Midi Observatory in Southern France, the system was operational during the period from July 18 to September 15, 2003. The video system, based two low-light, non-intensified CCD video cameras, was mounted on top of a motorized pan/tilt unit. The cameras and the pan/tilt unit were controlled over serial links from a local computer, and the video outputs were distributed to a pair of PCI frame grabbers in the computer. This setup allowed remote users to log in and operate the system over the internet. Event detection software provided means of recording and time-stamping single TLE video fields and thus eliminated the need for continuous human monitoring of TLE activity. The computer recorded and analyzed two parallel video streams at the full 50 Hz field rate, while uploading status images, TLE images, and system logs to a remote web server. The system detected more than 130 TLEs - mostly sprites - distributed over 9 active evenings. We have thus demonstrated the feasibility of remote agents for TLE observations, which are likely to find use in future ground-based TLE observation campaigns, or to be installed at remote sites in support for space-borne or other global TLE observation efforts.
Shi, Yu; Yan, Guozheng; Zhu, Bingquan; Liu, Gang
Wireless power transmission (WPT) technology can solve the energy shortage problem of the video capsule endoscope (VCE) powered by button batteries, but the fixed platform limited its clinical application. This paper presents a portable WPT system for VCE. Besides portability, power transfer efficiency and stability are considered as the main indexes of optimization design of the system, which consists of the transmitting coil structure, portable control box, operating frequency, magnetic core and winding of receiving coil. Upon the above principles, the correlation parameters are measured, compared and chosen. Finally, through experiments on the platform, the methods are tested and evaluated. In the gastrointestinal tract of small pig, the VCE is supplied with sufficient energy by the WPT system, and the energy conversion efficiency is 2.8%. The video obtained is clear with a resolution of 320×240 and a frame rate of 30 frames per second. The experiments verify the feasibility of design scheme, and further improvement direction is discussed.
© 2013 IEEE. The existing efforts in computer assisted semen analysis have been focused on high speed imaging and automated image analysis of sperm motility. This results in a large amount of data, and it is extremely challenging for both clinical scientists and researchers to interpret, compare and correlate the multidimensional and time-varying measurements captured from video data. In this work, we use glyphs to encode a collection of numerical measurements taken at a regular interval and to summarize spatio-temporal motion characteristics using static visual representations. The design of the glyphs addresses the needs for (a) encoding some 20 variables using separable visual channels, (b) supporting scientific observation of the interrelationships between different measurements and comparison between different sperm cells and their flagella, and (c) facilitating the learning of the encoding scheme by making use of appropriate visual abstractions and metaphors. As a case study, we focus this work on video visualization for computer-aided semen analysis, which has a broad impact on both biological sciences and medical healthcare. We demonstrate that glyph-based visualization can serve as a means of external memorization of video data as well as an overview of a large set of spatiotemporal measurements. It enables domain scientists to make scientific observation in a cost-effective manner by reducing the burden of viewing videos repeatedly, while providing them with a new visual representation for conveying semen statistics.
Full Text Available In this paper, an efficient, highly imperceptible, robust, and secure digital video watermarking technique for content authentication based on Hilbert transform in the Integer Wavelet Transform (IWT domain has been introduced. The Hilbert coefficients of gray watermark image are embedded into the cover video frames Hilbert coefficients on the 2-level IWT decomposed selected block on sub-bands using Principal Component Analysis (PCA technique. The authentication is achieved by using the digital signature mechanism. This mechanism is used to generate and embed a digital signature after embedding the watermarks. Since, the embedding process is done in Hilbert transform domain, the imperceptibility and the robustness of the watermark is greatly improved. At the receiver end, prior to the extraction of watermark, the originality of the content is verified through the authentication test. If the generated and received signature matches, it proves that the received content is original and performs the extraction process, otherwise deny the extraction process due to unauthenticated received content. The proposed method avoids typical degradations in the imperceptibility level of watermarked video in terms of Average Peak Signal – to – Noise Ratio (PSNR value of about 48db, while it is still providing better robustness against common video distortions such as frame dropping, averaging, and various image processing attacks such as noise addition, median filtering, contrast adjustment, and geometrical attacks such as, rotation and cropping in terms of Normalized Correlation Coefficient (NCC value of about nearly 1.
Full Text Available In this paper, we proposed a secure video codec based on the discrete wavelet transformation (DWT and the Advanced Encryption Standard (AES processor. Either, use of video coding with DWT or encryption using AES is well known. However, linking these two designs to achieve secure video coding is leading. The contributions of our work are as follows. First, a new method for image and video compression is proposed. This codec is a synthesis of JPEG and JPEG2000,which is implemented using Huffman coding to the JPEG and DWT to the JPEG2000. Furthermore, an improved motion estimation algorithm is proposed. Second, the encryptiondecryption effects are achieved by the AES processor. AES is aim to encrypt group of LL bands. The prominent feature of this method is an encryption of LL bands by AES-128 (128-bit keys, or AES-192 (192-bit keys, or AES-256 (256-bit keys.Third, we focus on a method that implements partial encryption of LL bands. Our approach provides considerable levels of security (key size, partial encryption, mode encryption, and has very limited adverse impact on the compression efficiency. The proposed codec can provide up to 9 cipher schemes within a reasonable software cost. Latency, correlation, PSNR and compression rate results are analyzed and shown.
Duffy, Brian; Thiyagalingam, Jeyarajan; Walton, Simon; Smith, David J; Trefethen, Anne; Kirkman-Brown, Jackson C; Gaffney, Eamonn A; Min Chen
The existing efforts in computer assisted semen analysis have been focused on high speed imaging and automated image analysis of sperm motility. This results in a large amount of data, and it is extremely challenging for both clinical scientists and researchers to interpret, compare and correlate the multidimensional and time-varying measurements captured from video data. In this work, we use glyphs to encode a collection of numerical measurements taken at a regular interval and to summarize spatio-temporal motion characteristics using static visual representations. The design of the glyphs addresses the needs for (a) encoding some 20 variables using separable visual channels, (b) supporting scientific observation of the interrelationships between different measurements and comparison between different sperm cells and their flagella, and (c) facilitating the learning of the encoding scheme by making use of appropriate visual abstractions and metaphors. As a case study, we focus this work on video visualization for computer-aided semen analysis, which has a broad impact on both biological sciences and medical healthcare. We demonstrate that glyph-based visualization can serve as a means of external memorization of video data as well as an overview of a large set of spatiotemporal measurements. It enables domain scientists to make scientific observation in a cost-effective manner by reducing the burden of viewing videos repeatedly, while providing them with a new visual representation for conveying semen statistics.
Krüger, Volker; Zhou, Shaohua; Chellappa, Rama
-temporal relations: This allows the system to use dynamics as well as to generate warnings when 'implausible' situations occur or to circumvent these altogether. We have studied the effectiveness of temporal integration for recognition purposes by using the face recognition as an example problem. Face recognition...... is a prominent problem and has been studied more extensively than almost any other recognition problem. An observation is that face recognition works well in ideal conditions. If those conditions, however, are not met, then all present algorithms break down disgracefully. This probelm appears to be general...... will use the face recognition problem as a study example. Probabilistic methods are attractive in this context as they allow a systematic handling of uncertainty and an elegant way for fusing temporal information....
Ukhanova, Ann; Forchhammer, Søren
This paper discusses the question of video codec enhancement for wireless video transmission of high definition video data taking into account constraints on memory and complexity. Starting from parameter adjustment for JPEG2000 compression algorithm used for wireless transmission and achieving...
Smalley, Daniel E.; Smithwick, Quinn Y. J.; Bove, V. Michael, Jr.
We introduce a new holo-video display architecture ("Mark III") developed at the MIT Media Laboratory. The goal of the Mark III project is to reduce the cost and size of a holo-video display, making it into an inexpensive peripheral to a standard desktop PC or game machine which can be driven by standard graphics chips. Our new system is based on lithium niobate guided-wave acousto-optic devices, which give twenty or more times the bandwidth of the tellurium dioxide bulk-wave acousto-optic modulators of our previous displays. The novel display architecture is particularly designed to eliminate the high-speed horizontal scanning mechanism that has traditionally limited the scalability of Scophony- style video displays. We describe the system architecture and the guided-wave device, explain how it is driven by a graphics chip, and present some early results.
... COMMISSION In the Matter of: Certain Video Game Systems and Controllers; Notice of Investigation AGENCY: U.S... importation, and the sale within the United States after importation of certain video game systems and... after importation of certain video game systems and controllers that infringe one or more of claims 16...
Full Text Available This essay examines how tensions between work and play for video game developers shape the worlds they create. The worlds of game developers, whose daily activity is linked to larger systems of experimentation and technoscientific practice, provide insights that transcend video game development work. The essay draws on ethnographic material from over 3 years of fieldwork with video game developers in the United States and India. It develops the notion of creative collaborative practice based on work in the fields of science and technology studies, game studies, and media studies. The importance of, the desire for, or the drive to understand underlying systems and structures has become fundamental to creative collaborative practice. I argue that the daily activity of game development embodies skills fundamental to creative collaborative practice and that these capabilities represent fundamental aspects of critical thought. Simultaneously, numerous interests have begun to intervene in ways that endanger these foundations of creative collaborative practice.
Cormier, Etienne; Cao, Frédéric; Guichard, Frédéric; Viard, Clément
This article presents a system and a protocol to characterize image stabilization systems both for still images and videos. It uses a six axes platform, three being used for camera rotation and three for camera positioning. The platform is programmable and can reproduce complex motions that have been typically recorded by a gyroscope mounted on different types of cameras in different use cases. The measurement uses a single chart for still image and videos, the texture dead leaves chart. Although the proposed implementation of the protocol uses a motion platform, the measurement itself does not rely on any specific hardware. For still images, a modulation transfer function is measured in different directions and is weighted by a contrast sensitivity function (simulating the human visual system accuracy) to obtain an acutance. The sharpness improvement due to the image stabilization system is a good measurement of performance as recommended by a CIPA standard draft. For video, four markers on the chart are detected with sub-pixel accuracy to determine a homographic deformation between the current frame and a reference position. This model describes well the apparent global motion as translations, but also rotations along the optical axis and distortion due to the electronic rolling shutter equipping most CMOS sensors. The protocol is applied to all types of cameras such as DSC, DSLR and smartphones.
Chen, Jiawen; Paris, Sylvain; Wang, Jue; Matusik, Wojciech; Cohen, Michael; Durand, Fredo
This paper introduces the video mesh, a data structure for representing video as 2.5D “paper cutouts.” The video mesh allows interactive editing of moving objects and modeling of depth, which enables 3D effects and post-exposure camera control. The video mesh sparsely encodes optical flow as well as depth, and handles occlusion using local layering and alpha mattes. Motion is described by a sparse set of points tracked over time. Each point also stores a depth value. The video mesh is a trian...
Aghito, Shankar Manuel; Forchhammer, Søren
In object based video, each frame is a composition of objects that are coded separately. The composition is performed through the alpha plane that represents the transparency of the object. We present an alternative to MPEG-4 for coding of alpha planes that considers their specific properties. Co....... Comparisons in terms of rate and distortion are provided, showing that the proposed coding scheme for still alpha planes is better than the algorithms for I-frames used in MPEG-4.......In object based video, each frame is a composition of objects that are coded separately. The composition is performed through the alpha plane that represents the transparency of the object. We present an alternative to MPEG-4 for coding of alpha planes that considers their specific properties...
This paper presents an architecture and algorithms for model based video object segmentation and its applications to vision augmented interactive game. We are especially interested in real time low cost vision based applications that can be implemented in software in a PC. We use different models for background and a player object. The object segmentation algorithm is performed in two different levels: pixel level and object level. At pixel level, the segmentation algorithm is formulated as a maximizing a posteriori probability (MAP) problem. The statistical likelihood of each pixel is calculated and used in the MAP problem. Object level segmentation is used to improve segmentation quality by utilizing the information about the spatial and temporal extent of the object. The concept of an active region, which is defined based on motion histogram and trajectory prediction, is introduced to indicate the possibility of a video object region for both background and foreground modeling. It also reduces the overall computation complexity. In contrast with other applications, the proposed video object segmentation system is able to create background and foreground models on the fly even without introductory background frames. Furthermore, we apply different rate of self-tuning on the scene model so that the system can adapt to the environment when there is a scene change. We applied the proposed video object segmentation algorithms to several prototype virtual interactive games. In our prototype vision augmented interactive games, a player can immerse himself/herself inside a game and can virtually interact with other animated characters in a real time manner without being constrained by helmets, gloves, special sensing devices, or background environment. The potential applications of the proposed algorithms including human computer gesture interface and object based video coding such as MPEG-4 video coding.
Sun, Jun; Liang, Mingxing; Chen, Weijun; Zhang, Bin
In order to reinforce the measure of vegetable shed's safety, the S3C44B0X is taken as the main processor chip. The embedded hardware platform is built with a few outer-ring chips, and the network server is structured under the Linux embedded environment, and MPEG4 compression and real time transmission are carried on. The experiment indicates that the video monitoring system can guarantee good effect, which can be applied to the safety of vegetable sheds.
Sotnik, A. V.; Yarishev, S. N.; Korotaev, V. V.
possible reducing of the digital stream. Discrete cosine transformation is most widely used among possible orthogonal transformation. Errors of television measuring systems and data compression protocols analyzed In this paper. The main characteristics of measuring systems and detected sources of their error detected. The most effective methods of video compression are determined. The influence of video compression error on television measuring systems was researched. Obtained results will increase the accuracy of the measuring systems. In television image quality measuring system reduces distortion identical distortion in analog systems and specific distortions resulting from the process of coding / decoding digital video signal and errors in the transmission channel. By the distortions associated with encoding / decoding signal include quantization noise, reducing resolution, mosaic effect, "mosquito" effect edging on sharp drops brightness, blur colors, false patterns, the effect of "dirty window" and other defects. The size of video compression algorithms used in television measuring systems based on the image encoding with intra- and inter prediction individual fragments. The process of encoding / decoding image is non-linear in space and in time, because the quality of the playback of a movie at the reception depends on the pre- and post-history of a random, from the preceding and succeeding tracks, which can lead to distortion of the inadequacy of the sub-picture and a corresponding measuring signal.
Langbehn, Hendrickson Reiter; Ricci, Saulo M. R.; Gonçalves, Marcos A.; Almeida, Jussara Marques; Pappa, Gisele Lobo; Benevenuto, Fabrício
Most online video sharing systems (OVSSs), such as YouTube and Yahoo! Video, have several mechanisms for supporting interactions among users. One such mechanism is the video response feature in YouTube, which allows a user to post a video in response to another video. While increasingly popular, the video response feature opens the opportunity for non-cooperative users to introduce ``content pollution'' into the system, thus causing loss of service effectiveness and credibility as w...
Oropesa, Ignacio; Sánchez-González, Patricia; Chmarra, Magdalena K; Lamata, Pablo; Fernández, Alvaro; Sánchez-Margallo, Juan A; Jansen, Frank Willem; Dankelman, Jenny; Sánchez-Margallo, Francisco M; Gómez, Enrique J
The EVA (Endoscopic Video Analysis) tracking system is a new system for extracting motions of laparoscopic instruments based on nonobtrusive video tracking. The feasibility of using EVA in laparoscopic settings has been tested in a box trainer setup. EVA makes use of an algorithm that employs information of the laparoscopic instrument's shaft edges in the image, the instrument's insertion point, and the camera's optical center to track the three-dimensional position of the instrument tip. A validation study of EVA comprised a comparison of the measurements achieved with EVA and the TrEndo tracking system. To this end, 42 participants (16 novices, 22 residents, and 4 experts) were asked to perform a peg transfer task in a box trainer. Ten motion-based metrics were used to assess their performance. Construct validation of the EVA has been obtained for seven motion-based metrics. Concurrent validation revealed that there is a strong correlation between the results obtained by EVA and the TrEndo for metrics, such as path length (ρ = 0.97), average speed (ρ = 0.94), or economy of volume (ρ = 0.85), proving the viability of EVA. EVA has been successfully validated in a box trainer setup, showing the potential of endoscopic video analysis to assess laparoscopic psychomotor skills. The results encourage further implementation of video tracking in training setups and image-guided surgery.
A. L. Oleinik
Full Text Available Subject of Research. The paper deals with the problem of multiple face tracking in a video stream. The primary application of the implemented tracking system is the automatic video surveillance. The particular operating conditions of surveillance cameras are taken into account in order to increase the efficiency of the system in comparison to existing general-purpose analogs. Method. The developed system is comprised of two subsystems: detector and tracker. The tracking subsystem does not depend on the detector, and thus various face detection methods can be used. Furthermore, only a small portion of frames is processed by the detector in this structure, substantially improving the operation rate. The tracking algorithm is based on BRIEF binary descriptors that are computed very efficiently on modern processor architectures. Main Results. The system is implemented in C++ and the experiments on the processing rate and quality evaluation are carried out. MOTA and MOTP metrics are used for tracking quality measurement. The experiments demonstrated the four-fold processing rate gain in comparison to the baseline implementation that processes every video frame with the detector. The tracking quality is on the adequate level when compared to the baseline. Practical Relevance. The developed system can be used with various face detectors (including slow ones to create a fully functional high-speed multiple face tracking solution. The algorithm is easy to implement and optimize, so it may be applied not only in full-scale video surveillance systems, but also in embedded solutions integrated directly into cameras.
The objective of this activity was to record video that could be used for controlled : evaluation of video image vehicle detection system (VIVDS) products and software upgrades to : existing products based on a list of conditions that might be diffic...
Ørngreen, Rikke; Mouritzen, Per
shows that the students experiment with various pedagogical situations, and that during the process of design, teaching, and reflection they acquire experiences at both a concrete specific and a general abstract level. The desktop video conference system creates challenges, with technical issues......This paper presents experiences from teaching video conferencing for learning and collaboration, and discusses the challenges and potentials of applying a collaborative and problem‐based learning (PBL) pedagogy. The research is an action research study, and we as researchers, educational planners...... conferences. We studied 3 subsequent years of a master program module on video conferencing, and the changes it has undergone. The participants work in groups and each group has the task of designing a short one hour (45min) educational design of their own choice. The students have to try out and evaluate...
Hyun, Dai-Kyung; Ryu, Seung-Jin; Lee, Hae-Yeoun; Lee, Heung-Kyu
.... Nevertheless, there is little research being done on forgery of surveillance videos. This paper proposes a forensic technique to detect forgeries of surveillance video based on sensor pattern noise (SPN...
Hürst, W.O.; Ip Vai Ching, Algernon; Hudelist, Marco A.; Primus, Manfred J.; Schoeffmann, Klaus; Beecks, Chrisitan
We present a new approach for collaborative video search and video browsing relying on a combination of traditional, indexbased video retrieval complemented with large-scale human-based visual inspection. In particular, a traditional PC interface is used for query-based search using advanced
Pace, Barbara G.; Jones, Linda Cronin
Today, the use of web-based videos in science classrooms is becoming more and more commonplace. However, these videos are often fast-paced and information rich--science concepts can be fragmented and embedded within larger cultural issues. This article addresses the cognitive difficulties posed by many web-based science videos. Drawing on concepts…
Legg, Philip A; Chung, David H S; Parry, Matthew L; Bown, Rhodri; Jones, Mark W; Griffiths, Iwan W; Chen, Min
Traditional sketch-based image or video search systems rely on machine learning concepts as their core technology. However, in many applications, machine learning alone is impractical since videos may not be semantically annotated sufficiently, there may be a lack of suitable training data, and the search requirements of the user may frequently change for different tasks. In this work, we develop a visual analytics systems that overcomes the shortcomings of the traditional approach. We make use of a sketch-based interface to enable users to specify search requirement in a flexible manner without depending on semantic annotation. We employ active machine learning to train different analytical models for different types of search requirements. We use visualization to facilitate knowledge discovery at the different stages of visual analytics. This includes visualizing the parameter space of the trained model, visualizing the search space to support interactive browsing, visualizing candidature search results to support rapid interaction for active learning while minimizing watching videos, and visualizing aggregated information of the search results. We demonstrate the system for searching spatiotemporal attributes from sports video to identify key instances of the team and player performance.
In this dissertation, several approaches are proposed to improve the quality of video transmission over wired and wireless networks. To improve the robustness of video transmission over error-prone mobile networks, a proxy-based reference picture selection scheme is proposed. In the second part of the dissertation, rate-distortion optimized rate adaptation algorithms are proposed for video applications over congested network nodes. A segment-based proxy caching algorithm for video-on-demand a...
Yu, Fang; An, Ping; Yang, Chao; You, Zhixiang; Shen, Liquan
As an extension of High Efficiency Video Coding ( HEVC), 3D-HEVC has been widely researched under the impetus of the new generation coding standard in recent years. Compared with H.264/AVC, its compression efficiency is doubled while keeping the same video quality. However, its higher encoding complexity and longer encoding time are not negligible. To reduce the computational complexity and guarantee the subjective quality of virtual views, this paper presents a novel video coding method for 3D-HEVC based on the saliency informat ion which is an important part of Human Visual System (HVS). First of all, the relationship between the current coding unit and its adjacent units is used to adjust the maximum depth of each largest coding unit (LCU) and determine the SKIP mode reasonably. Then, according to the saliency informat ion of each frame image, the texture and its corresponding depth map will be divided into three regions, that is, salient area, middle area and non-salient area. Afterwards, d ifferent quantization parameters will be assigned to different regions to conduct low complexity coding. Finally, the compressed video will generate new view point videos through the renderer tool. As shown in our experiments, the proposed method saves more bit rate than other approaches and achieves up to highest 38% encoding time reduction without subjective quality loss in compression or rendering.
Ishikawa, Tomoya; Yamazawa, Kazumasa; Sato, Tomokazu; Ikeda, Sei; Nakamura, Yutaka; Fujikawa, Kazutoshi; Sunahara, Hideki; Yokoya, Naokazu
In this paper, we describe a new telepresence system which enables a user to look around a virtualized real world easily in network environments. The proposed system includes omni-directional video viewers on web browsers and allows the user to look around the omni-directional video contents on the web browsers. The omni-directional video viewer is implemented as an Active-X program so that the user can install the viewer automatically only by opening the web site which contains the omni-directional video contents. The system allows many users at different sites to look around the scene just like an interactive TV using a multi-cast protocol without increasing the network traffic. This paper describes the implemented system and the experiments using live and stored video streams. In the experiment with stored video streams, the system uses an omni-directional multi-camera system for video capturing. We can look around high resolution and high quality video contents. In the experiment with live video streams, a car-mounted omni-directional camera acquires omni-directional video streams surrounding the car, running in an outdoor environment. The acquired video streams are transferred to the remote site through the wireless and wired network using multi-cast protocol. We can see the live video contents freely in arbitrary direction. In the both experiments, we have implemented a view-dependent presentation with a head-mounted display (HMD) and a gyro sensor for realizing more rich presence.
Full Text Available This paper envisions coding with side information to design a highly scalable video codec. To achieve fine-grained scalability in terms of resolution, quality, and spatial access as well as temporal access to individual frames, the JPEG 2000 coding algorithm has been considered as the reference algorithm to encode INTRA information, and coding with side information has been envisioned to refresh the blocks that change between two consecutive images of a video sequence. One advantage of coding with side information compared to conventional closed-loop hybrid video coding schemes lies in the fact that parity bits are designed to correct stochastic errors and not to encode deterministic prediction errors. This enables the codec to support some desynchronization between the encoder and the decoder, which is particularly helpful to adapt on the fly pre-encoded content to fluctuating network resources and/or user preferences in terms of regions of interest. Regarding the coding scheme itself, to preserve both quality scalability and compliance to the JPEG 2000 wavelet representation, a particular attention has been devoted to the definition of a practical coding framework able to exploit not only the temporal but also spatial correlation among wavelet subbands coefficients, while computing the parity bits on subsets of wavelet bit-planes. Simulations have shown that compared to pure INTRA-based conditional replenishment solutions, the addition of the parity bits option decreases the transmission cost in terms of bandwidth, while preserving access flexibility.
Enayati, Moein; Banerjee, Tanvi; Popescu, Mihail; Skubic, Marjorie; Rantz, Marilyn
The purpose of this study was to implement a web based application to provide the ability to rewind and review depth videos captured in hospital rooms to investigate the event chains that led to patient's fall at a specific time. In this research, Kinect depth images are being used to capture shadow-like images of the patient and their room to resolve concerns about patients' privacy. As a result of our previous research, a fall detection system has been developed and installed in hospital rooms, and fall alarms are generated if any falls are detected by the system. Then nurses will go through the stored depth videos to investigate for possible injury as well as the reasons and events that may have caused the patient's fall to prevent future occurrences. This paper proposes a novel web application to ease the process of search and reviewing the videos by means of new visualization techniques to highlight video frames that contain potential risk of fall based on our previous research.
Lindsay, Jan A; Kauth, Michael R; Hudson, Sonora; Martin, Lindsey A; Ramsey, David J; Daily, Lawrence; Rader, John
Increasing access to psychotherapy for posttraumatic stress disorder (PTSD) is a primary focus of the Department of Veterans Affairs (VA) healthcare system. Delivery of treatment via video telehealth can expand availability of treatment and be equally effective as in-person treatment. Despite VA efforts, barriers to establishing telehealth services remain, including both provider acceptance and organizational obstacles. Thus, development of specific strategies is needed to implement video telehealth services in complex healthcare systems, like the VA. This project was guided by the Promoting Action on Research Implementation in Health Services framework and used external facilitation to increase access to psychotherapy via video telehealth. The project was conducted at five VA Medical Centers and their associated community clinics across six states in the South Central United States. Over a 21-month period, 27 video telehealth clinics were established to provide greater access to evidence-based psychotherapies for PTSD. Examination of change scores showed that participating sites averaged a 3.2-fold increase in unique patients and a 6.5-fold increase in psychotherapy sessions via video telehealth for PTSD. Differences between participating and nonparticipating sites in both unique patients and encounters were significant (p=0.041 and p=0.009, respectively). Two groups emerged, separated by degree of engagement in the facilitation intervention. Facilitation was perceived as useful by providers. To our knowledge, this is the first prospective study of external facilitation as an implementation strategy for telehealth. Our findings suggest that external facilitation is an effective and acceptable strategy to support providers as they establish clinics and make complex practice changes, such as implementing video telehealth to deliver psychotherapy.
Bower, Matt; Cavanagh, Michael; Moloney, Robyn; Dao, MingMing
This paper reports on how the cognitive, behavioural and affective communication competencies of undergraduate students were developed using an online Video Reflection system. Pre-service teachers were provided with communication scenarios and asked to record short videos of one another making presentations. Students then uploaded their videos to…
Martinez-Hernandez, Kermin Joel
The chemistry-based computer video game is a multidisciplinary collaboration between chemistry and computer graphics and technology fields developed to explore the use of video games as a possible learning tool. This innovative approach aims to integrate elements of commercial video game and authentic chemistry context environments into a learning…
Chan, Yi-Tung; Wang, Shuenn-Jyi; Tsai, Chung-Hsien
Public safety is a matter of national security and people's livelihoods. In recent years, intelligent video-surveillance systems have become important active-protection systems. A surveillance system that provides early detection and threat assessment could protect people from crowd-related disasters and ensure public safety. Image processing is commonly used to extract features, e.g., people, from a surveillance video. However, little research has been conducted on the relationship between foreground detection and feature extraction. Most current video-surveillance research has been developed for restricted environments, in which the extracted features are limited by having information from a single foreground; they do not effectively represent the diversity of crowd behavior. This paper presents a general framework based on extracting ensemble features from the foreground of a surveillance video to analyze a crowd. The proposed method can flexibly integrate different foreground-detection technologies to adapt to various monitored environments. Furthermore, the extractable representative features depend on the heterogeneous foreground data. Finally, a classification algorithm is applied to these features to automatically model crowd behavior and distinguish an abnormal event from normal patterns. The experimental results demonstrate that the proposed method's performance is both comparable to that of state-of-the-art methods and satisfies the requirements of real-time applications.
Investigating how visuals affect test takers' performance on video-based L2 listening tests has been the focus of many recent studies. While most existing research has been based on test scores and self-reported verbal data, few studies have examined test takers' viewing behavior (Ockey, 2007; Wagner, 2007, 2010a). To address this gap, in the…
Chen, Xia; Cai, Canhui
The goal of this paper is to design a TCP friendly real-time video transport protocol that will not only utilize network resource efficiently, but also prevent network congestion from the real-time video transmitting effectively. To this end, we proposed a source based congestion control scheme to adapt video coding rate to the channel capacity of the IP network, including three stages: rate control, rate-adaptive video encoding, and rate shaping.
Gonzalez, Ray; Martinez, Jose M; Lo Menzo, Emanuele; Iglesias, Alberto R; Ro, Charles Y; Madan, Atul K
The Global Operative Assessment of Laparoscopic Skill (GOALS) is one validated metric utilized to grade laparoscopic skills and has been utilized to score recorded operative videos. To facilitate easier viewing of these recorded videos, we are developing novel techniques to enable surgeons to view these videos. The objective of this study is to determine the feasibility of utilizing widespread current consumer-based technology to assist in distributing appropriate videos for objective evaluation. Videos from residents were recorded via a direct connection from the camera processor via an S-video output via a cable into a hub to connect to a standard laptop computer via a universal serial bus (USB) port. A standard consumer-based video editing program was utilized to capture the video and record in appropriate format. We utilized mp4 format, and depending on the size of the file, the videos were scaled down (compressed), their format changed (using a standard video editing program), or sliced into multiple videos. Standard available consumer-based programs were utilized to convert the video into a more appropriate format for handheld personal digital assistants. In addition, the videos were uploaded to a social networking website and video sharing websites. Recorded cases of laparoscopic cholecystectomy in a porcine model were utilized. Compression was required for all formats. All formats were accessed from home computers, work computers, and iPhones without difficulty. Qualitative analyses by four surgeons demonstrated appropriate quality to grade for these formats. Our preliminary results show promise that, utilizing consumer-based technology, videos can be easily distributed to surgeons to grade via GOALS via various methods. Easy accessibility may help make evaluation of resident videos less complicated and cumbersome.
AO-AIO 790 BOM CORP MCLEAN VA F/A 17/8 VIDEO AUTOMATIC TARGE T TRACKING SYSTEM (VATTS) OPERATING PROCEO -ETC(U) AUG Go C STAMM J P ORRESTER, J...Tape Transport Number Two TKI Tektronics I/0 Terminal DS1 Removable Disk Storage Unit DSO Fixed Disk Storage Unit CRT Cathode Ray Tube 1-3 THE BDM...file (mark on Mag Tape) AZEL Quick look at Trial Information Program DUPTAPE Allows for duplication of magnetic tapes CA Cancel ( terminates program on
Archetti, Renata; Vacchi, Matteo; Carniel, Sandro; Benetazzo, Alvise
Measuring the location of the shoreline and monitoring foreshore changes through time represent a fundamental task for correct coastal management at many sites around the world. Several authors demonstrated video systems to be an essential tool for increasing the amount of data available for coastline management. These systems typically sample at least once per hour and can provide long-term datasets showing variations over days, events, months, seasons and years. In the past few years, due to the wide diffusion of video cameras at relatively low price, the use of video cameras and of video images analysis for environmental control has increased significantly. Even if video monitoring systems were often used in the research field they are most often applied with practical purposes including: i) identification and quantification of shoreline erosion, ii) assessment of coastal protection structure and/or beach nourishment performance, and iii) basic input to engineering design in the coastal zone iv) support for integrated numerical model validation Here we present the guidelines for the creation of a new video monitoring network in the proximity of the Jesolo beach (NW of the Adriatic Sea, Italy), Within this 10 km-long tourist district several engineering structures have been built in recent years, with the aim of solving urgent local erosion problems; as a result, almost all types of protection structures are present at this site: groynes, detached breakwaters.The area investigated experienced severe problems of coastal erosion in the past decades, inclusding a major one in the last November 2012. The activity is planned within the framework of the RITMARE project, that is also including other monitoring and scientific activities (bathymetry survey, waves and currents measurements, hydrodynamics and morphodynamic modeling). This contribution focuses on best practices to be adopted in the creation of the video monitoring system, and briefly describes the
White, Preston, III
Kennedy Space Center has the need for economical transmission of two multiplexed video signals along multimode fiberoptic systems. These systems must span unusual distances and must meet RS-250B short-haul standards after reception. Bandwidth is a major problem and studies of the installed fibers, available LEDs and PINFETs led to the choice of 100 MHz as the upper limit for the system bandwidth. Optical multiplexing and digital transmission were deemed inappropriate. Three electrical multiplexing schemes were chosen for further study. Each of the multiplexing schemes included an FM stage to help meet the stringent S/N specification. Both FM and AM frequency division multiplexing methods were investigated theoretically and these results were validated with laboratory tests. The novel application of quadrature amplitude multiplexing was also considered. Frequency division multiplexing of two wideband FM video signal appears the most promising scheme although this application requires high power highly linear LED transmitters. Futher studies are necessary to determine if LEDs of appropriate quality exist and to better quantify performance of QAM in this application.
Valdez, Connie A; Paulsen, Susan
To (1) describe the development of a Video-based Clinical Examination (VCE) as a formal testing format to evaluate student ability to make an accurate pharmaceutical assessment and recommendation, and (2) determine student perception of the VCE testing format. Descriptive study of first-year pharmacy students. One hundred and twenty-nine students were included in the study. Students perceived that the VCE testing format provided a real life/interactive environment but felt rushed as the video segments of the patient/pharmacist interaction occurred quickly. Based on the findings of this project, we will continue to pursue further research related to validity, reliability and application of VCEs. However, the University of Colorado will continue to incorporate VCEs in the performance based evaluations in the Professional Skills Development 1 course, as it appears to be an effective stepping-stone for first-year students to begin developing their active listening, higher level learning and problem-solving skills. Results of this project will be shared with the faculty and curriculum committee at the University of Colorado School of Pharmacy to encourage further use and research of VCEs in other courses.
Full Text Available Objective The purpose of this paper is to describe the use of video-based observation research methods in primary care environment and highlight important methodological considerations and provide practical guidance for primary care and human factors researchers conducting video studies to understand patient–clinician interaction in primary care settings.Methods We reviewed studies in the literature which used video methods in health care research, and we also used our own experience based on the video studies we conducted in primary care settings.Results This paper highlighted the benefits of using video techniques, such as multi-channel recording and video coding, and compared “unmanned” video recording with the traditional observation method in primary care research. We proposed a list that can be followed step by step to conduct an effective video study in a primary care setting for a given problem. This paper also described obstacles, researchers should anticipate when using video recording methods in future studies.Conclusion With the new technological improvements, video-based observation research is becoming a promising method in primary care and HFE research. Video recording has been under-utilised as a data collection tool because of confidentiality and privacy issues. However, it has many benefits as opposed to traditional observations, and recent studies using video recording methods have introduced new research areas and approaches.
Asan, Onur; Montague, Enid
The purpose of this paper is to describe the use of video-based observation research methods in primary care environment and highlight important methodological considerations and provide practical guidance for primary care and human factors researchers conducting video studies to understand patient-clinician interaction in primary care settings. We reviewed studies in the literature which used video methods in health care research, and we also used our own experience based on the video studies we conducted in primary care settings. This paper highlighted the benefits of using video techniques, such as multi-channel recording and video coding, and compared "unmanned" video recording with the traditional observation method in primary care research. We proposed a list that can be followed step by step to conduct an effective video study in a primary care setting for a given problem. This paper also described obstacles, researchers should anticipate when using video recording methods in future studies. With the new technological improvements, video-based observation research is becoming a promising method in primary care and HFE research. Video recording has been under-utilised as a data collection tool because of confidentiality and privacy issues. However, it has many benefits as opposed to traditional observations, and recent studies using video recording methods have introduced new research areas and approaches.
Barton, Robert; Lemley, Corey; French, Paul
Recent advancements in computing technology have drastically improved the interface between computers and video equipment, thus allowing for the improvement of video-based motion analysis. However, analysis of video data remains susceptible to errors caused by lens distortion, angular distortion, descaling, and discretization. In previous work, methods were developed to correct for some of these errors. This paper presents several improvements to these corrections, as well as additional methods to increase accuracy, including: 1) improvement of the measurement of lens distortion, 2) automation of the lens distortion correction technique, 3) a simulation method to test the correction of lens distortion error, 4) creation of an algorithm to correct for angular distortion, and 5) refinement of a two camera system to correct for descaling error. The methods were tested in the context of the measurement of the air resistance force on a high-speed projectile. This allowed for the determination of the onset of turbulent flow. These significant improvements in accuracy have made video-based analysis an even more powerful tool for the study of motion.
Full Text Available This paper presents an efficient video filtering scheme and its implementation in a field-programmable logic device (FPLD. Since the proposed nonlinear, spatiotemporal filtering scheme is based on order statistics, its efficient implementation benefits from a bit-serial realization. The utilization of both the spatial and temporal correlation characteristics of the processed video significantly increases the computational demands on this solution, and thus, implementation becomes a significant challenge. Simulation studies reported in this paper indicate that the proposed pipelined bit-serial FPLD filtering solution can achieve speeds of up to 97.6 Mpixels/s and consumes 1700 to 2700 logic cells for the speed-optimized and area-optimized versions, respectively. Thus, the filter area represents only 6.6 to 10.5% of the Altera STRATIX EP1S25 device available on the Altera Stratix DSP evaluation board, which has been used to implement a prototype of the entire real-time vision system. As such, the proposed adaptive video filtering scheme is both practical and attractive for real-time machine vision and surveillance systems as well as conventional video and multimedia applications.
Chan, Fai; Moon, Yiu-Sang; Chen, Jiansheng; Ma, Yiu-Kwan; Tsang, Wai-Hung; Fu, Kah-Kuen
Low resolution and un-sharp facial images are always captured from surveillance videos because of long human-camera distance and human movements. Previous works addressed this problem by using an active camera to capture close-up facial images without considering human movements and mechanical delays of the active camera. In this paper, we proposed a unified framework to capture facial images in video surveillance systems by using one static and active camera in a cooperative manner. Human faces are first located by a skin-color based real-time face detection algorithm. A stereo camera model is also employed to approximate human face location and his/her velocity with respect to the active camera. Given the mechanical delays of the active camera, the position of a target face with a given delay can be estimated using a Human-Camera Synchronization Model. By controlling the active camera with corresponding amount of pan, tilt, and zoom, a clear close-up facial image of a moving human can be captured then. We built the proposed system in an 8.4-meter indoor corridor. Results show that the proposed stereo camera configuration can locate faces with average error of 3%. In addition, it is capable of capturing facial images of a walking human clearly in first instance in 90% of the test cases.
Sandy, C. L. M.; Meiyanti, R.
A measurement of height is comparing the value of the magnitude of an object with a standard measuring tool. The problems that exist in the measurement are still the use of a simple apparatus in which one of them is by using a meter. This method requires a relatively long time. To overcome these problems, this research aims to create software with image processing that is used for the measurement of height. And subsequent that image is tested, where the object captured by the video camera can be known so that the height of the object can be measured using the learning method of Otsu. The system was built using Delphi 7 of Vision Lab VCL 4.5 component. To increase the quality of work of the system in future research, the developed system can be combined with other methods.
Na, Sang‐il; Oh, Weon‐Geun; Jeong, Dong‐Seok
.... The proposed two‐stage matching method is fast and works very well in finding locations. In addition, the proposed matching structure and strategy can distinguish a case in which a part of the query video matches a part...
Giaccone, Agnese; Solli, Piergiorgio; Bertolaccini, Luca
The magnetic anchoring guidance system (MAGS) is one of the most promising technological innovations in minimally invasive surgery and consists in two magnetic elements matched through the abdominal or thoracic wall. The internal magnet can be inserted into the abdominal or chest cavity through a small single incision and then moved into position by manipulating the external component. In addition to a video camera system, the inner magnetic platform can house remotely controlled surgical tools thus reducing instruments fencing, a serious inconvenience of the uniportal access. The latest prototypes are equipped with self-light-emitting diode (LED) illumination and a wireless antenna for signal transmission and device controlling, which allows bypassing the obstacle of wires crossing the field of view (FOV). Despite being originally designed for laparoscopic surgery, the MAGS seems to suit optimally the characteristics of the chest wall and might meet the specific demands of video-assisted thoracic surgery (VATS) surgery in terms of ergonomics, visualization and surgical performance; moreover, it involves less risks for the patients and an improved aesthetic outcome.
Full Text Available Gait is a unique perceptible biometric feature at larger distances, and the gait representation approach plays a key role in a video sensor-based gait recognition system. Class Energy Image is one of the most important gait representation methods based on appearance, which has received lots of attentions. In this paper, we reviewed the expressions and meanings of various Class Energy Image approaches, and analyzed the information in the Class Energy Images. Furthermore, the effectiveness and robustness of these approaches were compared on the benchmark gait databases. We outlined the research challenges and provided promising future directions for the field. To the best of our knowledge, this is the first review that focuses on Class Energy Image. It can provide a useful reference in the literature of video sensor-based gait representation approach.
Lv, Zhuowen; Xing, Xianglei; Wang, Kejun; Guan, Donghai
Gait is a unique perceptible biometric feature at larger distances, and the gait representation approach plays a key role in a video sensor-based gait recognition system. Class Energy Image is one of the most important gait representation methods based on appearance, which has received lots of attentions. In this paper, we reviewed the expressions and meanings of various Class Energy Image approaches, and analyzed the information in the Class Energy Images. Furthermore, the effectiveness and robustness of these approaches were compared on the benchmark gait databases. We outlined the research challenges and provided promising future directions for the field. To the best of our knowledge, this is the first review that focuses on Class Energy Image. It can provide a useful reference in the literature of video sensor-based gait representation approach. PMID:25574935
Terakawa, Yuzo; Ishibashi, Kenichi; Goto, Takeo; Ohata, Kenji
Three-dimensional (3-D) video recording of microsurgery is a more promising tool for presentation and education of microsurgery than conventional two-dimensional video systems, but has not been widely adopted partly because 3-D image processing of previous 3-D video systems is complicated and observers without optical devices cannot visualize the 3-D image. A new technical development for 3-D video presentation of microsurgery is described. Microsurgery is recorded with a microscope equipped with a single high-definition (HD) video camera. This 3-D video system records the right- and left-eye views of the microscope simultaneously as single HD data with the use of a 3-D camera adapter: the right- and left-eye views of the microscope are displayed separately on the right and left sides, respectively. The operation video is then edited with video editing software so that the right-eye view is displayed on the left side and left-eye view is displayed on the right side. Consequently, a 3-D video of microsurgery can be created by viewing the edited video by the cross-eyed stereogram viewing method without optical devices. The 3-D microsurgical video provides a more accurate view, especially with regard to depth, and a better understanding of microsurgical anatomy. Although several issues are yet to be addressed, this 3-D video system is a useful method of recording and presenting microsurgery for 3-D viewing with currently available equipment, without optical devices.
Bhattacharya, Sharmila; Inan, Omer; Kovacs, Gregory; Etemadi, Mozziyar; Sanchez, Max; Marcu, Oana
populations in terrestrial experiments, and could be especially useful in field experiments in remote locations. Two practical limitations of the system should be noted: first, only walking flies can be observed - not flying - and second, although it enables population studies, tracking individual flies within the population is not currently possible. The system used video recording and an analog circuit to extract the average light changes as a function of time. Flies were held in a 5-cm diameter Petri dish and illuminated from below by a uniform light source. A miniature, monochrome CMOS (complementary metal-oxide semiconductor) video camera imaged the flies. This camera had automatic gain control, and this did not affect system performance. The camera was positioned 5-7 cm above the Petri dish such that the imaging area was 2.25 sq cm. With this basic setup, still images and continuous video of 15 flies at one time were obtained. To reduce the required data bandwidth by several orders of magnitude, a band-pass filter (0.3-10 Hz) circuit compressed the video signal and extracted changes in image luminance over time. The raw activity signal output of this circuit was recorded on a computer and digitally processed to extract the fly movement "events" from the waveform. These events corresponded to flies entering and leaving the image and were used for extracting activity parameters such as inter-event duration. The efficacy of the system in quantifying locomotor activity was evaluated by varying environmental temperature, then measuring the activity level of the flies.
Pasch, H. L.
An overview of video coding is presented. The aim is not to give a technical summary of possible coding techniques, but to address subjects related to video compression in general and to the transmission of compressed video in more detail. Bit rate reduction is in general possible by removing redundant information; removing information the eye does not use anyway; and reducing the quality of the video. The codecs which are used for reducing the bit rate, can be divided into two groups: Constant Bit rate Codecs (CBC's), which keep the bit rate constant, but vary the video quality; and Variable Bit rate Codecs (VBC's), which keep the video quality constant by varying the bit rate. VBC's can be in general reach a higher video quality than CBC's using less bandwidth, but need a transmission system that allows the bandwidth of a connection to fluctuate in time. The current and the next generation of the PSTN does not allow this; ATM might. There are several factors which influence the quality of video: the bit error rate of the transmission channel, slip rate, packet loss rate/packet insertion rate, end-to-end delay, phase shift between voice and video, and bit rate. Based on the bit rate of the coded video, the following classification of coded video can be made: High Definition Television (HDTV); Broadcast Quality Television (BQTV); video conferencing; and video telephony. The properties of these classes are given. The video conferencing and video telephony equipment available now and in the next few years can be divided into three categories: conforming to 1984 CCITT standard for video conferencing; conforming to 1988 CCITT standard; and conforming to no standard.
Xu, Fang; Zhou, Qin-Wu; Wu, Peng; Chen, Xing; Yang, Xiaofeng; Yan, Hong-jian
This paper proposes a new non-contact heart rate measurement method based on photoplethysmography (PPG) theory. With this method we can measure heart rate remotely with a camera and ambient light. We collected video sequences of subjects, and detected remote PPG signals through video sequences. Remote PPG signals were analyzed with two methods, Blind Source Separation Technology (BSST) and Cross Spectral Power Technology (CSPT). BSST is a commonly used method, and CSPT is used for the first time in the study of remote PPG signals in this paper. Both of the methods can acquire heart rate, but compared with BSST, CSPT has clearer physical meaning, and the computational complexity of CSPT is lower than that of BSST. Our work shows that heart rates detected by CSPT method have good consistency with the heart rates measured by a finger clip oximeter. With good accuracy and low computational complexity, the CSPT method has a good prospect for the application in the field of home medical devices and mobile health devices.
Cai, Lin; Deng, Nianchun; Xiao, Zexin
The cables in anchorage zone of cable-stayed bridge are hidden within the embedded pipe, which leads to the difficulty for detecting the damage of the cables with visual inspection. We have built a detection device based on high-resolution video capture, realized the distance observing of invisible segment of stay cable and damage detection of outer surface of cable in the small volume. The system mainly consists of optical stents and precision mechanical support device, optical imaging system, lighting source, drived motor control and IP camera video capture system. The principal innovations of the device are ⑴A set of telescope objectives with three different focal lengths are designed and used in different distances of the monitors by means of converter. ⑵Lens system is far separated with lighting system, so that the imaging optical path could effectively avoid the harsh environment which would be in the invisible part of cables. The practice shows that the device not only can collect the clear surveillance video images of outer surface of cable effectively, but also has a broad application prospect in security warning of prestressed structures.
Li, Sheng-Zhen; Xie, Quan-Long; Zang, Xiao-Dong; Tang, Guo-Jun
Since that the pedestrian crossing facilities at present is not perfect, pedestrian crossing is in chaos and pedestrians from opposite direction conflict and congest with each other, which severely affects the pedestrian traffic efficiency, obstructs the vehicle and bringing about some potential security problems. To solve these problems, based on video identification, a pedestrian crossing guidance system was researched and designed. It uses the camera to monitor the pedestrians in real time and sums up the number of pedestrians through video detection program, and a group of pedestrian's induction lamp array is installed at the interval of crosswalk, which adjusts color display according to the proportion of pedestrians from both sides to guide pedestrians from both opposite directions processing separately. The emulation analysis result from cellular automaton shows that the system reduces the pedestrian crossing conflict, shortens the time of pedestrian crossing and improves the safety of pedestrians crossing.
Full Text Available This paper presents a new approach for text-based video content retrieval system. The proposed scheme consists of three main processes that are key frame extraction, text localization and keyword matching. For the key-frame extraction, we proposed a Maximally Stable Extremal Region (MSER based feature which is oriented to segment shots of the video with different text contents. In text localization process, in order to form the text lines, the MSERs in each key frame are clustered based on their similarity in position, size, color, and stroke width. Then, Tesseract OCR engine is used for recognizing the text regions. In this work, to improve the recognition results, we input four images obtained from different pre-processing methods to Tesseract engine. Finally, the target keyword for querying is matched with OCR results based on an approximate string search scheme. The experiment shows that, by using the MSER feature, the videos can be segmented by using efficient number of shots and provide the better precision and recall in comparison with a sum of absolute difference and edge based method.
Full Text Available A security has become very important along with the increasing number of crime cases. If some security system fails, there is a need for a mechanism that capable in recording the criminal act. Therefore, it can be used for investigation purpose of the authorities. The objective of this research is to develop a security system using video streaming that able to monitor in real-time manner, display movies in a browser, and record a video as triggered by a sensor. This monitoring system comprises of two security level camera as a video recorder of special events based on infrared sensor that is connected to a microcontroller via serial communication and camera as a real-time room monitor. The hardware system consists of infrared sensor circuit to detect special events that is serially communicated to an AT89S51 microcontroller that controls the system to perform recording process, and the software system consists of a server that displaying video streaming in a webpage and a video recorder. The software for video recording and server camera uses Visual Basic 6.0 and for video streaming uses PHP 5.1.6. As the result, the system can be used to record special events that it is wanted, and can displayed video streaming in a webpage using LAN infrastructure.
Kowalczyk, Marcin; Maksymiuk, Lukasz; Siuzdak, Jerzy
The article was focused on presentation the achievements in a scope of experimental research on transmission of digital video streams in the frame of specially realized for this purpose ROF (Radio over Fiber) network. Its construction was based on the merge of wireless IEEE 802.11 network, popularly referred as Wi-Fi, with a passive optical network PON based on multimode fibers MMF. The proposed approach can constitute interesting proposal in area of solutions in the scope of the systems monitoring extensive, within which is required covering of a large area with ensuring of a relatively high degree of immunity on the interferences transmitted signals from video IP cameras to the monitoring center and a high configuration flexibility (easily change the deployment of cameras) of such network.
Okano, Fumio; Kawakita, Masahiro; Arai, Jun; Sasaki, Hisayuki; Yamashita, Takayuki; Sato, Masahito; Suehiro, Koya; Haino, Yasuyuki
The integral method enables observers to see 3D images like real objects. It requires extremely high resolution for both capture and display stages. We present an experimental 3D television system based on the integral method using an extremely high-resolution video system. The video system has 4,000 scanning lines using the diagonal offset method for two green channels. The number of elemental lenses in the lens array is 140 (vertical) × 182 (horizontal). The viewing zone angle is wider than 20 degrees in practice. This television system can capture 3D objects and provides full color and full parallax 3D images in real time.
Fu, Rongguo; Feng, Shu; Shen, Tianyu; Luo, Hao; Wei, Yifang; Yang, Qi
Low light level video is an important method of observation under low illumination condition, but the SNR of low light level video is low, the effect of observation is poor, so the noise reduction processing must be carried out. Low light level video noise mainly includes Gauss noise, Poisson noise, impulse noise, fixed pattern noise and dark current noise. In order to remove the noise in low-light-level video effectively, improve the quality of low-light-level video. This paper presents an improved time domain recursive filtering algorithm with three dimensional filtering coefficients. This algorithm makes use of the correlation between the temporal domain of the video sequence. In the video sequences, the proposed algorithm adaptively adjusts the local window filtering coefficients in space and time by motion estimation techniques, for the different pixel points of the same frame of the image, the different weighted coefficients are used. It can reduce the image tail, and ensure the noise reduction effect well. Before the noise reduction, a pretreatment based on boxfilter is used to reduce the complexity of the algorithm and improve the speed of the it. In order to enhance the visual effect of low-light-level video, an image enhancement algorithm based on guided image filter is used to enhance the edge of the video details. The results of experiment show that the hybrid algorithm can remove the noise of the low-light-level video effectively, enhance the edge feature and heighten the visual effects of video.
Bayyapu, Karunakar Reddy; Fischer, Paul
We propose an architecture for a storage system of surveillance videos. Such systems have to handle massive amounts of incoming video streams and relatively few requests for replay. In such a system load (i.e., Write requests) scheduling is essential to guarantee performance. Large-scale data-sto...
Potel, Michael J.; MacKay, Steven A.; Sayre, Richard E.
Extracting quantitative information from movie film and video recordings has always been a difficult process. The Galatea motion analysis system represents an application of some powerful interactive computer graphics capabilities to this problem. A minicomputer is interfaced to a stop-motion projector, a data tablet, and real-time display equipment. An analyst views a film and uses the data tablet to track a moving position of interest. Simultaneously, a moving point is displayed in an animated computer graphics image that is synchronized with the film as it runs. Using a projection CRT and a series of mirrors, this image is superimposed on the film image on a large front screen. Thus, the graphics point lies on top of the point of interest in the film and moves with it at cine rates. All previously entered points can be displayed simultaneously in this way, which is extremely useful in checking the accuracy of the entries and in avoiding omission and duplication of points. Furthermore, the moving points can be connected into moving stick figures, so that such representations can be transcribed directly from film. There are many other tools in the system for entering outlines, measuring time intervals, and the like. The system is equivalent to "dynamic tracing paper" because it is used as though it were tracing paper that can keep up with running movie film. We have applied this system to a variety of problems in cell biology, cardiology, biomechanics, and anatomy. We have also extended the system using photogrammetric techniques to support entry of three-dimensional moving points from two (or more) films taken simultaneously from different perspective views. We are also presently constructing a second, lower-cost, microcomputer-based system for motion analysis in video, using digital graphics and video mixing to achieve the graphics overlay for any composite video source image.
Full Text Available Transcoding is an effective method to provide video adaptation for heterogeneous internetwork video access and communication environments, which require the tailoring (i.e., repurposing of coded video properties to channel conditions, terminal capabilities, and user preferences. This paper presents a video transcoding system that is capable of applying a suite of error resilience tools on the input compressed video streams while controlling the output rates to provide robust communications over error-prone and bandwidth-limited 3G wireless networks. The transcoder is also designed to employ a new adaptive intra-refresh algorithm, which is responsive to the detected scene activity inherently embedded into the video content and the reported time-varying channel error conditions of the wireless network. Comprehensive computer simulations demonstrate significant improvements in the received video quality performances using the new transcoding architecture without an extra computational cost.
Eminsoy, Sertac; Dogan, Safak; Kondoz, Ahmet M.
Transcoding is an effective method to provide video adaptation for heterogeneous internetwork video access and communication environments, which require the tailoring (i.e., repurposing) of coded video properties to channel conditions, terminal capabilities, and user preferences. This paper presents a video transcoding system that is capable of applying a suite of error resilience tools on the input compressed video streams while controlling the output rates to provide robust communications over error-prone and bandwidth-limited 3G wireless networks. The transcoder is also designed to employ a new adaptive intra-refresh algorithm, which is responsive to the detected scene activity inherently embedded into the video content and the reported time-varying channel error conditions of the wireless network. Comprehensive computer simulations demonstrate significant improvements in the received video quality performances using the new transcoding architecture without an extra computational cost.
Д В Сенашенко
Full Text Available The article describes distant learning systems used in world practice. The author gives classification of video communication systems. Aspects of using Skype software in Russian Federation are discussed. In conclusion the author provides the review of modern production video conference systems used as tools for distant learning.
... COMMISSION Certain Video Analytics Software, Systems, Components Thereof, and Products Containing Same... certain video analytics software, systems, components thereof, and products containing same by reason of..., Inc. The remaining respondents are Bosch Security Systems, Inc.; Robert Bosch GmbH; Bosch...
... COMMISSION Certain Video Analytics Software, Systems, Components Thereof, and Products Containing Same... States after importation of certain video analytics software, systems, components thereof, and products...; Bosch Security Systems, Inc. of Fairpoint, New York; Samsung Techwin Co., Ltd. of Seoul, Korea; Samsung...
Seeling, P.; Reisslein, M.; Fitzek, Frank
Video traces contain information about encoded video frames, such as frame sizes and qualities, and provide a convenient method to conduct multimedia networking research. Although wiedely used in networking research, these traces do not allow to determine the video qaulityin an accurate manner...... after networking transport that includes losses and delays. In this work, we provide (i) an overview of frame dependencies that have to be taken into consideration when working with video traces, (ii) an algorithmic approach to combine traditional video traces and offset distortion traces to determine...... the video quality or distortion after lossy network transport, (iii) offset distortion and quality characteristics and (iv) the offset distortion trace format and tools to create offset distortion traces....
Full Text Available This work presents a novel indoor video surveillance system, capable of detecting the falls of humans. The proposed system can detect and evaluate human posture as well. To evaluate human movements, the background model is developed using the codebook method, and the possible position of moving objects is extracted using the background and shadow eliminations method. Extracting a foreground image produces more noise and damage in this image. Additionally, the noise is eliminated using morphological and size filters and this damaged image is repaired. When the image object of a human is extracted, whether or not the posture has changed is evaluated using the aspect ratio and height of a human body. Meanwhile, the proposed system detects a change of the posture and extracts the histogram of the object projection to represent the appearance. The histogram becomes the input vector of K-Nearest Neighbor (K-NN algorithm and is to evaluate the posture of the object. Capable of accurately detecting different postures of a human, the proposed system increases the fall detection accuracy. Importantly, the proposed method detects the posture using the frame ratio and the displacement of height in an image. Experimental results demonstrate that the proposed system can further improve the system performance and the fall down identification accuracy.
Brieva, Jorge; Moya-Albor, Ernesto
In this paper we present a new Eulerian phase-based motion magnification technique using the Hermite Transform (HT) decomposition that is inspired in the Human Vision System (HVS). We test our method in one sequence of the breathing of a newborn baby and on a video sequence that shows the heartbeat on the wrist. We detect and magnify the heart pulse applying our technique. Our motion magnification approach is compared to the Laplacian phase based approach by means of quantitative metrics (based on the RMS error and the Fourier transform) to measure the quality of both reconstruction and magnification. In addition a noise robustness analysis is performed for the two methods.
Ngo, Hau T.; Rakvic, Ryan N.; Broussard, Randy P.; Ives, Robert W.
FPGA devices with embedded DSP and memory blocks, and high-speed interfaces are ideal for real-time video processing applications. In this work, a hardware-software co-design approach is proposed to effectively utilize FPGA features for a prototype of an automated video surveillance system. Time-critical steps of the video surveillance algorithm are designed and implemented in the FPGAs logic elements to maximize parallel processing. Other non timecritical tasks are achieved by executing a high level language program on an embedded Nios-II processor. Pre-tested and verified video and interface functions from a standard video framework are utilized to significantly reduce development and verification time. Custom and parallel processing modules are integrated into the video processing chain by Altera's Avalon Streaming video protocol. Other data control interfaces are achieved by connecting hardware controllers to a Nios-II processor using Altera's Avalon Memory Mapped protocol.
Full Text Available Identifying inter-frame forgery is a hot topic in video forensics. In this paper, we propose a method based on the assumption that the optical flows are consistent in an original video, while in forgeries the consistency will be destroyed. We first extract optical flow from frames of videos and then calculate the optical flow consistency after normalization and quantization as distinguishing feature to identify inter-frame forgeries. We train the Support Vector Machine to classify original videos and video forgeries with optical flow consistency feature of some sample videos and test the classification accuracy in a large database. Experimental results show that the proposed method is efficient in classifying original videos and forgeries. Furthermore, the proposed method performs also pretty well in classifying frame insertion and frame deletion forgeries.
Full Text Available Video content has increased much on the Internet during last years. In spite of the efforts of different organizations and governments to increase the accessibility of websites, most multimedia content on the Internet is not accessible. This paper describes a system that contributes to make multimedia content more accessible on the Web, by automatically translating subtitles in oral language to SignWriting, a way of writing Sign Language. This system extends the functionality of a general web platform that can provide accessible web content for different needs. This platform has a core component that automatically converts any web page to a web page compliant with level AA of WAI guidelines. Around this core component, different adapters complete the conversion according to the needs of specific users. One adapter is the Deaf People Accessibility Adapter, which provides accessible web content for the Deaf, based on SignWritting. Functionality of this adapter has been extended with the video subtitle translator system. A first prototype of this system has been tested through different methods including usability and accessibility tests and results show that this tool can enhance the accessibility of video content available on the Web for Deaf people.
Allen, A. J.; Terry, J. L.; Garnier, D.; Stillerman, J. A.; Wurden, G. A.
A new system for routine digitization of video images is presently operating on the Alcator C-Mod tokamak. The PC-based system features high resolution video capture, storage, and retrieval. The captured images are stored temporarily on the PC, but are eventually written to CD. Video is captured from one of five filtered RS-170 CCD cameras at 30 frames per second (fps) with 640×480 pixel resolution. In addition, the system can digitize the output from a filtered Kodak Ektapro EM Digital Camera which captures images at 1000 fps with 239×192 resolution. Present views of this set of cameras include a wide angle and a tangential view of the plasma, two high resolution views of gas puff capillaries embedded in the plasma facing components, and a view of ablating, high speed Li pellets. The system is being used to study (1) the structure and location of visible emissions (including MARFEs) from the main plasma and divertor, (2) asymmetries in gas puff plumes due to flows in the scrape-off layer (SOL), and (3) the tilt and cigar-shaped spatial structure of the Li pellet ablation cloud.
The human motion contains valuable information in many situations and people frequently perform an unconscious analysis of the motion of other people to understand their actions, intentions, and state of mind. An automatic analysis of human motion will facilitate many applications and thus has...... received great interest from both industry and research communities. The focus of this thesis is on video-based analysis of human motion and the thesis presents work within three overall topics, namely foreground segmentation, action recognition, and human pose estimation. Foreground segmentation is often...... the first important step in the analysis of human motion. By separating foreground from background the subsequent analysis can be focused and efficient. This thesis presents a robust background subtraction method that can be initialized with foreground objects in the scene and is capable of handling...
Full Text Available Abstract Illumination changes cause challenging problems for video surveillance algorithms, as objects of interest become masked by changes in background appearance. It is desired for such algorithms to maintain a consistent perception of a scene regardless of illumination variation. This work introduces a concept we call BigBackground, which is a model for representing large, persistent scene features based on chromatic self-similarity. This model is found to comprise 50% to 90% of surveillance scenes. The large, stable regions represented by the model are used as reference points for performing illumination compensation. The presented compensation technique is demonstrated to decrease improper false-positive classification of background pixels by an average of 83% compared to the uncompensated case and by 25% to 43% compared to compensation techniques from the literature.
Full Text Available Students often feel disconnected from their introductory social welfare policy courses. Therefore, it is important that instructors employ engaging pedagogical methods in the classroom. A review of the literature reveals that a host of methods have been utilized to attempt to interest students in policy courses, but there is no mention of using internet-based videos in the social welfare policy classroom. This article describes how to select and use appropriate internet-based videos from websites such as YouTube and SnagFilms, to effectively engage students in social welfare policy courses. Four rules are offered for choosing videos based on emotional impact, brevity, and relevance to course topics. The selected videos should elicit students’ passions and stimulate critical thinking when used in concert with instructor-generated discussion questions, writing assignments, and small group dialogue. Examples of the process of choosing videos, discussion questions, and student reactions to the use of videos are provided.
Walton, James S.; Hallamasek, Karen G.
The value of high-speed imaging for making subjective assessments is widely recognized, but the inability to acquire useful data from image sequences in a timely fashion has severely limited the use of the technology. 4DVideo has created a foundation for a generic instrument that can capture kinematic data from high-speed images. The new system has been designed to acquire (1) two-dimensional trajectories of points; (2) three-dimensional kinematics of structures or linked rigid-bodies; and (3) morphological reconstructions of boundaries. The system has been designed to work with an unlimited number of cameras configured as nodes in a network, with each camera able to acquire images at 1000 frames per second (fps) or better, with a spatial resolution of 512 X 512 or better, and an 8-bit gray scale. However, less demanding configurations are anticipated. The critical technology is contained in the custom hardware that services the cameras. This hardware optimizes the amount of information stored, and maximizes the available bandwidth. The system identifies targets using an algorithm implemented in hardware. When complete, the system software will provide all of the functionality required to capture and process video data from multiple perspectives. Thereafter it will extract, edit and analyze the motions of finite targets and boundaries.
Full Text Available Detection of buildings and vegetation, and even more reconstruction of urban terrain from sequences of aerial images and videos is known to be a challenging task. It has been established that those methods that have as input a high-quality Digital Surface Model (DSM, are more straight-forward and produce more robust and reliable results than those image-based methods that require matching line segments or even whole regions. This motivated us to develop a new dense matching technique for DSM generation that is capable of simultaneous integration of multiple images in the reconstruction process. The DSMs generated by this new multi-image matching technique can be used for urban object extraction. In the first contribution of this paper, two examples of external sources of information added to the reconstruction pipeline will be shown. The GIS layers are used for recognition of streets and suppressing false alarms in the depth maps that were caused by moving vehicles while the near infrared channel is applied for separating vegetation from buildings. Three examples of data sets including both UAV-borne video sequences with a relatively high number of frames and high-resolution (10 cm ground sample distance data sets consisting of (few spatial-temporarily diverse images from large-format aerial frame cameras, will be presented. By an extensive quantitative evaluation of the Vaihingen block from the ISPRS benchmark on urban object detection, it will become clear that our procedure allows a straight-forward, efficient, and reliable instantiation of 3D city models.
Whitton, Nicola; Maclure, Maggie
Increasingly prevalent educational discourses promote the use of video games in schools and universities. At the same time, populist discourses persist, particularly in print media, which condemn video games because of putative negative effects on behaviour and socialisation. These contested discourses, we suggest, influence the acceptability of…
Full Text Available This paper presents a steganalytic approach against video steganography which modifies motion vector (MV in content adaptive manner. Current video steganalytic schemes extract features from fixed-length frames of the whole video and do not take advantage of the content diversity. Consequently, the effectiveness of the steganalytic feature is influenced by video content and the problem of cover source mismatch also affects the steganalytic performance. The goal of this paper is to propose a steganalytic method which can suppress the differences of statistical characteristics caused by video content. The given video is segmented to subsequences according to block’s motion in every frame. The steganalytic features extracted from each category of subsequences with close motion intensity are used to build one classifier. The final steganalytic result can be obtained by fusing the results of weighted classifiers. The experimental results have demonstrated that our method can effectively improve the performance of video steganalysis, especially for videos of low bitrate and low embedding ratio.
Full Text Available This paper introduces source-channel adaptive rate control (SARC, a new congestion control algorithm for layered video transmission in large multicast groups. In order to solve the well-known feedback implosion problem in large multicast groups, we first present a mechanism for filtering RTCP receiver reports sent from receivers to the whole session. The proposed filtering mechanism provides a classification of receivers according to a predefined similarity measure. An end-to-end source and FEC rate control based on this distributed feedback aggregation mechanism coupled with a video layered coding system is then described. The number of layers, their rate, and their levels of protection are adapted dynamically to aggregated feedbacks. The algorithms have been validated with the NS2 network simulator.
Machireddy, Archana; van Santen, Jan; Wilson, Jenny L; Myers, Julianne; Hadders-Algra, Mijna; Xubo Song
Cerebral palsy is a non-progressive neurological disorder occurring in early childhood affecting body movement and muscle control. Early identification can help improve outcome through therapy-based interventions. Absence of so-called "fidgety movements" is a strong predictor of cerebral palsy. Currently, infant limb movements captured through either video cameras or accelerometers are analyzed to identify fidgety movements. However both modalities have their limitations. Video cameras do not have the high temporal resolution needed to capture subtle movements. Accelerometers have low spatial resolution and capture only relative movement. In order to overcome these limitations, we have developed a system to combine measurements from both camera and sensors to estimate the true underlying motion using extended Kalman filter. The estimated motion achieved 84% classification accuracy in identifying fidgety movements using Support Vector Machine.
Irondi, Iheanyi; Wang, Qi; Grecos, Christos
Real-time HTTP streaming has gained global popularity for delivering video content over Internet. In particular, the recent MPEG-DASH (Dynamic Adaptive Streaming over HTTP) standard enables on-demand, live, and adaptive Internet streaming in response to network bandwidth fluctuations. Meanwhile, emerging is the new-generation video coding standard, H.265/HEVC (High Efficiency Video Coding) promises to reduce the bandwidth requirement by 50% at the same video quality when compared with the current H.264/AVC standard. However, little existing work has addressed the integration of the DASH and HEVC standards, let alone empirical performance evaluation of such systems. This paper presents an experimental HEVC-DASH system, which is a pull-based adaptive streaming solution that delivers HEVC-coded video content through conventional HTTP servers where the client switches to its desired quality, resolution or bitrate based on the available network bandwidth. Previous studies in DASH have focused on H.264/AVC, whereas we present an empirical evaluation of the HEVC-DASH system by implementing a real-world test bed, which consists of an Apache HTTP Server with GPAC, an MP4Client (GPAC) with open HEVC-based DASH client and a NETEM box in the middle emulating different network conditions. We investigate and analyze the performance of HEVC-DASH by exploring the impact of various network conditions such as packet loss, bandwidth and delay on video quality. Furthermore, we compare the Intra and Random Access profiles of HEVC coding with the Intra profile of H.264/AVC when the correspondingly encoded video is streamed with DASH. Finally, we explore the correlation among the quality metrics and network conditions, and empirically establish under which conditions the different codecs can provide satisfactory performance.
Lajoie, Susanne P.; Hmelo-Silver, Cindy; Wiseman, Jeffrey; Chan, Lap Ki; Lu, Jingyan; Khurana, Chesta; Cruz-Panesso, Ilian; Poitras, Eric; Kazemitabar, Maedeh
The goal of this study is to examine how to facilitate cross-cultural groups in problem-based learning (PBL) using online digital tools and videos. The PBL consisted of two video-based cases used to trigger student-learning issues about giving bad news to HIV-positive patients. Mixed groups of medical students from Canada and Hong Kong worked with…
Kumar, David Devraj
This paper is an invited adaptation of the IEEE Education Society Distinguished Lecture Approaches to Interactive Video Anchors in Problem-Based Science Learning. Interactive video anchors have a cognitive theory base, and they help to enlarge the context of learning with information-rich real-world situations. Carefully selected movie clips and…
de Barros, Rui Sergio Monteiro; Brito, Marcus Vinicius Henriques; de Brito, Marcelo Houat; de Aguiar Lédo Coutinho, Jean Vitor; Teixeira, Renan Kleber Costa; Yamaki, Vitor Nagai; da Silva Costa, Felipe Lobato; Somensi, Danusa Neves
The surgical microscope is an essential tool for microsurgery. Nonetheless, several promising alternatives are being developed, including endoscopes and laparoscopes with video systems. However, these alternatives have only been used for arterial anastomoses so far. The aim of this study was to evaluate the use of a low-cost video-assisted magnification system in end-to-side neurorrhaphy in rats. Forty rats were randomly divided into four matched groups: (1) normality (sciatic nerve was exposed but was kept intact); (2) denervation (fibular nerve was sectioned, and the proximal and distal stumps were sutured-transection without repair); (3) microscope; and (4) video system (fibular nerve was sectioned; the proximal stump was buried inside the adjacent musculature, and the distal stump was sutured to the tibial nerve). Microsurgical procedures were performed with guidance from a microscope or video system. We analyzed weight, nerve caliber, number of stitches, times required to perform the neurorrhaphy, muscle mass, peroneal functional indices, latency and amplitude, and numbers of axons. There were no significant differences in weight, nerve caliber, number of stitches, muscle mass, peroneal functional indices, or latency between microscope and video system groups. Neurorrhaphy took longer using the video system (P microscope group than in the video group. It is possible to perform an end-to-side neurorrhaphy in rats through video system magnification. The success rate is satisfactory and comparable with that of procedures performed under surgical microscopes. Copyright © 2017 Elsevier Inc. All rights reserved.
López Velasco, Juan Pedro; Rodrigo Ferrán, Juan Antonio; Jiménez Bermejo, David; Menendez Garcia, Jose Manuel
Assessing video quality is a complex task. While most pixel-based metrics do not present enough correlation between objective and subjective results, algorithms need to correspond to human perception when analyzing quality in a video sequence. For analyzing the perceived quality derived from concrete video artifacts in determined region of interest we present a novel methodology for generating test sequences which allow the analysis of impact of each individual distortion. Through results obt...
Jung Han-Seung; Lee Young-Yoon; Lee Sang Uk
Watermarking for video sequences should consider additional attacks, such as frame averaging, frame-rate change, frame shuffling or collusion attacks, as well as those of still images. Also, since video is a sequence of analogous images, video watermarking is subject to interframe collusion. In order to cope with these attacks, we propose a scene-based temporal watermarking algorithm. In each scene, segmented by scene-change detection schemes, a watermark is embedded temporally to one-dimens...
Qi Wang; Zhaohong Li; Zhenzhen Zhang; Qinglong Ma
Identifying inter-frame forgery is a hot topic in video forensics. In this paper, we propose a method based on the assumption that the optical flows are consistent in an original video, while in forgeries the consistency will be destroyed. We first extract optical flow from frames of videos and then calculate the optical flow consistency after normalization and quantization as distinguishing feature to identify inter-frame forgeries. We train the Support Vector Machine to classify original vi...
Donghui LIU; Lina TONG; Jiashuo WANG; Xiaoyun SUN; Xiaoying ZUO; Yakun DU; Zhenzhou WANG
An online video edge detection device for bottle caps is designed and implemented using OV7670 video module and FPGA based control unit. By Verilog language programming, the device realizes the menu type parametric setting of the external VGA display, and completes the Roberts edge detection of real-time video image, which improves the speed of image processing. By improving the detection algorithm, the noise is effectively suppressed, and clear and coherent edge images are derived. The desig...
Full Text Available With the continuous expansion of the amount of data with time in mobile video applications such as cloud video surveillance (CVS, the increasing energy consumption in video data centers has drawn widespread attention for the past several years. Addressing the issue of reducing energy consumption, we propose a low energy consumption storage method specially designed for CVS systems based onthe service level agreement (SLA classification. A novel SLA with an extra parameter of access time period is proposed and then utilized as a criterion for dividing virtual machines (VMs and data storage nodes into different classifications. Tasks can be scheduled in real time for running on the homologous VMs and data storage nodes according to their access time periods. Any nodes whose access time periods do not encompass the current time will be placed into the energy saving state while others are in normal state with the capability of undertaking tasks. As a result, overall electric energy consumption in data centers is reduced while the SLA is fulfilled. To evaluate the performance, we compare the method with two related approaches using the Hadoop Distributed File System (HDFS. The results show the superiority and effectiveness of our method.
Sullivan, Gary J.
The extension of video applications to enable 3D perception, which typically is considered to include a stereo viewing experience, is emerging as a mass market phenomenon, as is evident from the recent prevalence of 3D major cinema title releases. For high quality 3D video to become a commonplace user experience beyond limited cinema distribution, adoption of an interoperable coded 3D digital video format will be needed. Stereo-view video can also be studied as a special case of the more general technologies of multiview and "free-viewpoint" video systems. The history of standardization work on this topic is actually richer than people may typically realize. The ISO/IEC Moving Picture Experts Group (MPEG), in particular, has been developing interoperability standards to specify various such coding schemes since the advent of digital video as we know it. More recently, the ITU-T Visual Coding Experts Group (VCEG) has been involved as well in the Joint Video Team (JVT) work on development of 3D features for H.264/14496-10 Advanced Video Coding, including Multiview Video Coding (MVC) extensions. This paper surveys the prior, ongoing, and anticipated future standardization efforts on this subject to provide an overview and historical perspective on feasible approaches to 3D and multiview video coding.
Jubran, Mohammad K; Bansal, Manu; Kondi, Lisimachos P; Grover, Rohan
In this paper, we propose an optimal strategy for the transmission of scalable video over packet-based multiple-input multiple-output (MIMO) systems. The scalable extension of H.264/AVC that provides a combined temporal, quality and spatial scalability is used. For given channel conditions, we develop a method for the estimation of the distortion of the received video and propose different error concealment schemes. We show the accuracy of our distortion estimation algorithm in comparison with simulated wireless video transmission with packet errors. In the proposed MIMO system, we employ orthogonal space-time block codes (O-STBC) that guarantee independent transmission of different symbols within the block code. In the proposed constrained bandwidth allocation framework, we use the estimated end-to-end decoder distortion to optimally select the application layer parameters, i.e., quantization parameter (QP) and group of pictures (GOP) size, and physical layer parameters, i.e., rate-compatible turbo (RCPT) code rate and symbol constellation. Results show the substantial performance gain by using different symbol constellations across the scalable layers as compared to a fixed constellation.
Huang, Zhiwu; Shan, Shiguang; Wang, Ruiping; Zhang, Haihong; Lao, Shihong; Kuerban, Alifu; Chen, Xilin
Face recognition with still face images has been widely studied, while the research on video-based face recognition is inadequate relatively, especially in terms of benchmark datasets and comparisons. Real-world video-based face recognition applications require techniques for three distinct scenarios: 1) Videoto-Still (V2S); 2) Still-to-Video (S2V); and 3) Video-to-Video (V2V), respectively, taking video or still image as query or target. To the best of our knowledge, few datasets and evaluation protocols have benchmarked for all the three scenarios. In order to facilitate the study of this specific topic, this paper contributes a benchmarking and comparative study based on a newly collected still/video face database, named COX(1) Face DB. Specifically, we make three contributions. First, we collect and release a largescale still/video face database to simulate video surveillance with three different video-based face recognition scenarios (i.e., V2S, S2V, and V2V). Second, for benchmarking the three scenarios designed on our database, we review and experimentally compare a number of existing set-based methods. Third, we further propose a novel Point-to-Set Correlation Learning (PSCL) method, and experimentally show that it can be used as a promising baseline method for V2S/S2V face recognition on COX Face DB. Extensive experimental results clearly demonstrate that video-based face recognition needs more efforts, and our COX Face DB is a good benchmark database for evaluation.
H. R. Hosseinpoor
Full Text Available There is an increasingly large number of applications for Unmanned Aerial Vehicles (UAVs from monitoring, mapping and target geolocation. However, most of commercial UAVs are equipped with low-cost navigation sensors such as C/A code GPS and a low-cost IMU on board, allowing a positioning accuracy of 5 to 10 meters. This low accuracy cannot be used in applications that require high precision data on cm-level. This paper presents a precise process for geolocation of ground targets based on thermal video imagery acquired by small UAV equipped with RTK GPS. The geolocation data is filtered using an extended Kalman filter, which provides a smoothed estimate of target location and target velocity. The accurate geo-locating of targets during image acquisition is conducted via traditional photogrammetric bundle adjustment equations using accurate exterior parameters achieved by on board IMU and RTK GPS sensors, Kalman filtering and interior orientation parameters of thermal camera from pre-flight laboratory calibration process. The results of this study compared with code-based ordinary GPS, indicate that RTK observation with proposed method shows more than 10 times improvement of accuracy in target geolocation.
Recent years have seen significant investment and increasingly effective use of Video Analytics (VA) systems to detect intrusion or attacks in sterile areas. Currently there are a number of manufacturers who have achieved the Imagery Library for Intelligent Detection System (i-LIDS) primary detection classification performance standard for the sterile zone detection scenario. These manufacturers have demonstrated the performance of their systems under evaluation conditions using an uncompressed evaluation video. In this paper we consider the effect on the detection rate of an i-LIDS primary approved sterile zone system using compressed sterile zone scenario video clips as the input. The preliminary test results demonstrate a change time of detection rate with compression as the time to alarm increased with greater compression. Initial experiments suggest that the detection performance does not linearly degrade as a function of compression ratio. These experiments form a starting point for a wider set of planned trials that the Home Office will carry out over the next 12 months.
Full Text Available A novel video shot boundary recognition method is proposed, which includes two stages of video feature extraction and shot boundary recognition. Firstly, we use adaptive locality preserving projections (ALPP to extract video feature. Unlike locality preserving projections, we define the discriminating similarity with mode prior probabilities and adaptive neighborhood selection strategy which make ALPP more suitable to preserve the local structure and label information of the original data. Secondly, we use an optimized multiple kernel support vector machine to classify video frames into boundary and nonboundary frames, in which the weights of different types of kernels are optimized with an ant colony optimization method. Experimental results show the effectiveness of our method.
Bavelier, D; Achtman, R L; Mani, M; Föcker, J
Over the past few years, the very act of playing action video games has been shown to enhance several different aspects of visual selective attention, yet little is known about the neural mechanisms...
Liang, Xuan; Xu, Huosheng; Chen, Xi; Fan, Yu
This paper research on a high definition Ship-borne radar and video monitoring system which requires multi-channel TV video and radar video encoding and decoding ability. The real time data transferring is based on RTP/RTCP protocol with guarantee of QoS. In this paper, we propose an effective Feedback control for real time video stream to combine with forward error correction (FEC). In our scheme, the server multicasts the video in parallel with FEC packets and adaptive RTCP feedback control of the video stream. On the server side, we analyze and optimize the number of streams and FEC packets to meet a certain residual loss requirement. For every RTT round trip time, the sender sends a forward RTCP control packet. On the receiver side, we analyze the optimal combination of FEC and packets to minimize its loss. Upon the receipt of a backward RTCP packet with the packet loss ratio from the receiver, the output rate of the source is adjusted. Additive increase and multiplicative decrease (AIMD) model can achieve efficient congestion preventing when the accurate available bandwidth is estimated by the backward RTCP packet.
Conkey, Curtis A; Bowers, Clint; Cannon-Bowers, Janis; Sanchez, Alicia
Multimedia training methods have traditionally relied heavily on video-based technologies, and significant research has shown these to be very effective training tools. However, production of video is time and resource intensive. Machinima technologies are based on videogaming technology. Machinima technology allows videogame technology to be manipulated into unique scenarios based on entertainment or training and practice applications. Machinima is the converting of these unique scenarios into video vignettes that tell a story. These vignettes can be interconnected with branching points in much the same way that education videos are interconnected as vignettes between decision points. This study addressed the effectiveness of machinima-based soft-skills education using avatar actors versus the traditional video teaching application using human actors in the training of frontline healthcare workers. This research also investigated the difference between presence reactions when using avatar actor-produced video vignettes as compared with human actor-produced video vignettes. Results indicated that the difference in training and/or practice effectiveness is statistically insignificant for presence, interactivity, quality, and the skill of assertiveness. The skill of active listening presented a mixed result indicating the need for careful attention to detail in situations where body language and facial expressions are critical to communication. This study demonstrates that a significant opportunity exists for the exploitation of avatar actors in video-based instruction.
Li, Jia; Xia, Changqun; Chen, Xiaowu
Image-based salient object detection (SOD) has been extensively studied in past decades. However, video-based SOD is much less explored due to the lack of large-scale video datasets within which salient objects are unambiguously defined and annotated. Toward this end, this paper proposes a video-based SOD dataset that consists of 200 videos. In constructing the dataset, we manually annotate all objects and regions over 7,650 uniformly sampled keyframes and collect the eye-tracking data of 23 subjects who free-view all videos. From the user data, we find that salient objects in a video can be defined as objects that consistently pop-out throughout the video, and objects with such attributes can be unambiguously annotated by combining manually annotated object/region masks with eye-tracking data of multiple subjects. To the best of our knowledge, it is currently the largest dataset for videobased salient object detection. Based on this dataset, this paper proposes an unsupervised baseline approach for video-based SOD by using saliencyguided stacked autoencoders. In the proposed approach, multiple spatiotemporal saliency cues are first extracted at the pixel, superpixel and object levels. With these saliency cues, stacked autoencoders are constructed in an unsupervised manner that automatically infers a saliency score for each pixel by progressively encoding the high-dimensional saliency cues gathered from the pixel and its spatiotemporal neighbors. In experiments, the proposed unsupervised approach is compared with 31 state-of-the-art models on the proposed dataset and outperforms 30 of them, including 19 imagebased classic (unsupervised or non-deep learning) models, six image-based deep learning models, and five video-based unsupervised models. Moreover, benchmarking results show that the proposed dataset is very challenging and has the potential to boost the development of video-based SOD.
Lin, Yu-Tzu; Yen, Bai-Jang; Chang, Chia-Hu; Lee, Greg C.; Lin, Yu-Chih
Purpose: The purpose of this paper is to propose an indexing and teaching focus mining system for lecture videos recorded in an unconstrained environment. Design/methodology/approach: By applying the proposed algorithms in this paper, the slide structure can be reconstructed by extracting slide images from the video. Instead of applying…
..., ``Nintendo''). The products accused of infringing the asserted patents are gaming systems and related... From the Federal Register Online via the Government Publishing Office INTERNATIONAL TRADE COMMISSION Certain Video Game Systems and Wireless Controllers and Components Thereof; Commission...
National Aeronautics and Space Administration — In this project, the development of a novel panoramic, stereoscopic video system was proposed. The proposed system, which contains no moving parts, uses three-fixed...
Fu, Chang-Hong; Chan, Yui-Lam; Ip, Tak-Piu; Siu, Wan-Chi
MPEG digital video is becoming ubiquitous for video storage and communications. It is often desirable to perform various video cassette recording (VCR) functions such as backward playback in MPEG videos. However, the predictive processing techniques employed in MPEG severely complicate the backward-play operation. A straightforward implementation of backward playback is to transmit and decode the whole group-of-picture (GOP), store all the decoded frames in the decoder buffer, and play the decoded frames in reverse order. This approach requires a significant buffer in the decoder, which depends on the GOP size, to store the decoded frames. This approach could not be possible in a severely constrained memory requirement. Another alternative is to decode the GOP up to the current frame to be displayed, and then go back to decode the GOP again up to the next frame to be displayed. This approach does not need the huge buffer, but requires much higher bandwidth of the network and complexity of the decoder. In this paper, we propose a macroblock-based algorithm for an efficient implementation of the MPEG video streaming system to provide backward playback over a network with the minimal requirements on the network bandwidth and the decoder complexity. The proposed algorithm classifies macroblocks in the requested frame into backward macroblocks (BMBs) and forward/backward macroblocks (FBMBs). Two macroblock-based techniques are used to manipulate different types of macroblocks in the compressed domain and the server then sends the processed macroblocks to the client machine. For BMBs, a VLC-domain technique is adopted to reduce the number of macroblocks that need to be decoded by the decoder and the number of bits that need to be sent over the network in the backward-play operation. We then propose a newly mixed VLC/DCT-domain technique to handle FBMBs in order to further reduce the computational complexity of the decoder. With these compressed-domain techniques, the
Full Text Available In recent years, video surveillance and monitoring have gained importance because of security and safety concerns. Banks, borders, airports, stores, and parking areas are the important application areas. There are two main parts in scenario recognition: Low level processing, including moving object detection and object tracking, and feature extraction. We have developed new features through this work which are RUD (relative upper density, RMD (relative middle density and RLD (relative lower density, and we have used other features such as aspect ratio, width, height, and color of the object. High level processing, including event start-end point detection, activity detection for each frame and scenario recognition for sequence of images. This part is the focus of our research, and different pattern recognition and classification methods are implemented and experimental results are analyzed. We looked into several methods of classification which are decision tree, frequency domain classification, neural network-based classification, Bayes classifier, and pattern recognition methods, which are control charts, and hidden Markov models. The control chart approach, which is a decision methodology, gives more promising results than other methodologies. Overlapping between events is one of the problems, hence we applied fuzzy logic technique to solve this problem. After using this method the total accuracy increased from 95.6 to 97.2.
Yang, Yongchao; Dorn, Charles; Mancini, Tyler; Talken, Zachary; Kenyon, Garrett; Farrar, Charles; Mascareñas, David
Experimental or operational modal analysis traditionally requires physically-attached wired or wireless sensors for vibration measurement of structures. This instrumentation can result in mass-loading on lightweight structures, and is costly and time-consuming to install and maintain on large civil structures, especially for long-term applications (e.g., structural health monitoring) that require significant maintenance for cabling (wired sensors) or periodic replacement of the energy supply (wireless sensors). Moreover, these sensors are typically placed at a limited number of discrete locations, providing low spatial sensing resolution that is hardly sufficient for modal-based damage localization, or model correlation and updating for larger-scale structures. Non-contact measurement methods such as scanning laser vibrometers provide high-resolution sensing capacity without the mass-loading effect; however, they make sequential measurements that require considerable acquisition time. As an alternative non-contact method, digital video cameras are relatively low-cost, agile, and provide high spatial resolution, simultaneous, measurements. Combined with vision based algorithms (e.g., image correlation, optical flow), video camera based measurements have been successfully used for vibration measurements and subsequent modal analysis, based on techniques such as the digital image correlation (DIC) and the point-tracking. However, they typically require speckle pattern or high-contrast markers to be placed on the surface of structures, which poses challenges when the measurement area is large or inaccessible. This work explores advanced computer vision and video processing algorithms to develop a novel video measurement and vision-based operational (output-only) modal analysis method that alleviate the need of structural surface preparation associated with existing vision-based methods and can be implemented in a relatively efficient and autonomous manner with little
Matsumoto, Jumpei; Urakawa, Susumu; Takamura, Yusaku; Malcher-Lopes, Renato; Hori, Etsuro; Tomaz, Carlos; Ono, Taketoshi; Nishijo, Hisao
A large number of studies have analyzed social and sexual interactions between rodents in relation to neural activity. Computerized video analysis has been successfully used to detect numerous behaviors quickly and objectively; however, to date only 2D video recording has been used, which cannot determine the 3D locations of animals and encounters difficulties in tracking animals when they are overlapping, e.g., when mounting. To overcome these limitations, we developed a novel 3D video analysis system for examining social and sexual interactions in rats. A 3D image was reconstructed by integrating images captured by multiple depth cameras at different viewpoints. The 3D positions of body parts of the rats were then estimated by fitting skeleton models of the rats to the 3D images using a physics-based fitting algorithm, and various behaviors were recognized based on the spatio-temporal patterns of the 3D movements of the body parts. Comparisons between the data collected by the 3D system and those by visual inspection indicated that this system could precisely estimate the 3D positions of body parts for 2 rats during social and sexual interactions with few manual interventions, and could compute the traces of the 2 animals even during mounting. We then analyzed the effects of AM-251 (a cannabinoid CB1 receptor antagonist) on male rat sexual behavior, and found that AM-251 decreased movements and trunk height before sexual behavior, but increased the duration of head-head contact during sexual behavior. These results demonstrate that the use of this 3D system in behavioral studies could open the door to new approaches for investigating the neuroscience of social and sexual behavior.
Full Text Available A large number of studies have analyzed social and sexual interactions between rodents in relation to neural activity. Computerized video analysis has been successfully used to detect numerous behaviors quickly and objectively; however, to date only 2D video recording has been used, which cannot determine the 3D locations of animals and encounters difficulties in tracking animals when they are overlapping, e.g., when mounting. To overcome these limitations, we developed a novel 3D video analysis system for examining social and sexual interactions in rats. A 3D image was reconstructed by integrating images captured by multiple depth cameras at different viewpoints. The 3D positions of body parts of the rats were then estimated by fitting skeleton models of the rats to the 3D images using a physics-based fitting algorithm, and various behaviors were recognized based on the spatio-temporal patterns of the 3D movements of the body parts. Comparisons between the data collected by the 3D system and those by visual inspection indicated that this system could precisely estimate the 3D positions of body parts for 2 rats during social and sexual interactions with few manual interventions, and could compute the traces of the 2 animals even during mounting. We then analyzed the effects of AM-251 (a cannabinoid CB1 receptor antagonist on male rat sexual behavior, and found that AM-251 decreased movements and trunk height before sexual behavior, but increased the duration of head-head contact during sexual behavior. These results demonstrate that the use of this 3D system in behavioral studies could open the door to new approaches for investigating the neuroscience of social and sexual behavior.
Fadl, Sondos M; Han, Qi; Li, Qiong
Nowadays, surveillance systems are used to control crimes. Therefore, the authenticity of digital video increases the accuracy of deciding to admit the digital video as legal evidence or not. Inter-frame duplication forgery is the most common type of video forgery methods. However, many existing methods have been proposed for detecting this type of forgery and these methods require high computational time and impractical. In this study, we propose an efficient inter-frame duplication detection algorithm based on standard deviation of residual frames. Standard deviation of residual frame is applied to select some frames and ignore others, which represent a static scene. Then, the entropy of discrete cosine transform coefficients is calculated for each selected residual frame to represent its discriminating feature. Duplicated frames are then detected exactly using subsequence feature analysis. The experimental results demonstrated that the proposed method is effective to identify inter-frame duplication forgery with localization and acceptable running time. © 2017 American Academy of Forensic Sciences.
Wilson, Kaitlyn P.
Purpose: Video modeling is an intervention strategy that has been shown to be effective in improving the social and communication skills of students with autism spectrum disorders, or ASDs. The purpose of this tutorial is to outline empirically supported, step-by-step instructions for the use of video modeling by school-based speech-language…
Bryan, Craig J.; Dhillon-Davis, Luther E.; Dhillon-Davis, Kieran K.
In light of continuing concerns about iatrogenic effects associated with suicide prevention efforts utilizing video-based media, the impact of emotionally-charged videos on two vulnerable subgroups--suicidal viewers and suicide survivors--was explored. Following participation in routine suicide education as a part of the U.S. Air Force Suicide…
Geradts, Zeno J.; Merlijn, Menno; de Groot, Gert; Bijhold, Jurrien
The gait parameters of eleven subjects were evaluated to provide data for recognition purposes of subjects. Video images of these subjects were acquired in frontal, transversal, and sagittal (a plane parallel to the median of the body) view. The subjects walked by at their usual walking speed. The measured parameters were hip, knee and ankle joint angle and their time averaged values, thigh, foot and trunk angle, step length and width, cycle time and walking speed. Correlation coefficients within and between subjects for the hip, knee and ankle rotation pattern in the sagittal aspect and for the trunk rotation pattern in the transversal aspect were almost similar. (were similar or were almost identical) This implies that the intra and inter individual variance were equal. Therefore, these gait parameters could not distinguish between subjects. A simple ANOVA with a follow-up test was used to detect significant differences for the mean hip, knee and ankle joint angle, thigh angle, step length, step width, walking speed, cycle time and foot angle. The number of significant differences between subjects defined the usefulness of the gait parameter. The parameter with the most significant difference between subjects was the foot angle (64 % - 73 % of the maximal attainable significant differences), followed by the time average hip joint angle (58 %) and the step length (45 %). The other parameters scored less than 25 %, which is poor for recognition purposes. The use of gait for identification purposes it not yet possible based on this research.
M. M. Blagoveshchenskaya
Full Text Available Summary. The most important operation of granular mixed fodder production is molding process. Properties of granular mixed fodder are defined during this process. They determine the process of production and final product quality. The possibility of digital video camera usage as intellectual sensor for control system in process of production is analyzed in the article. The developed parametric model of the process of bundles molding from granular fodder mass is presented in the paper. Dynamic characteristics of the molding process were determined. A mathematical model of motion of bundle of granular fodder mass after matrix holes was developed. The developed mathematical model of the automatic control system (ACS with the use of etalon video frame as the set point in the MATLAB software environment was shown. As a parameter of the bundles molding process it is proposed to use the value of the specific area defined in the mathematical treatment of the video frame. The algorithms of the programs to determine the changes in structural and mechanical properties of the feed mass in video frames images were developed. Digital video shooting of various modes of the molding machine was carried out and after the mathematical processing of video the transfer functions for use as a change of adjustable parameters of the specific area were determined. Structural and functional diagrams of the system of regulation of the food bundles molding process with the use of digital camcorders were built and analyzed. Based on the solution of the equations of fluid dynamics mathematical model of bundle motion after leaving the hole matrix was obtained. In addition to its viscosity, creep property was considered that is characteristic of the feed mass. The mathematical model ACS of the bundles molding process allowing to investigate transient processes which occur in the control system that uses a digital video camera as the smart sensor was developed in Simulink
Martini, Maria G.; Villarini, Barbara; Fiorucci, Federico
In image and video compression and transmission, it is important to rely on an objective image/video quality metric which accurately represents the subjective quality of processed images and video sequences. In some scenarios, it is also important to evaluate the quality of the received video sequence with minimal reference to the transmitted one. For instance, for quality improvement of video transmission through closed-loop optimisation, the video quality measure can be evaluated at the receiver and provided as feedback information to the system controller. The original image/video sequence--prior to compression and transmission--is not usually available at the receiver side, and it is important to rely at the receiver side on an objective video quality metric that does not need reference or needs minimal reference to the original video sequence. The observation that the human eye is very sensitive to edge and contour information of an image underpins the proposal of our reduced reference (RR) quality metric, which compares edge information between the distorted and the original image. Results highlight that the metric correlates well with subjective observations, also in comparison with commonly used full-reference metrics and with a state-of-the-art RR metric.
... systems for the delivery of video programming. 63.02 Section 63.02 Telecommunication FEDERAL... systems for the delivery of video programming. (a) Any common carrier is exempt from the requirements of... with respect to the establishment or operation of a system for the delivery of video programming. ...
... COMMISSION In the Matter of Certain Video Game Systems and Wireless Controllers and Components Thereof... importation, and the sale within the United States after importation of certain video game systems and... importation of certain video game systems and wireless controllers and components thereof that infringe one or...
Raiff, Bethany R; Jarvis, Brantley P; Rapoza, Darion
Video games may serve as an ideal platform for developing and implementing technology-based contingency management (CM) interventions for smoking cessation as they can be used to address a number of barriers to the utilization of CM (e.g., replacing monetary rewards with virtual game-based rewards). However, little is known about the relationship between video game playing and cigarette smoking. The current study determined the prevalence of video game use, video game practices, and the acceptability of a video game-based CM intervention for smoking cessation among adult smokers and nonsmokers, including health care professionals. In an online survey, participants (N = 499) answered questions regarding their cigarette smoking and video game playing practices. Participants also reported if they believed a video game-based CM intervention could motivate smokers to quit and if they would recommend such an intervention. Nearly half of the participants surveyed reported smoking cigarettes, and among smokers, 74.5% reported playing video games. Video game playing was more prevalent in smokers than nonsmokers, and smokers reported playing more recently, for longer durations each week, and were more likely to play social games than nonsmokers. Most participants (63.7%), including those who worked as health care professionals, believed that a video game-based CM intervention would motivate smokers to quit and would recommend such an intervention to someone trying to quit (67.9%). Our findings suggest that delivering technology-based smoking cessation interventions via video games has the potential to reach substantial numbers of smokers and that most smokers, nonsmokers, and health care professionals endorsed this approach.
Mota, Paulo; Carvalho, Nuno; Carvalho-Dias, Emanuel; João Costa, Manuel; Correia-Pinto, Jorge; Lima, Estevão
Since the end of the XIX century, teaching of surgery has remained practically unaltered until now. With the dawn of video-assisted laparoscopy, surgery has faced new technical and learning challenges. Due to technological advances, from Internet access to portable electronic devices, the use of online resources is part of the educational armamentarium. In this respect, videos have already proven to be effective and useful, however the best way to benefit from these tools is still not clearly defined. To assess the importance of video-based learning, using an electronic questionnaire applied to residents and specialists of different surgical fields. Importance of video-based learning was assessed in a sample of 141 subjects, using a questionnaire distributed by a GoogleDoc online form. We found that 98.6% of the respondents have already used videos to prepare for surgery. When comparing video sources by formation status, residents were found to use Youtube significantly more often than specialists (p didactic illustrations and procedure narration than specialists (p < 0.001). On the other hand, specialists prized surgeon's technical skill and the presence of tips and tricks much more than residents (p < 0.001). Video-based learning is currently a hallmark of surgical preparation among residents and specialists working in Portugal. Based on these findings we believe that the creation of quality and scientifically accurate videos, and subsequent compilation in available video-libraries appears to be the future landscape for video-based learning. Copyright © 2017 Association of Program Directors in Surgery. Published by Elsevier Inc. All rights reserved.
Damsted, Camma; Larsen, L H; Nielsen, R.O.
and video time frame at initial contact during treadmill running using two-dimensional (2D) video recordings. METHODS: Thirty-one recreational runners were recorded twice, 1 week apart, with a high-speed video camera. Two blinded raters evaluated each video twice with an interval of at least 14 days...
Full Text Available Wavelet coding has been shown to achieve better compression than DCT coding and moreover allows scalability. 2D DWT can be easily extended to 3D and thus applied to video coding. However, 3D subband coding of video suffers from two drawbacks. The first is the amount of memory required for coding large 3D blocks; the second is the lack of temporal quality due to the sequence temporal splitting. In fact, 3D block-based video coders produce jerks. They appear at blocks temporal borders during video playback. In this paper, we propose a new temporal scan-based wavelet transform method for video coding combining the advantages of wavelet coding (performance, scalability with acceptable reduced memory requirements, no additional CPU complexity, and avoiding jerks. We also propose an efficient quality allocation procedure to ensure a constant quality over time.
Parisot, Christophe; Antonini, Marc; Barlaud, Michel
Wavelet coding has been shown to achieve better compression than DCT coding and moreover allows scalability. 2D DWT can be easily extended to 3D and thus applied to video coding. However, 3D subband coding of video suffers from two drawbacks. The first is the amount of memory required for coding large 3D blocks; the second is the lack of temporal quality due to the sequence temporal splitting. In fact, 3D block-based video coders produce jerks. They appear at blocks temporal borders during video playback. In this paper, we propose a new temporal scan-based wavelet transform method for video coding combining the advantages of wavelet coding (performance, scalability) with acceptable reduced memory requirements, no additional CPU complexity, and avoiding jerks. We also propose an efficient quality allocation procedure to ensure a constant quality over time.
Panayides, A S; Pattichis, M S; Constantinides, A G; Pattichis, C S
The emergence of the new, High Efficiency Video Coding (HEVC) standard, combined with wide deployment of 4G wireless networks, will provide significant support toward the adoption of mobile-health (m-health) medical video communication systems in standard clinical practice. For the first time since the emergence of m-health systems and services, medical video communication systems can be deployed that can rival the standards of in-hospital examinations. In this paper, we provide a thorough overview of today's advancements in the field, discuss existing approaches, and highlight the future trends and objectives.
Full Text Available Wireless capsule endoscopy (WCE enables a physician to diagnose a patient's digestive system without surgical procedures. However, it takes 1-2 hours for a gastroenterologist to examine the video. To speed up the review process, a number of analysis techniques based on machine vision have been proposed by computer science researchers. In order to train a machine to understand the semantics of an image, the image contents need to be translated into numerical form first. The numerical form of the image is known as image abstraction. The process of selecting relevant image features is often determined by the modality of medical images and the nature of the diagnoses. For example, there are radiographic projection-based images (e.g., X-rays and PET scans, tomography-based images (e.g., MRT and CT scans, and photography-based images (e.g., endoscopy, dermatology, and microscopic histology. Each modality imposes unique image-dependent restrictions for automatic and medically meaningful image abstraction processes. In this paper, we review the current development of machine-vision-based analysis of WCE video, focusing on the research that identifies specific gastrointestinal (GI pathology and methods of shot boundary detection.
Laws, Priscilla W.; Willis, Maxine C.; Jackson, David P.; Koenig, Kathleen; Teese, Robert
Ever since the first generalized computer-assisted instruction system (PLATO1) was introduced over 50 years ago, educators have been adding computer-based materials to their classes. Today many textbooks have complete online versions that include video lectures and other supplements. In the past 25 years the web has fueled an explosion of online homework and course management systems, both as blended learning and online courses. Meanwhile, introductory physics instructors have been implementing new approaches to teaching based on the outcomes of Physics Education Research (PER). A common theme of PER-based instruction has been the use of active-learning strategies designed to help students overcome alternative conceptions that they often bring to the study of physics.2 Unfortunately, while classrooms have become more active, online learning typically relies on passive lecture videos or Kahn-style3 tablet drawings. To bring active learning online, the LivePhoto Physics Group has been developing Interactive Video Vignettes (IVVs) that add interactivity and PER-based elements to short presentations. These vignettes incorporate web-based video activities that contain interactive elements and typically require students to make predictions and analyze real-world phenomena.
Musdal, Hande; Shiner, Brian; Chen, Techieh; Ceyhan, Mehmet E; Watts, Bradley V; Benneyan, James
Post-traumatic stress disorder (PTSD) is associated with poor health but there is a gap between need and receipt of care. It is useful to understand where to optimally locate in-person care and where video-based PTSD care would be most useful to minimize access to care barriers, care outside the Veterans Affairs system, and total costs. We developed a service location systems engineering model based on 2010 to 2020 projected care needs for veterans across New England to help determine where to best locate and use in-person and video-based care. This analysis determined specific locations and capacities of each type of PTSD care relative to patient home locations to help inform allocation of mental health resources. Not surprisingly Massachusetts, Connecticut, and Rhode Island are well suited for in-person care, whereas some rural areas of Maine, Vermont, and New Hampshire where in-patient services are infeasible could be better served by video-based care than external care, if the latter is even available. Results in New England alone suggest a potential $3,655,387 reduction in average annual total costs by shifting 9.73% of care to video-based treatment, with an average 12.6 miles travel distance for the remaining in-person care. Reprint & Copyright © 2014 Association of Military Surgeons of the U.S.
This paper summarizes research conducted on three computer-based video models’ effectiveness for learning based on memory and comprehension. In this quantitative study, a two-minute video presentation was created and played back in three different types of media players, for a sample of eighty-seven college freshman. The three players evaluated include a standard QuickTime video/audio player, a QuickTime player with embedded triggers that launched HTML-based study guide pages, and a Macromedi...
Ruolin, Zhu; Jianbo, Liu; Yuan, Zhang; Xiaoyu, Wu
The technology of computer vision is used in the training of military shooting. In order to overcome the limitation of the bullet holes recognition using Video Image Analysis that exists over-detection or leak-detection, this paper adopts the support vector machine algorithm and convolutional neural network to extract and recognize Bullet Holes in the digital video and compares their performance. It extracts HOG characteristics of bullet holes and train SVM classifier quickly, though the target is under outdoor environment. Experiments show that support vector machine algorithm used in this paper realize a fast and efficient extraction and recognition of bullet holes, improving the efficiency of shooting training.
Wang, Qia; Lobzhanidze, Alex; Jang, Hyun; Zeng, Wenjun; Shang, Yi; Yang, Jingyu
Smartphones are becoming popular nowadays not only because of its communication functionality but also, more importantly, its powerful sensing and computing capability. In this paper, we describe a novel and accurate image and video based remote target localization and tracking system using the Android smartphones, by leveraging its built-in sensors such as camera, digital compass, GPS, etc. Even though many other distance estimation or localization devices are available, our all-in-one, easy-to-use localization and tracking system on low cost and commodity smartphones is first of its kind. Furthermore, smartphones' exclusive user-friendly interface has been effectively taken advantage of by our system to facilitate low complexity and high accuracy. Our experimental results show that our system works accurately and efficiently.
Li, Hejian; An, Ping; Zhang, Zhaoyang
Three-dimensional (3-D) video brings people strong visual perspective experience, but also introduces large data and complexity processing problems. The depth estimation algorithm is especially complex and it is an obstacle for real-time system implementation. Meanwhile, high-resolution depth maps are necessary to provide a good image quality on autostereoscopic displays which deliver stereo content without the need for 3-D glasses. This paper presents a hardware implementation of a full high-definition (HD) depth estimation system that is capable of processing full HD resolution images with a maximum processing speed of 125 fps and a disparity search range of 240 pixels. The proposed field-programmable gate array (FPGA)-based architecture implements a fusion strategy matching algorithm for efficiency design. The system performs with high efficiency and stability by using a full pipeline design, multiresolution processing, synchronizers which avoid clock domain crossing problems, efficient memory management, etc. The implementation can be included in the video systems for live 3-D television applications and can be used as an independent hardware module in low-power integrated applications.
Vidakis, Nikolaos; Kavallakis, George; Triantafyllidis, Georgios
This paper presents a scheme of creating an emotion index of cover song music video clips by recognizing and classifying facial expressions of the artist in the video. More specifically, it fuses effective and robust algorithms which are employed for expression recognition, along with the use of ...... of a neural network system using the features extracted by the SIFT algorithm. Also we support the need of this fusion of different expression recognition algorithms, because of the way that emotions are linked to facial expressions in music video clips.......This paper presents a scheme of creating an emotion index of cover song music video clips by recognizing and classifying facial expressions of the artist in the video. More specifically, it fuses effective and robust algorithms which are employed for expression recognition, along with the use...
Irondi, Iheanyi; Wang, Qi; Grecos, Christos
The Dynamic Adaptive Streaming over HTTP (DASH) standard is becoming increasingly popular for real-time adaptive HTTP streaming of internet video in response to unstable network conditions. Integration of DASH streaming techniques with the new H.265/HEVC video coding standard is a promising area of research. The performance of HEVC-DASH systems has been previously evaluated by a few researchers using objective metrics, however subjective evaluation would provide a better measure of the user's Quality of Experience (QoE) and overall performance of the system. This paper presents a subjective evaluation of an HEVC-DASH system implemented in a hardware testbed. Previous studies in this area have focused on using the current H.264/AVC (Advanced Video Coding) or H.264/SVC (Scalable Video Coding) codecs and moreover, there has been no established standard test procedure for the subjective evaluation of DASH adaptive streaming. In this paper, we define a test plan for HEVC-DASH with a carefully justified data set employing longer video sequences that would be sufficient to demonstrate the bitrate switching operations in response to various network condition patterns. We evaluate the end user's real-time QoE online by investigating the perceived impact of delay, different packet loss rates, fluctuating bandwidth, and the perceived quality of using different DASH video stream segment sizes on a video streaming session using different video sequences. The Mean Opinion Score (MOS) results give an insight into the performance of the system and expectation of the users. The results from this study show the impact of different network impairments and different video segments on users' QoE and further analysis and study may help in optimizing system performance.
Khosla, Deepak; Moore, Christopher K.; Chelian, Suhas
This paper presents a bio-inspired method for spatio-temporal recognition in static and video imagery. It builds upon and extends our previous work on a bio-inspired Visual Attention and object Recognition System (VARS). The VARS approach locates and recognizes objects in a single frame. This work presents two extensions of VARS. The first extension is a Scene Recognition Engine (SCE) that learns to recognize spatial relationships between objects that compose a particular scene category in static imagery. This could be used for recognizing the category of a scene, e.g., office vs. kitchen scene. The second extension is the Event Recognition Engine (ERE) that recognizes spatio-temporal sequences or events in sequences. This extension uses a working memory model to recognize events and behaviors in video imagery by maintaining and recognizing ordered spatio-temporal sequences. The working memory model is based on an ARTSTORE1 neural network that combines an ART-based neural network with a cascade of sustained temporal order recurrent (STORE)1 neural networks. A series of Default ARTMAP classifiers ascribes event labels to these sequences. Our preliminary studies have shown that this extension is robust to variations in an object's motion profile. We evaluated the performance of the SCE and ERE on real datasets. The SCE module was tested on a visual scene classification task using the LabelMe2 dataset. The ERE was tested on real world video footage of vehicles and pedestrians in a street scene. Our system is able to recognize the events in this footage involving vehicles and pedestrians.
Bruellmann, D D; Tjaden, H; Schwanecke, U; Barth, P
We propose an augmented reality system for the reliable detection of root canals in video sequences based on a k-nearest neighbor color classification and introduce a simple geometric criterion for teeth. The new software was implemented using C++, Qt, and the image processing library OpenCV. Teeth are detected in video images to restrict the segmentation of the root canal orifices by using a k-nearest neighbor algorithm. The location of the root canal orifices were determined using Euclidean distance-based image segmentation. A set of 126 human teeth with known and verified locations of the root canal orifices was used for evaluation. The software detects root canals orifices for automatic classification of the teeth in video images and stores location and size of the found structures. Overall 287 of 305 root canals were correctly detected. The overall sensitivity was about 94 %. Classification accuracy for molars ranged from 65.0 to 81.2 % and from 85.7 to 96.7 % for premolars. The realized software shows that observations made in anatomical studies can be exploited to automate real-time detection of root canal orifices and tooth classification with a software system. Automatic storage of location, size, and orientation of the found structures with this software can be used for future anatomical studies. Thus, statistical tables with canal locations will be derived, which can improve anatomical knowledge of the teeth to alleviate root canal detection in the future. For this purpose the software is freely available at: http://www.dental-imaging.zahnmedizin.uni-mainz.de/.
Zhang, Xiaoli; Zhi, Min
In this paper, an effective slow motion replay detection method for tennis videos which contains logo transition is proposed. This method is based on the theory of color auto-correlogram and achieved by fowllowing steps: First,detect the candidate logo transition areas from the video frame sequence. Second, generate logo template. Then use color auto-correlogram for similarity matching between video frames and logo template in the candidate logo transition areas. Finally, select logo frames according to the matching results and locate the borders of slow motion accurately by using the brightness change during logo transition process. Experiment shows that, unlike previous approaches, this method has a great improvement in border locating accuracy rate, and can be used for other sports videos which have logo transition, too. In addition, as the algorithm only calculate the contents in the central area of the video frames, speed of the algorithm has been improved greatly.
Wang, Jiayun; Zu, Jian; Wang, Likang
In this paper, considering the atmospheric refractive index, we present an approach to extract the recording date and geo-location from videos, based on the alteration of the shadow length. The paper carefully takes different information (photographed date, the length of the selected object) of the given video into consideration and forms a comprehensive approach to extract the temporal and spatial information of the given video. On the basis of this approach, we analyze the shadow length data of a chosen object from a real video and extract the temporal and spatial information of the video. Compared with the actual information, the error is less than 1%, which proves the validity of our approach.
Squire, Kurt D.
Interactive digital media, or video games, are a powerful new medium. They offer immersive experiences in which players solve problems. Players learn more than just facts--ways of seeing and understanding problems so that they "become" different kinds of people. "Serious games" coming from business strategy, advergaming, and entertainment gaming…
In 2005, digital media artist/activist Liv Gjestvang founded a nonprofit organization, Youth Video OUTreach (YVO), to teach lesbian, gay, bisexual, transgender, queer/questioning (LGBTQ) youth skills to create a documentary about their lives that could serve as a centerpiece for outreach and advocacy efforts by/for LGBTQ youth. While…
de Rooij, O.; Snoek, C.G.M.; Worring, M.
Various query methods for video search exist. Because of the semantic gap each method has its limitations. We argue that for effective retrieval query methods need to be combined at retrieval time. However, switching query methods often involves a change in query and browsing interface, which puts a
The laser phosphor concept is currently the common approach for most applications to introduce laser as a projection light source. However, this concept bears quite some disadvantages for rear-projection video walls. Therefore, Barco has developed a RGB laser design for use in the control room market with tailor-made performance.
Mathiak, Krystyna A; Klasen, Martin; Weber, René; Ackermann, Hermann; Shergill, Sukhwinder S; Mathiak, Klaus
.... It was demonstrated that playing a video game leads to striatal dopamine release. It is unclear, however, which aspects of the game cause this reward system activation and if violent content contributes...
Chen, Chen-Yu; Wang, Jia-Ching; Wang, Jhing-Fa; Hu, Yu-Hen
An entropy-based criterion is proposed to characterize the pattern and intensity of object motion in a video sequence as a function of time. By applying a homoscedastic error model-based time series change point detection algorithm to this motion entropy curve, one is able to segment the corresponding video sequence into individual sections, each consisting of a semantically relevant event. The proposed method is tested on six hours of sports videos including basketball, soccer, and tennis. Excellent experimental results are observed.
Full Text Available This work is a review of the block-based algorithms used for motion estimation in video compression. It researches different types of block-based algorithms that range from the simplest named Full Search to the fast adaptive algorithms like Hierarchical Search. The algorithms evaluated in this paper are widely accepted by the video compressing community and have been used in implementing various standards, such as MPEG-4 Visual and H.264. The work also presents a very brief introduction to the entire flow of video compression.
Fritz, H. M.; Phillips, D. A.; Okayasu, A.; Shimozono, T.; Liu, H.; Takeda, S.; Mohammed, F.; Skanavis, V.; Synolakis, C. E.; Takahashi, T.
The March 11, 2011, magnitude Mw 9.0 earthquake off the coast of the Tohoku region caused catastrophic damage and loss of life in Japan. The mid-afternoon tsunami arrival combined with survivors equipped with cameras on top of vertical evacuation buildings provided spontaneous spatially and temporally resolved inundation recordings. This report focuses on the surveys at 9 tsunami eyewitness video recording locations in Myako, Kamaishi, Kesennuma and Yoriisohama along Japan's Sanriku coast and the subsequent video image calibration, processing, tsunami hydrograph and flow velocity analysis. Selected tsunami video recording sites were explored, eyewitnesses interviewed and some ground control points recorded during the initial tsunami reconnaissance in April, 2011. A follow-up survey in June, 2011 focused on terrestrial laser scanning (TLS) at locations with high quality eyewitness videos. We acquired precise topographic data using TLS at the video sites producing a 3-dimensional "point cloud" dataset. A camera mounted on the Riegl VZ-400 scanner yields photorealistic 3D images. Integrated GPS measurements allow accurate georeferencing. The original video recordings were recovered from eyewitnesses and the Japanese Coast Guard (JCG). The analysis of the tsunami videos follows an adapted four step procedure originally developed for the analysis of 2004 Indian Ocean tsunami videos at Banda Aceh, Indonesia (Fritz et al., 2006). The first step requires the calibration of the sector of view present in the eyewitness video recording based on ground control points measured in the LiDAR data. In a second step the video image motion induced by the panning of the video camera was determined from subsequent images by particle image velocimetry (PIV) applied to fixed objects. The third step involves the transformation of the raw tsunami video images from image coordinates to world coordinates with a direct linear transformation (DLT) procedure. Finally, the instantaneous tsunami
This study utilizes web-based video as a strategy to transfer knowledge about the interior design industry in a format that interests the current generation of students. The model of instruction developed is based upon online video as an engaging, economical, and time-saving alternative to a field trip, guest speaker, or video teleconference.…
Huda, C.; Hudha, M. N.; Ain, N.; Nandiyanto, A. B. D.; Abdullah, A. G.; Widiaty, I.
Computer programming course is theoretical. Sufficient practice is necessary to facilitate conceptual understanding and encouraging creativity in designing computer programs/animation. The development of tutorial video in an Android-based blended learning is needed for students’ guide. Using Android-based instructional material, students can independently learn anywhere and anytime. The tutorial video can facilitate students’ understanding about concepts, materials, and procedures of programming/animation making in detail. This study employed a Research and Development method adapting Thiagarajan’s 4D model. The developed Android-based instructional material and tutorial video were validated by experts in instructional media and experts in physics education. The expert validation results showed that the Android-based material was comprehensive and very feasible. The tutorial video was deemed feasible as it received average score of 92.9%. It was also revealed that students’ conceptual understanding, skills, and creativity in designing computer program/animation improved significantly.
Full Text Available The scope of this paper is a video surveillance system constituted of three principal modules, segmentation module, vehicle classification and vehicle counting. The segmentation is based on a background subtraction by using the Codebooks method. This step aims to define the regions of interest associated with vehicles. To classify vehicles in their type, our system uses the histograms of oriented gradient followed by support vector machine. Counting and tracking vehicles will be the last task to be performed. The presence of partial occlusion involves the decrease of the accuracy of vehicle segmentation and classification, which directly impacts the robustness of a video surveillance system. Therefore, a novel method to handle the partial occlusions based on vehicle classification process have developed. The results achieved have shown that the accuracy of vehicle counting and classification exceeds the accuracy measured in some existing systems.
Mantravadi, Anand V. S.; Bansal, Manu; Kondi, Lisimachos P.
In this paper, we address the problem of robust transmission of packet based H.264/AVC video over direct sequence-code division multiple access (DS-CDMA) channels. H.264 based data partitioning is used to produce video packets of unequal importance with regards to their need in terms of the decoded video quality. In the proposed transmission system, the data partitioned video packets are packetized as per IP/UDP/RTP protocol stack and are sorted into different levels for giving unequal error protection (UEP) using Rate Compatible Punctured Convolutional (RCPC) codes. Constant size framing is done at the link layer and Cyclic Redundancy Check header (CRC) is attached for error detection. Link layer buffering and packet interleaving schemes are proposed to improve the efficiency of the system. A multipath Rayleigh fading channel with Additive White Gaussian Noise (AWGN) and interference from other users is considered at the physical layer. The link layer frames are channel encoded, spread and transmitted over the channel. The received data is despread/demodulated using the Auxiliary Vector (AV) filter or RAKE matched filter (RAKE-MF) receiver and subsequently channel and source decoded. Our experimental results show the effectiveness of using data partitioning for wireless transmissions when compared to the system not using data partitioning. Also the superior interference mitigation capabilities of AV receiver is shown in comparison to the RAKE-MF receiver.
Paulus, Yannis M; Thompson, Noel P
We have devised an inexpensive, web-based tele-ultrasound system using commercially-available video streaming equipment. We examined the spatial and grey scale resolution, and the delay time of the system. The receiving PC was tested at various distances from the transmitting site, from 3.2 km to 4828 km. Standard resolution targets and echocardiography movie strips recorded on DVDs were used to assess the image quality. A qualitative assessment was made by an expert sonographer. As the distance between the transmitter and the receiver increased, the scan smoothness decreased and the delay increased. At a distance of 3.2 km the delay was 2-3 s, and at 4828 km it was 10-15 s. The delay was short enough to allow realtime guidance of the scanning technician by telephone. The system allows inexpensive, readily available, realtime tele-ultrasonography.
Full Text Available This paper provides an analysis of the multiplexed video signal transmission through the Dense Wavelength Division Multiplexing (DWDM network based on a quality check algorithm, which determines where the interruption of the transmission quality starts. On the basis of this algorithm, simulations of transmission for specific values of fiber parameters are executed. The analysis of the results shows how the BER and Q-factor change depends on the length of the fiber, i.e. on the number of amplifiers, and what kind of an effect the number of multiplexed channels and the flow rate per channel have on a transmited signals. Analysis of DWDM systems is performed in the software package OptiSystem 7.0, which is designed for systems with flow rates of 2.5 Gb/s and 10 Gb/s per channel.
Full Text Available The paper presents a discussion of properties of object classification methods utilized in processing video streams from a camera. Methods based on feature extraction, model fitting and invariant determination are evaluated. Petri nets are used for modelling the processing flow. Data objects and transitions are defined which are suitable for efficient implementation in FPGA circuits. Processing characteristics and problems of the implementations are shown. An invariant based method is assessed as most suitable for application in a vehicle video detector.
Ahmet, Akgul; Gamze, Kus; Rustem, Mustafaoglu; Sezen, Karaborklu Argut
Visual signs draw more attention during the learning process. Video is one of the most effective tool including a lot of visual cues. This systematic review set out to explore the influence of video in surgical education. We reviewed the current evidence for the video-based surgical education methods, discuss the advantages and disadvantages on the teaching of technical and nontechnical surgical skills. This systematic review was conducted according to the guidelines defined in the preferred reporting items for systematic reviews and meta-analyses statement. The electronic databases: the Cochrane Library, Medline (PubMED), and ProQuest were searched from their inception to the 30 January 2016. The Medical Subject Headings (MeSH) terms and keywords used were "video," "education," and "surgery." We analyzed all full-texts, randomised and nonrandomised clinical trials and observational studies including video-based education methods about any surgery. "Education" means a medical resident's or student's training and teaching process; not patients' education. We did not impose restrictions about language or publication date. A total of nine articles which met inclusion criteria were included. These trials enrolled 507 participants and the total number of participants per trial ranged from 10 to 172. Nearly all of the studies reviewed report significant knowledge gain from video-based education techniques. The findings of this systematic review provide fair to good quality studies to demonstrate significant gains in knowledge compared with traditional teaching. Additional video to simulator exercise or 3D animations has beneficial effects on training time, learning duration, acquisition of surgical skills, and trainee's satisfaction. Video-based education has potential for use in surgical education as trainees face significant barriers in their practice. This method is effective according to the recent literature. Video should be used in addition to standard techniques
Ikegami, Akiko; Ohira, Yoshiyuki; Uehara, Takanori; Noda, Kazutaka; Suzuki, Shingo; Shikino, Kiyoshi; Kajiwara, Hideki; Kondo, Takeshi; Hirota, Yusuke; Ikusaka, Masatomi
Objectives We examined whether problem-based learning tutorials using patient-simulated videos showing daily life are more practical for clinical learning, compared with traditional paper-based problem-based learning, for the consideration rate of psychosocial issues and the recall rate for experienced learning. Methods Twenty-two groups with 120 fifth-year students were each assigned paper-based problem-based learning and video-based problem-based learning using patient-simulated videos. We ...
Moutakki Zakaria; Ouloul Imad Mohamed; Afdel Karim; Amghar Abdellah
The scope of this paper is a video surveillance system constituted of three principal modules, segmentation module, vehicle classification and vehicle counting. The segmentation is based on a background subtraction by using the Codebooks method. This step aims to define the regions of interest associated with vehicles. To classify vehicles in their type, our system uses the histograms of oriented gradient followed by support vector machine. Counting and tracking vehicles will be the last task...
Héctor Alejandro Galvis
Full Text Available This paper introduces video-game based language instruction as a teaching approach catering to the different socio-economic and learning needs of English as a Foreign Language students. First, this paper reviews statistical data revealing the low participation of Colombian students in English as a second language programs abroad (U.S. context especially. This paper also provides solid reasons why the use of video games in education and foreign language education is justified. Additionally, this paper reviews second language acquisition theoretical foundations that provide the rationale for adapting video-game based language instruction in light of important second language acquisition constructs such as culture and identity, among others. Finally, this document provides options for further research to construct and test the efficacy of video-game based language instruction while simultaneously leaving it open for collaborative contributions.
Lehmann, Ronny; Seitz, Anke; Bosse, Hans Martin; Lutz, Thomas; Huwendiek, Sören
Physical examination skills are crucial for a medical doctor. The physical examination of children differs significantly from that of adults. Students often have only limited contact with pediatric patients to practice these skills. In order to improve the acquisition of pediatric physical examination skills during bedside teaching, we have developed a combined video-based training concept, subsequently evaluating its use and perception. Fifteen videos were compiled, demonstrating defined physical examination sequences in children of different ages. Students were encouraged to use these videos as preparation for bedside teaching during their pediatric clerkship. After bedside teaching, acceptance of this approach was evaluated using a 10-item survey, asking for the frequency of video use and the benefits to learning, self-confidence, and preparation of bedside teaching as well as the concluding OSCE. N=175 out of 299 students returned survey forms (58.5%). Students most frequently used videos, either illustrating complete examination sequences or corresponding focus examinations frequently assessed in the OSCE. Students perceived the videos as a helpful method of conveying the practical process and preparation for bedside teaching as well as the OSCE, and altogether considered them a worthwhile learning experience. Self-confidence at bedside teaching was enhanced by preparation with the videos. The demonstration of a defined standardized procedural sequence, explanatory comments, and demonstration of infrequent procedures and findings were perceived as particularly supportive. Long video segments, poor alignment with other curricular learning activities, and technical problems were perceived as less helpful. Students prefer an optional individual use of the videos, with easy technical access, thoughtful combination with the bedside teaching, and consecutive standardized practice of demonstrated procedures. Preparation with instructional videos combined with bedside
Li, Yan; Wang, Ruiping; Cui, Zhen; Shan, Shiguang; Chen, Xilin
We address the problem of face video retrieval in TV-series which searches video clips based on the presence of specific character, given one face track of his/her. This is tremendously challenging because on one hand, faces in TV-series are captured in largely uncontrolled conditions with complex appearance variations, and on the other hand retrieval task typically needs efficient representation with low time and space complexity. To handle this problem, we propose a compact and discriminative representation for the huge body of video data, named Compact Video Code (CVC). Our method first models the face track by its sample (i.e., frame) covariance matrix to capture the video data variations in a statistical manner. To incorporate discriminative information and obtain more compact video signature suitable for retrieval, the high-dimensional covariance representation is further encoded as a much lower-dimensional binary vector, which finally yields the proposed CVC. Specifically, each bit of the code, i.e., each dimension of the binary vector, is produced via supervised learning in a max margin framework, which aims to make a balance between the discriminability and stability of the code. Besides, we further extend the descriptive granularity of covariance matrix from traditional pixel-level to more general patchlevel, and proceed to propose a novel hierarchical video representation named Spatial Pyramid Covariance (SPC) along with a fast calculation method. Face retrieval experiments on two challenging TV-series video databases, i.e., the Big Bang Theory and Prison Break, demonstrate the competitiveness of the proposed CVC over state-of-the-art retrieval methods. In addition, as a general video matching algorithm, CVC is also evaluated in traditional video face recognition task on a standard Internet database, i.e., YouTube Celebrities, showing its quite promising performance by using an extremely compact code with only 128 bits.
Full Text Available Video applications using mobile wireless devices are a challenging task due to the limited capacity of batteries. The higher complex functionality of video decoding needs high resource requirements. Thus, power efficient control has become more critical design with devices integrating complex video processing techniques. Previous works on power efficient control in video decoding systems often aim at the low complexity design and not explicitly consider the scalable impact of subfunctions in decoding process, and seldom consider the relationship with the features of compressed video date. This paper is dedicated to developing an energy-scalable video decoding (ESVD strategy for energy-limited mobile terminals. First, ESVE can dynamically adapt the variable energy resources due to the device aware technique. Second, ESVD combines the decoder control with decoded data, through classifying the data into different partition profiles according to its characteristics. Third, it introduces utility theoretical analysis during the resource allocation process, so as to maximize the resource utilization. Finally, it adapts the energy resource as different energy budget and generates the scalable video decoding output under energy-limited systems. Experimental results demonstrate the efficiency of the proposed approach.
Zuo, Xuguang; Yu, Lu; Yu, Hualong; Mao, Jue; Zhao, Yin
In movies and TV shows, it is common that several scenes repeat alternately. These videos are characterized with the long-term temporal correlation, which can be exploited to improve video coding efficiency. However, in applications supporting random access (RA), a video is typically divided into a number of RA segments (RASs) by RA points (RAPs), and different RASs are coded independently. In such a way, the long-term temporal correlation among RASs with similar scenes cannot be used. We present a scene-library-based video coding scheme for the coding of videos with repeated scenes. First, a compact scene library is built by clustering similar scenes and extracting representative frames in encoding video. Then, the video is coded using a layered scene-library-based coding structure, in which the library frames serve as long-term reference frames. The scene library is not cleared by RAPs so that the long-term temporal correlation between RASs from similar scenes can be exploited. Furthermore, the RAP frames are coded as interframes by only referencing library frames so as to improve coding efficiency while maintaining RA property. Experimental results show that the coding scheme can achieve significant coding gain over state-of-the-art methods.
Full Text Available In recent years, video super resolution techniques becomes mandatory requirements to get high resolution videos. Many super resolution techniques researched but still video super resolution or scaling is a vital challenge. In this paper, we have presented a real-time video scaling based on convolution neural network architecture to eliminate the blurriness in the images and video frames and to provide better reconstruction quality while scaling of large datasets from lower resolution frames to high resolution frames. We compare our outcomes with multiple exiting algorithms. Our extensive results of proposed technique RemCNN (Reconstruction error minimization Convolution Neural Network shows that our model outperforms the existing technologies such as bicubic, bilinear, MCResNet and provide better reconstructed motioning images and video frames. The experimental results shows that our average PSNR result is 47.80474 considering upscale-2, 41.70209 for upscale-3 and 36.24503 for upscale-4 for Myanmar dataset which is very high in contrast to other existing techniques. This results proves our proposed model real-time video scaling based on convolution neural network architecture’s high efficiency and better performance.
Joshi, Amit M.; Mishra, Vivekanand; Patrikar, R. M.
With the advent of technology, video has become a prominent entity that is shared over networks. With easy availability of various editing tools, data integrity and ownership issues have caused great concern worldwide. Video watermarking is an evolving field that may be used to address such issues. Till date, most of the algorithms have been developed for uncompressed domain watermarking and implemented on software platforms. They provide flexibility and simplicity, but at the same time, they are not suited for real-time applications. They work offline where videos are captured and then watermark is embedded in the video. In the present work, a hardware-based implementation of video watermarking is proposed that overcomes the limitation of software watermarking methods and can be readily adapted to the H.264 standard. This paper focuses on an invisible and robust video watermarking scheme, which can be easily implemented as an integral part of the standard H.264 encoder. The proposed watermarking algorithm involves Integer DCT-based watermark embedding method, wherein Integer DCT is calculated with a fully parallel approach resulting in better speed. The proposed video watermarking is designed with pipelining and parallel architecture for real-time implementation. Here, scene change detection technique is used to improve the performance. Different planes of the watermark are embedded in different frames of a particular scene in order to achieve robustness against various temporal attacks.
Saur, Drew D.; Tan, Yap-Peng; Kulkarni, Sanjeev R.; Ramadge, Peter J.
Automated analysis and annotation of video sequences are important for digital video libraries, content-based video browsing and data mining projects. A successful video annotation system should provide users with useful video content summary in a reasonable processing time. Given the wide variety of video genres available today, automatically extracting meaningful video content for annotation still remains hard by using current available techniques. However, a wide range video has inherent structure such that some prior knowledge about the video content can be exploited to improve our understanding of the high-level video semantic content. In this paper, we develop tools and techniques for analyzing structured video by using the low-level information available directly from MPEG compressed video. Being able to work directly in the video compressed domain can greatly reduce the processing time and enhance storage efficiency. As a testbed, we have developed a basketball annotation system which combines the low-level information extracted from MPEG stream with the prior knowledge of basketball video structure to provide high level content analysis, annotation and browsing for events such as wide- angle and close-up views, fast breaks, steals, potential shots, number of possessions and possession times. We expect our approach can also be extended to structured video in other domains.
Agerholm, Niels; Tønning, Charlotte; Madsen, Tanja Kidholm Osmann
has been developed. It works as a watchdog – if a passing road user affects defined part(s) of the video frame, RUBA records the time of the activity. It operates with three type of detectors (defined parts of the video frame): 1) if a road user passes the detector independent of the direction, 2......) if a road user passes the area in one pre-adjusted specific direction and 3) if a road user is standing still in the detector area. Also, RUBA can be adjusted so it registers massive entities (e.g. cars) while less massive ones (e.g. cyclists) are not registered. The software has been used for various...... analyses of traffic behaviour: traffic counts with and without removal of different modes of transportation, traffic conflicts, traffic behaviour for specific traffic flows and modes and comparisons of speeds in rebuilt road areas. While there is still space for improvement regarding data treatment speed...
Kavallakis, George; Vidakis, Nikolaos; Triantafyllidis, Georgios
This paper presents a scheme of creating an emotion index of cover song music video clips by recognizing and classifying facial expressions of the artist in the video. More specifically, it fuses effective and robust algorithms which are employed for expression recognition, along with the use...... of a neural network system using the features extracted by the SIFT algorithm. Also we support the need of this fusion of different expression recognition algorithms, because of the way that emotions are linked to facial expressions in music video clips....
Scollato, A; Perrini, P; Benedetto, N; Di Lorenzo, N
We propose an easy-to-construct digital video editing system ideal to produce video documentation and still images. A digital video editing system applicable to many video sources in the operating room is described in detail. The proposed system has proved easy to use and permits one to obtain videography quickly and easily. Mixing different streams of video input from all the devices in use in the operating room, the application of filters and effects produces a final, professional end-product. Recording on a DVD provides an inexpensive, portable and easy-to-use medium to store or re-edit or tape at a later time. From stored videography it is easy to extract high-quality, still images useful for teaching, presentations and publications. In conclusion digital videography and still photography can easily be recorded by the proposed system, producing high-quality video recording. The use of firewire ports provides good compatibility with next-generation hardware and software. The high standard of quality makes the proposed system one of the lowest priced products available today.
Full Text Available In fixed video scenes, scene motion patterns can be a very useful prior knowledge for pedestrian detection which is still a challenge at present. A new approach of cascade pedestrian detection using an orthogonal scene motion pattern model in a general density video is developed in this paper. To statistically model the pedestrian motion pattern, a probability grid overlaying the whole scene is set up to partition the scene into paths and holding areas. Features extracted from different pattern areas are classified by a group of specific strategies. Instead of using a unitary classifier, the employed classifier is composed of two directional subclassifiers trained, respectively, with different samples which are selected by two orthogonal directions. Considering that the negative images from the detection window scanning are much more than the positive ones, the cascade AdaBoost technique is adopted by the subclassifiers to reduce the negative image computations. The proposed approach is proved effectively by static classification experiments and surveillance video experiments.
Belyaev, Evgeny; Forchhammer, Søren; Codreanu, Marian
Error concealment for video coding based on a 3-D discrete wavelet transform (DWT) is considered. We assume that the video sequence has a sparse representation in a known basis different from the DWT, e.g., in a 2-D discrete cosine transform basis. Then, we formulate the concealment problem as l1......-norm minimization and solve it utilizing an iterative thresholding algorithm. Comparing different thresholding operators, we show that video block-matching and 3-D filtering provide the best reconstruction by utilizing spatial similarity within a frame and temporal similarity between neighbor frames...
Hsu, Tz-Heng; Liang, You-Sheng; Chiang, Meng-Shu
In this paper, we propose a BitTorrent-based dynamic bandwidth adaptation algorithm for video streaming. Two mechanisms to improve the original BitTorrent protocol are proposed: (1) the decoding order frame first (DOFF) frame selection algorithm and (2) the rarest I frame first (RIFF) frame selection algorithm. With the proposed algorithms, a peer can periodically check the number of downloaded frames in the buffer and then allocate the available bandwidth adaptively for video streaming. As a result, users can have smooth video playout experience with the proposed algorithms.
Almendros Gutiérrez, David
[ANGLÈS] The aim of this thesis is to design a tool that performs visual instance search mining for news video summarization. This means to extract the relevant content of the video in order to be able to recognize the storyline of the news. Initially, a sampling of the video is required to get the frames with a desired rate. Then, different relevant contents are detected from each frame, focusing on faces, text and several objects that the user can select. Next, we use a graph-based clusteri...
Full Text Available An online video edge detection device for bottle caps is designed and implemented using OV7670 video module and FPGA based control unit. By Verilog language programming, the device realizes the menu type parametric setting of the external VGA display, and completes the Roberts edge detection of real-time video image, which improves the speed of image processing. By improving the detection algorithm, the noise is effectively suppressed, and clear and coherent edge images are derived. The design improves the working environment, and avoids the harm to human body.
Søgaard, Jacob; Shahid, Muhammad; Pokhrel, Jeevan
Video streaming services are offered over the Internet and since the service providers do not have full control over the network conditions all the way to the end user, streaming technologies have been developed to maintain the quality of service in these varying network conditions i.e. so called...... adaptive video streaming. In order to cater for users' Quality of Experience (QoE) requirements, HTTP based adaptive streaming solutions of video services have become popular. However, the keys to ensure the users a good QoE with this technology is still not completely understood. User QoE feedback...
Parisot Christophe; Antonini Marc; Barlaud Michel
Wavelet coding has been shown to achieve better compression than DCT coding and moreover allows scalability. 2D DWT can be easily extended to 3D and thus applied to video coding. However, 3D subband coding of video suffers from two drawbacks. The first is the amount of memory required for coding large 3D blocks; the second is the lack of temporal quality due to the sequence temporal splitting. In fact, 3D block-based video coders produce jerks. They appear at blocks temporal borders during v...
Heckendorn, F.M.; Robinson, C.W.
Specialized miniature low cost video equipment has been effectively used in a number of remote, radioactive, and contaminated environments at the Savannah River Site (SRS). The equipment and related techniques have reduced the potential for personnel exposure to both radiation and physical hazards. The valuable process information thus provided would not have otherwise been available for use in improving the quality of operation at SRS.
Ehlert, Steven; Kingery, Aaron; Suggs, Robert
We present the results of new calibration tests performed by the NASA Meteoroid Environment Office (MEO) designed to help quantify and minimize systematic uncertainties in meteor photometry from video camera observations. These systematic uncertainties can be categorized by two main sources: an imperfect understanding of the linearity correction for the MEO's Watec 902H2 Ultimate video cameras and uncertainties in meteor magnitudes arising from transformations between the Watec camera's Sony EX-View HAD bandpass and the bandpasses used to determine reference star magnitudes. To address the first point, we have measured the linearity response of the MEO's standard meteor video cameras using two independent laboratory tests on eight cameras. Our empirically determined linearity correction is critical for performing accurate photometry at low camera intensity levels. With regards to the second point, we have calculated synthetic magnitudes in the EX bandpass for reference stars. These synthetic magnitudes enable direct calculations of the meteor's photometric flux within the camera bandpass without requiring any assumptions of its spectral energy distribution. Systematic uncertainties in the synthetic magnitudes of individual reference stars are estimated at ∼ 0.20 mag , and are limited by the available spectral information in the reference catalogs. These two improvements allow for zero-points accurate to ∼ 0.05 - 0.10 mag in both filtered and unfiltered camera observations with no evidence for lingering systematics. These improvements are essential to accurately measuring photometric masses of individual meteors and source mass indexes.
Ehlert, Steven; Kingery, Aaron; Suggs, Robert
We present the results of new calibration tests performed by the NASA Meteoroid Environment Oce (MEO) designed to help quantify and minimize systematic uncertainties in meteor photometry from video camera observations. These systematic uncertainties can be categorized by two main sources: an imperfect understanding of the linearity correction for the MEO's Watec 902H2 Ultimate video cameras and uncertainties in meteor magnitudes arising from transformations between the Watec camera's Sony EX-View HAD bandpass and the bandpasses used to determine reference star magnitudes. To address the rst point, we have measured the linearity response of the MEO's standard meteor video cameras using two independent laboratory tests on eight cameras. Our empirically determined linearity correction is critical for performing accurate photometry at low camera intensity levels. With regards to the second point, we have calculated synthetic magnitudes in the EX bandpass for reference stars. These synthetic magnitudes enable direct calculations of the meteor's photometric ux within the camera band-pass without requiring any assumptions of its spectral energy distribution. Systematic uncertainties in the synthetic magnitudes of individual reference stars are estimated at 0:20 mag, and are limited by the available spectral information in the reference catalogs. These two improvements allow for zero-points accurate to 0:05 ?? 0:10 mag in both ltered and un ltered camera observations with no evidence for lingering systematics.
Ferreira, João, E-mail: firstname.lastname@example.org [Instituto de Plasmas e Fusão Nuclear - Laboratório Associado, Instituto Superior Técnico, Universidade Técnica de Lisboa, Av. Rovisco Pais 1, 1049-001 Lisboa (Portugal); Vale, Alberto [Instituto de Plasmas e Fusão Nuclear - Laboratório Associado, Instituto Superior Técnico, Universidade Técnica de Lisboa, Av. Rovisco Pais 1, 1049-001 Lisboa (Portugal); Ribeiro, Isabel [Laboratório de Robótica e Sistemas em Engenharia e Ciência - Laboratório Associado, Instituto Superior Técnico, Universidade Técnica de Lisboa, Av. Rovisco Pais 1, 1049-001 Lisboa (Portugal)
Highlights: ► Localization of cask and plug remote handling system with video cameras and markers. ► Video cameras already installed on the building for remote operators. ► Fiducial markers glued or painted on cask and plug remote handling system. ► Augmented reality contents on the video streaming as an aid for remote operators. ► Integration with other localization systems for enhanced robustness and precision. -- Abstract: The cask and plug remote handling system (CPRHS) provides the means for the remote transfer of in-vessel components and remote handling equipment between the Hot Cell building and the Tokamak building in ITER. Different CPRHS typologies will be autonomously guided following predefined trajectories. Therefore, the localization of any CPRHS in operation must be continuously known in real time to provide the feedback for the control system and also for the human supervision. This paper proposes a localization system that uses the video streaming captured by the multiple cameras already installed in the ITER scenario to estimate with precision the position and the orientation of any CPRHS. In addition, an augmented reality system can be implemented using the same video streaming and the libraries for the localization system. The proposed localization system was tested in a mock-up scenario with a scale 1:25 of the divertor level of Tokamak building.
Full Text Available Compressive sensing (CS theory has opened up new paths for the development of signal processing applications. Based on this theory, a novel single pixel camera architecture has been introduced to overcome the current limitations and challenges of traditional focal plane arrays. However, video quality based on this method is limited by existing acquisition and recovery methods, and the method also suffers from being time-consuming. In this paper, a multi-frame motion estimation algorithm is proposed in CS video to enhance the video quality. The proposed algorithm uses multiple frames to implement motion estimation. Experimental results show that using multi-frame motion estimation can improve the quality of recovered videos. To further reduce the motion estimation time, a block match algorithm is used to process motion estimation. Experiments demonstrate that using the block match algorithm can reduce motion estimation time by 30%.
Full Text Available Passive video forensics has drawn much attention in recent years. However, research on detection of object-based forgery, especially for forged video encoded with advanced codec frameworks, is still a great challenge. In this paper, we propose a deep learning-based approach to detect object-based forgery in the advanced video. The presented deep learning approach utilizes a convolutional neural network (CNN to automatically extract high-dimension features from the input image patches. Different from the traditional CNN models used in computer vision domain, we let video frames go through three preprocessing layers before being fed into our CNN model. They include a frame absolute difference layer to cut down temporal redundancy between video frames, a max pooling layer to reduce computational complexity of image convolution, and a high-pass filter layer to enhance the residual signal left by video forgery. In addition, an asymmetric data augmentation strategy has been established to get a similar number of positive and negative image patches before the training. The experiments have demonstrated that the proposed CNN-based model with the preprocessing layers has achieved excellent results.
Wu, Pei-Hsun; Agarwal, Ashutosh; Hess, Henry; Khargonekar, Pramod P.; Tseng, Yiider
Abstract The fidelity of the trajectories obtained from video-based particle tracking determines the success of a variety of biophysical techniques, including in situ single cell particle tracking and in vitro motility assays. However, the image acquisition process is complicated by system noise, which causes positioning error in the trajectories derived from image analysis. Here, we explore the possibility of reducing the positioning error by the application of a Kalman filter, a powerful algorithm to estimate the state of a linear dynamic system from noisy measurements. We show that the optimal Kalman filter parameters can be determined in an appropriate experimental setting, and that the Kalman filter can markedly reduce the positioning error while retaining the intrinsic fluctuations of the dynamic process. We believe the Kalman filter can potentially serve as a powerful tool to infer a trajectory of ultra-high fidelity from noisy images, revealing the details of dynamic cellular processes. PMID:20550894
Moezzi, Saied; Katkere, Arun L.; Jain, Ramesh C.
Interactive video and television viewers should have the power to control their viewing position. To make this a reality, we introduce the concept of Immersive Video, which employs computer vision and computer graphics technologies to provide remote users a sense of complete immersion when viewing an event. Immersive Video uses multiple videos of an event, captured from different perspectives, to generate a full 3D digital video of that event. That is accomplished by assimilating important information from each video stream into a comprehensive, dynamic, 3D model of the environment. Using this 3D digital video, interactive viewers can then move around the remote environment and observe the events taking place from any desired perspective. Our Immersive Video System currently provides interactive viewing and `walkthrus' of staged karate demonstrations, basketball games, dance performances, and typical campus scenes. In its full realization, Immersive Video will be a paradigm shift in visual communication which will revolutionize television and video media, and become an integral part of future telepresence and virtual reality systems.
Zydney, Janet Mannheimer; Hooper, Simon
Educators can use video to gain invaluable information about their students. A concern is that collecting videos online can create an increased security risk for children. The purpose of this article is to provide ethical and legal guidelines for designing video-based apps for mobile devices and the web. By reviewing the literature, law, and code…
Love, Elyse M; Manalo, Iviensan F; Chen, Suephy C; Chen, Kuang-Ho; Stoff, Benjamin K
Several treatment options exist for uncomplicated basal cell carcinoma. Standardized and effective informed consent is difficult in busy dermatology clinics. We investigated whether an educational video depicting 3 treatment options for uncomplicated basal cell carcinoma-excision, electrodessication and curettage, and topical therapy-before standard in-office informed consent affected patient knowledge and consent time compared with standard in-office consent alone. Patients were randomized to receive video education plus verbal discussion (video) or standard verbal discussion alone (control). Both groups completed baseline and final knowledge assessments. The primary outcome measure was change in knowledge scores between groups. Secondary outcomes were patient satisfaction, physician satisfaction, and informed consent time. In all, 32 eligible patients (16 control, 16 video) from an academic institution and affiliate Department of Veterans Affairs Medical Center dermatology clinics participated. The video group had significantly greater gains in knowledge compared with the control group (mean ± SD: 9 ± 3.6 vs 2.9 ± 2.2) (P = .0048). There was no significant difference in total consent time between groups. Patients and physicians were highly satisfied with the video. Small sample size and slight methodological difference between recruitment sites are limitations. Video-based education for basal cell carcinoma improved patient knowledge with no additional physician time when compared with standard communication. Published by Elsevier Inc.
Vonk, Matthew; Bohacek, Peter; Militello, Cheryl; Iverson, Ellen
This study focuses on student development of two important laboratory skills in the context of introductory college-level physics. The first skill, which we call model making, is the ability to analyze a phenomenon in a way that produces a quantitative multimodal model. The second skill, which we call model breaking, is the ability to critically evaluate if the behavior of a system is consistent with a given model. This study involved 116 introductory physics students in four different sections, each taught by a different instructor. All of the students within a given class section participated in the same instruction (including labs) with the exception of five activities performed throughout the semester. For those five activities, each class section was split into two groups; one group was scaffolded to focus on model-making skills and the other was scaffolded to focus on model-breaking skills. Both conditions involved direct measurement videos. In some cases, students could vary important experimental parameters within the video like mass, frequency, and tension. Data collected at the end of the semester indicate that students in the model-making treatment group significantly outperformed the other group on the model-making skill despite the fact that both groups shared a common physical lab experience. Likewise, the model-breaking treatment group significantly outperformed the other group on the model-breaking skill. This is important because it shows that direct measurement video-based instruction can help students acquire science-process skills, which are critical for scientists, and which are a key part of current science education approaches such as the Next Generation Science Standards and the Advanced Placement Physics 1 course.
This video supplement contains a set of videos created during the approximately 10-year-long course of developing and testing the Goddard Space Flight Center (GSFC) harpoon-based approach for collecting comet samples. The purpose of the videos is to illustrate various design concepts used in this method of acquiring samples of comet material, the testing used to verify the concepts, and the evolution of designs and testing. To play the videos this PDF needs to be opened in the freeware Adobe Reader. They do not seem to play while within a browser. While this supplement can be used as a stand-alone document, it is intended to augment its parent document of the same title, Development and Testing of Harpoon-Based Approaches for Collecting Comet Samples (NASA/CR-2017-219018; this document is accessible from the website: https://ssed.gsfc.nasa.gov/harpoon/SAS_Paper-V1.pdf). The parent document, which only contains text and figures, describes the overall development and testing effort and contains references to each of the videos in this supplement. Thus, the videos are primarily intended to augment the information provided by the text and figures in the parent document. This approach was followed to allow the file size of the parent document to remain small enough to facilitate downloading and storage. Some of the videos were created by other organizations, Johns Hopkins University Applied Physics Laboratory (JHU APL) and the German Aerospace Center called, the Deutsches Zentrum für Luft- und Raumfahrt (DLR), who are partnering with GSFC on developing this technology. Each video is accompanied by text that provides a summary description of its nature and purpose, as well as the identity of the authors. All videos have been edited to only show key parts of the testing. Although not all videos have sound, the sound has been retained in those that have it. Also, each video has been given one or more title screens to clarify what is going in different phases of the video.
Qian, Jia; Sui, Xiubao
This paper presents a high-definition video display solution based on the FPGA and THS8200. THS8200 is a video decoder chip launched by TI company, this chip has three 10-bit DAC channels which can capture video data in both 4:2:2 and 4:4:4 formats, and its data synchronization can be either through the dedicated synchronization signals HSYNC and VSYNC, or extracted from the embedded video stream synchronization information SAV / EAV code. In this paper, we will utilize the address and control signals generated by FPGA to access to the data-storage array, and then the FPGA generates the corresponding digital video signals YCbCr. These signals combined with the synchronization signals HSYNC and VSYNC that are also generated by the FPGA act as the input signals of THS8200. In order to meet the bandwidth requirements of the high-definition TV, we adopt video input in the 4:2:2 format over 2×10-bit interface. THS8200 is needed to be controlled by FPGA with I2C bus to set the internal registers, and as a result, it can generate the synchronous signal that is satisfied with the standard SMPTE and transfer the digital video signals YCbCr into analog video signals YPbPr. Hence, the composite analog output signals YPbPr are consist of image data signal and synchronous signal which are superimposed together inside the chip THS8200. The experimental research indicates that the method presented in this paper is a viable solution for high-definition video display, which conforms to the input requirements of the new high-definition display devices.
Full Text Available The expansion of Digital Television and the convergence between conventional broadcasting and television over IP contributed to the gradual increase of the number of available channels and on demand video content. Moreover, the dissemination of the use of mobile devices like laptops, smartphones and tablets on everyday activities resulted in a shift of the traditional television viewing paradigm from the couch to everywhere, anytime from any device. Although this new scenario enables a great improvement in viewing experiences, it also brings new challenges given the overload of information that the viewer faces. Recommendation systems stand out as a possible solution to help a watcher on the selection of the content that best fits his/her preferences. This paper describes a web based system that helps the user navigating on broadcasted and online television content by implementing recommendations based on collaborative and content based filtering. The algorithms developed estimate the similarity between items and users and predict the rating that a user would assign to a particular item (television program, movie, etc.. To enable interoperability between different systems, programs? characteristics (title, genre, actors, etc. are stored according to the TV-Anytime standard. The set of recommendations produced are presented through a Web Application that allows the user to interact with the system based on the obtained recommendations.
Full Text Available One of the key requirements for mobile devices is to provide high-performance computing at lower power consumption. The processors used in these devices provide specific hardware resources to handle computationally intensive video processing and interactive graphical applications. Moreover, processors designed for low-power applications may introduce limitations on the availability and usage of resources, which present additional challenges to the system designers. Owing to the specific design of the JZ47x series of mobile application processors, a hybrid software-hardware implementation scheme for H.264/AVC encoder is proposed in this work. The proposed scheme distributes the encoding tasks among hardware and software modules. A series of optimization techniques are developed to speed up the memory access and data transferring among memories. Moreover, an efficient data reusage design is proposed for the deblock filter video processing unit to reduce the memory accesses. Furthermore, fine grained macroblock (MB level parallelism is effectively exploited and a pipelined approach is proposed for efficient utilization of hardware processing cores. Finally, based on parallelism in the proposed design, encoding tasks are distributed between two processing cores. Experiments show that the hybrid encoder is 12 times faster than a highly optimized sequential encoder due to proposed techniques.
Full Text Available Candidate smoke region segmentation is the key link of smoke video detection; an effective and prompt method of candidate smoke region segmentation plays a significant role in a smoke recognition system. However, the interference of heavy fog and smoke-color moving objects greatly degrades the recognition accuracy. In this paper, a novel method of candidate smoke region segmentation based on rough set theory is presented. First, Kalman filtering is used to update video background in order to exclude the interference of static smoke-color objects, such as blue sky. Second, in RGB color space smoke regions are segmented by defining the upper approximation, lower approximation, and roughness of smoke-color distribution. Finally, in HSV color space small smoke regions are merged by the definition of equivalence relation so as to distinguish smoke images from heavy fog images in terms of V component value variety from center to edge of smoke region. The experimental results on smoke region segmentation demonstrated the effectiveness and usefulness of the proposed scheme.
The effectiveness of video-based training of an electronic medical record system: An exploratory study on computer literate health workers in rural Uganda : Ändamålsenligheten hos videobaserad undervisning av ett elektroniskt patientjournalsystem: en explorativ studie av datorvana sjukvårdsarbetare på Ugandas landsbygd
Aims The purpose of this study is to explore the possibilities for video-based learning of computer systems in the field of medical education in rural sub-Saharan Africa. Background Low-income countries are forced to perform healthcare services with resources already spread too thin. The use of electronic medical records can increase the cost-effectiveness of delivering healthcare services, but the low computer literacy in sub-Saharan Africa is an obstacle necessary to overcome. E-learning an...
Chen Homer H
Full Text Available The paradigm shift of network design from performance-centric to constraint-centric has called for new signal processing techniques to deal with various aspects of resource-constrained communication and networking. In this paper, we consider the computational constraints of a multimedia communication system and propose a video adaptation mechanism for live video streaming of multiple channels. The video adaptation mechanism includes three salient features. First, it adjusts the computational resource of the streaming server block by block to provide a fine control of the encoding complexity. Second, as far as we know, it is the first mechanism to allocate the computational resource to multiple channels. Third, it utilizes a complexity-distortion model to determine the optimal coding parameter values to achieve global optimization. These techniques constitute the basic building blocks for a successful application of wireless and Internet video to digital home, surveillance, IPTV, and online games.
Parihar, Vijay; Yadav, Y R; Kher, Yatin; Ratre, Shailendra; Sethi, Ashish; Sharma, Dhananjaya
Steep learning curve is found initially in pure endoscopic procedures. Video telescopic operating monitor (VITOM) is an advance in rigid-lens telescope systems provides an alternative method for learning basics of neuroendoscopy with the help of the familiar principle of microneurosurgery. The aim was to evaluate the clinical utility of VITOM as a learning tool for neuroendoscopy. Video telescopic operating monitor was used 39 cranial and spinal procedures and its utility as a tool for minimally invasive neurosurgery and neuroendoscopy for initial learning curve was studied. Video telescopic operating monitor was used in 25 cranial and 14 spinal procedures. Image quality is comparable to endoscope and microscope. Surgeons comfort improved with VITOM. Frequent repositioning of scope holder and lack of stereopsis is initial limiting factor was compensated for with repeated procedures. Video telescopic operating monitor is found useful to reduce initial learning curve of neuroendoscopy.
渡部, 和雄; 湯瀬, 裕昭; 渡邉, 貴之; 井口, 真彦; 藤田, 広一
The authors have developed a distance education system for interactive education which can transmit 4 video streams between distant lecture rooms. In this paper, we describe the results of our experiments using the system for adult education. We propose some efficient ways to use the system for adult education.
Gao, Pan; Peng, Qiang; Xiang, Wei
View synthesis prediction (VSP) is a crucial coding tool for improving compression efficiency in the next generation 3D video systems. However, VSP is susceptible to catastrophic error propagation when multi-view video plus depth (MVD) data are transmitted over lossy networks. This paper aims at accurately modeling the transmission errors propagated in the inter-view direction caused by VSP. Toward this end, we first study how channel errors gradually propagate along the VSP-based inter-view prediction path. Then, a new recursive model is formulated to estimate the expected end-to-end distortion caused by those channel losses. For the proposed model, the compound impact of the transmission distortions of both the texture video and depth map on the quality of the synthetic reference view is mathematically analyzed. Especially, the expected view synthesis distortion due to depth errors is characterized in the frequency domain using a new approach, which combines the energy densities of the reconstructed texture image and the channel errors. The proposed model also explicitly considers the disparity rounding operation invoked for the sub-pixel precision rendering of the synthesized reference view. Experimental results are presented to demonstrate that the proposed analytic model is capable of effectively modeling the channel-induced distortion for MVD-based 3D video transmission.
Full Text Available Moving object detection and tracking is the computer vision and image processing is a hot research direction, based on the analysis of the moving target detection and tracking algorithm in common use, focus on the sports video target tracking non rigid body. In sports video, non rigid athletes often have physical deformation in the process of movement, and may be associated with the occurrence of moving target under cover. Media data is surging to fast search and query causes more difficulties in data. However, the majority of users want to be able to quickly from the multimedia data to extract the interested content and implicit knowledge (concepts, rules, rules, models and correlation, retrieval and query quickly to take advantage of them, but also can provide the decision support problem solving hierarchy. Based on the motion in sport video object as the object of study, conducts the system research from the theoretical level and technical framework and so on, from the layer by layer mining between low level motion features to high-level semantic motion video, not only provides support for users to find information quickly, but also can provide decision support for the user to solve the problem.
Full Text Available The study aims at developing and exploring a novel video-based assessment that captures classroom management expertise (CME of teachers and for which statistical results are provided. CME measurement is conceptualized by using four video clips that refer to typical classroom management situations in which teachers are heavily challenged (involving the challenges to manage transitions, instructional time, student behavior, and instructional feedback and by applying three cognitive demands posed on respondents when responding to test items related to the video clips (accuracy of perception, holistic perception, and justification of action. Research questions are raised regarding reliability, testlet effects (related to the four video clips applied for measurement, intercorrelations of cognitive demands, and criterion-related validity of the instrument. Evidence is provided that (1 using a video-based assessment CME can be measured in a reliable way, (2 the CME total score represents a general ability that is only slightly influenced by testlet effects related to the four video clips, (3 the three cognitive demands conceptualized for the measurement of CME are highly intercorrelated, and (4 the CME measure is positively correlated with declarative-conceptual general pedagogical knowledge (medium effect size, whereas it shows only small size correlations with non-cognitive teacher variables.
Fritz, H. M.; Phillips, D. A.; Okayasu, A.; Shimozono, T.; Liu, H.; Mohammed, F.; Skanavis, V.; Synolakis, C. E.; Takahashi, T.
On March 11, 2011, a magnitude Mw 9.0 earthquake occurred off the coast of Japan's Tohoku region causing catastrophic damage and loss of life. Numerous tsunami reconnaissance trips were conducted in Japan (Tohoku Earthquake and Tsunami Joint Survey Group). This report focuses on the surveys at 9 tsunami eyewitness video recording locations in Yoriisohama, Kesennuma, Kamaishi and Miyako along Japan's Sanriku coast and the subsequent video image calibration, processing, tsunami hydrograph and flow velocity analysis. Selected tsunami video recording sites were visited, eyewitnesses interviewed and some ground control points recorded during the initial tsunami reconnaissance from April 9 to 25. A follow-up survey from June 9 to 15, 2011 focused on terrestrial laser scanning (TLS) at locations with previously identified high quality eyewitness videos. We acquired precise topographic data using TLS at nine video sites with multiple scans acquired from different instrument positions at each site. These ground-based LiDAR measurements produce a 3-dimensional "point cloud" dataset. Digital photography from a scanner-mounted camera yields photorealistic 3D images. Integrated GPS measurements allow accurate georeferencing of the TLS data in an absolute reference frame such as WGS84. We deployed a Riegl VZ-400 scanner (1550 nm wavelength laser, 42,000 measurements/second, requires the calibration of the sector of view present in the eyewitness video recording based on visually identifiable ground control points measured in the LiDAR point cloud data. In a second step the video image motion induced by the panning of the video camera was determined from subsequent raw color images by means of planar particle image velocimetry (PIV) applied to fixed objects in the field of view. The third step involves the transformation of the raw tsunami video images from image coordinates to world coordinates. The mapping from video frame to real world coordinates follows the direct linear
Suenaga, Ryo; Suzuki, Kazuyoshi; Tezuka, Tomoyuki; Panahpour Tehrani, Mehrdad; Takahashi, Keita; Fujii, Toshiaki
In this paper, we present a free viewpoint video generation system with billboard representation for soccer games. Free viewpoint video generation is a technology that enables users to watch 3-D objects from their desired viewpoints. Practical implementation of free viewpoint video for sports events is highly demanded. However, a commercially acceptable system has not yet been developed. The main obstacles are insufficient user-end quality of the synthesized images and highly complex procedures that sometimes require manual operations. In this work, we aim to develop a commercially acceptable free viewpoint video system with a billboard representation. A supposed scenario is that soccer games during the day can be broadcasted in 3-D, even in the evening of the same day. Our work is still ongoing. However, we have already developed several techniques to support our goal. First, we captured an actual soccer game at an official stadium where we used 20 full-HD professional cameras. Second, we have implemented several tools for free viewpoint video generation as follow. In order to facilitate free viewpoint video generation, all cameras should be calibrated. We calibrated all cameras using checker board images and feature points on the field (cross points of the soccer field lines). We extract each player region from captured images manually. The background region is estimated by observing chrominance changes of each pixel in temporal domain (automatically). Additionally, we have developed a user interface for visualizing free viewpoint video generation using a graphic library (OpenGL), which is suitable for not only commercialized TV sets but also devices such as smartphones. However, practical system has not yet been completed and our study is still ongoing.
This thesis is based on a detailed analysis of various topics related to the question of whether video games can be art. In the first place it analyzes the current academic discussion on this subject and confronts different opinions of both supporters and objectors of the idea, that video games can be a full-fledged art form. The second point of this paper is to analyze the properties, that are inherent to video games, in order to find the reason, why cultural elite considers video games as i...
Se, Stephen; Nadeau, Christian; Wood, Scott
Airborne surveillance and reconnaissance are essential for successful military missions. Such capabilities are critical for troop protection, situational awareness, mission planning, damage assessment, and others. Unmanned Aerial Vehicles (UAVs) gather huge amounts of video data but it is extremely labour-intensive for operators to analyze hours and hours of received data. At MDA, we have developed a suite of tools that can process the UAV video data automatically, including mosaicking, change detection and 3D reconstruction, which have been integrated within a standard GIS framework. In addition, the mosaicking and 3D reconstruction tools have also been integrated in a Service Oriented Architecture (SOA) framework. The Visualization and Exploitation Workstation (VIEW) integrates 2D and 3D visualization, processing, and analysis capabilities developed for UAV video exploitation. Visualization capabilities are supported through a thick-client Graphical User Interface (GUI), which allows visualization of 2D imagery, video, and 3D models. The GUI interacts with the VIEW server, which provides video mosaicking and 3D reconstruction exploitation services through the SOA framework. The SOA framework allows multiple users to perform video exploitation by running a GUI client on the operator's computer and invoking the video exploitation functionalities residing on the server. This allows the exploitation services to be upgraded easily and allows the intensive video processing to run on powerful workstations. MDA provides UAV services to the Canadian and Australian forces in Afghanistan with the Heron, a Medium Altitude Long Endurance (MALE) UAV system. On-going flight operations service provides important intelligence, surveillance, and reconnaissance information to commanders and front-line soldiers.
Full Text Available In the paper are presented the results of strength analysis for the two types of the welded joints made according to conventional and laser technologies of high-strength steel S960QC. The hardness distributions, tensile properties and fracture toughness were determined for the weld material and heat affect zone material for both types of the welded joints. Tests results shown on advantage the laser welded joints in comparison to the convention ones. Tensile properties and fracture toughness in all areas of the laser joints have a higher level than in the conventional one. The heat affect zone of the conventional welded joints is a weakness area, where the tensile properties are lower in comparison to the base material. Verification of the tensile tests, which carried out by using the Aramis video system, confirmed this assumption. The highest level of strains was observed in HAZ material and the destruction process occurred also in HAZ of the conventional welded joint.
Yoon, Bo Young; Choi, Ikseon; Choi, Seokjin; Kim, Tae-Hee; Roh, Hyerin; Rhee, Byoung Doo; Lee, Jong-Tae
The quality of problem representation is critical for developing students' problem-solving abilities in problem-based learning (PBL). This study investigates preclinical students' experience with standardized patients (SPs) as a problem representation method compared to using video cases in PBL. A cohort of 99 second-year preclinical students from Inje University College of Medicine (IUCM) responded to a Likert scale questionnaire on their learning experiences after they had experienced both video cases and SPs in PBL. The questionnaire consisted of 14 items with eight subcategories: problem identification, hypothesis generation, motivation, collaborative learning, reflective thinking, authenticity, patient-doctor communication, and attitude toward patients. The results reveal that using SPs led to the preclinical students having significantly positive experiences in boosting patient-doctor communication skills; the perceived authenticity of their clinical situations; development of proper attitudes toward patients; and motivation, reflective thinking, and collaborative learning when compared to using video cases. The SPs also provided more challenges than the video cases during problem identification and hypotheses generation. SPs are more effective than video cases in delivering higher levels of authenticity in clinical problems for PBL. The interaction with SPs engages preclinical students in deeper thinking and discussion; growth of communication skills; development of proper attitudes toward patients; and motivation. Considering the higher cost of SPs compared with video cases, SPs could be used most advantageously during the preclinical period in the IUCM curriculum.
Liu, Hao; Lu, Jiwen; Feng, Jianjiang; Zhou, Jie
In this paper, we propose a two-stream transformer networks (TSTN) approach for video-based face alignment. Unlike conventional image-based face alignment approaches which cannot explicitly model the temporal dependency in videos and motivated by the fact that consistent movements of facial landmarks usually occur across consecutive frames, our TSTN aims to capture the complementary information of both the spatial appearance on still frames and the temporal consistency information across frames. To achieve this, we develop a two-stream architecture, which decomposes the video-based face alignment into spatial and temporal streams accordingly. Specifically, the spatial stream aims to transform the facial image to the landmark positions by preserving the holistic facial shape structure. Accordingly, the temporal stream encodes the video input as active appearance codes, where the temporal consistency information across frames is captured to help shape refinements. Experimental results on the benchmarking video-based face alignment datasets show very competitive performance of our method in comparisons to the state-of-the-arts.
Luiz Affonso Guedes
Full Text Available Wireless sensor networks typically consist of a great number of tiny low-cost electronic devices with limited sensing and computing capabilities which cooperatively communicate to collect some kind of information from an area of interest. When wireless nodes of such networks are equipped with a low-power camera, visual data can be retrieved, facilitating a new set of novel applications. The nature of video-based wireless sensor networks demands new algorithms and solutions, since traditional wireless sensor networks approaches are not feasible or even efficient for that specialized communication scenario. The coverage problem is a crucial issue of wireless sensor networks, requiring specific solutions when video-based sensors are employed. In this paper, it is surveyed the state of the art of this particular issue, regarding strategies, algorithms and general computational solutions. Open research areas are also discussed, envisaging promising investigation considering coverage in video-based wireless sensor networks.
Costa, Daniel G; Guedes, Luiz Affonso
Wireless sensor networks typically consist of a great number of tiny low-cost electronic devices with limited sensing and computing capabilities which cooperatively communicate to collect some kind of information from an area of interest. When wireless nodes of such networks are equipped with a low-power camera, visual data can be retrieved, facilitating a new set of novel applications. The nature of video-based wireless sensor networks demands new algorithms and solutions, since traditional wireless sensor networks approaches are not feasible or even efficient for that specialized communication scenario. The coverage problem is a crucial issue of wireless sensor networks, requiring specific solutions when video-based sensors are employed. In this paper, it is surveyed the state of the art of this particular issue, regarding strategies, algorithms and general computational solutions. Open research areas are also discussed, envisaging promising investigation considering coverage in video-based wireless sensor networks.
Full Text Available New High Throughput Satellite (HTS systems allow high throughput IP uplinks/contribution at Ka-band frequencies for relatively lower costs when compared to broadcasting satellite uplinks at Ku band. This technology offers an advantage for live video contribution from remote areas, where the terrestrial infrastructure may not be adequate. On the other hand, the Ka-band is more subject to impairments due to rain or bad weather. This paper addresses the target system specification and provides an optimized approach for the transmission of IP-based video flows through HTS commercial services operating at Ka-band frequencies. In particular, the focus of this study is on the service requirements and the propagation analysis that provide a reference architecture to improve the overall link availability. The approach proposed herein leads to the introduction of a new concept of live service contribution using pairs of small satellite antennas and cheap satellite terminals.
Walpitagama, Milanga; Kaslin, Jan; Nugegoda, Dayanthi; Wlodkowic, Donald
The fish embryo toxicity (FET) biotest performed on embryos of zebrafish (Danio rerio) has gained significant popularity as a rapid and inexpensive alternative approach in chemical hazard and risk assessment. The FET was designed to evaluate acute toxicity on embryonic stages of fish exposed to the test chemical. The current standard, similar to most traditional methods for evaluating aquatic toxicity provides, however, little understanding of effects of environmentally relevant concentrations of chemical stressors. We postulate that significant environmental effects such as altered motor functions, physiological alterations reflected in heart rate, effects on development and reproduction can occur at sub-lethal concentrations well below than LC10. Behavioral studies can, therefore, provide a valuable integrative link between physiological and ecological effects. Despite the advantages of behavioral analysis development of behavioral toxicity, biotests is greatly hampered by the lack of dedicated laboratory automation, in particular, user-friendly and automated video microscopy systems. In this work we present a proof-of-concept development of an optical system capable of tracking embryonic vertebrates behavioral responses using automated and vastly miniaturized time-resolved video-microscopy. We have employed miniaturized CMOS cameras to perform high definition video recording and analysis of earliest vertebrate behavioral responses. The main objective was to develop a biocompatible embryo positioning structures that were suitable for high-throughput imaging as well as video capture and video analysis algorithms. This system should support the development of sub-lethal and behavioral markers for accelerated environmental monitoring.
Fernández-Carrión, Eduardo; Martínez-Avilés, Marta; Ivorra, Benjamin; Martínez-López, Beatriz; Ramos, Ángel Manuel; Sánchez-Vizcaíno, José Manuel
Early detection of infectious diseases can substantially reduce the health and economic impacts on livestock production. Here we describe a system for monitoring animal activity based on video and data processing techniques, in order to detect slowdown and weakening due to infection with African swine fever (ASF), one of the most significant threats to the pig industry. The system classifies and quantifies motion-based animal behaviour and daily activity in video sequences, allowing automated and non-intrusive surveillance in real-time. The aim of this system is to evaluate significant changes in animals' motion after being experimentally infected with ASF virus. Indeed, pig mobility declined progressively and fell significantly below pre-infection levels starting at four days after infection at a confidence level of 95%. Furthermore, daily motion decreased in infected animals by approximately 10% before the detection of the disease by clinical signs. These results show the promise of video processing techniques for real-time early detection of livestock infectious diseases.
As web-based courses using videos have become popular in recent years, the issue of managing audio-visual aids has become pertinent. Generally, the contents of audio-visual aids may include a lecture, an interview, a report, or an experiment, which may be transformed into a streaming format capable of making the quality of Internet-based videos…
Fritz, Hermann M.; Phillips, David A.; Okayasu, Akio; Shimozono, Takenori; Liu, Haijiang; Takeda, Seiichi; Mohammed, Fahad; Skanavis, Vassilis; Synolakis, Costas E.; Takahashi, Tomoyuki
The March 11, 2011, magnitude Mw 9.0 earthquake off the Tohoku coast of Japan caused catastrophic damage and loss of life to a tsunami aware population. The mid-afternoon tsunami arrival combined with survivors equipped with cameras on top of vertical evacuation buildings provided fragmented spatially and temporally resolved inundation recordings. This report focuses on the surveys at 9 tsunami eyewitness video recording locations in Myako, Kamaishi, Kesennuma and Yoriisohama along Japan's Sanriku coast and the subsequent video image calibration, processing, tsunami hydrograph and flow velocity analysis. Selected tsunami video recording sites were explored, eyewitnesses interviewed and some ground control points recorded during the initial tsunami reconnaissance in April, 2011. A follow-up survey in June, 2011 focused on terrestrial laser scanning (TLS) at locations with high quality eyewitness videos. We acquired precise topographic data using TLS at the video sites producing a 3-dimensional "point cloud" dataset. A camera mounted on the Riegl VZ-400 scanner yields photorealistic 3D images. Integrated GPS measurements allow accurate georeferencing. The original video recordings were recovered from eyewitnesses and the Japanese Coast Guard (JCG). The analysis of the tsunami videos follows an adapted four step procedure originally developed for the analysis of 2004 Indian Ocean tsunami videos at Banda Aceh, Indonesia (Fritz et al., 2006). The first step requires the calibration of the sector of view present in the eyewitness video recording based on ground control points measured in the LiDAR data. In a second step the video image motion induced by the panning of the video camera was determined from subsequent images by particle image velocimetry (PIV) applied to fixed objects. The third step involves the transformation of the raw tsunami video images from image coordinates to world coordinates with a direct linear transformation (DLT) procedure. Finally, the
Lee, June; Yoon, Seo Young; Lee, Chung Hyun
The purposes of the study are to investigate CHLS (Cyber Home Learning System) in online video conferencing environment in primary school level and to explore the students' responses on CHLS-VC (Cyber Home Learning System through Video Conferencing) in order to explore the possibility of using CHLS-VC as a supportive online learning system. The…
Bavelier, D; Achtman, R L; Mani, M; Föcker, J
Over the past few years, the very act of playing action video games has been shown to enhance several different aspects of visual selective attention, yet little is known about the neural mechanisms that mediate such attentional benefits. A review of the aspects of attention enhanced in action game players suggests there are changes in the mechanisms that control attention allocation and its efficiency (Hubert-Wallander, Green, & Bavelier, 2010). The present study used brain imaging to test this hypothesis by comparing attentional network recruitment and distractor processing in action gamers versus non-gamers as attentional demands increased. Moving distractors were found to elicit lesser activation of the visual motion-sensitive area (MT/MST) in gamers as compared to non-gamers, suggestive of a better early filtering of irrelevant information in gamers. As expected, a fronto-parietal network of areas showed greater recruitment as attentional demands increased in non-gamers. In contrast, gamers barely engaged this network as attentional demands increased. This reduced activity in the fronto-parietal network that is hypothesized to control the flexible allocation of top-down attention is compatible with the proposal that action game players may allocate attentional resources more automatically, possibly allowing more efficient early filtering of irrelevant information. Copyright © 2011 Elsevier Ltd. All rights reserved.
Noëlle Junod Perron
Full Text Available Introduction: Medical students at the Faculty of Medicine, University of Geneva, Switzerland, have the opportunity to practice clinical skills with simulated patients during formative sessions in preparation for clerkships. These sessions are given in two formats: 1 direct observation of an encounter followed by verbal feedback (direct feedback and 2 subsequent review of the videotaped encounter by both student and supervisor (video-based feedback. The aim of the study was to evaluate whether content and process of feedback differed between both formats. Methods: In 2013, all second- and third-year medical students and clinical supervisors involved in formative sessions were asked to take part in the study. A sample of audiotaped feedback sessions involving supervisors who gave feedback in both formats were analyzed (content and process of the feedback using a 21-item feedback scale. Results: Forty-eight audiotaped feedback sessions involving 12 supervisors were analyzed (2 direct and 2 video-based sessions per supervisor. When adjusted for the length of feedback, there were significant differences in terms of content and process between both formats; the number of communication skills and clinical reasoning items addressed were higher in the video-based format (11.29 vs. 7.71, p=0.002 and 3.71 vs. 2.04, p=0.010, respectively. Supervisors engaged students more actively during the video-based sessions than during direct feedback sessions (self-assessment: 4.00 vs. 3.17, p=0.007; active problem-solving: 3.92 vs. 3.42, p=0.009. Students made similar observations and tended to consider that the video feedback was more useful for improving some clinical skills. Conclusion: Video-based feedback facilitates discussion of clinical reasoning, communication, and professionalism issues while at the same time actively engaging students. Different time and conceptual frameworks may explain observed differences. The choice of feedback format should depend on
Brunner, M; Ittner, W
This paper describes VIPER, the video image-processing system Erlangen. It consists of a general purpose microcomputer, commercially available image-processing hardware modules connected directly to the computer, video input/output-modules such as a TV camera, video recorders and monitors, and a software package. The modular structure and the capabilities of this system are explained. The software is user-friendly, menu-driven and performs image acquisition, transfers, greyscale processing, arithmetics, logical operations, filtering display, colour assignment, graphics, and a couple of management functions. More than 100 image-processing functions are implemented. They are available either by typing a key or by a simple call to the function-subroutine library in application programs. Examples are supplied in the area of biomedical research, e.g. in in-vivo microscopy.
Full Text Available This paper presents about the transmission of Digital Video Broadcasting system with streaming video resolution 640x480 on different IQ rate and modulation. In the video transmission, distortion often occurs, so the received video has bad quality. Key frames selection algorithm is flexibel on a change of video, but on these methods, the temporal information of a video sequence is omitted. To minimize distortion between the original video and received video, we aimed at adding methodology using sequential distortion minimization algorithm. Its aim was to create a new video, better than original video without significant loss of content between the original video and received video, fixed sequentially. The reliability of video transmission was observed based on a constellation diagram, with the best result on IQ rate 2 Mhz and modulation 8 QAM. The best video transmission was also investigated using SEDIM (Sequential Distortion Minimization Method and without SEDIM. The experimental result showed that the PSNR (Peak Signal to Noise Ratio average of video transmission using SEDIM was an increase from 19,855 dB to 48,386 dB and SSIM (Structural Similarity average increase 10,49%. The experimental results and comparison of proposed method obtained a good performance. USRP board was used as RF front-end on 2,2 GHz.
Full Text Available We propose a Stereoscopic Visual Attention- (SVA- based regional bit allocation optimization for Multiview Video Coding (MVC by the exploiting visual redundancies from human perceptions. We propose a novel SVA model, where multiple perceptual stimuli including depth, motion, intensity, color, and orientation contrast are utilized, to simulate the visual attention mechanisms of human visual system with stereoscopic perception. Then, a semantic region-of-interest (ROI is extracted based on the saliency maps of SVA. Both objective and subjective evaluations of extracted ROIs indicated that the proposed SVA model based on ROI extraction scheme outperforms the schemes only using spatial or/and temporal visual attention clues. Finally, by using the extracted SVA-based ROIs, a regional bit allocation optimization scheme is presented to allocate more bits on SVA-based ROIs for high image quality and fewer bits on background regions for efficient compression purpose. Experimental results on MVC show that the proposed regional bit allocation algorithm can achieve over 20∼30% bit-rate saving while maintaining the subjective image quality. Meanwhile, the image quality of ROIs is improved by 0.46∼0.61 dB at the cost of insensitive image quality degradation of the background image.
Zhang, Yun; Jiang, Gangyi; Yu, Mei; Chen, Ken; Dai, Qionghai
We propose a Stereoscopic Visual Attention- (SVA-) based regional bit allocation optimization for Multiview Video Coding (MVC) by the exploiting visual redundancies from human perceptions. We propose a novel SVA model, where multiple perceptual stimuli including depth, motion, intensity, color, and orientation contrast are utilized, to simulate the visual attention mechanisms of human visual system with stereoscopic perception. Then, a semantic region-of-interest (ROI) is extracted based on the saliency maps of SVA. Both objective and subjective evaluations of extracted ROIs indicated that the proposed SVA model based on ROI extraction scheme outperforms the schemes only using spatial or/and temporal visual attention clues. Finally, by using the extracted SVA-based ROIs, a regional bit allocation optimization scheme is presented to allocate more bits on SVA-based ROIs for high image quality and fewer bits on background regions for efficient compression purpose. Experimental results on MVC show that the proposed regional bit allocation algorithm can achieve over [InlineEquation not available: see fulltext.]% bit-rate saving while maintaining the subjective image quality. Meanwhile, the image quality of ROIs is improved by [InlineEquation not available: see fulltext.] dB at the cost of insensitive image quality degradation of the background image.
Full Text Available We propose a Stereoscopic Visual Attention- (SVA- based regional bit allocation optimization for Multiview Video Coding (MVC by the exploiting visual redundancies from human perceptions. We propose a novel SVA model, where multiple perceptual stimuli including depth, motion, intensity, color, and orientation contrast are utilized, to simulate the visual attention mechanisms of human visual system with stereoscopic perception. Then, a semantic region-of-interest (ROI is extracted based on the saliency maps of SVA. Both objective and subjective evaluations of extracted ROIs indicated that the proposed SVA model based on ROI extraction scheme outperforms the schemes only using spatial or/and temporal visual attention clues. Finally, by using the extracted SVA-based ROIs, a regional bit allocation optimization scheme is presented to allocate more bits on SVA-based ROIs for high image quality and fewer bits on background regions for efficient compression purpose. Experimental results on MVC show that the proposed regional bit allocation algorithm can achieve over % bit-rate saving while maintaining the subjective image quality. Meanwhile, the image quality of ROIs is improved by dB at the cost of insensitive image quality degradation of the background image.
Full Text Available We present a simple yet efficient scalable scheme for wavelet-based video coders, able to provide on-demand spatial, temporal, and SNR scalability, and fully compatible with the still-image coding standard JPEG2000. Whereas hybrid video coders must undergo significant changes in order to support scalability, our coder only requires a specific wavelet filter for temporal analysis, as well as an adapted bit allocation procedure based on models of rate-distortion curves. Our study shows that scalably encoded sequences have the same or almost the same quality than nonscalably encoded ones, without a significant increase in complexity. A full compatibility with Motion JPEG2000, which tends to be a serious candidate for the compression of high-definition video sequences, is ensured.
Full Text Available We present a simple yet efficient scalable scheme for wavelet-based video coders, able to provide on-demand spatial, temporal, and SNR scalability, and fully compatible with the still-image coding standard JPEG2000. Whereas hybrid video coders must undergo significant changes in order to support scalability, our coder only requires a specific wavelet filter for temporal analysis, as well as an adapted bit allocation procedure based on models of rate-distortion curves. Our study shows that scalably encoded sequences have the same or almost the same quality than nonscalably encoded ones, without a significant increase in complexity. A full compatibility with Motion JPEG2000, which tends to be a serious candidate for the compression of high-definition video sequences, is ensured.
Full Text Available The paper presents a multiple description (MD video coder based on three-dimensional (3D transforms. Two balanced descriptions are created from a video sequence. In the encoder, video sequence is represented in a form of coarse sequence approximation (shaper included in both descriptions and residual sequence (details which is split between two descriptions. The shaper is obtained by block-wise pruned 3D-DCT. The residual sequence is coded by 3D-DCT or hybrid, LOT+DCT, 3D-transform. The coding scheme is targeted to mobile devices. It has low computational complexity and improved robustness of transmission over unreliable networks. The coder is able to work at very low redundancies. The coding scheme is simple, yet it outperforms some MD coders based on motion-compensated prediction, especially in the low-redundancy region. The margin is up to 3 dB for reconstruction from one description.
Leonardo A. Mercado
Full Text Available In this paper, we explore the benefits to teacher evaluation when video-based self-observation is done by teachers as a vehicle for individual, reflective practice. We explore how it was applied systematically at the Instituto Cultural Peruano Norteamericano (ICPNA bi-national center in Lima, Peru among hundreds of English as a foreign language (EFL teachers in two institution-wide initiatives that have relied on self-observation through video professional development. In these cases, we provide a descriptive framework for each initiative as well as information on what was ultimately achieved by teachers, supervisors and the institution as a whole. We conclude with recommendations for implementing video-based self-evaluation.
Steele, Miriam; Steele, Howard; Bate, Jordan; Knafo, Hannah; Kinsey, Michael; Bonuck, Karen; Meisner, Paul; Murphy, Anne
This paper provides an account of multiple potential benefits of using video in clinical interventions designed to promote change in parent-child attachment relationships. The power of video to provide a unique perspective on parents' ways of thinking and feeling about their own behavior and that of their child will be discussed in terms of current attachment-based interventions using video either as the main component of the treatment or in addition to a more comprehensive treatment protocol. Interventions also range from those that use micro-analytic as compared to more global units of analyses, and there are potential bridges to be made with neuro-scientific research findings. In addition, this paper provides a clinical illustration of the utility of showing parents vignettes of video-filmed observations of parent-child interactions from the Group Attachment Based Intervention (GABI) for vulnerable families. Emphasis is placed on the motivational force arising from seeing (and hearing) oneself in interaction with one's child on video, thus serving as a powerful catalyst for reflective functioning and updating one's frame of reference for how to think, feel and behave with one's child.
Sehairi, Kamal; Chouireb, Fatima; Meunier, Jean
The objective of this study is to compare several change detection methods for a monostatic camera and identify the best method for different complex environments and backgrounds in indoor and outdoor scenes. To this end, we used the CDnet video dataset as a benchmark that consists of many challenging problems, ranging from basic simple scenes to complex scenes affected by bad weather and dynamic backgrounds. Twelve change detection methods, ranging from simple temporal differencing to more sophisticated methods, were tested and several performance metrics were used to precisely evaluate the results. Because most of the considered methods have not previously been evaluated on this recent large scale dataset, this work compares these methods to fill a lack in the literature, and thus this evaluation joins as complementary compared with the previous comparative evaluations. Our experimental results show that there is no perfect method for all challenging cases; each method performs well in certain cases and fails in others. However, this study enables the user to identify the most suitable method for his or her needs.
Efstathiou, Nectarios; Skitsas, Michael; Psaroudakis, Chrysostomos; Koutras, Nikolaos
Nowadays, video surveillance cameras are used for the protection and monitoring of a huge number of facilities worldwide. An important element in such surveillance systems is the use of aerial video streams originating from onboard sensors located on Unmanned Aerial Vehicles (UAVs). Video surveillance using UAVs represent a vast amount of video to be transmitted, stored, analyzed and visualized in a real-time way. As a result, the introduction and development of systems able to handle huge amount of data become a necessity. In this paper, a new approach for the collection, transmission and storage of aerial videos and metadata is introduced. The objective of this work is twofold. First, the integration of the appropriate equipment in order to capture and transmit real-time video including metadata (i.e. position coordinates, target) from the UAV to the ground and, second, the utilization of the ADITESS Versatile Media Content Management System (VMCMS-GE) for storing of the video stream and the appropriate metadata. Beyond the storage, VMCMS-GE provides other efficient management capabilities such as searching and processing of videos, along with video transcoding. For the evaluation and demonstration of the proposed framework we execute a use case where the surveillance of critical infrastructure and the detection of suspicious activities is performed. Collected video Transcodingis subject of this evaluation as well.
Chisholm, Joseph D; Kingstone, Alan
Research has demonstrated that experience with action video games is associated with improvements in a host of cognitive tasks. Evidence from paradigms that assess aspects of attention has suggested that action video game players (AVGPs) possess greater control over the allocation of attentional resources than do non-video-game players (NVGPs). Using a compound search task that teased apart selection- and response-based processes (Duncan, 1985), we required participants to perform an oculomotor capture task in which they made saccades to a uniquely colored target (selection-based process) and then produced a manual directional response based on information within the target (response-based process). We replicated the finding that AVGPs are less susceptible to attentional distraction and, critically, revealed that AVGPs outperform NVGPs on both selection-based and response-based processes. These results not only are consistent with the improved-attentional-control account of AVGP benefits, but they suggest that the benefit of action video game playing extends across the full breadth of attention-mediated stimulus-response processes that impact human performance.
Park, Sung Youl; Kim, Soo-Wook; Cha, Seung-Bong; Nam, Min-Woo
This study investigated the effectiveness of e-learning by comparing the learning outcomes in conventional face-to-face lectures and e-learning methods. Two video-based e-learning contents were developed based on the rapid prototyping model and loaded onto the learning management system (LMS), which was available at http://www.greenehrd.com.…
I M.O. Widyantara
Full Text Available Video surveillance system (VSS is an monitoring system based-on IP-camera. VSS implemented in live streaming and serves to observe and monitor a site remotely. Typically, IP- camera in the VSS has a management software application. However, for ad hoc applications, where the user wants to manage VSS independently, application management software has become ineffective. In the IP-camera installation spread over a large area, an administrator would be difficult to describe the location of the IP-camera. In addition, monitoring an area of IP- Camera will also become more difficult. By looking at some of the flaws in VSS, this paper has proposed a VSS application for easy monitoring of each IP Camera. Applications that have been proposed to integrate the concept of web-based geographical information system with the Google Maps API (Web-GIS. VSS applications built with smart features include maps ip-camera, live streaming of events, information on the info window and marker cluster. Test results showed that the application is able to display all the features built well
Azer, Samy A; Algrain, Hala A; AlKhelaif, Rana A; AlEshaiwi, Sarah M
A number of studies have evaluated the educational contents of videos on YouTube. However, little analysis has been done on videos about physical examination. This study aimed to analyze YouTube videos about physical examination of the cardiovascular and respiratory systems. It was hypothesized that the educational standards of videos on YouTube would vary significantly. During the period from November 2, 2011 to December 2, 2011, YouTube was searched by three assessors for videos covering the clinical examination of the cardiovascular and respiratory systems. For each video, the following information was collected: title, authors, duration, number of viewers, and total number of days on YouTube. Using criteria comprising content, technical authority, and pedagogy parameters, videos were rated independently by three assessors and grouped into educationally useful and non-useful videos. A total of 1920 videos were screened. Only relevant videos covering the examination of adults in the English language were identified (n=56). Of these, 20 were found to be relevant to cardiovascular examinations and 36 to respiratory examinations. Further analysis revealed that 9 provided useful information on cardiovascular examinations and 7 on respiratory examinations: scoring mean 14.9 (SD 0.33) and mean 15.0 (SD 0.00), respectively. The other videos, 11 covering cardiovascular and 29 on respiratory examinations, were not useful educationally, scoring mean 11.1 (SD 1.08) and mean 11.2 (SD 1.29), respectively. The differences between these two categories were significant (P.86. A small number of videos about physical examination of the cardiovascular and respiratory systems were identified as educationally useful; these videos can be used by medical students for independent learning and by clinical teachers as learning resources. The scoring system utilized by this study is simple, easy to apply, and could be used by other researchers on similar topics.
Tuong, William; Larsen, Elizabeth R; Armstrong, April W
This systematic review examines the effectiveness of videos in modifying health behaviors. We searched PubMed (1975-2012), PsycINFO (1975-2012), EMBASE (1975-2012), and CINAHL (1983-2012) for controlled clinical trials that examined the effectiveness of video interventions in changing health behaviors. Twenty-eight studies comprised of 12,703 subjects were included in the systematic review. Video interventions were variably effective for modifying health behaviors depending on the target behaviors to be influenced. Video interventions appear to be effective in breast self-examination, prostate cancer screening, sunscreen adherence, self-care in patients with heart failure, HIV testing, treatment adherence, and female condom use. However, videos have not shown to be effective in influencing addiction behaviors when they are not tailored. Compared to loss-framing, gain-framed messages may be more effective in promoting certain types of health behavior change. Also, video modeling may facilitate learning of new behaviors and can be an important consideration in future video interventions.
Karimi Moonaghi, Hossein; Hasanzadeh, Farzaneh; Shamsoddini, Somayyeh; Emamimoghadam, Zahra; Ebrahimzadeh, Saeed
Adherence to diet and fluids is the cornerstone of patients undergoing hemodialysis. By informing hemodialysis patients we can help them have a proper diet and reduce mortality and complications of toxins. Face to face education is one of the most common methods of training in health care system. But advantages of video- based education are being simple and cost-effective, although this method is virtual. Seventy-five hemodialysis patients were divided randomly into face to face and video-based education groups. A training manual was designed based on Orem's self-care model. Content of training manual was same in both the groups. In the face to face group, 2 educational sessions were accomplished during dialysis with a 1-week time interval. In the video-based education group, a produced film, separated to 2 episodes was presented during dialysis with a 1-week time interval. An Attitude questionnaire was completed as a pretest and at the end of weeks 2 and 4. SPSS software version 11.5 was used for analysis. Attitudes about fluid and diet adherence at the end of weeks 2 and 4 are not significantly different in face to face or video-based education groups. The patients' attitude had a significant difference in face to face group between the 3 study phases (pre-, 2, and 4 weeks postintervention). The same results were obtained in 3 phases of video-based education group. Our findings showed that video-based education could be as effective as face to face method. It is recommended that more investment be devoted to video-based education.
Barekatain, Behrang; Khezrimotlagh, Dariush; Aizaini Maarof, Mohd; Ghaeini, Hamid Reza; Salleh, Shaharuddin; Quintana, Alfonso Ariza; Akbari, Behzad; Cabrera, Alicia Triviño
In recent years, Random Network Coding (RNC) has emerged as a promising solution for efficient Peer-to-Peer (P2P) video multicasting over the Internet. This probably refers to this fact that RNC noticeably increases the error resiliency and throughput of the network. However, high transmission overhead arising from sending large coefficients vector as header has been the most important challenge of the RNC. Moreover, due to employing the Gauss-Jordan elimination method, considerable computational complexity can be imposed on peers in decoding the encoded blocks and checking linear dependency among the coefficients vectors. In order to address these challenges, this study introduces MATIN which is a random network coding based framework for efficient P2P video streaming. The MATIN includes a novel coefficients matrix generation method so that there is no linear dependency in the generated coefficients matrix. Using the proposed framework, each peer encapsulates one instead of n coefficients entries into the generated encoded packet which results in very low transmission overhead. It is also possible to obtain the inverted coefficients matrix using a bit number of simple arithmetic operations. In this regard, peers sustain very low computational complexities. As a result, the MATIN permits random network coding to be more efficient in P2P video streaming systems. The results obtained from simulation using OMNET++ show that it substantially outperforms the RNC which uses the Gauss-Jordan elimination method by providing better video quality on peers in terms of the four important performance metrics including video distortion, dependency distortion, End-to-End delay and Initial Startup delay.
Full Text Available In recent years, Random Network Coding (RNC has emerged as a promising solution for efficient Peer-to-Peer (P2P video multicasting over the Internet. This probably refers to this fact that RNC noticeably increases the error resiliency and throughput of the network. However, high transmission overhead arising from sending large coefficients vector as header has been the most important challenge of the RNC. Moreover, due to employing the Gauss-Jordan elimination method, considerable computational complexity can be imposed on peers in decoding the encoded blocks and checking linear dependency among the coefficients vectors. In order to address these challenges, this study introduces MATIN which is a random network coding based framework for efficient P2P video streaming. The MATIN includes a novel coefficients matrix generation method so that there is no linear dependency in the generated coefficients matrix. Using the proposed framework, each peer encapsulates one instead of n coefficients entries into the generated encoded packet which results in very low transmission overhead. It is also possible to obtain the inverted coefficients matrix using a bit number of simple arithmetic operations. In this regard, peers sustain very low computational complexities. As a result, the MATIN permits random network coding to be more efficient in P2P video streaming systems. The results obtained from simulation using OMNET++ show that it substantially outperforms the RNC which uses the Gauss-Jordan elimination method by providing better video quality on peers in terms of the four important performance metrics including video distortion, dependency distortion, End-to-End delay and Initial Startup delay.
Ignacio, Joselito; Center for Homeland Defense and Security Naval Postgraduate School
This proposed system process aims to improve subway safety through better enabling the rapid detection and response to a chemical release in a subway system. The process is designed to be location-independent and generalized to most subway systems despite each system's unique characteristics.
Schuetz, Christopher; Martin, Richard; Dillon, Thomas; Yao, Peng; Mackrides, Daniel; Harrity, Charles; Zablocki, Alicia; Shreve, Kevin; Bonnett, James; Curt, Petersen; Prather, Dennis
Passive imaging using millimeter waves (mmWs) has many advantages and applications in the defense and security markets. All terrestrial bodies emit mmW radiation and these wavelengths are able to penetrate smoke, fog/clouds/marine layers, and even clothing. One primary obstacle to imaging in this spectrum is that longer wavelengths require larger apertures to achieve the resolutions desired for many applications. Accordingly, lens-based focal plane systems and scanning systems tend to require large aperture optics, which increase the achievable size and weight of such systems to beyond what can be supported by many applications. To overcome this limitation, a distributed aperture detection scheme is used in which the effective aperture size can be increased without the associated volumetric increase in imager size. This distributed aperture system is realized through conversion of the received mmW energy into sidebands on an optical carrier. This conversion serves, in essence, to scale the mmW sparse aperture array signals onto a complementary optical array. The side bands are subsequently stripped from the optical carrier and recombined to provide a real time snapshot of the mmW signal. Using this technique, we have constructed a real-time, video-rate imager operating at 75 GHz. A distributed aperture consisting of 220 upconversion channels is used to realize 2.5k pixels with passive sensitivity. Details of the construction and operation of this imager as well as field testing results will be presented herein.
Ballesta, S; Reymond, G; Pozzobon, M; Duhamel, J-R
To date, assessing the solitary and social behaviors of laboratory primates' colonies relies on time-consuming manual scoring methods. Here, we describe a real-time multi-camera 3D tracking system developed to measure the behavior of socially-housed primates. Their positions are identified using non-invasive color markers such as plastic collars, thus allowing to also track colored objects and to measure their usage. Compared to traditional manual ethological scoring, we show that this system can reliably evaluate solitary behaviors (foraging, solitary resting, toy usage, locomotion) as well as spatial proximity with peers, which is considered as a good proxy of their social motivation. Compared to existing video-based commercial systems currently available to measure animal activity, this system offers many possibilities (real-time data, large volume coverage, multiple animal tracking) at a lower hardware cost. Quantitative behavioral data of animal groups can now be obtained automatically over very long periods of time, thus opening new perspectives in particular for studying the neuroethology of social behavior in primates. Copyright © 2014 Elsevier B.V. All rights reserved.
Monsoriu, Juan A; Gimenez, Marcos H; Riera, Jaime; Vidaurre, Ana [Departamento de Fisica Aplicada, Universidad Politecnica de Valencia, E-46022 Valencia (Spain)
The applications of the digital video image to the investigation of physical phenomena have increased enormously in recent years. The advances in computer technology and image recognition techniques allow the analysis of more complex problems. In this work, we study the movement of a damped coupled oscillation system. The motion is considered as a linear combination of two normal modes, i.e. the symmetric and antisymmetric modes. The image of the experiment is recorded with a video camera and analysed by means of software developed in our laboratory. The results show a very good agreement with the theory.
Haque, Mohammad Ahsanul; Irani, Ramin; Nasrollahi, Kamal
Physical fatigue reveals the health condition of a person at for example health checkup, fitness assessment or rehabilitation training. This paper presents an efficient noncontact system for detecting non-localized physi-cal fatigue from maximal muscle activity using facial videos acquired...
Lau, K H Vincent; Farooque, Pue; Leydon, Gary; Schwartz, Michael L; Sadler, R Mark; Moeller, Jeremy J
The video-based lecture (VBL), an important component of the flipped classroom (FC) and massive open online course (MOOC) approaches to medical education, has primarily been evaluated through direct learner feedback. Evaluation may be enhanced through learner analytics (LA) - analysis of quantitative audience usage data generated by video-sharing platforms. We applied LA to an experimental series of ten VBLs on electroencephalography (EEG) interpretation, uploaded to YouTube in the model of a publicly accessible MOOC. Trends in view count; total percentage of video viewed and audience retention (AR) (percentage of viewers watching at a time point compared to the initial total) were examined. The pattern of average AR decline was characterized using regression analysis, revealing a uniform linear decline in viewership for each video, with no evidence of an optimal VBL length. Segments with transient increases in AR corresponded to those focused on core concepts, indicative of content requiring more detailed evaluation. We propose a model for applying LA at four levels: global, series, video, and feedback. LA may be a useful tool in evaluating a VBL series. Our proposed model combines analytics data and learner self-report for comprehensive evaluation.
Full Text Available High Efficiency Video Coding (HEVC, which is the newest video coding standard, has been developed for the efficient compression of ultra high definition videos. One of the important features in HEVC is the adoption of a quad-tree based video coding structure, in which each incoming frame is represented as a set of non-overlapped coding tree blocks (CTB by variable-block sized prediction and coding process. To do this, each CTB needs to be recursively partitioned into coding unit (CU, predict unit (PU and transform unit (TU during the coding process, leading to a huge computational load in the coding of each video frame. This paper proposes to extract visual features in a CTB and uses them to simplify the coding procedure by reducing the depth of quad-tree partition for each CTB in HEVC intra coding mode. A measure for the edge strength in a CTB, which is defined with simple Sobel edge detection, is used to constrain the possible maximum depth of quad-tree partition of the CTB. With the constrained partition depth, the proposed method can reduce a lot of encoding time. Experimental results by HM10.1 show that the average time-savings is about 13.4% under the increase of encoded BD-Rate by only 0.02%, which is a less performance degradation in comparison to other similar methods.
Patki, Aniruddha; Puscas, Liana
To determine whether instructional videos modeling examples of "good" and "bad" patient communication skills are useful as an educational tool for improving resident-patient communication. Retrospective study in which resident participants in the module gave survey responses indicating perceived utility of the exercise. Tertiary academic medical center. A total of 11 otolaryngology trainees from postgraduate year 1-5 who attended the course over 2 separate sessions and provided feedback on the benefits of the module. All 11 residents attended both sessions. Of 22 total survey responses, 21 found that the videos were "realistic and engaging" and were a true representation of commonly encountered clinical scenarios. Residents identified multiple themes and behaviors distinguishing "good" vs "bad" communication with patients and felt they could incorporate these into daily practice. A perceived weakness was the lack of opportunity for "role playing" with a video-based module as opposed to standardized patients. Instructional videos, when realistic, are useful for modeling effective patient communication skills for residents. By watching the videos, residents are able to identify specific techniques they can incorporate into their daily practice. Copyright © 2015 Association of Program Directors in Surgery. Published by Elsevier Inc. All rights reserved.
Hua, My; Yip, Henry; Talbot, Prue
The objective was to analyse and compare puff and exhalation duration for individuals using electronic nicotine delivery systems (ENDS) and conventional cigarettes in YouTube videos. Video data from YouTube videos were analysed to quantify puff duration and exhalation duration during use of conventional tobacco-containing cigarettes and ENDS. For ENDS, comparisons were also made between 'advertisers' and 'non-advertisers', genders, brands of ENDS, and models of ENDS within one brand. Puff duration (mean =2.4 s) for conventional smokers in YouTube videos (N=9) agreed well with prior publications. Puff duration was significantly longer for ENDS users (mean =4.3 s) (N = 64) than for conventional cigarette users, and puff duration varied significantly among ENDS brands. For ENDS users, puff duration and exhalation duration were not significantly affected by 'advertiser' status, gender or variation in models within a brand. Men outnumbered women by about 5:1, and most users were between 19 and 35 years of age. YouTube videos provide a valuable resource for studying ENDS usage. Longer puff duration may help ENDS users compensate for the apparently poor delivery of nicotine from ENDS. As with conventional cigarette smoking, ENDS users showed a large variation in puff duration (range =1.9-8.3 s). ENDS puff duration should be considered when designing laboratory and clinical trials and in developing a standard protocol for evaluating ENDS performance.
Kannan, R.; Ghinea, G; Swaminathan, S
Video summarization aims at producing a compact version of a full-length video while preserving the significant content of the original video. Movie summarization condenses a full-length movie into a summary that still retains the most significant and interesting content of the original movie. In the past, several movie summarization systems have been proposed to generate a movie summary based on low-level video features such as color, motion, texture, etc. However, a generic summary, which i...
Das, V. E.; Thomas, C. W.; Zivotofsky, A. Z.; Leigh, R. J.
Video-based eye-tracking systems are especially suited to studying eye movements during naturally occurring activities such as locomotion, but eye velocity records suffer from broad band noise that is not amenable to conventional filtering methods. We evaluated the effectiveness of combined median and moving-average filters by comparing prefiltered and postfiltered records made synchronously with a video eye-tracker and the magnetic search coil technique, which is relatively noise free. Root-mean-square noise was reduced by half, without distorting the eye velocity signal. To illustrate the practical use of this technique, we studied normal subjects and patients with deficient labyrinthine function and compared their ability to hold gaze on a visual target that moved with their heads (cancellation of the vestibulo-ocular reflex). Patients and normal subjects performed similarly during active head rotation but, during locomotion, patients held their eyes more steadily on the visual target than did subjects.
Cheah Wai Shiang
Full Text Available Agent-oriented methodology (AOM is a comprehensive and unified agent methodology for agent-oriented software development. Although AOM is claimed to be able to cope with a complex system development, it is still not yet determined up to what extent this may be true. Therefore, it is vital to conduct an investigation to validate this methodology. This paper presents the adoption of AOM in developing an agent-oriented video surveillance system (VSS. An intruder handling scenario is designed and implemented through AOM. AOM provides an alternative method to engineer a distributed security system in a systematic manner. It presents the security system at a holistic view; provides a better conceptualization of agent-oriented security system and supports rapid prototyping as well as simulation of video surveillance system.
Burner, A. W.; Rummler, D. R.; Goad, W. K.
A system consisting of a single charge coupled device (CCD) video camera, computer controlled video digitizer, and software to automate the measurement was developed to measure the location of bullet holes in targets at the International Shooters Development Fund (ISDF)/NASA Ballistics Tunnel. The camera/digitizer system is a crucial component of a highly instrumented indoor 50 meter rifle range which is being constructed to support development of wind resistant, ultra match ammunition. The system was designed to take data rapidly (10 sec between shoots) and automatically with little operator intervention. The system description, measurement concept, and procedure are presented along with laboratory tests of repeatability and bias error. The long term (1 hour) repeatability of the system was found to be 4 microns (one standard deviation) at the target and the bias error was found to be less than 50 microns. An analysis of potential errors and a technique for calibration of the system are presented.
Full Text Available We present a supervised learning-based approach for subpixel motion estimation which is then used to perform video super-resolution. The novelty of this work is the formulation of the problem of subpixel motion estimation in a ranking framework. The ranking formulation is a variant of classification and regression formulation, in which the ordering present in class labels namely, the shift between patches is explicitly taken into account. Finally, we demonstrate the applicability of our approach on superresolving synthetically generated images with global subpixel shifts and enhancing real video frames by accounting for both local integer and subpixel shifts.
de Groot, Arjan; Linotte, Peter; van Veen, Django; de Witte, Martijn; Laurent, Nicolas; Hiddema, Arend; Lalkens, Fred; van Spijker, Jan
High quality night vision digital video is nowadays required for many observation, surveillance and targeting applications, including several of the current soldier modernization programs. We present the performance increase that is obtained when combining a state-of-the-art image intensifier with a low power consumption CMOS image sensor. Based on the content of the video signal, the gating and gain of the image intensifier are optimized for best SNR. The options of the interface with a separate laser in the application for range gated imaging are discussed.
Uddin, Md. Zia
In this paper, a novel spatiotemporal feature-based method is proposed to recognize facial expressions from depth video. Independent Component Analysis (ICA) spatial features of the depth faces of facial expressions are first augmented with the optical flow motion features. Then, the augmented features are enhanced by Fisher Linear Discriminant Analysis (FLDA) to make them robust. The features are then combined with on Hidden Markov Models (HMMs) to model different facial expressions that are later used to recognize appropriate expression from a test expression depth video. The experimental results show superior performance of the proposed approach over the conventional methods.
... From the Federal Register Online via the Government Publishing Office INTERNATIONAL TRADE COMMISSION Certain Video Game Systems and Controllers; Investigations: Terminations, Modifications and Rulings AGENCY: U.S. International Trade Commission. ACTION: Notice. Section 337 of the Tariff Act of 1930...
... From the Federal Register Online via the Government Publishing Office INTERNATIONAL TRADE COMMISSION Certain Video Game Systems and Wireless Controllers and Components Thereof, Commission Determination Finding No Violation of the Tariff Act of 1930 AGENCY: U.S. International Trade Commission. ACTION...
Horn, Eva; And Others
Three nonvocal students (ages 5-8) with severe physical handicaps were trained in scan and selection responses (similar to responses needed for operating augmentative communication systems) using a microcomputer-operated video-game format. Results indicated that all three children showed substantial increases in the number of correct responses and…
Pope, Alan T.; Bogart, Edward H.
Describes the Extended Attention Span Training (EAST) system for modifying attention deficits, which takes the concept of biofeedback one step further by making a video game more difficult as the player's brain waves indicate that attention is waning. Notes contributions of this technology to neuropsychology and neurology, where the emphasis is on…
Hung, Chin-Chang; Tsao, Shih-Chieh; Huang, Kuo-Hao; Jang, Jia-Pu; Chang, Hsu-Kuang; Dobbs, Fred C.
The turbid, low-light waters characteristic of aquaculture ponds have made it difficult or impossible for previous video cameras to provide clear imagery of the ponds’ benthic habitat. We developed a highly sensitive, underwater video system (UVS) for this particular application and tested it in shrimp ponds having turbidities typical of those in southern Taiwan. The system’s high-quality video stream and images, together with its camera capacity (up to nine cameras), permit in situ observations of shrimp feeding behavior, shrimp size and internal anatomy, and organic matter residues on pond sediments. The UVS can operate continuously and be focused remotely, a convenience to shrimp farmers. The observations possible with the UVS provide aquaculturists with information critical to provision of feed with minimal waste; determining whether the accumulation of organic-matter residues dictates exchange of pond water; and management decisions concerning shrimp health.
Subhi, Yousif; Todsen, Tobias; Konge, Lars
Assessment of clinical competencies by direct observation is problematic for two main reasons the identity of the examinee influences the assessment scores, and direct observation demands experts at the exact location and the exact time. Recording the performance can overcome these problems......; however, managing video recordings and assessment sheets is troublesome and may lead to missing or incorrect data. Currently, no existing software solution can provide a local solution for the management of videos and assessments but this is necessary as assessment scores are confidential information......, and access to this information should be restricted to select personnel. A local software solution may also ease the need for customization to local needs and integration into existing user databases or project management software. We developed an integrable web-based solution for easy assessment of video...