WorldWideScience

Sample records for preprocessed grace-a level-1b

  1. GIFTS SM EDU Level 1B algorithms

    Science.gov (United States)

    Tian, Jialin; Gazarik, Michael J.; Reisse, Robert A.; Johnson, David G.

    2007-10-01

    The Geosynchronous Imaging Fourier Transform Spectrometer (GIFTS) Sensor Module (SM) Engineering Demonstration Unit (EDU) is a high resolution spectral imager designed to measure infrared (IR) radiances using a Fourier transform spectrometer (FTS). The GIFTS instrument employs three focal plane arrays (FPAs), which gather measurements across the long-wave IR (LWIR), short/mid-wave IR (SMWIR), and visible spectral bands. The raw interferogram measurements are radiometrically and spectrally calibrated to produce radiance spectra, which are further processed to obtain atmospheric profiles via retrieval algorithms. This paper describes the GIFTS SM EDU Level 1B algorithms involved in the calibration. The GIFTS Level 1B calibration procedures can be subdivided into four blocks. In the first block, the measured raw interferograms are first corrected for the detector nonlinearity distortion, followed by the complex filtering and decimation procedure. In the second block, a phase correction algorithm is applied to the filtered and decimated complex interferograms. The resulting imaginary part of the spectrum contains only the noise component of the uncorrected spectrum. Additional random noise reduction can be accomplished by applying a spectral smoothing routine to the phase-corrected spectrum. The phase correction and spectral smoothing operations are performed on a set of interferogram scans for both ambient and hot blackbody references. To continue with the calibration, we compute the spectral responsivity based on the previous results, from which, the calibrated ambient blackbody (ABB), hot blackbody (HBB), and scene spectra can be obtained. We now can estimate the noise equivalent spectral radiance (NESR) from the calibrated ABB and HBB spectra. The correction schemes that compensate for the fore-optics offsets and off-axis effects are also implemented. In the third block, we developed an efficient method of generating pixel performance assessments. In addition, a

  2. GIFTS SM EDU Level 1B Algorithms

    Science.gov (United States)

    Tian, Jialin; Gazarik, Michael J.; Reisse, Robert A.; Johnson, David G.

    2007-01-01

    The Geosynchronous Imaging Fourier Transform Spectrometer (GIFTS) SensorModule (SM) Engineering Demonstration Unit (EDU) is a high resolution spectral imager designed to measure infrared (IR) radiances using a Fourier transform spectrometer (FTS). The GIFTS instrument employs three focal plane arrays (FPAs), which gather measurements across the long-wave IR (LWIR), short/mid-wave IR (SMWIR), and visible spectral bands. The raw interferogram measurements are radiometrically and spectrally calibrated to produce radiance spectra, which are further processed to obtain atmospheric profiles via retrieval algorithms. This paper describes the GIFTS SM EDU Level 1B algorithms involved in the calibration. The GIFTS Level 1B calibration procedures can be subdivided into four blocks. In the first block, the measured raw interferograms are first corrected for the detector nonlinearity distortion, followed by the complex filtering and decimation procedure. In the second block, a phase correction algorithm is applied to the filtered and decimated complex interferograms. The resulting imaginary part of the spectrum contains only the noise component of the uncorrected spectrum. Additional random noise reduction can be accomplished by applying a spectral smoothing routine to the phase-corrected spectrum. The phase correction and spectral smoothing operations are performed on a set of interferogram scans for both ambient and hot blackbody references. To continue with the calibration, we compute the spectral responsivity based on the previous results, from which, the calibrated ambient blackbody (ABB), hot blackbody (HBB), and scene spectra can be obtained. We now can estimate the noise equivalent spectral radiance (NESR) from the calibrated ABB and HBB spectra. The correction schemes that compensate for the fore-optics offsets and off-axis effects are also implemented. In the third block, we developed an efficient method of generating pixel performance assessments. In addition, a

  3. MISR Level 1B1 Local Mode Radiance Data V002

    Data.gov (United States)

    National Aeronautics and Space Administration — This is the Local Mode Level 1B1 Product containing the DNs radiometrically scaled to radiances with no geometric resampling (Suggested Usage: The MISR Instrument...

  4. BOREAS Level-1B MAS Imagery At-sensor Radiance, Relative X and Y Coordinates

    Science.gov (United States)

    Strub, Richard; Strub, Richard; Newcomer, Jeffrey A.; Ungar, Stephen

    2000-01-01

    For BOReal Ecosystem-Atmosphere Study (BOREAS), the MODIS Airborne Simulator (MAS) images, along with the other remotely sensed data, were collected to provide spatially extensive information over the primary study areas. This information includes detailed land cover and biophysical parameter maps such as fraction of Photosynthetically Active Radiation (fPAR) and Leaf Area Index (LAI). Collection of the MAS images occurred over the study areas during the 1994 field campaigns. The level-1b MAS data cover the dates of 21-Jul-1994, 24-Jul-1994, 04-Aug-1994, and 08-Aug-1994. The data are not geographically/geometrically corrected; however, files of relative X and Y coordinates for each image pixel were derived by using the C-130 INS data in a MAS scan model. The data are provided in binary image format files.

  5. LANDSAT data preprocessing

    Science.gov (United States)

    Austin, W. W.

    1983-01-01

    The effect on LANDSAT data of a Sun angle correction, an intersatellite LANDSAT-2 and LANDSAT-3 data range adjustment, and the atmospheric correction algorithm was evaluated. Fourteen 1978 crop year LACIE sites were used as the site data set. The preprocessing techniques were applied to multispectral scanner channel data and transformed data were plotted and used to analyze the effectiveness of the preprocessing techniques. Ratio transformations effectively reduce the need for preprocessing techniques to be applied directly to the data. Subtractive transformations are more sensitive to Sun angle and atmospheric corrections than ratios. Preprocessing techniques, other than those applied at the Goddard Space Flight Center, should only be applied as an option of the user. While performed on LANDSAT data the study results are also applicable to meteorological satellite data.

  6. Normalization: A Preprocessing Stage

    OpenAIRE

    Patro, S. Gopal Krishna; Sahu, Kishore Kumar

    2015-01-01

    As we know that the normalization is a pre-processing stage of any type problem statement. Especially normalization takes important role in the field of soft computing, cloud computing etc. for manipulation of data like scale down or scale up the range of data before it becomes used for further stage. There are so many normalization techniques are there namely Min-Max normalization, Z-score normalization and Decimal scaling normalization. So by referring these normalization techniques we are ...

  7. High speed preprocessing system

    Indian Academy of Sciences (India)

    M Sankar Kishore

    2000-10-01

    In systems employing tracking, the area of interest is recognized using a high resolution camera and is handed overto the low resolution receiver. The images seen by the low resolution receiver and by the operator through the high resolution camera are different in spatial resolution. In order to establish the correlation between these two images, the high-resolution camera image needsto be preprocessed and made similar to the low-resolution receiver image. This paper discusses the implementation of a suitable preprocessing technique, emphasis being given to develop a system both in hardware and software to reduce processing time. By applying different software/hardware techniques, the execution time has been brought down from a few seconds to a few milliseconds for a typical set of conditions. The hardware is designed around i486 processors and software is developed in PL/M. The system is tested to match the images obtained by two different sensors of the same scene. The hardware and software have been evaluated with different sets of images.

  8. Application of Accelerometer Data in Precise Orbit Determination of GRACE -A and -B

    Institute of Scientific and Technical Information of China (English)

    Dong-Ju Peng; Bin Wu

    2008-01-01

    We investigate how well the GRACE satellite orbits can be determined using the onboard GPS data combined with the accelerometer data.The preprocessing of the accelerometer data and the methods and models used in the orbit determination are presented.In order to assess the orbit accuracy,a number of tests are made,including external orbit comparison,and through Satellite Laser Ranging (SLR) residuals and K-band ranging (KBR) residuals.It is shown that the standard deviations of the position differences between the so-called precise science orbits (PSO) produced by GFZ,and the single-difference (SD) and zero-difference (ZD) dynamic orbits are about 7 cm and 6 cm,respectively.The independent SLR validation indicates that the overall root-mean-squared (RMS) errors of the SD solution for days 309-329 of 2002 are about 4.93cm and 5.22cm,for GRACE-A and B respectively; theoverall RMS errors of the ZD solution are about 4.25 cm and 4.71 cm,respectively.The relative accuracy between the two GRACE satellites is validated by the KBR data to be on a level of 1.29 cm for the SD,and 1.03 cm for the ZD solution.

  9. ITSG-Grace2016 data preprocessing methodologies revisited: impact of using Level-1A data products

    Science.gov (United States)

    Klinger, Beate; Mayer-Gürr, Torsten

    2017-04-01

    For the ITSG-Grace2016 release, the gravity field recovery is based on the use of official GRACE (Gravity Recovery and Climate Experiment) Level-1B data products, generated by the Jet Propulsion Laboratory (JPL). Before gravity field recovery, the Level-1B instrument data are preprocessed. This data preprocessing step includes the combination of Level-1B star camera (SCA1B) and angular acceleration (ACC1B) data for an improved attitude determination (sensor fusion), instrument data screening and ACC1B data calibration. Based on a Level-1A test dataset, provided for individual month throughout the GRACE period by the Center of Space Research at the University of Texas at Austin (UTCSR), the impact of using Level-1A instead of Level-1B data products within the ITSG-Grace2016 processing chain is analyzed. We discuss (1) the attitude determination through an optimal combination of SCA1A and ACC1A data using our sensor fusion approach, (2) the impact of the new attitude product on temporal gravity field solutions, and (3) possible benefits of using Level-1A data for instrument data screening and calibration. As the GRACE mission is currently reaching its end-of-life, the presented work aims not only at a better understanding of GRACE science data to reduce the impact of possible error sources on the gravity field recovery, but it also aims at preparing Level-1A data handling capabilities for the GRACE Follow-On mission.

  10. Preprocessing of NMR metabolomics data.

    Science.gov (United States)

    Euceda, Leslie R; Giskeødegård, Guro F; Bathen, Tone F

    2015-05-01

    Metabolomics involves the large scale analysis of metabolites and thus, provides information regarding cellular processes in a biological sample. Independently of the analytical technique used, a vast amount of data is always acquired when carrying out metabolomics studies; this results in complex datasets with large amounts of variables. This type of data requires multivariate statistical analysis for its proper biological interpretation. Prior to multivariate analysis, preprocessing of the data must be carried out to remove unwanted variation such as instrumental or experimental artifacts. This review aims to outline the steps in the preprocessing of NMR metabolomics data and describe some of the methods to perform these. Since using different preprocessing methods may produce different results, it is important that an appropriate pipeline exists for the selection of the optimal combination of methods in the preprocessing workflow.

  11. Preprocessing of raw metabonomic data.

    Science.gov (United States)

    Vettukattil, Riyas

    2015-01-01

    Recent advances in metabolic profiling techniques allow global profiling of metabolites in cells, tissues, or organisms, using a wide range of analytical techniques such as nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry (MS). The raw data acquired from these instruments are abundant with technical and structural complexity, which makes it statistically difficult to extract meaningful information. Preprocessing involves various computational procedures where data from the instruments (gas chromatography (GC)/liquid chromatography (LC)-MS, NMR spectra) are converted into a usable form for further analysis and biological interpretation. This chapter covers the common data preprocessing techniques used in metabonomics and is primarily focused on baseline correction, normalization, scaling, peak alignment, detection, and quantification. Recent years have witnessed development of several software tools for data preprocessing, and an overview of the frequently used tools in data preprocessing pipeline is covered.

  12. Development of Level 1b Calibration and Validation Readiness, Implementation and Management Plans for GOES-R

    Science.gov (United States)

    Kunkee, David B.; Farley, Robert W.; Kwan, Betty P.; Hecht, James H.; Walterscheid, Richard L.; Claudepierre, Seth G.; Bishop, Rebecca L.; Gelinas, Lynette J.; Deluccia, Frank J.

    2017-01-01

    A complement of Readiness, Implementation and Management Plans (RIMPs) to facilitate management of post-launch product test activities for the official Geostationary Operational Environmental Satellite (GOES-R) Level 1b (L1b) products have been developed and documented. Separate plans have been created for each of the GOES-R sensors including: the Advanced Baseline Imager (ABI), the Extreme ultraviolet and X-ray Irradiance Sensors (EXIS), Geostationary Lightning Mapper (GLM), GOES-R Magnetometer (MAG), the Space Environment In-Situ Suite (SEISS), and the Solar Ultraviolet Imager (SUVI). The GOES-R program has implemented these RIMPs in order to address the full scope of CalVal activities required for a successful demonstration of GOES-R L1b data product quality throughout the three validation stages: Beta, Provisional and Full Validation. For each product maturity level, the RIMPs include specific performance criteria and required artifacts that provide evidence a given validation stage has been reached, the timing when each stage will be complete, a description of every applicable Post-Launch Product Test (PLPT), roles and responsibilities of personnel, upstream dependencies, and analysis methods and tools to be employed during validation. Instrument level Post-Launch Tests (PLTs) are also referenced and apply primarily to functional check-out of the instruments.

  13. Data preprocessing in data mining

    CERN Document Server

    García, Salvador; Herrera, Francisco

    2015-01-01

    Data Preprocessing for Data Mining addresses one of the most important issues within the well-known Knowledge Discovery from Data process. Data directly taken from the source will likely have inconsistencies, errors or most importantly, it is not ready to be considered for a data mining process. Furthermore, the increasing amount of data in recent science, industry and business applications, calls to the requirement of more complex tools to analyze it. Thanks to data preprocessing, it is possible to convert the impossible into possible, adapting the data to fulfill the input demands of each data mining algorithm. Data preprocessing includes the data reduction techniques, which aim at reducing the complexity of the data, detecting or removing irrelevant and noisy elements from the data. This book is intended to review the tasks that fill the gap between the data acquisition from the source and the data mining process. A comprehensive look from a practical point of view, including basic concepts and surveying t...

  14. Optimal Preprocessing Of GPS Data

    Science.gov (United States)

    Wu, Sien-Chong; Melbourne, William G.

    1994-01-01

    Improved technique for preprocessing data from Global Positioning System receiver reduces processing time and number of data to be stored. Optimal in sense that it maintains strength of data. Also increases ability to resolve ambiguities in numbers of cycles of received GPS carrier signals.

  15. Effective Feature Preprocessing for Time Series Forecasting

    DEFF Research Database (Denmark)

    Zhao, Junhua; Dong, Zhaoyang; Xu, Zhao

    2006-01-01

    Time series forecasting is an important area in data mining research. Feature preprocessing techniques have significant influence on forecasting accuracy, therefore are essential in a forecasting model. Although several feature preprocessing techniques have been applied in time series forecasting...... performance in time series forecasting. It is demonstrated in our experiment that, effective feature preprocessing can significantly enhance forecasting accuracy. This research can be a useful guidance for researchers on effectively selecting feature preprocessing techniques and integrating them with time...... series forecasting models....

  16. Preprocessing of compressed digital video

    Science.gov (United States)

    Segall, C. Andrew; Karunaratne, Passant V.; Katsaggelos, Aggelos K.

    2000-12-01

    Pre-processing algorithms improve on the performance of a video compression system by removing spurious noise and insignificant features from the original images. This increases compression efficiency and attenuates coding artifacts. Unfortunately, determining the appropriate amount of pre-filtering is a difficult problem, as it depends on both the content of an image as well as the target bit-rate of compression algorithm. In this paper, we explore a pre- processing technique that is loosely coupled to the quantization decisions of a rate control mechanism. This technique results in a pre-processing system that operates directly on the Displaced Frame Difference (DFD) and is applicable to any standard-compatible compression system. Results explore the effect of several standard filters on the DFD. An adaptive technique is then considered.

  17. Effective Feature Preprocessing for Time Series Forecasting

    DEFF Research Database (Denmark)

    Zhao, Junhua; Dong, Zhaoyang; Xu, Zhao

    2006-01-01

    Time series forecasting is an important area in data mining research. Feature preprocessing techniques have significant influence on forecasting accuracy, therefore are essential in a forecasting model. Although several feature preprocessing techniques have been applied in time series forecasting......, there is so far no systematic research to study and compare their performance. How to select effective techniques of feature preprocessing in a forecasting model remains a problem. In this paper, the authors conduct a comprehensive study of existing feature preprocessing techniques to evaluate their empirical...... performance in time series forecasting. It is demonstrated in our experiment that, effective feature preprocessing can significantly enhance forecasting accuracy. This research can be a useful guidance for researchers on effectively selecting feature preprocessing techniques and integrating them with time...

  18. Preprocessing and Morphological Analysis in Text Mining

    Directory of Open Access Journals (Sweden)

    Krishna Kumar Mohbey Sachin Tiwari

    2011-12-01

    Full Text Available This paper is based on the preprocessing activities which is performed by the software or language translators before applying mining algorithms on the huge data. Text mining is an important area of Data mining and it plays a vital role for extracting useful information from the huge database or data ware house. But before applying the text mining or information extraction process, preprocessing is must because the given data or dataset have the noisy, incomplete, inconsistent, dirty and unformatted data. In this paper we try to collect the necessary requirements for preprocessing. When we complete the preprocess task then we can easily extract the knowledgful information using mining strategy. This paper also provides the information about the analysis of data like tokenization, stemming and semantic analysis like phrase recognition and parsing. This paper also collect the procedures for preprocessing data i.e. it describe that how the stemming, tokenization or parsing are applied.

  19. Facilitating Watermark Insertion by Preprocessing Media

    Directory of Open Access Journals (Sweden)

    Matt L. Miller

    2004-10-01

    Full Text Available There are several watermarking applications that require the deployment of a very large number of watermark embedders. These applications often have severe budgetary constraints that limit the computation resources that are available. Under these circumstances, only simple embedding algorithms can be deployed, which have limited performance. In order to improve performance, we propose preprocessing the original media. It is envisaged that this preprocessing occurs during content creation and has no budgetary or computational constraints. Preprocessing combined with simple embedding creates a watermarked Work, the performance of which exceeds that of simple embedding alone. However, this performance improvement is obtained without any increase in the computational complexity of the embedder. Rather, the additional computational burden is shifted to the preprocessing stage. A simple example of this procedure is described and experimental results confirm our assertions.

  20. Tongji-GRACE01: A GRACE-only static gravity field model recovered from GRACE Level-1B data using modified short arc approach

    Science.gov (United States)

    Chen, Qiujie; Shen, Yunzhong; Zhang, Xingfu; Chen, Wu; Hsu, Houze

    2015-09-01

    The modified short arc approach, where the position vector in force model are regarded as pseudo observation, is implemented in the SAtellite Gravimetry Analysis Software (SAGAS) developed by Tongji university. Based on the SAGAS platform, a static gravity field model (namely Tongji-GRACE01) complete to degree and order 160 is computed from 49 months of real GRACE Level-1B data spanning the period 2003-2007 (including the observations of K-band range-rate, reduced dynamic orbits, non-conservative accelerations and altitudes). The Tongji-GRACE01 model is compared with the recent GRACE-only models (such as GGM05S, AIUB-GRACE03S, ITG-GRACE03, ITG-GRACE2010S, and ITSG-GRACE2014S) and validated with GPS-leveling data sets in different countries. The results show that the Tongji-GRACE01 model has a considered quality as GGM05S, AIUB-GRACE03S and ITG-GRACE03. The Tongji-GRACE01 model is available at the International Centre for Global Earth Models (ICGEM) web page (http://icgem.gfz-potsdam.de/ICGEM/).

  1. Preprocessing of ionospheric echo Doppler spectra

    Institute of Scientific and Technical Information of China (English)

    FANG Liang; ZHAO Zhengyu; WANG Feng; SU Fanfan

    2007-01-01

    The real-time information of the distant ionosphere can be acquired by using the Wuhan ionospheric oblique backscattering sounding system(WIOBSS),which adopts a discontinuous wave mechanism.After the characteristics of the ionospheric echo Doppler spectra were analyzed,the signal preprocessing was developed in this paper,which aimed at improving the Doppler spectra.The results indicate that the preprocessing not only makes the system acquire a higher ability of target detection but also suppresses the radio frequency interference by 6-7 dB.

  2. Preprocessing Moist Lignocellulosic Biomass for Biorefinery Feedstocks

    Energy Technology Data Exchange (ETDEWEB)

    Neal Yancey; Christopher T. Wright; Craig Conner; J. Richard Hess

    2009-06-01

    Biomass preprocessing is one of the primary operations in the feedstock assembly system of a lignocellulosic biorefinery. Preprocessing is generally accomplished using industrial grinders to format biomass materials into a suitable biorefinery feedstock for conversion to ethanol and other bioproducts. Many factors affect machine efficiency and the physical characteristics of preprocessed biomass. For example, moisture content of the biomass as received from the point of production has a significant impact on overall system efficiency and can significantly affect the characteristics (particle size distribution, flowability, storability, etc.) of the size-reduced biomass. Many different grinder configurations are available on the market, each with advantages under specific conditions. Ultimately, the capacity and/or efficiency of the grinding process can be enhanced by selecting the grinder configuration that optimizes grinder performance based on moisture content and screen size. This paper discusses the relationships of biomass moisture with respect to preprocessing system performance and product physical characteristics and compares data obtained on corn stover, switchgrass, and wheat straw as model feedstocks during Vermeer HG 200 grinder testing. During the tests, grinder screen configuration and biomass moisture content were varied and tested to provide a better understanding of their relative impact on machine performance and the resulting feedstock physical characteristics and uniformity relative to each crop tested.

  3. Preprocessing Moist Lignocellulosic Biomass for Biorefinery Feedstocks

    Energy Technology Data Exchange (ETDEWEB)

    Neal Yancey; Christopher T. Wright; Craig Conner; J. Richard Hess

    2009-06-01

    Biomass preprocessing is one of the primary operations in the feedstock assembly system of a lignocellulosic biorefinery. Preprocessing is generally accomplished using industrial grinders to format biomass materials into a suitable biorefinery feedstock for conversion to ethanol and other bioproducts. Many factors affect machine efficiency and the physical characteristics of preprocessed biomass. For example, moisture content of the biomass as received from the point of production has a significant impact on overall system efficiency and can significantly affect the characteristics (particle size distribution, flowability, storability, etc.) of the size-reduced biomass. Many different grinder configurations are available on the market, each with advantages under specific conditions. Ultimately, the capacity and/or efficiency of the grinding process can be enhanced by selecting the grinder configuration that optimizes grinder performance based on moisture content and screen size. This paper discusses the relationships of biomass moisture with respect to preprocessing system performance and product physical characteristics and compares data obtained on corn stover, switchgrass, and wheat straw as model feedstocks during Vermeer HG 200 grinder testing. During the tests, grinder screen configuration and biomass moisture content were varied and tested to provide a better understanding of their relative impact on machine performance and the resulting feedstock physical characteristics and uniformity relative to each crop tested.

  4. Efficient Preprocessing technique using Web log mining

    Science.gov (United States)

    Raiyani, Sheetal A.; jain, Shailendra

    2012-11-01

    Web Usage Mining can be described as the discovery and Analysis of user access pattern through mining of log files and associated data from a particular websites. No. of visitors interact daily with web sites around the world. enormous amount of data are being generated and these information could be very prize to the company in the field of accepting Customerís behaviors. In this paper a complete preprocessing style having data cleaning, user and session Identification activities to improve the quality of data. Efficient preprocessing technique one of the User Identification which is key issue in preprocessing technique phase is to identify the Unique web users. Traditional User Identification is based on the site structure, being supported by using some heuristic rules, for use of this reduced the efficiency of user identification solve this difficulty we introduced proposed Technique DUI (Distinct User Identification) based on IP address ,Agent and Session time ,Referred pages on desired session time. Which can be used in counter terrorism, fraud detection and detection of unusual access of secure data, as well as through detection of regular access behavior of users improve the overall designing and performance of upcoming access of preprocessing results.

  5. A PREPROCESSING LS-CMA IN HIGHLY CORRUPTIVE ENVIRONMENT

    Institute of Scientific and Technical Information of China (English)

    Guo Yan; Fang Dagang; Thomas N.C.Wang; Liang Changhong

    2002-01-01

    A fast preprocessing Least Square-Constant Modulus Algorithm (LS-CMA) is proposed for blind adaptive beamforming. This new preprocessing method precludes noise capture caused by the original LS-CMA with the preprocessing procedure controlled by the static Constant Modulus Algorithm (CMA). The simulation results have shown that the proposed fast preprocessing LS-CMA can effectively reject the co-channel interference, and quickly lock onto the constant modulus desired signal with only one snapshot in a highly corruptive environment.

  6. The preprocessing of multispectral data. II. [of Landsat satellite

    Science.gov (United States)

    Quiel, F.

    1976-01-01

    It is pointed out that a correction of atmospheric effects is an important requirement for a full utilization of the possibilities provided by preprocessing techniques. The most significant characteristics of original and preprocessed data are considered, taking into account the solution of classification problems by means of the preprocessing procedure. Improvements obtainable with different preprocessing techniques are illustrated with the aid of examples involving Landsat data regarding an area in Colorado.

  7. Pre-processing Tasks in Indonesian Twitter Messages

    Science.gov (United States)

    Hidayatullah, A. F.; Ma’arif, M. R.

    2017-01-01

    Twitter text messages are very noisy. Moreover, tweet data are unstructured and complicated enough. The focus of this work is to investigate pre-processing technique for Twitter messages in Bahasa Indonesia. The main goal of this experiment is to clean the tweet data for further analysis. Thus, the objectives of this pre-processing task is simply removing all meaningless character and left valuable words. In this research, we divide our proposed pre-processing experiments into two parts. The first part is common pre-processing task. The second part is a specific pre-processing task for tweet data. From the experimental result we can conclude that by employing a specific pre-processing task related to tweet data characteristic we obtained more valuable result. The result obtained is better in terms of less meaningful word occurrence which is not significant in number comparing to the result obtained by just running common pre-processing tasks.

  8. GRace: a MATLAB-based application for fitting the discrimination-association model.

    Science.gov (United States)

    Stefanutti, Luca; Vianello, Michelangelo; Anselmi, Pasquale; Robusto, Egidio

    2014-10-28

    The Implicit Association Test (IAT) is a computerized two-choice discrimination task in which stimuli have to be categorized as belonging to target categories or attribute categories by pressing, as quickly and accurately as possible, one of two response keys. The discrimination association model has been recently proposed for the analysis of reaction time and accuracy of an individual respondent to the IAT. The model disentangles the influences of three qualitatively different components on the responses to the IAT: stimuli discrimination, automatic association, and termination criterion. The article presents General Race (GRace), a MATLAB-based application for fitting the discrimination association model to IAT data. GRace has been developed for Windows as a standalone application. It is user-friendly and does not require any programming experience. The use of GRace is illustrated on the data of a Coca Cola-Pepsi Cola IAT, and the results of the analysis are interpreted and discussed.

  9. Acquisition and preprocessing of LANDSAT data

    Science.gov (United States)

    Horn, T. N.; Brown, L. E.; Anonsen, W. H. (Principal Investigator)

    1979-01-01

    The original configuration of the GSFC data acquisition, preprocessing, and transmission subsystem, designed to provide LANDSAT data inputs to the LACIE system at JSC, is described. Enhancements made to support LANDSAT -2, and modifications for LANDSAT -3 are discussed. Registration performance throughout the 3 year period of LACIE operations satisfied the 1 pixel root-mean-square requirements established in 1974, with more than two of every three attempts at data registration proving successful, notwithstanding cosmetic faults or content inadequacies to which the process is inherently susceptible. The cloud/snow rejection rate experienced throughout the last 3 years has approached 50%, as expected in most LANDSAT data use situations.

  10. Approximate Distance Oracles with Improved Preprocessing Time

    CERN Document Server

    Wulff-Nilsen, Christian

    2011-01-01

    Given an undirected graph $G$ with $m$ edges, $n$ vertices, and non-negative edge weights, and given an integer $k\\geq 1$, we show that for some universal constant $c$, a $(2k-1)$-approximate distance oracle for $G$ of size $O(kn^{1 + 1/k})$ can be constructed in $O(\\sqrt km + kn^{1 + c/\\sqrt k})$ time and can answer queries in $O(k)$ time. We also give an oracle which is faster for smaller $k$. Our results break the quadratic preprocessing time bound of Baswana and Kavitha for all $k\\geq 6$ and improve the $O(kmn^{1/k})$ time bound of Thorup and Zwick except for very sparse graphs and small $k$. When $m = \\Omega(n^{1 + c/\\sqrt k})$ and $k = O(1)$, our oracle is optimal w.r.t.\\ both stretch, size, preprocessing time, and query time, assuming a widely believed girth conjecture by Erd\\H{o}s.

  11. The Registration of Knee Joint Images with Preprocessing

    Directory of Open Access Journals (Sweden)

    Zhenyan Ji

    2011-06-01

    Full Text Available the registration of CT and MR images is important to analyze the effect of PCL and ACL deficiency on knee joint. Because CT and MR images have different limitations, we need register CT and MR images of knee joint and then build a model to do an analysis of the stress distribution on knee joint. In our project, we adopt image registration based on mutual information. In the knee joint images, the information about adipose, muscle and other soft tissue affects the registration accuracy. To eliminate the interference, we propose a combined preprocessing solution BEBDO, which consists of five steps, image blurring, image enhancement, image blurring, image edge detection and image outline preprocessing. We also designed the algorithm of image outline preprocessing. At the end of the paper, an experiment is done to compare the image registration results without the preprocessing and with the preprocessing. The results prove that the preprocessing can improve the image registration accuracy.

  12. An effective preprocessing method for finger vein recognition

    Science.gov (United States)

    Peng, JiaLiang; Li, Qiong; Wang, Ning; Abd El-Latif, Ahmed A.; Niu, Xiamu

    2013-07-01

    The image preprocessing plays an important role in finger vein recognition system. However, previous preprocessing schemes remind weakness to be resolved for the high finger vein recongtion performance. In this paper, we propose a new finger vein preprocessing that includes finger region localization, alignment, finger vein ROI segmentation and enhancement. The experimental results show that the proposed scheme is capable of enhancing the quality of finger vein image effectively and reliably.

  13. User microprogrammable processors for high data rate telemetry preprocessing

    Science.gov (United States)

    Pugsley, J. H.; Ogrady, E. P.

    1973-01-01

    The use of microprogrammable processors for the preprocessing of high data rate satellite telemetry is investigated. The following topics are discussed along with supporting studies: (1) evaluation of commercial microprogrammable minicomputers for telemetry preprocessing tasks; (2) microinstruction sets for telemetry preprocessing; and (3) the use of multiple minicomputers to achieve high data processing. The simulation of small microprogrammed processors is discussed along with examples of microprogrammed processors.

  14. Preprocessing and Analysis of Digitized ECGs

    Science.gov (United States)

    Villalpando, L. E. Piña; Kurmyshev, E.; Ramírez, S. Luna; Leal, L. Delgado

    2008-08-01

    In this work we propose a methodology and programs in MatlabTM that perform the preprocessing and analysis of the derivative D1 of ECGs. The program makes the correction to isoelectric line for each beat, calculates the average cardiac frequency and its standard deviation, generates a file of amplitude of P, Q and T waves, as well as the segments and intervals important of each beat. Software makes the normalization of beats to a standard rate of 80 beats per minute, the superposition of beats is done centering R waves, before and after normalizing the amplitude of each beat. The data and graphics provide relevant information to the doctor for diagnosis. In addition, some results are displayed similar to those presented by a Holter recording.

  15. The Effect of Preprocessing on Arabic Document Categorization

    Directory of Open Access Journals (Sweden)

    Abdullah Ayedh

    2016-04-01

    Full Text Available Preprocessing is one of the main components in a conventional document categorization (DC framework. This paper aims to highlight the effect of preprocessing tasks on the efficiency of the Arabic DC system. In this study, three classification techniques are used, namely, naive Bayes (NB, k-nearest neighbor (KNN, and support vector machine (SVM. Experimental analysis on Arabic datasets reveals that preprocessing techniques have a significant impact on the classification accuracy, especially with complicated morphological structure of the Arabic language. Choosing appropriate combinations of preprocessing tasks provides significant improvement on the accuracy of document categorization depending on the feature size and classification techniques. Findings of this study show that the SVM technique has outperformed the KNN and NB techniques. The SVM technique achieved 96.74% micro-F1 value by using the combination of normalization and stemming as preprocessing tasks.

  16. Forensic considerations for preprocessing effects on clinical MDCT scans.

    Science.gov (United States)

    Wade, Andrew D; Conlogue, Gerald J

    2013-05-01

    Manipulation of digital photographs destined for medico-legal inquiry must be thoroughly documented and presented with explanation of any manipulations. Unlike digital photography, computed tomography (CT) data must pass through an additional step before viewing. Reconstruction of raw data involves reconstruction algorithms to preprocess the raw information into display data. Preprocessing of raw data, although it occurs at the source, alters the images and must be accounted for in the same way as postprocessing. Repeated CT scans of a gunshot wound phantom were made using the Toshiba Aquilion 64-slice multidetector CT scanner. The appearance of fragments, high-density inclusion artifacts, and soft tissue were assessed. Preprocessing with different algorithms results in substantial differences in image output. It is important to appreciate that preprocessing affects the image, that it does so differently in the presence of high-density inclusions, and that preprocessing algorithms and scanning parameters may be used to overcome the resulting artifacts.

  17. Feature detection techniques for preprocessing proteomic data.

    Science.gov (United States)

    Sellers, Kimberly F; Miecznikowski, Jeffrey C

    2010-01-01

    Numerous gel-based and nongel-based technologies are used to detect protein changes potentially associated with disease. The raw data, however, are abundant with technical and structural complexities, making statistical analysis a difficult task. Low-level analysis issues (including normalization, background correction, gel and/or spectral alignment, feature detection, and image registration) are substantial problems that need to be addressed, because any large-level data analyses are contingent on appropriate and statistically sound low-level procedures. Feature detection approaches are particularly interesting due to the increased computational speed associated with subsequent calculations. Such summary data corresponding to image features provide a significant reduction in overall data size and structure while retaining key information. In this paper, we focus on recent advances in feature detection as a tool for preprocessing proteomic data. This work highlights existing and newly developed feature detection algorithms for proteomic datasets, particularly relating to time-of-flight mass spectrometry, and two-dimensional gel electrophoresis. Note, however, that the associated data structures (i.e., spectral data, and images containing spots) used as input for these methods are obtained via all gel-based and nongel-based methods discussed in this manuscript, and thus the discussed methods are likewise applicable.

  18. OrbView-3 Level 1B

    Data.gov (United States)

    U.S. Geological Survey, Department of the Interior — GeoEye's OrbView-3 satellite was among the world's first commercial satellites to provide high-resolution imagery from space. OrbView-3 collected one meter...

  19. OrbView-3 Level 1B

    Data.gov (United States)

    U.S. Geological Survey, Department of the Interior — GeoEye's OrbView-3 satellite was among the world's first commercial satellites to provide high-resolution imagery from space. OrbView-3 collected one meter...

  20. Enhanced bone structural analysis through pQCT image preprocessing.

    Science.gov (United States)

    Cervinka, T; Hyttinen, J; Sievanen, H

    2010-05-01

    Several factors, including preprocessing of the image, can affect the reliability of pQCT-measured bone traits, such as cortical area and trabecular density. Using repeated scans of four different liquid phantoms and repeated in vivo scans of distal tibiae from 25 subjects, the performance of two novel preprocessing methods, based on the down-sampling of grayscale intensity histogram and the statistical approximation of image data, was compared to 3 x 3 and 5 x 5 median filtering. According to phantom measurements, the signal to noise ratio in the raw pQCT images (XCT 3000) was low ( approximately 20dB) which posed a challenge for preprocessing. Concerning the cortical analysis, the reliability coefficient (R) was 67% for the raw image and increased to 94-97% after preprocessing without apparent preference for any method. Concerning the trabecular density, the R-values were already high ( approximately 99%) in the raw images leaving virtually no room for improvement. However, some coarse structural patterns could be seen in the preprocessed images in contrast to a disperse distribution of density levels in the raw image. In conclusion, preprocessing cannot suppress the high noise level to the extent that the analysis of mean trabecular density is essentially improved, whereas preprocessing can enhance cortical bone analysis and also facilitate coarse structural analyses of the trabecular region.

  1. An adaptive preprocessing algorithm for low bitrate video coding

    Institute of Scientific and Technical Information of China (English)

    LI Mao-quan; XU Zheng-quan

    2006-01-01

    At low bitrate, all block discrete cosine transform (BDCT) based video coding algorithms suffer from visible blocking and ringing artifacts in the reconstructed images because the quantization is too coarse and high frequency DCT coefficients are inclined to be quantized to zeros. Preprocessing algorithms can enhance coding efficiency and thus reduce the likelihood of blocking artifacts and ringing artifacts generated in the video coding process by applying a low-pass filter before video encoding to remove some relatively insignificant high frequent components. In this paper, we introduce a new adaptive preprocessing algorithm, which employs an improved bilateral filter to provide adaptive edge-preserving low-pass filtering which is adjusted according to the quantization parameters. Whether at low or high bit rate, the preprocessing can provide proper filtering to make the video encoder more efficient and have better reconstructed image quality. Experimental results demonstrate that our proposed preprocessing algorithm can significantly improve both subjective and objective quality.

  2. Solid Earth ARISTOTELES mission data preprocessing simulation of gravity gradiometer

    Science.gov (United States)

    Avanzi, G.; Stolfa, R.; Versini, B.

    Data preprocessing of the ARISTOTELES mission, which measures the Earth gravity gradient in a near polar orbit, was studied. The mission measures the gravity field at sea level through indirect measurements performed on the orbit, so that the evaluation steps consist in processing data from GRADIO accelerometer measurements. Due to the physical phenomena involved in the data collection experiment, it is possible to isolate at an initial stage a preprocessing of the gradiometer data based only on GRADIO measurements and not needing a detailed knowledge of the attitude and attitude rate sensors output. This preprocessing produces intermediate quantities used in future stages of the reduction. Software was designed and run to evaluate for this level of data reduction the achievable accuracy as a function of knowledge on instrument and satellite status parameters. The architecture of this element of preprocessing is described.

  3. Preprocessing Algorithm for Deciphering Historical Inscriptions Using String Metric

    Directory of Open Access Journals (Sweden)

    Lorand Lehel Toth

    2016-07-01

    Full Text Available The article presents the improvements in the preprocessing part of the deciphering method (shortly preprocessing algorithm for historical inscriptions of unknown origin. Glyphs used in historical inscriptions changed through time; therefore, various versions of the same script may contain different glyphs for each grapheme. The purpose of the preprocessing algorithm is reducing the running time of the deciphering process by filtering out the less probable interpretations of the examined inscription. However, the first version of the preprocessing algorithm leads incorrect outcome or no result in the output in certain cases. Therefore, its improved version was developed to find the most similar words in the dictionary by relaying the search conditions more accurately, but still computationally effectively. Moreover, a sophisticated similarity metric used to determine the possible meaning of the unknown inscription is introduced. The results of the evaluations are also detailed.

  4. A review of statistical methods for preprocessing oligonucleotide microarrays.

    Science.gov (United States)

    Wu, Zhijin

    2009-12-01

    Microarrays have become an indispensable tool in biomedical research. This powerful technology not only makes it possible to quantify a large number of nucleic acid molecules simultaneously, but also produces data with many sources of noise. A number of preprocessing steps are therefore necessary to convert the raw data, usually in the form of hybridisation images, to measures of biological meaning that can be used in further statistical analysis. Preprocessing of oligonucleotide arrays includes image processing, background adjustment, data normalisation/transformation and sometimes summarisation when multiple probes are used to target one genomic unit. In this article, we review the issues encountered in each preprocessing step and introduce the statistical models and methods in preprocessing.

  5. Preprocessing for classification of thermograms in breast cancer detection

    Science.gov (United States)

    Neumann, Łukasz; Nowak, Robert M.; Okuniewski, Rafał; Oleszkiewicz, Witold; Cichosz, Paweł; Jagodziński, Dariusz; Matysiewicz, Mateusz

    2016-09-01

    Performance of binary classification of breast cancer suffers from high imbalance between classes. In this article we present the preprocessing module designed to negate the discrepancy in training examples. Preprocessing module is based on standardization, Synthetic Minority Oversampling Technique and undersampling. We show how each algorithm influences classification accuracy. Results indicate that described module improves overall Area Under Curve up to 10% on the tested dataset. Furthermore we propose other methods of dealing with imbalanced datasets in breast cancer classification.

  6. Evaluating the impact of image preprocessing on iris segmentation

    Directory of Open Access Journals (Sweden)

    José F. Valencia-Murillo

    2014-08-01

    Full Text Available Segmentation is one of the most important stages in iris recognition systems. In this paper, image preprocessing algorithms are applied in order to evaluate their impact on successful iris segmentation. The preprocessing algorithms are based on histogram adjustment, Gaussian filters and suppression of specular reflections in human eye images. The segmentation method introduced by Masek is applied on 199 images acquired under unconstrained conditions, belonging to the CASIA-irisV3 database, before and after applying the preprocessing algorithms. Then, the impact of image preprocessing algorithms on the percentage of successful iris segmentation is evaluated by means of a visual inspection of images in order to determine if circumferences of iris and pupil were detected correctly. An increase from 59% to 73% in percentage of successful iris segmentation is obtained with an algorithm that combine elimination of specular reflections, followed by the implementation of a Gaussian filter having a 5x5 kernel. The results highlight the importance of a preprocessing stage as a previous step in order to improve the performance during the edge detection and iris segmentation processes.

  7. Effect of microaerobic fermentation in preprocessing fibrous lignocellulosic materials.

    Science.gov (United States)

    Alattar, Manar Arica; Green, Terrence R; Henry, Jordan; Gulca, Vitalie; Tizazu, Mikias; Bergstrom, Robby; Popa, Radu

    2012-06-01

    Amending soil with organic matter is common in agricultural and logging practices. Such amendments have benefits to soil fertility and crop yields. These benefits may be increased if material is preprocessed before introduction into soil. We analyzed the efficiency of microaerobic fermentation (MF), also referred to as Bokashi, in preprocessing fibrous lignocellulosic (FLC) organic materials using varying produce amendments and leachate treatments. Adding produce amendments increased leachate production and fermentation rates and decreased the biological oxygen demand of the leachate. Continuously draining leachate without returning it to the fermentors led to acidification and decreased concentrations of polysaccharides (PS) in leachates. PS fragmentation and the production of soluble metabolites and gases stabilized in fermentors in about 2-4 weeks. About 2 % of the carbon content was lost as CO(2). PS degradation rates, upon introduction of processed materials into soil, were similar to unfermented FLC. Our results indicate that MF is insufficient for adequate preprocessing of FLC material.

  8. Exploration, visualization, and preprocessing of high-dimensional data.

    Science.gov (United States)

    Wu, Zhijin; Wu, Zhiqiang

    2010-01-01

    The rapid advances in biotechnology have given rise to a variety of high-dimensional data. Many of these data, including DNA microarray data, mass spectrometry protein data, and high-throughput screening (HTS) assay data, are generated by complex experimental procedures that involve multiple steps such as sample extraction, purification and/or amplification, labeling, fragmentation, and detection. Therefore, the quantity of interest is not directly obtained and a number of preprocessing procedures are necessary to convert the raw data into the format with biological relevance. This also makes exploratory data analysis and visualization essential steps to detect possible defects, anomalies or distortion of the data, to test underlying assumptions and thus ensure data quality. The characteristics of the data structure revealed in exploratory analysis often motivate decisions in preprocessing procedures to produce data suitable for downstream analysis. In this chapter we review the common techniques in exploring and visualizing high-dimensional data and introduce the basic preprocessing procedures.

  9. Data Preprocessing in Cluster Analysis of Gene Expression

    Institute of Scientific and Technical Information of China (English)

    杨春梅; 万柏坤; 高晓峰

    2003-01-01

    Considering that the DNA microarray technology has generated explosive gene expression data and that it is urgent to analyse and to visualize such massive datasets with efficient methods, we investigate the data preprocessing methods used in cluster analysis, normalization or logarithm of the matrix, by using hierarchical clustering, principal component analysis (PCA) and self-organizing maps (SOMs). The results illustrate that when using the Euclidean distance as measuring metrics, logarithm of relative expression level is the best preprocessing method, while data preprocessed by normalization cannot attain the expected results because the data structure is ruined. If there are only a few principal components, the PCA is an effective method to extract the frame structure, while SOMs are more suitable for a specific structure.

  10. Micro-Analyzer: automatic preprocessing of Affymetrix microarray data.

    Science.gov (United States)

    Guzzi, Pietro Hiram; Cannataro, Mario

    2013-08-01

    A current trend in genomics is the investigation of the cell mechanism using different technologies, in order to explain the relationship among genes, molecular processes and diseases. For instance, the combined use of gene-expression arrays and genomic arrays has been demonstrated as an effective instrument in clinical practice. Consequently, in a single experiment different kind of microarrays may be used, resulting in the production of different types of binary data (images and textual raw data). The analysis of microarray data requires an initial preprocessing phase, that makes raw data suitable for use on existing analysis platforms, such as the TIGR M4 (TM4) Suite. An additional challenge to be faced by emerging data analysis platforms is the ability to treat in a combined way those different microarray formats coupled with clinical data. In fact, resulting integrated data may include both numerical and symbolic data (e.g. gene expression and SNPs regarding molecular data), as well as temporal data (e.g. the response to a drug, time to progression and survival rate), regarding clinical data. Raw data preprocessing is a crucial step in analysis but is often performed in a manual and error prone way using different software tools. Thus novel, platform independent, and possibly open source tools enabling the semi-automatic preprocessing and annotation of different microarray data are needed. The paper presents Micro-Analyzer (Microarray Analyzer), a cross-platform tool for the automatic normalization, summarization and annotation of Affymetrix gene expression and SNP binary data. It represents the evolution of the μ-CS tool, extending the preprocessing to SNP arrays that were not allowed in μ-CS. The Micro-Analyzer is provided as a Java standalone tool and enables users to read, preprocess and analyse binary microarray data (gene expression and SNPs) by invoking TM4 platform. It avoids: (i) the manual invocation of external tools (e.g. the Affymetrix Power

  11. Image preprocessing study on KPCA-based face recognition

    Science.gov (United States)

    Li, Xuan; Li, Dehua

    2015-12-01

    Face recognition as an important biometric identification method, with its friendly, natural, convenient advantages, has obtained more and more attention. This paper intends to research a face recognition system including face detection, feature extraction and face recognition, mainly through researching on related theory and the key technology of various preprocessing methods in face detection process, using KPCA method, focuses on the different recognition results in different preprocessing methods. In this paper, we choose YCbCr color space for skin segmentation and choose integral projection for face location. We use erosion and dilation of the opening and closing operation and illumination compensation method to preprocess face images, and then use the face recognition method based on kernel principal component analysis method for analysis and research, and the experiments were carried out using the typical face database. The algorithms experiment on MATLAB platform. Experimental results show that integration of the kernel method based on PCA algorithm under certain conditions make the extracted features represent the original image information better for using nonlinear feature extraction method, which can obtain higher recognition rate. In the image preprocessing stage, we found that images under various operations may appear different results, so as to obtain different recognition rate in recognition stage. At the same time, in the process of the kernel principal component analysis, the value of the power of the polynomial function can affect the recognition result.

  12. Pre-Processing Rules for Triangulation of Probabilistic Networks

    NARCIS (Netherlands)

    Bodlaender, H.L.; Koster, A.M.C.A.; Eijkhof, F. van den

    2003-01-01

    The currently most efficient algorithm for inference with a probabilistic network builds upon a triangulation of a network’s graph. In this paper, we show that pre-processing can help in finding good triangulations for probabilistic networks, that is, triangulations with a minimal maximum clique siz

  13. Pre-processing for Triangulation of Probabilistic Networks

    NARCIS (Netherlands)

    Bodlaender, H.L.; Koster, A.M.C.A.; Eijkhof, F. van den; Gaag, L.C. van der

    2001-01-01

    The currently most efficient algorithm for inference with a probabilistic network builds upon a triangulation of a networks graph. In this paper, we show that pre-processing can help in finding good triangulations for probabilistic networks, that is, triangulations with a minimal maximum clique

  14. The minimal preprocessing pipelines for the Human Connectome Project.

    Science.gov (United States)

    Glasser, Matthew F; Sotiropoulos, Stamatios N; Wilson, J Anthony; Coalson, Timothy S; Fischl, Bruce; Andersson, Jesper L; Xu, Junqian; Jbabdi, Saad; Webster, Matthew; Polimeni, Jonathan R; Van Essen, David C; Jenkinson, Mark

    2013-10-15

    The Human Connectome Project (HCP) faces the challenging task of bringing multiple magnetic resonance imaging (MRI) modalities together in a common automated preprocessing framework across a large cohort of subjects. The MRI data acquired by the HCP differ in many ways from data acquired on conventional 3 Tesla scanners and often require newly developed preprocessing methods. We describe the minimal preprocessing pipelines for structural, functional, and diffusion MRI that were developed by the HCP to accomplish many low level tasks, including spatial artifact/distortion removal, surface generation, cross-modal registration, and alignment to standard space. These pipelines are specially designed to capitalize on the high quality data offered by the HCP. The final standard space makes use of a recently introduced CIFTI file format and the associated grayordinate spatial coordinate system. This allows for combined cortical surface and subcortical volume analyses while reducing the storage and processing requirements for high spatial and temporal resolution data. Here, we provide the minimum image acquisition requirements for the HCP minimal preprocessing pipelines and additional advice for investigators interested in replicating the HCP's acquisition protocols or using these pipelines. Finally, we discuss some potential future improvements to the pipelines.

  15. OPSN: The IMS COMSYS 1 and 2 Data Preprocessing System.

    Science.gov (United States)

    Yu, John

    The Instructional Management System (IMS) developed by the Southwest Regional Laboratory (SWRL) processes student and teacher-generated data through the use of an optical scanner that produces a magnetic tape (Scan Tape) for input to IMS. A series of computer routines, OPSN, preprocesses the Scan Tape and prepares the data for transmission to the…

  16. An effective measured data preprocessing method in electrical impedance tomography.

    Science.gov (United States)

    Yu, Chenglong; Yue, Shihong; Wang, Jianpei; Wang, Huaxiang

    2014-01-01

    As an advanced process detection technology, electrical impedance tomography (EIT) has widely been paid attention to and studied in the industrial fields. But the EIT techniques are greatly limited to the low spatial resolutions. This problem may result from the incorrect preprocessing of measuring data and lack of general criterion to evaluate different preprocessing processes. In this paper, an EIT data preprocessing method is proposed by all rooting measured data and evaluated by two constructed indexes based on all rooted EIT measured data. By finding the optimums of the two indexes, the proposed method can be applied to improve the EIT imaging spatial resolutions. In terms of a theoretical model, the optimal rooting times of the two indexes range in [0.23, 0.33] and in [0.22, 0.35], respectively. Moreover, these factors that affect the correctness of the proposed method are generally analyzed. The measuring data preprocessing is necessary and helpful for any imaging process. Thus, the proposed method can be generally and widely used in any imaging process. Experimental results validate the two proposed indexes.

  17. Pre-processing for Triangulation of Probabilistic Networks

    NARCIS (Netherlands)

    Bodlaender, H.L.; Koster, A.M.C.A.; Eijkhof, F. van den; Gaag, L.C. van der

    2001-01-01

    The currently most efficient algorithm for inference with a probabilistic network builds upon a triangulation of a networks graph. In this paper, we show that pre-processing can help in finding good triangulations for probabilistic networks, that is, triangulations with a minimal maximum clique

  18. Research on pre-processing of QR Code

    Science.gov (United States)

    Sun, Haixing; Xia, Haojie; Dong, Ning

    2013-10-01

    QR code encodes many kinds of information because of its advantages: large storage capacity, high reliability, full arrange of utter-high-speed reading, small printing size and high-efficient representation of Chinese characters, etc. In order to obtain the clearer binarization image from complex background, and improve the recognition rate of QR code, this paper researches on pre-processing methods of QR code (Quick Response Code), and shows algorithms and results of image pre-processing for QR code recognition. Improve the conventional method by changing the Souvola's adaptive text recognition method. Additionally, introduce the QR code Extraction which adapts to different image size, flexible image correction approach, and improve the efficiency and accuracy of QR code image processing.

  19. An Efficient and Configurable Preprocessing Algorithm to Improve Stability Analysis.

    Science.gov (United States)

    Sesia, Ilaria; Cantoni, Elena; Cernigliaro, Alice; Signorile, Giovanna; Fantino, Gianluca; Tavella, Patrizia

    2016-04-01

    The Allan variance (AVAR) is widely used to measure the stability of experimental time series. Specifically, AVAR is commonly used in space applications such as monitoring the clocks of the global navigation satellite systems (GNSSs). In these applications, the experimental data present some peculiar aspects which are not generally encountered when the measurements are carried out in a laboratory. Space clocks' data can in fact present outliers, jumps, and missing values, which corrupt the clock characterization. Therefore, an efficient preprocessing is fundamental to ensure a proper data analysis and improve the stability estimation performed with the AVAR or other similar variances. In this work, we propose a preprocessing algorithm and its implementation in a robust software code (in MATLAB language) able to deal with time series of experimental data affected by nonstationarities and missing data; our method is properly detecting and removing anomalous behaviors, hence making the subsequent stability analysis more reliable.

  20. Adaptive fingerprint image enhancement with emphasis on preprocessing of data.

    Science.gov (United States)

    Bartůnek, Josef Ström; Nilsson, Mikael; Sällberg, Benny; Claesson, Ingvar

    2013-02-01

    This article proposes several improvements to an adaptive fingerprint enhancement method that is based on contextual filtering. The term adaptive implies that parameters of the method are automatically adjusted based on the input fingerprint image. Five processing blocks comprise the adaptive fingerprint enhancement method, where four of these blocks are updated in our proposed system. Hence, the proposed overall system is novel. The four updated processing blocks are: 1) preprocessing; 2) global analysis; 3) local analysis; and 4) matched filtering. In the preprocessing and local analysis blocks, a nonlinear dynamic range adjustment method is used. In the global analysis and matched filtering blocks, different forms of order statistical filters are applied. These processing blocks yield an improved and new adaptive fingerprint image processing method. The performance of the updated processing blocks is presented in the evaluation part of this paper. The algorithm is evaluated toward the NIST developed NBIS software for fingerprint recognition on FVC databases.

  1. Linguistic Preprocessing and Tagging for Problem Report Trend Analysis

    Science.gov (United States)

    Beil, Robert J.; Malin, Jane T.

    2012-01-01

    Mr. Robert Beil, Systems Engineer at Kennedy Space Center (KSC), requested the NASA Engineering and Safety Center (NESC) develop a prototype tool suite that combines complementary software technology used at Johnson Space Center (JSC) and KSC for problem report preprocessing and semantic tag extraction, to improve input to data mining and trend analysis. This document contains the outcome of the assessment and the Findings, Observations and NESC Recommendations.

  2. Research on Digital Watermark Using Pre-Processing Technology

    Institute of Scientific and Technical Information of China (English)

    Ru Guo-bao; Ru Guo-bao; Niu Hui-fang; Niu Hui-fang; Yang Rui; Yang Rui; Sun Hong; Sun Hong; Shi Hong-ling; Shi Hong-ling; Huang Tian-xi; Huang Tian-xi

    2003-01-01

    We have realized a watermark embedding system based on audio perceptual masking and brought forward a watermark detection system using pre-processing technology.We can detect watermark from watermarked audio without original audio by using this method. The results have indicated that this embedding and detecting method is robust, on the premise of not affecting the hearing quality, it can resist those attacks such as MPEG compressing, filtering and adding white noise.

  3. Biosignal data preprocessing: a voice pathology detection application

    Directory of Open Access Journals (Sweden)

    Genaro Daza Santacoloma

    2010-05-01

    Full Text Available A methodology for biosignal data preprocessing is presented. Experiments were mainly carried out with voice signals for automa- tically detecting pathologies. The proposed methodology was structured on 3 elements: outlier detection, normality verification and distribution transformation. It improved classification performance if basic assumptions about data structure were met. This entailed a more accurate detection of voice pathologies and it reduced the computational complexity of classification algorithms. Classification performance improved by 15%.

  4. Integration of geometric modeling and advanced finite element preprocessing

    Science.gov (United States)

    Shephard, Mark S.; Finnigan, Peter M.

    1987-01-01

    The structure to a geometry based finite element preprocessing system is presented. The key features of the system are the use of geometric operators to support all geometric calculations required for analysis model generation, and the use of a hierarchic boundary based data structure for the major data sets within the system. The approach presented can support the finite element modeling procedures used today as well as the fully automated procedures under development.

  5. Review of feed forward neural network classification preprocessing techniques

    Science.gov (United States)

    Asadi, Roya; Kareem, Sameem Abdul

    2014-06-01

    The best feature of artificial intelligent Feed Forward Neural Network (FFNN) classification models is learning of input data through their weights. Data preprocessing and pre-training are the contributing factors in developing efficient techniques for low training time and high accuracy of classification. In this study, we investigate and review the powerful preprocessing functions of the FFNN models. Currently initialization of the weights is at random which is the main source of problems. Multilayer auto-encoder networks as the latest technique like other related techniques is unable to solve the problems. Weight Linear Analysis (WLA) is a combination of data pre-processing and pre-training to generate real weights through the use of normalized input values. The FFNN model by using the WLA increases classification accuracy and improve training time in a single epoch without any training cycle, the gradient of the mean square error function, updating the weights. The results of comparison and evaluation show that the WLA is a powerful technique in the FFNN classification area yet.

  6. A Survey on Preprocessing Methods for Web Usage Data

    Directory of Open Access Journals (Sweden)

    V.Chitraa

    2010-03-01

    Full Text Available World Wide Web is a huge repository of web pages and links. It provides abundance of information for the Internet users. The growth of web is tremendous as approximately one million pages are added daily. Users’ accesses are recorded in web logs. Because of the tremendous usage of web, the web log files are growing at a faster rate and the size is becoming huge. Web data mining is the application of data mining techniques in web data. Web Usage Mining applies mining techniques in log data to extract the behavior of users which is used in various applications like personalized services, adaptive web sites, customer profiling, prefetching, creating attractive web sites etc., Web usage mining consists of three phases preprocessing, pattern discovery and pattern analysis. Web log data is usually noisy and ambiguous and preprocessing is an important process before mining. For discovering patterns sessions are to be constructed efficiently. This paper reviews existing work done in the preprocessing stage. A brief overview of various data mining techniques for discovering patterns, and pattern analysis are discussed. Finally a glimpse of various applications of web usage mining is also presented.

  7. Optimization of miRNA-seq data preprocessing.

    Science.gov (United States)

    Tam, Shirley; Tsao, Ming-Sound; McPherson, John D

    2015-11-01

    The past two decades of microRNA (miRNA) research has solidified the role of these small non-coding RNAs as key regulators of many biological processes and promising biomarkers for disease. The concurrent development in high-throughput profiling technology has further advanced our understanding of the impact of their dysregulation on a global scale. Currently, next-generation sequencing is the platform of choice for the discovery and quantification of miRNAs. Despite this, there is no clear consensus on how the data should be preprocessed before conducting downstream analyses. Often overlooked, data preprocessing is an essential step in data analysis: the presence of unreliable features and noise can affect the conclusions drawn from downstream analyses. Using a spike-in dilution study, we evaluated the effects of several general-purpose aligners (BWA, Bowtie, Bowtie 2 and Novoalign), and normalization methods (counts-per-million, total count scaling, upper quartile scaling, Trimmed Mean of M, DESeq, linear regression, cyclic loess and quantile) with respect to the final miRNA count data distribution, variance, bias and accuracy of differential expression analysis. We make practical recommendations on the optimal preprocessing methods for the extraction and interpretation of miRNA count data from small RNA-sequencing experiments.

  8. A Stereo Music Preprocessing Scheme for Cochlear Implant Users.

    Science.gov (United States)

    Buyens, Wim; van Dijk, Bas; Wouters, Jan; Moonen, Marc

    2015-10-01

    Listening to music is still one of the more challenging aspects of using a cochlear implant (CI) for most users. Simple musical structures, a clear rhythm/beat, and lyrics that are easy to follow are among the top factors contributing to music appreciation for CI users. Modifying the audio mix of complex music potentially improves music enjoyment in CI users. A stereo music preprocessing scheme is described in which vocals, drums, and bass are emphasized based on the representation of the harmonic and the percussive components in the input spectrogram, combined with the spatial allocation of instruments in typical stereo recordings. The scheme is assessed with postlingually deafened CI subjects (N = 7) using pop/rock music excerpts with different complexity levels. The scheme is capable of modifying relative instrument level settings, with the aim of improving music appreciation in CI users, and allows individual preference adjustments. The assessment with CI subjects confirms the preference for more emphasis on vocals, drums, and bass as offered by the preprocessing scheme, especially for songs with higher complexity. The stereo music preprocessing scheme has the potential to improve music enjoyment in CI users by modifying the audio mix in widespread (stereo) music recordings. Since music enjoyment in CI users is generally poor, this scheme can assist the music listening experience of CI users as a training or rehabilitation tool.

  9. Comparison of multivariate preprocessing techniques as applied to electronic tongue based pattern classification for black tea

    Energy Technology Data Exchange (ETDEWEB)

    Palit, Mousumi [Department of Electronics and Telecommunication Engineering, Central Calcutta Polytechnic, Kolkata 700014 (India); Tudu, Bipan, E-mail: bt@iee.jusl.ac.in [Department of Instrumentation and Electronics Engineering, Jadavpur University, Kolkata 700098 (India); Bhattacharyya, Nabarun [Centre for Development of Advanced Computing, Kolkata 700091 (India); Dutta, Ankur; Dutta, Pallab Kumar [Department of Instrumentation and Electronics Engineering, Jadavpur University, Kolkata 700098 (India); Jana, Arun [Centre for Development of Advanced Computing, Kolkata 700091 (India); Bandyopadhyay, Rajib [Department of Instrumentation and Electronics Engineering, Jadavpur University, Kolkata 700098 (India); Chatterjee, Anutosh [Department of Electronics and Communication Engineering, Heritage Institute of Technology, Kolkata 700107 (India)

    2010-08-18

    In an electronic tongue, preprocessing on raw data precedes pattern analysis and choice of the appropriate preprocessing technique is crucial for the performance of the pattern classifier. While attempting to classify different grades of black tea using a voltammetric electronic tongue, different preprocessing techniques have been explored and a comparison of their performances is presented in this paper. The preprocessing techniques are compared first by a quantitative measurement of separability followed by principle component analysis; and then two different supervised pattern recognition models based on neural networks are used to evaluate the performance of the preprocessing techniques.

  10. Comparison of multivariate preprocessing techniques as applied to electronic tongue based pattern classification for black tea.

    Science.gov (United States)

    Palit, Mousumi; Tudu, Bipan; Bhattacharyya, Nabarun; Dutta, Ankur; Dutta, Pallab Kumar; Jana, Arun; Bandyopadhyay, Rajib; Chatterjee, Anutosh

    2010-08-18

    In an electronic tongue, preprocessing on raw data precedes pattern analysis and choice of the appropriate preprocessing technique is crucial for the performance of the pattern classifier. While attempting to classify different grades of black tea using a voltammetric electronic tongue, different preprocessing techniques have been explored and a comparison of their performances is presented in this paper. The preprocessing techniques are compared first by a quantitative measurement of separability followed by principle component analysis; and then two different supervised pattern recognition models based on neural networks are used to evaluate the performance of the preprocessing techniques.

  11. Preprocessing Techniques for High-Efficiency Data Compression in Wireless Multimedia Sensor Networks

    Directory of Open Access Journals (Sweden)

    Junho Park

    2015-01-01

    Full Text Available We have proposed preprocessing techniques for high-efficiency data compression in wireless multimedia sensor networks. To do this, we analyzed the characteristics of multimedia data under the environment of wireless multimedia sensor networks. The proposed preprocessing techniques consider the characteristics of sensed multimedia data to perform the first stage preprocessing by deleting the low priority bits that do not affect the image quality. The second stage preprocessing is also performed for the undeleted high priority bits. By performing these two-stage preprocessing techniques, it is possible to reduce the multimedia data size in large. To show the superiority of our techniques, we simulated the existing multimedia data compression scheme with/without our preprocessing techniques. Our experimental results show that our proposed techniques increase compression ratio while reducing compression operations compared to the existing compression scheme without preprocessing techniques.

  12. Preprocessing and parameterizing bioimpedance spectroscopy measurements by singular value decomposition.

    Science.gov (United States)

    Nejadgholi, Isar; Caytak, Herschel; Bolic, Miodrag; Batkin, Izmail; Shirmohammadi, Shervin

    2015-05-01

    In several applications of bioimpedance spectroscopy, the measured spectrum is parameterized by being fitted into the Cole equation. However, the extracted Cole parameters seem to be inconsistent from one measurement session to another, which leads to a high standard deviation of extracted parameters. This inconsistency is modeled with a source of random variations added to the voltage measurement carried out in the time domain. These random variations may originate from biological variations that are irrelevant to the evidence that we are investigating. Yet, they affect the voltage measured by using a bioimpedance device based on which magnitude and phase of impedance are calculated.By means of simulated data, we showed that Cole parameters are highly affected by this type of variation. We further showed that singular value decomposition (SVD) is an effective tool for parameterizing bioimpedance measurements, which results in more consistent parameters than Cole parameters. We propose to apply SVD as a preprocessing method to reconstruct denoised bioimpedance measurements. In order to evaluate the method, we calculated the relative difference between parameters extracted from noisy and clean simulated bioimpedance spectra. Both mean and standard deviation of this relative difference are shown to effectively decrease when Cole parameters are extracted from preprocessed data in comparison to being extracted from raw measurements.We evaluated the performance of the proposed method in distinguishing three arm positions, for a set of experiments including eight subjects. It is shown that Cole parameters of different positions are not distinguishable when extracted from raw measurements. However, one arm position can be distinguished based on SVD scores. Moreover, all three positions are shown to be distinguished by two parameters, R0/R∞ and Fc, when Cole parameters are extracted from preprocessed measurements. These results suggest that SVD could be considered as an

  13. Contour extraction of echocardiographic images based on pre-processing

    Energy Technology Data Exchange (ETDEWEB)

    Hussein, Zinah Rajab; Rahmat, Rahmita Wirza; Abdullah, Lili Nurliyana [Department of Multimedia, Faculty of Computer Science and Information Technology, Department of Computer and Communication Systems Engineering, Faculty of Engineering University Putra Malaysia 43400 Serdang, Selangor (Malaysia); Zamrin, D M [Department of Surgery, Faculty of Medicine, National University of Malaysia, 56000 Cheras, Kuala Lumpur (Malaysia); Saripan, M Iqbal

    2011-02-15

    In this work we present a technique to extract the heart contours from noisy echocardiograph images. Our technique is based on improving the image before applying contours detection to reduce heavy noise and get better image quality. To perform that, we combine many pre-processing techniques (filtering, morphological operations, and contrast adjustment) to avoid unclear edges and enhance low contrast of echocardiograph images, after implementing these techniques we can get legible detection for heart boundaries and valves movement by traditional edge detection methods.

  14. Preprocessing and Analysis of LC-MS-Based Proteomic Data.

    Science.gov (United States)

    Tsai, Tsung-Heng; Wang, Minkun; Ressom, Habtom W

    2016-01-01

    Liquid chromatography coupled with mass spectrometry (LC-MS) has been widely used for profiling protein expression levels. This chapter is focused on LC-MS data preprocessing, which is a crucial step in the analysis of LC-MS based proteomics. We provide a high-level overview, highlight associated challenges, and present a step-by-step example for analysis of data from LC-MS based untargeted proteomic study. Furthermore, key procedures and relevant issues with the subsequent analysis by multiple reaction monitoring (MRM) are discussed.

  15. Effects of preprocessing Landsat MSS data on derived features

    Science.gov (United States)

    Parris, T. M.; Cicone, R. C.

    1983-01-01

    Important to the use of multitemporal Landsat MSS data for earth resources monitoring, such as agricultural inventories, is the ability to minimize the effects of varying atmospheric and satellite viewing conditions, while extracting physically meaningful features from the data. In general, the approaches to the preprocessing problem have been derived from either physical or statistical models. This paper compares three proposed algorithms; XSTAR haze correction, Color Normalization, and Multiple Acquisition Mean Level Adjustment. These techniques represent physical, statistical, and hybrid physical-statistical models, respectively. The comparisons are made in the context of three feature extraction techniques; the Tasseled Cap, the Cate Color Cube. and Normalized Difference.

  16. Preprocessing of GPR data for syntactic landmine detection and classification

    Science.gov (United States)

    Nasif, Ahmed O.; Hintz, Kenneth J.; Peixoto, Nathalia

    2010-04-01

    Syntactic pattern recognition is being used to detect and classify non-metallic landmines in terms of their range impedance discontinuity profile. This profile, extracted from the ground penetrating radar's return signal, constitutes a high-range-resolution and unique description of the inner structure of a landmine. In this paper, we discuss two preprocessing steps necessary to extract such a profile, namely, inverse filtering (deconvolving) and binarization. We validate the use of an inverse filter to effectively decompose the observed composite signal resulting from the different layers of dielectric materials of a landmine. It is demonstrated that the transmitted radar waveform undergoing multiple reflections with different materials does not change appreciably, and mainly depends on the transmit and receive processing chains of the particular radar being used. Then, a new inversion approach for the inverse filter is presented based on the cumulative contribution of the different frequency components to the original Fourier spectrum. We discuss the tradeoffs and challenges involved in such a filter design. The purpose of the binarization scheme is to localize the impedance discontinuities in range, by assigning a '1' to the peaks of the inverse filtered output, and '0' to all other values. The paper is concluded with simulation results showing the effectiveness of the proposed preprocessing technique.

  17. Simple and Effective Way for Data Preprocessing Selection Based on Design of Experiments.

    Science.gov (United States)

    Gerretzen, Jan; Szymańska, Ewa; Jansen, Jeroen J; Bart, Jacob; van Manen, Henk-Jan; van den Heuvel, Edwin R; Buydens, Lutgarde M C

    2015-12-15

    The selection of optimal preprocessing is among the main bottlenecks in chemometric data analysis. Preprocessing currently is a burden, since a multitude of different preprocessing methods is available for, e.g., baseline correction, smoothing, and alignment, but it is not clear beforehand which method(s) should be used for which data set. The process of preprocessing selection is often limited to trial-and-error and is therefore considered somewhat subjective. In this paper, we present a novel, simple, and effective approach for preprocessing selection. The defining feature of this approach is a design of experiments. On the basis of the design, model performance of a few well-chosen preprocessing methods, and combinations thereof (called strategies) is evaluated. Interpretation of the main effects and interactions subsequently enables the selection of an optimal preprocessing strategy. The presented approach is applied to eight different spectroscopic data sets, covering both calibration and classification challenges. We show that the approach is able to select a preprocessing strategy which improves model performance by at least 50% compared to the raw data; in most cases, it leads to a strategy very close to the true optimum. Our approach makes preprocessing selection fast, insightful, and objective.

  18. The Pre-Processing of Images Technique for the Materia

    Directory of Open Access Journals (Sweden)

    Yevgeniy P. Putyatin

    2016-08-01

    Full Text Available The image processing analysis is one of the most powerful tool in various research fields, especially in material / polymer science. Therefore in the present article an attempt has been made for study of pre-processing of images technique of the material samples during the images taken out by Scanning Electron Microscope (SEM. First we prepared the material samples with coir fibre (natural and its polymer composite after that the image analysis has been performed by SEM technique and later on the said studies have been conducted. The results presented here were found satisfactory and also are in good agreement with our earlier work and some other worker in the same field.

  19. A Gender Recognition Approach with an Embedded Preprocessing

    Directory of Open Access Journals (Sweden)

    Md. Mostafijur Rahman

    2015-05-01

    Full Text Available Gender recognition from facial images has become an empirical aspect in present world. It is one of the main problems of computer vision and researches have been conducting on it. Though several techniques have been proposed, most of the techniques focused on facial images in controlled situation. But the problem arises when the classification is performed in uncontrolled conditions like high rate of noise, lack of illumination, etc. To overcome these problems, we propose a new gender recognition framework which first preprocess and enhances the input images using Adaptive Gama Correction with Weighting Distribution. We used Labeled Faces in the Wild (LFW database for our experimental purpose which contains real life images of uncontrolled condition. For measuring the performance of our proposed method, we have used confusion matrix, precision, recall, F-measure, True Positive Rate (TPR, and False Positive Rate (FPR. In every case, our proposed framework performs superior over other existing state-of-the-art techniques.

  20. Constant-overhead secure computation of Boolean circuits using preprocessing

    DEFF Research Database (Denmark)

    Damgård, Ivan Bjerre; Zakarias, S.

    2013-01-01

    We present a protocol for securely computing a Boolean circuit C in presence of a dishonest and malicious majority. The protocol is unconditionally secure, assuming a preprocessing functionality that is not given the inputs. For a large number of players the work for each player is the same...... as computing the circuit in the clear, up to a constant factor. Our protocol is the first to obtain these properties for Boolean circuits. On the technical side, we develop new homomorphic authentication schemes based on asymptotically good codes with an additional multiplication property. We also show a new...... algorithm for verifying the product of Boolean matrices in quadratic time with exponentially small error probability, where previous methods only achieved constant error....

  1. Pre-Processing and Modeling Tools for Bigdata

    Directory of Open Access Journals (Sweden)

    Hashem Hadi

    2016-09-01

    Full Text Available Modeling tools and operators help the user / developer to identify the processing field on the top of the sequence and to send into the computing module only the data related to the requested result. The remaining data is not relevant and it will slow down the processing. The biggest challenge nowadays is to get high quality processing results with a reduced computing time and costs. To do so, we must review the processing sequence, by adding several modeling tools. The existing processing models do not take in consideration this aspect and focus on getting high calculation performances which will increase the computing time and costs. In this paper we provide a study of the main modeling tools for BigData and a new model based on pre-processing.

  2. Constant-Overhead Secure Computation of Boolean Circuits using Preprocessing

    DEFF Research Database (Denmark)

    Damgård, Ivan Bjerre; Zakarias, Sarah Nouhad Haddad

    We present a protocol for securely computing a Boolean circuit $C$ in presence of a dishonest and malicious majority. The protocol is unconditionally secure, assuming access to a preprocessing functionality that is not given the inputs to compute on. For a large number of players the work done...... by each player is the same as the work needed to compute the circuit in the clear, up to a constant factor. Our protocol is the first to obtain these properties for Boolean circuits. On the technical side, we develop new homomorphic authentication schemes based on asymptotically good codes...... with an additional multiplication property. We also show a new algorithm for verifying the product of Boolean matrices in quadratic time with exponentially small error probability, where previous methods would only give a constant error....

  3. Pre-processing in AI based Prediction of QSARs

    CERN Document Server

    Patri, Om Prasad

    2009-01-01

    Machine learning, data mining and artificial intelligence (AI) based methods have been used to determine the relations between chemical structure and biological activity, called quantitative structure activity relationships (QSARs) for the compounds. Pre-processing of the dataset, which includes the mapping from a large number of molecular descriptors in the original high dimensional space to a small number of components in the lower dimensional space while retaining the features of the original data, is the first step in this process. A common practice is to use a mapping method for a dataset without prior analysis. This pre-analysis has been stressed in our work by applying it to two important classes of QSAR prediction problems: drug design (predicting anti-HIV-1 activity) and predictive toxicology (estimating hepatocarcinogenicity of chemicals). We apply one linear and two nonlinear mapping methods on each of the datasets. Based on this analysis, we conclude the nature of the inherent relationships betwee...

  4. Digital soil mapping: strategy for data pre-processing

    Directory of Open Access Journals (Sweden)

    Alexandre ten Caten

    2012-08-01

    Full Text Available The region of greatest variability on soil maps is along the edge of their polygons, causing disagreement among pedologists about the appropriate description of soil classes at these locations. The objective of this work was to propose a strategy for data pre-processing applied to digital soil mapping (DSM. Soil polygons on a training map were shrunk by 100 and 160 m. This strategy prevented the use of covariates located near the edge of the soil classes for the Decision Tree (DT models. Three DT models derived from eight predictive covariates, related to relief and organism factors sampled on the original polygons of a soil map and on polygons shrunk by 100 and 160 m were used to predict soil classes. The DT model derived from observations 160 m away from the edge of the polygons on the original map is less complex and has a better predictive performance.

  5. Real-Time Rendering of Teeth with No Preprocessing

    DEFF Research Database (Denmark)

    Larsen, Christian Thode; Frisvad, Jeppe Revall; Jensen, Peter Dahl Ejby

    2012-01-01

    We present a technique for real-time rendering of teeth with no need for computational or artistic preprocessing. Teeth constitute a translucent material consisting of several layers; a highly scattering material (dentine) beneath a semitransparent layer (enamel) with a transparent coating (saliva......). In this study we examine how light interacts with this multilayered structure. In the past, rendering of teeth has mostly been done using image-based texturing or volumetric scans. We work with surface scans and have therefore developed a simple way of estimating layer thicknesses. We use scattering properties...... based on measurements reported in the optics literature, and we compare rendered results qualitatively to images of ceramic teeth created by denturists....

  6. Sparse and Unique Nonnegative Matrix Factorization Through Data Preprocessing

    CERN Document Server

    Gillis, Nicolas

    2012-01-01

    Nonnegative matrix factorization (NMF) has become a very popular technique in machine learning because it automatically extracts meaningful features through a sparse and part-based representation. However, NMF has the drawback of being highly ill-posed, that is, there typically exist many different but equivalent factorizations. In this paper, we introduce a completely new way to obtaining more well-posed NMF problems whose solutions are sparser. Our technique is based on the preprocessing of the nonnegative input data matrix, and relies on the theory of M-matrices and the geometric interpretation of NMF. This approach provably leads to optimal and sparse solutions under the separability assumption of Donoho and Stodden (NIPS, 2003), and, for rank-three matrices, makes the number of exact factorizations finite. We illustrate the effectiveness of our technique on several image datasets.

  7. Statistics in experimental design, preprocessing, and analysis of proteomics data.

    Science.gov (United States)

    Jung, Klaus

    2011-01-01

    High-throughput experiments in proteomics, such as 2-dimensional gel electrophoresis (2-DE) and mass spectrometry (MS), yield usually high-dimensional data sets of expression values for hundreds or thousands of proteins which are, however, observed on only a relatively small number of biological samples. Statistical methods for the planning and analysis of experiments are important to avoid false conclusions and to receive tenable results. In this chapter, the most frequent experimental designs for proteomics experiments are illustrated. In particular, focus is put on studies for the detection of differentially regulated proteins. Furthermore, issues of sample size planning, statistical analysis of expression levels as well as methods for data preprocessing are covered.

  8. Preprocessing in a Tiered Sensor Network for Habitat Monitoring

    Directory of Open Access Journals (Sweden)

    Hanbiao Wang

    2003-03-01

    Full Text Available We investigate task decomposition and collaboration in a two-tiered sensor network for habitat monitoring. The system recognizes and localizes a specified type of birdcalls. The system has a few powerful macronodes in the first tier, and many less powerful micronodes in the second tier. Each macronode combines data collected by multiple micronodes for target classification and localization. We describe two types of lightweight preprocessing which significantly reduce data transmission from micronodes to macronodes. Micronodes classify events according to their cross-zero rates and discard irrelevant events. Data about events of interest is reduced and compressed before being transmitted to macronodes for target localization. Preliminary experiments illustrate the effectiveness of event filtering and data reduction at micronodes.

  9. Data acquisition and preprocessing techniques for remote sensing field research

    Science.gov (United States)

    Biehl, L. L.; Robinson, B. F.

    1983-01-01

    A crops and soils data base has been developed at Purdue University's Laboratory for Applications of Remote Sensing using spectral and agronomic measurements made by several government and university researchers. The data are being used to (1) quantitatively determine the relationships of spectral and agronomic characteristics of crops and soils, (2) define future sensor systems, and (3) develop advanced data analysis techniques. Researchers follow defined data acquisition and preprocessing techniques to provide fully annotated and calibrated sets of spectral, agronomic, and meteorological data. These procedures enable the researcher to combine his data with that acquired by other researchers for remote sensing research. The key elements or requirements for developing a field research data base of spectral data that can be transported across sites and years are appropriate experiment design, accurate spectral data calibration, defined field procedures, and through experiment documentation.

  10. Radar image preprocessing. [of SEASAT-A SAR data

    Science.gov (United States)

    Frost, V. S.; Stiles, J. A.; Holtzman, J. C.; Held, D. N.

    1980-01-01

    Standard image processing techniques are not applicable to radar images because of the coherent nature of the sensor. Therefore there is a need to develop preprocessing techniques for radar images which will then allow these standard methods to be applied. A random field model for radar image data is developed. This model describes the image data as the result of a multiplicative-convolved process. Standard techniques, those based on additive noise and homomorphic processing are not directly applicable to this class of sensor data. Therefore, a minimum mean square error (MMSE) filter was designed to treat this class of sensor data. The resulting filter was implemented in an adaptive format to account for changes in local statistics and edges. A radar image processing technique which provides the MMSE estimate inside homogeneous areas and tends to preserve edge structure was the result of this study. Digitally correlated Seasat-A synthetic aperture radar (SAR) imagery was used to test the technique.

  11. Multiple Criteria Decision-Making Preprocessing Using Data Mining Tools

    CERN Document Server

    Mosavi, A

    2010-01-01

    Real-life engineering optimization problems need Multiobjective Optimization (MOO) tools. These problems are highly nonlinear. As the process of Multiple Criteria Decision-Making (MCDM) is much expanded most MOO problems in different disciplines can be classified on the basis of it. Thus MCDM methods have gained wide popularity in different sciences and applications. Meanwhile the increasing number of involved components, variables, parameters, constraints and objectives in the process, has made the process very complicated. However the new generation of MOO tools has made the optimization process more automated, but still initializing the process and setting the initial value of simulation tools and also identifying the effective input variables and objectives in order to reach the smaller design space are still complicated. In this situation adding a preprocessing step into the MCDM procedure could make a huge difference in terms of organizing the input variables according to their effects on the optimizati...

  12. Preprocessing Solar Images while Preserving their Latent Structure

    CERN Document Server

    Stein, Nathan M; Kashyap, Vinay L

    2015-01-01

    Telescopes such as the Atmospheric Imaging Assembly aboard the Solar Dynamics Observatory, a NASA satellite, collect massive streams of high resolution images of the Sun through multiple wavelength filters. Reconstructing pixel-by-pixel thermal properties based on these images can be framed as an ill-posed inverse problem with Poisson noise, but this reconstruction is computationally expensive and there is disagreement among researchers about what regularization or prior assumptions are most appropriate. This article presents an image segmentation framework for preprocessing such images in order to reduce the data volume while preserving as much thermal information as possible for later downstream analyses. The resulting segmented images reflect thermal properties but do not depend on solving the ill-posed inverse problem. This allows users to avoid the Poisson inverse problem altogether or to tackle it on each of $\\sim$10 segments rather than on each of $\\sim$10$^7$ pixels, reducing computing time by a facto...

  13. Prediction of speech intelligibility based on an auditory preprocessing model

    DEFF Research Database (Denmark)

    Christiansen, Claus Forup Corlin; Pedersen, Michael Syskind; Dau, Torsten

    2010-01-01

    Classical speech intelligibility models, such as the speech transmission index (STI) and the speech intelligibility index (SII) are based on calculations on the physical acoustic signals. The present study predicts speech intelligibility by combining a psychoacoustically validated model of auditory...... preprocessing [Dau et al., 1997. J. Acoust. Soc. Am. 102, 2892-2905] with a simple central stage that describes the similarity of the test signal with the corresponding reference signal at a level of the internal representation of the signals. The model was compared with previous approaches, whereby a speech...... in noise experiment was used for training and an ideal binary mask experiment was used for evaluation. All three models were able to capture the trends in the speech in noise training data well, but the proposed model provides a better prediction of the binary mask test data, particularly when the binary...

  14. Performance of Pre-processing Schemes with Imperfect Channel State Information

    DEFF Research Database (Denmark)

    Christensen, Søren Skovgaard; Kyritsi, Persa; De Carvalho, Elisabeth

    2006-01-01

    Pre-processing techniques have several benefits when the CSI is perfect. In this work we investigate three linear pre-processing filters, assuming imperfect CSI caused by noise degradation and channel temporal variation. Results indicate, that the LMMSE filter achieves the lowest BER and the high...

  15. A New Indicator for Optimal Preprocessing and Wavelengths Selection of Near-Infrared Spectra

    NARCIS (Netherlands)

    Skibsted, E.; Boelens, H.F.M.; Westerhuis, J.A.; Witte, D.T.; Smilde, A.K.

    2004-01-01

    Preprocessing of near-infrared spectra to remove unwanted, i.e., non-related spectral variation and selection of informative wavelengths is considered to be a crucial step prior to the construction of a quantitative calibration model. The standard methodology when comparing various preprocessing

  16. A New Indicator for Optimal Preprocessing and Wavelengths Selection of Near-Infrared Spectra

    NARCIS (Netherlands)

    Skibsted, E.; Boelens, H.F.M.; Westerhuis, J.A.; Witte, D.T.; Smilde, A.K.

    2004-01-01

    Preprocessing of near-infrared spectra to remove unwanted, i.e., non-related spectral variation and selection of informative wavelengths is considered to be a crucial step prior to the construction of a quantitative calibration model. The standard methodology when comparing various preprocessing tec

  17. Ensemble preprocessing of near-infrared (NIR) spectra for multivariate calibration.

    Science.gov (United States)

    Xu, Lu; Zhou, Yan-Ping; Tang, Li-Juan; Wu, Hai-Long; Jiang, Jian-Hui; Shen, Guo-Li; Yu, Ru-Qin

    2008-06-01

    Preprocessing of raw near-infrared (NIR) spectral data is indispensable in multivariate calibration when the measured spectra are subject to significant noises, baselines and other undesirable factors. However, due to the lack of sufficient prior information and an incomplete knowledge of the raw data, NIR spectra preprocessing in multivariate calibration is still trial and error. How to select a proper method depends largely on both the nature of the data and the expertise and experience of the practitioners. This might limit the applications of multivariate calibration in many fields, where researchers are not very familiar with the characteristics of many preprocessing methods unique in chemometrics and have difficulties to select the most suitable methods. Another problem is many preprocessing methods, when used alone, might degrade the data in certain aspects or lose some useful information while improving certain qualities of the data. In order to tackle these problems, this paper proposes a new concept of data preprocessing, ensemble preprocessing method, where partial least squares (PLSs) models built on differently preprocessed data are combined by Monte Carlo cross validation (MCCV) stacked regression. Little or no prior information of the data and expertise are required. Moreover, fusion of complementary information obtained by different preprocessing methods often leads to a more stable and accurate calibration model. The investigation of two real data sets has demonstrated the advantages of the proposed method.

  18. A New Indicator for Optimal Preprocessing and Wavelengths Selection of Near-Infrared Spectra

    NARCIS (Netherlands)

    E. Skibsted; H.F.M. Boelens; J.A. Westerhuis; D.T. Witte; A.K. Smilde

    2004-01-01

    Preprocessing of near-infrared spectra to remove unwanted, i.e., non-related spectral variation and selection of informative wavelengths is considered to be a crucial step prior to the construction of a quantitative calibration model. The standard methodology when comparing various preprocessing tec

  19. Automatic selection of preprocessing methods for improving predictions on mass spectrometry protein profiles.

    Science.gov (United States)

    Pelikan, Richard C; Hauskrecht, Milos

    2010-11-13

    Mass spectrometry proteomic profiling has potential to be a useful clinical screening tool. One obstacle is providing a standardized method for preprocessing the noisy raw data. We have developed a system for automatically determining a set of preprocessing methods among several candidates. Our system's automated nature relieves the analyst of the need to be knowledgeable about which methods to use on any given dataset. Each stage of preprocessing is approached with many competing methods. We introduce metrics which are used to balance each method's attempts to correct noise versus preserving valuable discriminative information. We demonstrate the benefit of our preprocessing system on several SELDI and MALDI mass spectrometry datasets. Downstream classification is improved when using our system to preprocess the data.

  20. ASAP: an environment for automated preprocessing of sequencing data

    Directory of Open Access Journals (Sweden)

    Torstenson Eric S

    2013-01-01

    Full Text Available Abstract Background Next-generation sequencing (NGS has yielded an unprecedented amount of data for genetics research. It is a daunting task to process the data from raw sequence reads to variant calls and manually processing this data can significantly delay downstream analysis and increase the possibility for human error. The research community has produced tools to properly prepare sequence data for analysis and established guidelines on how to apply those tools to achieve the best results, however, existing pipeline programs to automate the process through its entirety are either inaccessible to investigators, or web-based and require a certain amount of administrative expertise to set up. Findings Advanced Sequence Automated Pipeline (ASAP was developed to provide a framework for automating the translation of sequencing data into annotated variant calls with the goal of minimizing user involvement without the need for dedicated hardware or administrative rights. ASAP works both on computer clusters and on standalone machines with minimal human involvement and maintains high data integrity, while allowing complete control over the configuration of its component programs. It offers an easy-to-use interface for submitting and tracking jobs as well as resuming failed jobs. It also provides tools for quality checking and for dividing jobs into pieces for maximum throughput. Conclusions ASAP provides an environment for building an automated pipeline for NGS data preprocessing. This environment is flexible for use and future development. It is freely available at http://biostat.mc.vanderbilt.edu/ASAP.

  1. Breast image pre-processing for mammographic tissue segmentation.

    Science.gov (United States)

    He, Wenda; Hogg, Peter; Juette, Arne; Denton, Erika R E; Zwiggelaar, Reyer

    2015-12-01

    During mammographic image acquisition, a compression paddle is used to even the breast thickness in order to obtain optimal image quality. Clinical observation has indicated that some mammograms may exhibit abrupt intensity change and low visibility of tissue structures in the breast peripheral areas. Such appearance discrepancies can affect image interpretation and may not be desirable for computer aided mammography, leading to incorrect diagnosis and/or detection which can have a negative impact on sensitivity and specificity of screening mammography. This paper describes a novel mammographic image pre-processing method to improve image quality for analysis. An image selection process is incorporated to better target problematic images. The processed images show improved mammographic appearances not only in the breast periphery but also across the mammograms. Mammographic segmentation and risk/density classification were performed to facilitate a quantitative and qualitative evaluation. When using the processed images, the results indicated more anatomically correct segmentation in tissue specific areas, and subsequently better classification accuracies were achieved. Visual assessments were conducted in a clinical environment to determine the quality of the processed images and the resultant segmentation. The developed method has shown promising results. It is expected to be useful in early breast cancer detection, risk-stratified screening, and aiding radiologists in the process of decision making prior to surgery and/or treatment.

  2. Adaptive preprocessing algorithms of corneal topography in polar coordinate system

    Institute of Scientific and Technical Information of China (English)

    郭雁文

    2014-01-01

    New adaptive preprocessing algorithms based on the polar coordinate system were put forward to get high-precision corneal topography calculation results. Adaptive locating algorithms of concentric circle center were created to accurately capture the circle center of original Placido-based image, expand the image into matrix centered around the circle center, and convert the matrix into the polar coordinate system with the circle center as pole. Adaptive image smoothing treatment was followed and the characteristics of useful circles were extracted via horizontal edge detection, based on useful circles presenting approximate horizontal lines while noise signals presenting vertical lines or different angles. Effective combination of different operators of morphology were designed to remedy data loss caused by noise disturbances, get complete image about circle edge detection to satisfy the requests of precise calculation on follow-up parameters. The experimental data show that the algorithms meet the requirements of practical detection with characteristics of less data loss, higher data accuracy and easier availability.

  3. Multimodal image fusion with SIMS: Preprocessing with image registration.

    Science.gov (United States)

    Tarolli, Jay Gage; Bloom, Anna; Winograd, Nicholas

    2016-06-14

    In order to utilize complementary imaging techniques to supply higher resolution data for fusion with secondary ion mass spectrometry (SIMS) chemical images, there are a number of aspects that, if not given proper consideration, could produce results which are easy to misinterpret. One of the most critical aspects is that the two input images must be of the same exact analysis area. With the desire to explore new higher resolution data sources that exists outside of the mass spectrometer, this requirement becomes even more important. To ensure that two input images are of the same region, an implementation of the insight segmentation and registration toolkit (ITK) was developed to act as a preprocessing step before performing image fusion. This implementation of ITK allows for several degrees of movement between two input images to be accounted for, including translation, rotation, and scale transforms. First, the implementation was confirmed to accurately register two multimodal images by supplying a known transform. Once validated, two model systems, a copper mesh grid and a group of RAW 264.7 cells, were used to demonstrate the use of the ITK implementation to register a SIMS image with a microscopy image for the purpose of performing image fusion.

  4. Software for Preprocessing Data from Rocket-Engine Tests

    Science.gov (United States)

    Cheng, Chiu-Fu

    2004-01-01

    Three computer programs have been written to preprocess digitized outputs of sensors during rocket-engine tests at Stennis Space Center (SSC). The programs apply exclusively to the SSC E test-stand complex and utilize the SSC file format. The programs are the following: Engineering Units Generator (EUGEN) converts sensor-output-measurement data to engineering units. The inputs to EUGEN are raw binary test-data files, which include the voltage data, a list identifying the data channels, and time codes. EUGEN effects conversion by use of a file that contains calibration coefficients for each channel. QUICKLOOK enables immediate viewing of a few selected channels of data, in contradistinction to viewing only after post-test processing (which can take 30 minutes to several hours depending on the number of channels and other test parameters) of data from all channels. QUICKLOOK converts the selected data into a form in which they can be plotted in engineering units by use of Winplot (a free graphing program written by Rick Paris). EUPLOT provides a quick means for looking at data files generated by EUGEN without the necessity of relying on the PV-WAVE based plotting software.

  5. Nonlinear preprocessing method for detecting peaks from gas chromatograms

    Directory of Open Access Journals (Sweden)

    Min Hyeyoung

    2009-11-01

    Full Text Available Abstract Background The problem of locating valid peaks from data corrupted by noise frequently arises while analyzing experimental data. In various biological and chemical data analysis tasks, peak detection thus constitutes a critical preprocessing step that greatly affects downstream analysis and eventual quality of experiments. Many existing techniques require the users to adjust parameters by trial and error, which is error-prone, time-consuming and often leads to incorrect analysis results. Worse, conventional approaches tend to report an excessive number of false alarms by finding fictitious peaks generated by mere noise. Results We have designed a novel peak detection method that can significantly reduce parameter sensitivity, yet providing excellent peak detection performance and negligible false alarm rates from gas chromatographic data. The key feature of our new algorithm is the successive use of peak enhancement algorithms that are deliberately designed for a gradual improvement of peak detection quality. We tested our approach with real gas chromatograms as well as intentionally contaminated spectra that contain Gaussian or speckle-type noise. Conclusion Our results demonstrate that the proposed method can achieve near perfect peak detection performance while maintaining very small false alarm probabilities in case of gas chromatograms. Given the fact that biological signals appear in the form of peaks in various experimental data and that the propose method can easily be extended to such data, our approach will be a useful and robust tool that can help researchers highlight valid signals in their noisy measurements.

  6. Visualisation and pre-processing of peptide microarray data.

    Science.gov (United States)

    Reilly, Marie; Valentini, Davide

    2009-01-01

    The data files produced by digitising peptide microarray images contain detailed information on the location, feature, response parameters and quality of each spot on each array. In this chapter, we will describe how such peptide microarray data can be read into the R statistical package and pre-processed in preparation for subsequent comparative or predictive analysis. We illustrate how the information in the data can be visualised using images and graphical displays that highlight the main features, enabling the quality of the data to be assessed and invalid data points to be identified and excluded. The log-ratio of the foreground to background signal is used as a response index. Negative control responses serve as a reference against which "detectable" responses can be defined, and slides incubated with only buffer and secondary antibody help identify false-positive responses from peptides. For peptides that have a detectable response on at least one subarray, and no false-positive response, we use linear mixed models to remove artefacts due to the arrays and their architecture. The resulting normalized responses provide the input data for further analysis.

  7. 数据挖掘中的数据预处理%Data Preprocessing in Ddta Mining

    Institute of Scientific and Technical Information of China (English)

    刘明吉; 王秀峰; 黄亚楼

    2000-01-01

    Data Mining (DM) is a new hot research point in database area. Because the real-world data is not ideal,it is necessary to do some data preprocessing to meet the requirement of DM algorithms. In this paper,we discuss the procedure of data preprocessing and present the work of data preprocessing in details. We also discuss the methods and technologies used in data preprocessing.

  8. Automated Pre-processing for NMR Assignments with Reduced Tedium

    Energy Technology Data Exchange (ETDEWEB)

    2004-05-11

    An important rate-limiting step in the reasonance asignment process is accurate identification of resonance peaks in MNR spectra. NMR spectra are noisy. Hence, automatic peak-picking programs must navigate between the Scylla of reliable but incomplete picking, and the Charybdis of noisy but complete picking. Each of these extremes complicates the assignment process: incomplete peak-picking results in the loss of essential connectivities, while noisy picking conceals the true connectivities under a combinatiorial explosion of false positives. Intermediate processing can simplify the assignment process by preferentially removing false peaks from noisy peak lists. This is accomplished by requiring consensus between multiple NMR experiments, exploiting a priori information about NMR spectra, and drawing on empirical statistical distributions of chemical shift extracted from the BioMagResBank. Experienced NMR practitioners currently apply many of these techniques "by hand", which is tedious, and may appear arbitrary to the novice. To increase efficiency, we have created a systematic and automated approach to this process, known as APART. Automated pre-processing has three main advantages: reduced tedium, standardization, and pedagogy. In the hands of experienced spectroscopists, the main advantage is reduced tedium (a rapid increase in the ratio of true peaks to false peaks with minimal effort). When a project is passed from hand to hand, the main advantage is standardization. APART automatically documents the peak filtering process by archiving its original recommendations, the accompanying justifications, and whether a user accepted or overrode a given filtering recommendation. In the hands of a novice, this tool can reduce the stumbling block of learning to differentiate between real peaks and noise, by providing real-time examples of how such decisions are made.

  9. Spatial-spectral preprocessing for endmember extraction on GPU's

    Science.gov (United States)

    Jimenez, Luis I.; Plaza, Javier; Plaza, Antonio; Li, Jun

    2016-10-01

    Spectral unmixing is focused in the identification of spectrally pure signatures, called endmembers, and their corresponding abundances in each pixel of a hyperspectral image. Mainly focused on the spectral information contained in the hyperspectral images, endmember extraction techniques have recently included spatial information to achieve more accurate results. Several algorithms have been developed for automatic or semi-automatic identification of endmembers using spatial and spectral information, including the spectral-spatial endmember extraction (SSEE) where, within a preprocessing step in the technique, both sources of information are extracted from the hyperspectral image and equally used for this purpose. Previous works have implemented the SSEE technique in four main steps: 1) local eigenvectors calculation in each sub-region in which the original hyperspectral image is divided; 2) computation of the maxima and minima projection of all eigenvectors over the entire hyperspectral image in order to obtain a candidates pixels set; 3) expansion and averaging of the signatures of the candidate set; 4) ranking based on the spectral angle distance (SAD). The result of this method is a list of candidate signatures from which the endmembers can be extracted using various spectral-based techniques, such as orthogonal subspace projection (OSP), vertex component analysis (VCA) or N-FINDR. Considering the large volume of data and the complexity of the calculations, there is a need for efficient implementations. Latest- generation hardware accelerators such as commodity graphics processing units (GPUs) offer a good chance for improving the computational performance in this context. In this paper, we develop two different implementations of the SSEE algorithm using GPUs. Both are based on the eigenvectors computation within each sub-region of the first step, one using the singular value decomposition (SVD) and another one using principal component analysis (PCA). Based

  10. ASTER Level 1B Registered Radiance at the Sensor

    Data.gov (United States)

    U.S. Geological Survey, Department of the Interior — The Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) is an advanced multispectral imager that was launched on board NASA's Terra spacecraft in...

  11. MISR Level 1B2 Ellipsoid Data V003

    Data.gov (United States)

    National Aeronautics and Space Administration — This file contains Ellipsoid-projected TOA Radiance,resampled at the surface and topographically corrected, as well as geometrically corrected by PGE22

  12. GPM GMI Level 1B Brightness Temperatures VV03A

    Data.gov (United States)

    National Aeronautics and Space Administration — The 1BGMI algorithm uses a non-linear three-point in-flight calibration to derive antenna temperature (Ta) and convert Ta to Tb using GMI antenna pattern...

  13. GPM GMI Level 1B Brightness Temperatures VV03B

    Data.gov (United States)

    National Aeronautics and Space Administration — The 1BGMI algorithm uses a non-linear three-point in-flight calibration to derive antenna temperature (Ta) and convert Ta to Tb using GMI antenna pattern...

  14. An Automated, Adaptive Framework for Optimizing Preprocessing Pipelines in Task-Based Functional MRI.

    Directory of Open Access Journals (Sweden)

    Nathan W Churchill

    Full Text Available BOLD fMRI is sensitive to blood-oxygenation changes correlated with brain function; however, it is limited by relatively weak signal and significant noise confounds. Many preprocessing algorithms have been developed to control noise and improve signal detection in fMRI. Although the chosen set of preprocessing and analysis steps (the "pipeline" significantly affects signal detection, pipelines are rarely quantitatively validated in the neuroimaging literature, due to complex preprocessing interactions. This paper outlines and validates an adaptive resampling framework for evaluating and optimizing preprocessing choices by optimizing data-driven metrics of task prediction and spatial reproducibility. Compared to standard "fixed" preprocessing pipelines, this optimization approach significantly improves independent validation measures of within-subject test-retest, and between-subject activation overlap, and behavioural prediction accuracy. We demonstrate that preprocessing choices function as implicit model regularizers, and that improvements due to pipeline optimization generalize across a range of simple to complex experimental tasks and analysis models. Results are shown for brief scanning sessions (<3 minutes each, demonstrating that with pipeline optimization, it is possible to obtain reliable results and brain-behaviour correlations in relatively small datasets.

  15. An Automated, Adaptive Framework for Optimizing Preprocessing Pipelines in Task-Based Functional MRI.

    Science.gov (United States)

    Churchill, Nathan W; Spring, Robyn; Afshin-Pour, Babak; Dong, Fan; Strother, Stephen C

    2015-01-01

    BOLD fMRI is sensitive to blood-oxygenation changes correlated with brain function; however, it is limited by relatively weak signal and significant noise confounds. Many preprocessing algorithms have been developed to control noise and improve signal detection in fMRI. Although the chosen set of preprocessing and analysis steps (the "pipeline") significantly affects signal detection, pipelines are rarely quantitatively validated in the neuroimaging literature, due to complex preprocessing interactions. This paper outlines and validates an adaptive resampling framework for evaluating and optimizing preprocessing choices by optimizing data-driven metrics of task prediction and spatial reproducibility. Compared to standard "fixed" preprocessing pipelines, this optimization approach significantly improves independent validation measures of within-subject test-retest, and between-subject activation overlap, and behavioural prediction accuracy. We demonstrate that preprocessing choices function as implicit model regularizers, and that improvements due to pipeline optimization generalize across a range of simple to complex experimental tasks and analysis models. Results are shown for brief scanning sessions (<3 minutes each), demonstrating that with pipeline optimization, it is possible to obtain reliable results and brain-behaviour correlations in relatively small datasets.

  16. The Effects of Pre-processing Strategies for Pediatric Cochlear Implant Recipients

    Science.gov (United States)

    Rakszawski, Bernadette; Wright, Rose; Cadieux, Jamie H.; Davidson, Lisa S.; Brenner, Christine

    2016-01-01

    Background Cochlear implants (CIs) have been shown to improve children’s speech recognition over traditional amplification when severe to profound sensorineural hearing loss is present. Despite improvements, understanding speech at low-level intensities or in the presence of background noise remains difficult. In an effort to improve speech understanding in challenging environments, Cochlear Ltd. offers pre-processing strategies that apply various algorithms prior to mapping the signal to the internal array. Two of these strategies include Autosensitivity Control™ (ASC) and Adaptive Dynamic Range Optimization (ADRO®). Based on previous research, the manufacturer’s default pre-processing strategy for pediatrics’ everyday programs combines ASC+ADRO®. Purpose The purpose of this study is to compare pediatric speech perception performance across various pre-processing strategies while applying a specific programming protocol utilizing increased threshold (T) levels to ensure access to very low-level sounds. Research Design This was a prospective, cross-sectional, observational study. Participants completed speech perception tasks in four pre-processing conditions: no pre-processing, ADRO®, ASC, ASC+ADRO®. Study Sample Eleven pediatric Cochlear Ltd. cochlear implant users were recruited: six bilateral, one unilateral, and four bimodal. Intervention Four programs, with the participants’ everyday map, were loaded into the processor with different pre-processing strategies applied in each of the four positions: no pre-processing, ADRO®, ASC, and ASC+ADRO®. Data Collection and Analysis Participants repeated CNC words presented at 50 and 70 dB SPL in quiet and HINT sentences presented adaptively with competing R-Space noise at 60 and 70 dB SPL. Each measure was completed as participants listened with each of the four pre-processing strategies listed above. Test order and condition were randomized. A repeated-measures analysis of variance (ANOVA) was used to

  17. Examination of Speed Contribution of Parallelization for Several Fingerprint Pre-Processing Algorithms

    Directory of Open Access Journals (Sweden)

    GORGUNOGLU, S.

    2014-05-01

    Full Text Available In analysis of minutiae based fingerprint systems, fingerprints needs to be pre-processed. The pre-processing is carried out to enhance the quality of the fingerprint and to obtain more accurate minutiae points. Reducing the pre-processing time is important for identification and verification in real time systems and especially for databases holding large fingerprints information. Parallel processing and parallel CPU computing can be considered as distribution of processes over multi core processor. This is done by using parallel programming techniques. Reducing the execution time is the main objective in parallel processing. In this study, pre-processing of minutiae based fingerprint system is implemented by parallel processing on multi core computers using OpenMP and on graphics processor using CUDA to improve execution time. The execution times and speedup ratios are compared with the one that of single core processor. The results show that by using parallel processing, execution time is substantially improved. The improvement ratios obtained for different pre-processing algorithms allowed us to make suggestions on the more suitable approaches for parallelization.

  18. Study on preprocessing of surface defect images of cold steel strip

    Directory of Open Access Journals (Sweden)

    Xiaoye GE

    2016-06-01

    Full Text Available The image preprocessing is an important part in the field of digital image processing, and it’s also the premise for the image detection of cold steel strip surface defects. The factors including the complicated on-site environment and the distortion of the optical system will cause image degradation, which will directly affects the feature extraction and classification of the images. Aiming at these problems, a method combining the adaptive median filter and homomorphic filter is proposed to preprocess the image. The adaptive median filter is effective for image denoising, and the Gaussian homomorphic filter can steadily remove the nonuniform illumination of images. Finally, the original and preprocessed images and their features are analyzed and compared. The results show that this method can improve the image quality effectively.

  19. Optimization of Preprocessing and Densification of Sorghum Stover at Full-scale Operation

    Energy Technology Data Exchange (ETDEWEB)

    Neal A. Yancey; Jaya Shankar Tumuluru; Craig C. Conner; Christopher T. Wright

    2011-08-01

    Transportation costs can be a prohibitive step in bringing biomass to a preprocessing location or biofuel refinery. One alternative to transporting biomass in baled or loose format to a preprocessing location, is to utilize a mobile preprocessing system that can be relocated to various locations where biomass is stored, preprocess and densify the biomass, then ship it to the refinery as needed. The Idaho National Laboratory has a full scale 'Process Demonstration Unit' PDU which includes a stage 1 grinder, hammer mill, drier, pellet mill, and cooler with the associated conveyance system components. Testing at bench and pilot scale has been conducted to determine effects of moisture on preprocessing, crop varieties on preprocessing efficiency and product quality. The INLs PDU provides an opportunity to test the conclusions made at the bench and pilot scale on full industrial scale systems. Each component of the PDU is operated from a central operating station where data is collected to determine power consumption rates for each step in the process. The power for each electrical motor in the system is monitored from the control station to monitor for problems and determine optimal conditions for the system performance. The data can then be viewed to observe how changes in biomass input parameters (moisture and crop type for example), mechanical changes (screen size, biomass drying, pellet size, grinding speed, etc.,), or other variations effect the power consumption of the system. Sorgum in four foot round bales was tested in the system using a series of 6 different screen sizes including: 3/16 in., 1 in., 2 in., 3 in., 4 in., and 6 in. The effect on power consumption, product quality, and production rate were measured to determine optimal conditions.

  20. Boosting model performance and interpretation by entangling preprocessing selection and variable selection.

    Science.gov (United States)

    Gerretzen, Jan; Szymańska, Ewa; Bart, Jacob; Davies, Antony N; van Manen, Henk-Jan; van den Heuvel, Edwin R; Jansen, Jeroen J; Buydens, Lutgarde M C

    2016-09-28

    The aim of data preprocessing is to remove data artifacts-such as a baseline, scatter effects or noise-and to enhance the contextually relevant information. Many preprocessing methods exist to deliver one or more of these benefits, but which method or combination of methods should be used for the specific data being analyzed is difficult to select. Recently, we have shown that a preprocessing selection approach based on Design of Experiments (DoE) enables correct selection of highly appropriate preprocessing strategies within reasonable time frames. In that approach, the focus was solely on improving the predictive performance of the chemometric model. This is, however, only one of the two relevant criteria in modeling: interpretation of the model results can be just as important. Variable selection is often used to achieve such interpretation. Data artifacts, however, may hamper proper variable selection by masking the true relevant variables. The choice of preprocessing therefore has a huge impact on the outcome of variable selection methods and may thus hamper an objective interpretation of the final model. To enhance such objective interpretation, we here integrate variable selection into the preprocessing selection approach that is based on DoE. We show that the entanglement of preprocessing selection and variable selection not only improves the interpretation, but also the predictive performance of the model. This is achieved by analyzing several experimental data sets of which the true relevant variables are available as prior knowledge. We show that a selection of variables is provided that complies more with the true informative variables compared to individual optimization of both model aspects. Importantly, the approach presented in this work is generic. Different types of models (e.g. PCR, PLS, …) can be incorporated into it, as well as different variable selection methods and different preprocessing methods, according to the taste and experience of

  1. Genetic Algorithm for Optimization: Preprocessing with n Dimensional Bisection and Error Estimation

    Science.gov (United States)

    Sen, S. K.; Shaykhian, Gholam Ali

    2006-01-01

    A knowledge of the appropriate values of the parameters of a genetic algorithm (GA) such as the population size, the shrunk search space containing the solution, crossover and mutation probabilities is not available a priori for a general optimization problem. Recommended here is a polynomial-time preprocessing scheme that includes an n-dimensional bisection and that determines the foregoing parameters before deciding upon an appropriate GA for all problems of similar nature and type. Such a preprocessing is not only fast but also enables us to get the global optimal solution and its reasonably narrow error bounds with a high degree of confidence.

  2. Performance of Pre-processing Schemes with Imperfect Channel State Information

    DEFF Research Database (Denmark)

    Christensen, Søren Skovgaard; Kyritsi, Persa; De Carvalho, Elisabeth

    2006-01-01

    Pre-processing techniques have several benefits when the CSI is perfect. In this work we investigate three linear pre-processing filters, assuming imperfect CSI caused by noise degradation and channel temporal variation. Results indicate, that the LMMSE filter achieves the lowest BER and the high...... and the highest SINR when the CSI is perfect, whereas the simple matched filter may be a good choice when the CSI is imperfect. Additionally the results give insight into the inherent trade-off between robustness against CSI imperfections and spatial focusing ability....

  3. ACTS (Advanced Communications Technology Satellite) Propagation Experiment: Preprocessing Software User's Manual

    Science.gov (United States)

    Crane, Robert K.; Wang, Xuhe; Westenhaver, David

    1996-01-01

    The preprocessing software manual describes the Actspp program originally developed to observe and diagnose Advanced Communications Technology Satellite (ACTS) propagation terminal/receiver problems. However, it has been quite useful for automating the preprocessing functions needed to convert the terminal output to useful attenuation estimates. Prior to having data acceptable for archival functions, the individual receiver system must be calibrated and the power level shifts caused by ranging tone modulation must be received. Actspp provides three output files: the daylog, the diurnal coefficient file, and the file that contains calibration information.

  4. Data acquisition, preprocessing and analysis for the Virginia Tech OLYMPUS experiment

    Science.gov (United States)

    Remaklus, P. Will

    1991-01-01

    Virginia Tech is conducting a slant path propagation experiment using the 12, 20, and 30 GHz OLYMPUS beacons. Beacon signal measurements are made using separate terminals for each frequency. In addition, short baseline diversity measurements are collected through a mobile 20 GHz terminal. Data collection is performed with a custom data acquisition and control system. Raw data are preprocessed to remove equipment biases and discontinuities prior to analysis. Preprocessed data are then statistically analyzed to investigate parameters such as frequency scaling, fade slope and duration, and scintillation intensity.

  5. Preprocessing of Tandem Mass Spectrometric Data Based on Decision Tree Classification

    Institute of Scientific and Technical Information of China (English)

    Jing-Fen Zhang; Si-Min He; Jin-Jin Cai; Xing-Jun Cao; Rui-Xiang Sun; Yan Fu; Rong Zeng; Wen Gao

    2005-01-01

    In this study, we present a preprocessing method for quadrupole time-of-flight(Q-TOF) tandem mass spectra to increase the accuracy of database searching for peptide (protein) identification. Based on the natural isotopic information inherent in tandem mass spectra, we construct a decision tree after feature selection to classify the noise and ion peaks in tandem spectra. Furthermore, we recognize overlapping peaks to find the monoisotopic masses of ions for the following identification process. The experimental results show that this preprocessing method increases the search speed and the reliability of peptide identification.

  6. Influence of Hemp Fibers Pre-processing on Low Density Polyethylene Matrix Composites Properties

    Science.gov (United States)

    Kukle, S.; Vidzickis, R.; Zelca, Z.; Belakova, D.; Kajaks, J.

    2016-04-01

    In present research with short hemp fibres reinforced LLDPE matrix composites with fibres content in a range from 30 to 50 wt% subjected to four different pre-processing technologies were produced and such their properties as tensile strength and elongation at break, tensile modulus, melt flow index, micro hardness and water absorption dynamics were investigated. Capillary viscosimetry was used for fluidity evaluation and melt flow index (MFI) evaluated for all variants. MFI of fibres of two pre-processing variants were high enough to increase hemp fibres content from 30 to 50 wt% with moderate increase of water sorption capability.

  7. A Real-Time Embedded System for Stereo Vision Preprocessing Using an FPGA

    DEFF Research Database (Denmark)

    Kjær-Nielsen, Anders; Jensen, Lars Baunegaard With; Sørensen, Anders Stengaard

    2008-01-01

    In this paper a low level vision processing node for use in existing IEEE 1394 camera setups is presented. The processing node is a small embedded system, that utilizes an FPGA to perform stereo vision preprocessing at rates limited by the bandwidth of IEEE 1394a (400Mbit). The system is used...

  8. Evaluation of Microarray Preprocessing Algorithms Based on Concordance with RT-PCR in Clinical Samples

    DEFF Research Database (Denmark)

    Hansen, Kasper Lage; Szallasi, Zoltan Imre; Eklund, Aron Charles

    2009-01-01

    evaluated consistency using the Pearson correlation between measurements obtained on the two platforms. Also, we introduce the log-ratio discrepancy as a more relevant measure of discordance between gene expression platforms. Of nine preprocessing algorithms tested, PLIER+16 produced expression values...

  9. Scene matching based on non-linear pre-processing on reference image and sensed image

    Institute of Scientific and Technical Information of China (English)

    Zhong Sheng; Zhang Tianxu; Sang Nong

    2005-01-01

    To solve the heterogeneous image scene matching problem, a non-linear pre-processing method for the original images before intensity-based correlation is proposed. The result shows that the proper matching probability is raised greatly. Especially for the low S/N image pairs, the effect is more remarkable.

  10. A New Endmember Preprocessing Method for the Hyperspectral Unmixing of Imagery Containing Marine Oil Spills

    Directory of Open Access Journals (Sweden)

    Can Cui

    2017-09-01

    Full Text Available The current methods that use hyperspectral remote sensing imagery to extract and monitor marine oil spills are quite popular. However, the automatic extraction of endmembers from hyperspectral imagery remains a challenge. This paper proposes a data field-spectral preprocessing (DSPP algorithm for endmember extraction. The method first derives a set of extreme points from the data field of an image. At the same time, it identifies a set of spectrally pure points in the spectral space. Finally, the preprocessing algorithm fuses the data field with the spectral calculation to generate a new subset of endmember candidates for the following endmember extraction. The processing time is greatly shortened by directly using endmember extraction algorithms. The proposed algorithm provides accurate endmember detection, including the detection of anomalous endmembers. Therefore, it has a greater accuracy, stronger noise resistance, and is less time-consuming. Using both synthetic hyperspectral images and real airborne hyperspectral images, we utilized the proposed preprocessing algorithm in combination with several endmember extraction algorithms to compare the proposed algorithm with the existing endmember extraction preprocessing algorithms. The experimental results show that the proposed method can effectively extract marine oil spill data.

  11. affyPara-a Bioconductor Package for Parallelized Preprocessing Algorithms of Affymetrix Microarray Data.

    Science.gov (United States)

    Schmidberger, Markus; Vicedo, Esmeralda; Mansmann, Ulrich

    2009-07-22

    Microarray data repositories as well as large clinical applications of gene expression allow to analyse several hundreds of microarrays at one time. The preprocessing of large amounts of microarrays is still a challenge. The algorithms are limited by the available computer hardware. For example, building classification or prognostic rules from large microarray sets will be very time consuming. Here, preprocessing has to be a part of the cross-validation and resampling strategy which is necessary to estimate the rule's prediction quality honestly.This paper proposes the new Bioconductor package affyPara for parallelized preprocessing of Affymetrix microarray data. Partition of data can be applied on arrays and parallelization of algorithms is a straightforward consequence. The partition of data and distribution to several nodes solves the main memory problems and accelerates preprocessing by up to the factor 20 for 200 or more arrays.affyPara is a free and open source package, under GPL license, available form the Bioconductor project at www.bioconductor.org. A user guide and examples are provided with the package.

  12. Pre-processing filter design at transmitters for IBI mitigation in an OFDM system

    Institute of Scientific and Technical Information of China (English)

    Xia Wang; Lei Wang

    2013-01-01

    In order to meet the demands for high transmission rates and high service quality in broadband wireless communica-tion systems, orthogonal frequency division multiplexing (OFDM) has been adopted in some standards. However, the inter-block interference (IBI) and inter-carrier interference (ICI) in an OFDM system affect the performance. To mitigate IBI and ICI, some pre-processing approaches have been proposed based on ful channel state information (CSI), which improved the system per-formance. A pre-processing filter based on partial CSI at the trans-mitter is designed and investigated. The filter coefficient is given by the optimization processing, the symbol error rate (SER) is tested, and the computation complexity of the proposed scheme is analyzed. Computer simulation results show that the proposed pre-processing filter can effectively mitigate IBI and ICI and the performance can be improved. Compared with pre-processing approaches at the transmitter based on ful CSI, the proposed scheme has high spectral efficiency, limited CSI feedback and low computation complexity.

  13. Inter-Rater Reliability of Preprocessing EEG Data: Impact of Subjective Artifact Removal on Associative Memory Task ERP Results

    Directory of Open Access Journals (Sweden)

    Steven D. Shirk

    2017-06-01

    Full Text Available The processing of EEG data routinely involves subjective removal of artifacts during a preprocessing stage. Preprocessing inter-rater reliability (IRR and how differences in preprocessing may affect outcomes of primary event-related potential (ERP analyses has not been previously assessed. Three raters independently preprocessed EEG data of 16 cognitively healthy adult participants (ages 18–39 years who performed a memory task. Using intraclass correlations (ICCs, IRR was assessed for Early-frontal, Late-frontal, and Parietal Old/new memory effects contrasts across eight regions of interest (ROIs. IRR was good to excellent for all ROIs; 22 of 26 ICCs were above 0.80. Raters were highly consistent in preprocessing across ROIs, although the frontal pole ROI (ICC range 0.60–0.90 showed less consistency. Old/new parietal effects had highest ICCs with the lowest variability. Rater preprocessing differences did not alter primary ERP results. IRR for EEG preprocessing was good to excellent, and subjective rater-removal of EEG artifacts did not alter primary memory-task ERP results. Findings provide preliminary support for robustness of cognitive/memory task-related ERP results against significant inter-rater preprocessing variability and suggest reliability of EEG to assess cognitive-neurophysiological processes multiple preprocessors are involved.

  14. Predictive modeling of colorectal cancer using a dedicated pre-processing pipeline on routine electronic medical records

    NARCIS (Netherlands)

    Kop, Reinier; Hoogendoorn, Mark; Teije, Annette Ten; Büchner, Frederike L; Slottje, Pauline; Moons, Leon M G; Numans, Mattijs E

    2016-01-01

    Over the past years, research utilizing routine care data extracted from Electronic Medical Records (EMRs) has increased tremendously. Yet there are no straightforward, standardized strategies for pre-processing these data. We propose a dedicated medical pre-processing pipeline aimed at taking on

  15. Reproducible cancer biomarker discovery in SELDI-TOF MS using different pre-processing algorithms.

    Directory of Open Access Journals (Sweden)

    Jinfeng Zou

    Full Text Available BACKGROUND: There has been much interest in differentiating diseased and normal samples using biomarkers derived from mass spectrometry (MS studies. However, biomarker identification for specific diseases has been hindered by irreproducibility. Specifically, a peak profile extracted from a dataset for biomarker identification depends on a data pre-processing algorithm. Until now, no widely accepted agreement has been reached. RESULTS: In this paper, we investigated the consistency of biomarker identification using differentially expressed (DE peaks from peak profiles produced by three widely used average spectrum-dependent pre-processing algorithms based on SELDI-TOF MS data for prostate and breast cancers. Our results revealed two important factors that affect the consistency of DE peak identification using different algorithms. One factor is that some DE peaks selected from one peak profile were not detected as peaks in other profiles, and the second factor is that the statistical power of identifying DE peaks in large peak profiles with many peaks may be low due to the large scale of the tests and small number of samples. Furthermore, we demonstrated that the DE peak detection power in large profiles could be improved by the stratified false discovery rate (FDR control approach and that the reproducibility of DE peak detection could thereby be increased. CONCLUSIONS: Comparing and evaluating pre-processing algorithms in terms of reproducibility can elucidate the relationship among different algorithms and also help in selecting a pre-processing algorithm. The DE peaks selected from small peak profiles with few peaks for a dataset tend to be reproducibly detected in large peak profiles, which suggests that a suitable pre-processing algorithm should be able to produce peaks sufficient for identifying useful and reproducible biomarkers.

  16. Data preprocessing methods of FT-NIR spectral data for the classification cooking oil

    Science.gov (United States)

    Ruah, Mas Ezatul Nadia Mohd; Rasaruddin, Nor Fazila; Fong, Sim Siong; Jaafar, Mohd Zuli

    2014-12-01

    This recent work describes the data pre-processing method of FT-NIR spectroscopy datasets of cooking oil and its quality parameters with chemometrics method. Pre-processing of near-infrared (NIR) spectral data has become an integral part of chemometrics modelling. Hence, this work is dedicated to investigate the utility and effectiveness of pre-processing algorithms namely row scaling, column scaling and single scaling process with Standard Normal Variate (SNV). The combinations of these scaling methods have impact on exploratory analysis and classification via Principle Component Analysis plot (PCA). The samples were divided into palm oil and non-palm cooking oil. The classification model was build using FT-NIR cooking oil spectra datasets in absorbance mode at the range of 4000cm-1-14000cm-1. Savitzky Golay derivative was applied before developing the classification model. Then, the data was separated into two sets which were training set and test set by using Duplex method. The number of each class was kept equal to 2/3 of the class that has the minimum number of sample. Then, the sample was employed t-statistic as variable selection method in order to select which variable is significant towards the classification models. The evaluation of data pre-processing were looking at value of modified silhouette width (mSW), PCA and also Percentage Correctly Classified (%CC). The results show that different data processing strategies resulting to substantial amount of model performances quality. The effects of several data pre-processing i.e. row scaling, column standardisation and single scaling process with Standard Normal Variate indicated by mSW and %CC. At two PCs model, all five classifier gave high %CC except Quadratic Distance Analysis.

  17. Value of Distributed Preprocessing of Biomass Feedstocks to a Bioenergy Industry

    Energy Technology Data Exchange (ETDEWEB)

    Christopher T Wright

    2006-07-01

    Biomass preprocessing is one of the primary operations in the feedstock assembly system and the front-end of a biorefinery. Its purpose is to chop, grind, or otherwise format the biomass into a suitable feedstock for conversion to ethanol and other bioproducts. Many variables such as equipment cost and efficiency, and feedstock moisture content, particle size, bulk density, compressibility, and flowability affect the location and implementation of this unit operation. Previous conceptual designs show this operation to be located at the front-end of the biorefinery. However, data are presented that show distributed preprocessing at the field-side or in a fixed preprocessing facility can provide significant cost benefits by producing a higher value feedstock with improved handling, transporting, and merchandising potential. In addition, data supporting the preferential deconstruction of feedstock materials due to their bio-composite structure identifies the potential for significant improvements in equipment efficiencies and compositional quality upgrades. Theses data are collected from full-scale low and high capacity hammermill grinders with various screen sizes. Multiple feedstock varieties with a range of moisture values were used in the preprocessing tests. The comparative values of the different grinding configurations, feedstock varieties, and moisture levels are assessed through post-grinding analysis of the different particle fractions separated with a medium-scale forage particle separator and a Rototap separator. The results show that distributed preprocessing produces a material that has bulk flowable properties and fractionation benefits that can improve the ease of transporting, handling and conveying the material to the biorefinery and improve the biochemical and thermochemical conversion processes.

  18. A comprehensive analysis about the influence of low-level preprocessing techniques on mass spectrometry data for sample classification.

    Science.gov (United States)

    López-Fernández, Hugo; Reboiro-Jato, Miguel; Glez-Peña, Daniel; Fernández-Riverola, Florentino

    2014-01-01

    Matrix-Assisted Laser Desorption Ionisation Time-of-Flight (MALDI-TOF) is one of the high-throughput mass spectrometry technologies able to produce data requiring an extensive preprocessing before subsequent analyses. In this context, several low-level preprocessing techniques have been successfully developed for different tasks, including baseline correction, smoothing, normalisation, peak detection and peak alignment. In this work, we present a systematic comparison of different software packages aiding in the compulsory preprocessing of MALDI-TOF data. In order to guarantee the validity of our study, we test multiple configurations of each preprocessing technique that are subsequently used to train a set of classifiers whose performance (kappa and accuracy) provide us accurate information for the final comparison. Results from experiments show the real impact of preprocessing techniques on classification, evidencing that MassSpecWavelet provides the best performance and Support Vector Machines (SVM) are one of the most accurate classifiers.

  19. Influence of data preprocessing on the quantitative determination of nutrient content in poultry manure by near infrared spectroscopy.

    Science.gov (United States)

    Chen, L J; Xing, L; Han, L J

    2010-01-01

    With increasing concern over potential polltion from farm wastes, there is a need for rapid and robust methods that can analyze livestock manure nutrient content. The near infrared spectroscopy (NIRS) method was used to determine nutrient content in diverse poultry manure samples (n=91). Various standard preprocessing methods (derivatives, multiplicative scatter correction, Savitsky-Golay smoothing, and standard normal variate) were applied to reduce data systemic noise. In addition, a new preprocessing method known as direct orthogonal signal correction (DOSC) was tested. Calibration models for ammonium nitrogen, total potassium, total nitrogen, and total phosphorus were developed with the partial least squares (PLS) method. The results showed that all the preprocessed data improved prediction results compared with the non-preprocessing method. Compared with the other preprocessing methods, the DOSC method gave the best results. The DOSC method achieved moderately successful prediction for ammonium nitrogen, total nitrogen, and total phosphorus. However, all preprocessing methods did not provide reliable prediction for total potassium. This indicates the DOSC method, especially combined with other preprocessing methods, needs further study to allow a more complete predictive analysis of manure nutrient content.

  20. Acoustic Biometric System Based on Preprocessing Techniques and Linear Support Vector Machines.

    Science.gov (United States)

    del Val, Lara; Izquierdo-Fuente, Alberto; Villacorta, Juan J; Raboso, Mariano

    2015-06-17

    Drawing on the results of an acoustic biometric system based on a MSE classifier, a new biometric system has been implemented. This new system preprocesses acoustic images, extracts several parameters and finally classifies them, based on Support Vector Machine (SVM). The preprocessing techniques used are spatial filtering, segmentation-based on a Gaussian Mixture Model (GMM) to separate the person from the background, masking-to reduce the dimensions of images-and binarization-to reduce the size of each image. An analysis of classification error and a study of the sensitivity of the error versus the computational burden of each implemented algorithm are presented. This allows the selection of the most relevant algorithms, according to the benefits required by the system. A significant improvement of the biometric system has been achieved by reducing the classification error, the computational burden and the storage requirements.

  1. Acoustic Biometric System Based on Preprocessing Techniques and Linear Support Vector Machines

    Directory of Open Access Journals (Sweden)

    Lara del Val

    2015-06-01

    Full Text Available Drawing on the results of an acoustic biometric system based on a MSE classifier, a new biometric system has been implemented. This new system preprocesses acoustic images, extracts several parameters and finally classifies them, based on Support Vector Machine (SVM. The preprocessing techniques used are spatial filtering, segmentation—based on a Gaussian Mixture Model (GMM to separate the person from the background, masking—to reduce the dimensions of images—and binarization—to reduce the size of each image. An analysis of classification error and a study of the sensitivity of the error versus the computational burden of each implemented algorithm are presented. This allows the selection of the most relevant algorithms, according to the benefits required by the system. A significant improvement of the biometric system has been achieved by reducing the classification error, the computational burden and the storage requirements.

  2. PRE-Processing for Video Coduing with Rate-Distortion Optimization Decision

    Institute of Scientific and Technical Information of China (English)

    QI Yi; HUANG Yong-gui; QI Hong-gang

    2006-01-01

    This paper proposes an adaptive video pre-processing algorithm for video coding. This algorithm works on the original image before intra- or inter-prediction. It adopts Gaussian filter to remove noise and insignificant features existing in images of video. Detection and restoration of edges are followed to restore the edges which are excessively filtered out in filtered images. Rate-Distortion Optimization (RDO) is employed to decide adaptively whether a processed block or a unprocessed block is coded into bit-streams doe more efficient coding. Our experiment results show that the algorithm achieves good coding performances on both subjective and objective aspects. In addition, the proposed pre-processing algorithm is transparent to decoder, and thus can be compliant with any video coding standards without modifying the decoder.

  3. PREPROCESSING PADA SEGMENTASI CITRA PARU-PARU DAN JANTUNG MENGGUNAKAN ANISOTROPIC DIFFUSION FILTER

    Directory of Open Access Journals (Sweden)

    A. T. A Prawira Kusuma

    2015-12-01

    Full Text Available This paper propose a preprocessing techniques in lung segmentation scheme using Anisotropic Diffusion filters. The aim is to improve the accuracy, sensitivity and specificity results of segmentation. This method was chosen because it has the ability to detect the edge, namely in doing smoothing, this method can obscure noise, while maintaining the edges of objects in the image. Characteristics such as this is needed to process medical image filter, where the boundary between the organ and the background is not so clear. The segmentation process is done by K-means Clustering and Active Contour to segment the lungs. Segmentation results were validated using the Receiver Operating Characteristic (ROC showed an increased accuracy, sensitivity and specificity, when compared with the results of segmentation in the previous paper, in which the preprocessing method used is Gaussian Lowpass filter.

  4. A Study on Pre-processing Algorithms for Metal Parts Inspection

    Directory of Open Access Journals (Sweden)

    Haider Sh. Hashim

    2011-06-01

    Full Text Available Pre-processing is very useful in a variety of situations since it helps to suppress information that is not related to the exact image processing or analysis task. Mathematical morphology is used for analysis, understanding and image processing. It is an influential method in the geometric morphological analysis and image understanding. It has befallen a new theory in the digital image processing domain. Edges detection and noise reduction are a crucial and very important pre-processing step. The classical edge detection methods and filtering are less accurate in detecting complex edge and filtering various types of noise. This paper proposed some useful mathematic morphological techniques to detect edge and to filter noise in metal parts image. The experimental result showed that the proposed algorithm helps to increase accuracy of metal parts inspection system.

  5. Analog preprocessing in a SNS 2 micrometers low-noise CMOS folding ADC

    Science.gov (United States)

    Carr, Richard D.

    1994-12-01

    Significant research in high performance analog-to-digital converters (ADC's) has been directed at retaining part of the high-speed flash ADC architecture, while reducing the total number of comparators in the circuit. The symmetrical number system (SNS) can be used to preprocess the analog input signal, reducing the number of comparators and thus reducing the chip area and power consumption of the ADC. This thesis examines a Very Large Scale Integrated (VLSI) design for a folding circuit for a SNS analog preprocessing architecture in a 9-bit folding ADC with a total of 23 comparators. The analog folding circuit layout uses the Orbit 2 micrometers CMOS N-well double-metal, double-poly low-noise analog process. The effects of Spice level 2 parameter tolerances during fabrication on the operation of the folding circuit are investigated numerically. The frequency response of the circuit is also quantified. An Application Specific Integrated Circuit (ASIC) is designed.

  6. Radar signal pre-processing to suppress surface bounce and multipath

    Science.gov (United States)

    Paglieroni, David W; Mast, Jeffrey E; Beer, N. Reginald

    2013-12-31

    A method and system for detecting the presence of subsurface objects within a medium is provided. In some embodiments, the imaging and detection system operates in a multistatic mode to collect radar return signals generated by an array of transceiver antenna pairs that is positioned across the surface and that travels down the surface. The imaging and detection system pre-processes that return signal to suppress certain undesirable effects. The imaging and detection system then generates synthetic aperture radar images from real aperture radar images generated from the pre-processed return signal. The imaging and detection system then post-processes the synthetic aperture radar images to improve detection of subsurface objects. The imaging and detection system identifies peaks in the energy levels of the post-processed image frame, which indicates the presence of a subsurface object.

  7. Preprocessing, classification modeling and feature selection using flow injection electrospray mass spectrometry metabolite fingerprint data.

    Science.gov (United States)

    Enot, David P; Lin, Wanchang; Beckmann, Manfred; Parker, David; Overy, David P; Draper, John

    2008-01-01

    Metabolome analysis by flow injection electrospray mass spectrometry (FIE-MS) fingerprinting generates measurements relating to large numbers of m/z signals. Such data sets often exhibit high variance with a paucity of replicates, thus providing a challenge for data mining. We describe data preprocessing and modeling methods that have proved reliable in projects involving samples from a range of organisms. The protocols interact with software resources specifically for metabolomics provided in a Web-accessible data analysis package FIEmspro (http://users.aber.ac.uk/jhd) written in the R environment and requiring a moderate knowledge of R command-line usage. Specific emphasis is placed on describing the outcome of modeling experiments using FIE-MS data that require further preprocessing to improve quality. The salient features of both poor and robust (i.e., highly generalizable) multivariate models are outlined together with advice on validating classifiers and avoiding false discovery when seeking explanatory variables.

  8. Acoustic Biometric System Based on Preprocessing Techniques and Linear Support Vector Machines

    Science.gov (United States)

    del Val, Lara; Izquierdo-Fuente, Alberto; Villacorta, Juan J.; Raboso, Mariano

    2015-01-01

    Drawing on the results of an acoustic biometric system based on a MSE classifier, a new biometric system has been implemented. This new system preprocesses acoustic images, extracts several parameters and finally classifies them, based on Support Vector Machine (SVM). The preprocessing techniques used are spatial filtering, segmentation—based on a Gaussian Mixture Model (GMM) to separate the person from the background, masking—to reduce the dimensions of images—and binarization—to reduce the size of each image. An analysis of classification error and a study of the sensitivity of the error versus the computational burden of each implemented algorithm are presented. This allows the selection of the most relevant algorithms, according to the benefits required by the system. A significant improvement of the biometric system has been achieved by reducing the classification error, the computational burden and the storage requirements. PMID:26091392

  9. A Hybrid System based on Multi-Agent System in the Data Preprocessing Stage

    CERN Document Server

    Kularbphettong, Kobkul; Meesad, Phayung

    2010-01-01

    We describe the usage of the Multi-agent system in the data preprocessing stage of an on-going project, called e-Wedding. The aim of this project is to utilize MAS and various approaches, like Web services, Ontology, and Data mining techniques, in e-Business that want to improve responsiveness and efficiency of systems so as to extract customer behavior model on Wedding Businesses. However, in this paper, we propose and implement the multi-agent-system, based on JADE, to only cope data preprocessing stage specified on handle with missing value techniques. JADE is quite easy to learn and use. Moreover, it supports many agent approaches such as agent communication, protocol, behavior and ontology. This framework has been experimented and evaluated in the realization of a simple, but realistic. The results, though still preliminary, are quite.

  10. Input data preprocessing method for exchange rate forecasting via neural network

    Directory of Open Access Journals (Sweden)

    Antić Dragan S.

    2014-01-01

    Full Text Available The aim of this paper is to present a method for neural network input parameters selection and preprocessing. The purpose of this network is to forecast foreign exchange rates using artificial intelligence. Two data sets are formed for two different economic systems. Each system is represented by six categories with 70 economic parameters which are used in the analysis. Reduction of these parameters within each category was performed by using the principal component analysis method. Component interdependencies are established and relations between them are formed. Newly formed relations were used to create input vectors of a neural network. The multilayer feed forward neural network is formed and trained using batch training. Finally, simulation results are presented and it is concluded that input data preparation method is an effective way for preprocessing neural network data. [Projekat Ministarstva nauke Republike Srbije, br.TR 35005, br. III 43007 i br. III 44006

  11. The Role of GRAIL Orbit Determination in Preprocessing of Gravity Science Measurements

    Science.gov (United States)

    Kruizinga, Gerhard; Asmar, Sami; Fahnestock, Eugene; Harvey, Nate; Kahan, Daniel; Konopliv, Alex; Oudrhiri, Kamal; Paik, Meegyeong; Park, Ryan; Strekalov, Dmitry; Watkins, Michael; Yuan, Dah-Ning

    2013-01-01

    The Gravity Recovery And Interior Laboratory (GRAIL) mission has constructed a lunar gravity field with unprecedented uniform accuracy on the farside and nearside of the Moon. GRAIL lunar gravity field determination begins with preprocessing of the gravity science measurements by applying corrections for time tag error, general relativity, measurement noise and biases. Gravity field determination requires the generation of spacecraft ephemerides of an accuracy not attainable with the pre-GRAIL lunar gravity fields. Therefore, a bootstrapping strategy was developed, iterating between science data preprocessing and lunar gravity field estimation in order to construct sufficiently accurate orbit ephemerides.This paper describes the GRAIL measurements, their dependence on the spacecraft ephemerides and the role of orbit determination in the bootstrapping strategy. Simulation results will be presented that validate the bootstrapping strategy followed by bootstrapping results for flight data, which have led to the latest GRAIL lunar gravity fields.

  12. The impact of data preprocessing in traumatic brain injury detection using functional magnetic resonance imaging.

    Science.gov (United States)

    Vergara, Victor M; Damaraju, Eswar; Mayer, Andrew B; Miller, Robyn; Cetin, Mustafa S; Calhoun, Vince

    2015-01-01

    Traumatic brain injury (TBI) can adversely affect a person's thinking, memory, personality and behavior. For this reason new and better biomarkers are being investigated. Resting state functional network connectivity (rsFNC) derived from functional magnetic resonance (fMRI) imaging is emerging as a possible biomarker. One of the main concerns with this technique is the appropriateness of methods used to correct for subject movement. In this work we used 50 mild TBI patients and matched healthy controls to explore the outcomes obtained from different fMRI data preprocessing. Results suggest that correction for motion variance before spatial smoothing is the best alternative. Following this preprocessing option a significant group difference was found between cerebellum and supplementary motor area/paracentral lobule. In this case the mTBI group exhibits an increase in rsFNC.

  13. KONFIG and REKONFIG: Two interactive preprocessing to the Navy/NASA Engine Program (NNEP)

    Science.gov (United States)

    Fishbach, L. H.

    1981-01-01

    The NNEP is a computer program that is currently being used to simulate the thermodynamic cycle performance of almost all types of turbine engines by many government, industry, and university personnel. The NNEP uses arrays of input data to set up the engine simulation and component matching method as well as to describe the characteristics of the components. A preprocessing program (KONFIG) is described in which the user at a terminal on a time shared computer can interactively prepare the arrays of data required. It is intended to make it easier for the occasional or new user to operate NNEP. Another preprocessing program (REKONFIG) in which the user can modify the component specifications of a previously configured NNEP dataset is also described. It is intended to aid in preparing data for parametric studies and/or studies of similar engines such a mixed flow turbofans, turboshafts, etc.

  14. Effective automated prediction of vertebral column pathologies based on logistic model tree with SMOTE preprocessing.

    Science.gov (United States)

    Karabulut, Esra Mahsereci; Ibrikci, Turgay

    2014-05-01

    This study develops a logistic model tree based automation system based on for accurate recognition of types of vertebral column pathologies. Six biomechanical measures are used for this purpose: pelvic incidence, pelvic tilt, lumbar lordosis angle, sacral slope, pelvic radius and grade of spondylolisthesis. A two-phase classification model is employed in which the first step is preprocessing the data by use of Synthetic Minority Over-sampling Technique (SMOTE), and the second one is feeding the classifier Logistic Model Tree (LMT) with the preprocessed data. We have achieved an accuracy of 89.73 %, and 0.964 Area Under Curve (AUC) in computer based automatic detection of the pathology. This was validated via a 10-fold-cross-validation experiment conducted on clinical records of 310 patients. The study also presents a comparative analysis of the vertebral column data with the use of several machine learning algorithms.

  15. The Combined Effect of Filters in ECG Signals for Pre-Processing

    Directory of Open Access Journals (Sweden)

    Isha V. Upganlawar

    2014-05-01

    Full Text Available The ECG signal is abruptly changing and continuous in nature. The heart disease such as paroxysmal of heart, arrhythmia diagnosing, are related with the intelligent health care decision this ECG signal need to be pre-process accurately for further action on it such as extracting the features, wavelet decomposition, distribution of QRS complexes in ECG recordings and related information such as heart rate and RR interval, classification of the signal by using various classifiers etc. Filters plays very important role in analyzing the low frequency components in ECG signal. The biomedical signals are of low frequency, the removal of power line interference and baseline wander is a very important step at the pre-processing stage of ECG. In these paper we deal with the study of Median filtering and FIR (Finite Impulse Responsefiltering of ECG signals under noisy condition

  16. The Combined Effect of Filters in ECG Signals for Pre-Processing

    OpenAIRE

    Isha V. Upganlawar; Harshal Chowhan

    2014-01-01

    The ECG signal is abruptly changing and continuous in nature. The heart disease such as paroxysmal of heart, arrhythmia diagnosing, are related with the intelligent health care decision this ECG signal need to be pre-process accurately for further action on it such as extracting the features, wavelet decomposition, distribution of QRS complexes in ECG recordings and related information such as heart rate and RR interval, classification of the signal by using various classifiers etc. Filters p...

  17. Data preprocessing for a vehicle-based localization system used in road traffic applications

    Science.gov (United States)

    Patelczyk, Timo; Löffler, Andreas; Biebl, Erwin

    2016-09-01

    This paper presents a fixed-point implementation of the preprocessing using a field programmable gate array (FPGA), which is required for a multipath joint angle and delay estimation (JADE) used in road traffic applications. This paper lays the foundation for many model-based parameter estimation methods. Here, a simulation of a vehicle-based localization system application for protecting vulnerable road users, which were equipped with appropriate transponders, is considered. For such safety critical applications, the robustness and real-time capability of the localization is particularly important. Additionally, a motivation to use a fixed-point implementation for the data preprocessing is a limited computing power of the head unit of a vehicle. This study aims to process the raw data provided by the localization system used in this paper. The data preprocessing applied includes a wideband calibration of the physical localization system, separation of relevant information from the received sampled signal, and preparation of the incoming data via further processing. Further, a channel matrix estimation was implemented to complete the data preprocessing, which contains information on channel parameters, e.g., the positions of the objects to be located. In the presented case of a vehicle-based localization system application we assume an urban environment, in which multipath propagation occurs. Since most methods for localization are based on uncorrelated signals, this fact must be addressed. Hence, a decorrelation of incoming data stream in terms of a further localization is required. This decorrelation was accomplished by considering several snapshots in different time slots. As a final aspect of the use of fixed-point arithmetic, quantization errors are considered. In addition, the resources and runtime of the presented implementation are discussed; these factors are strongly linked to a practical implementation.

  18. A clinical evaluation of the RNCA study using Fourier filtering as a preprocessing method

    Energy Technology Data Exchange (ETDEWEB)

    Robeson, W.; Alcan, K.E.; Graham, M.C.; Palestro, C.; Oliver, F.H.; Benua, R.S.

    1984-06-01

    Forty-one patients (25 male, 16 female) were studied by Radionuclide Cardangiography (RNCA) in our institution. There were 42 rest studies and 24 stress studies (66 studies total). Sixteen patients were normal, 15 had ASHD, seven had a cardiomyopathy, and three had left-sided valvular regurgitation. Each study was preprocessed using both the standard nine-point smoothing method and Fourier filtering. Amplitude and phase images were also generated. Both preprocessing methods were compared with respect to image quality, border definition, reliability and reproducibility of the LVEF, and cine wall motion interpretation. Image quality and border definition were judged superior by the consensus of two independent observers in 65 of 66 studies (98%) using Fourier filtered data. The LVEF differed between the two processes by greater than .05 in 17 of 66 studies (26%) including five studies in which the LVEF could not be determined using nine-point smoothed data. LV wall motion was normal by both techniques in all control patients by cine analysis. However, cine wall motion analysis using Fourier filtered data demonstrated additional abnormalities in 17 of 25 studies (68%) in the ASHD group, including three uninterpretable studies using nine-point smoothed data. In the cardiomyopathy/valvular heart disease group, ten of 18 studies (56%) had additional wall motion abnormalities using Fourier filtered data (including four uninterpretable studies using nine-point smoothed data). We conclude that Fourier filtering is superior to the nine-point smooth preprocessing method now in general use in terms of image quality, border definition, generation of an LVEF, and cine wall motion analysis. The advent of the array processor makes routine preprocessing by Fourier filtering a feasible technologic advance in the development of the RNCA study.

  19. Pre-Processing and Re-Weighting Jet Images with Different Substructure Variables

    CERN Document Server

    Huynh, Lynn

    2016-01-01

    This work is an extension of Monte Carlo simulation based studies in tagging boosted, hadronically decaying W bosons at a center of mass energy of s = 13 TeV. Two pre-processing techniques used with jet images, translation and rotation, are first examined. The generated jet images for W signal jets and QCD background jets are then rescaled and weighted with five different substructure variables for visual comparison.

  20. Preprocessing techniques to reduce atmospheric and sensor variability in multispectral scanner data.

    Science.gov (United States)

    Crane, R. B.

    1971-01-01

    Multispectral scanner data are potentially useful in a variety of remote sensing applications. Large-area surveys of earth resources carried out by automated recognition processing of these data are particularly important. However, the practical realization of such surveys is limited by a variability in the scanner signals that results in improper recognition of the data. This paper discusses ways by which some of this variability can be removed from the data by preprocessing with resultant improvements in recognition results.

  1. Performance evaluation of preprocessing techniques utilizing expert information in multivariate calibration.

    Science.gov (United States)

    Sharma, Sandeep; Goodarzi, Mohammad; Ramon, Herman; Saeys, Wouter

    2014-04-01

    Partial Least Squares (PLS) regression is one of the most used methods for extracting chemical information from Near Infrared (NIR) spectroscopic measurements. The success of a PLS calibration relies largely on the representativeness of the calibration data set. This is not trivial, because not only the expected variation in the analyte of interest, but also the variation of other contributing factors (interferents) should be included in the calibration data. This also implies that changes in interferent concentrations not covered in the calibration step can deteriorate the prediction ability of the calibration model. Several researchers have suggested that PLS models can be robustified against changes in the interferent structure by incorporating expert knowledge in the preprocessing step with the aim to efficiently filter out the spectral influence of the spectral interferents. However, these methods have not yet been compared against each other. Therefore, in the present study, various preprocessing techniques exploiting expert knowledge were compared on two experimental data sets. In both data sets, the calibration and test set were designed to have a different interferent concentration range. The performance of these techniques was compared to that of preprocessing techniques which do not use any expert knowledge. Using expert knowledge was found to improve the prediction performance for both data sets. For data set-1, the prediction error improved nearly 32% when pure component spectra of the analyte and the interferents were used in the Extended Multiplicative Signal Correction framework. Similarly, for data set-2, nearly 63% improvement in the prediction error was observed when the interferent information was utilized in Spectral Interferent Subtraction preprocessing.

  2. Pre-Processing Noise Cross-Correlations with Equalizing the Network Covariance Matrix Eigen-Spectrum

    Science.gov (United States)

    Seydoux, L.; de Rosny, J.; Shapiro, N.

    2016-12-01

    Theoretically, the extraction of Green functions from noise cross-correlation requires the ambient seismic wavefield to be generated by uncorrelated sources evenly distributed in the medium. Yet, this condition is often not verified. Strong events such as earthquakes often produce highly coherent transient signals. Also, the microseismic noise is generated at specific places on the Earth's surface with source regions often very localized in space. Different localized and persistent seismic sources may contaminate the cross-correlations of continuous records resulting in spurious arrivals or asymmetry and, finally, in biased travel-time measurements. Pre-processing techniques therefore must be applied to the seismic data in order to reduce the effect of noise anisotropy and the influence of strong localized events. Here we describe a pre-processing approach that uses the covariance matrix computed from signals recorded by a network of seismographs. We extend the widely used time and spectral equalization pre-processing to the equalization of the covariance matrix spectrum (i.e., its ordered eigenvalues). This approach can be considered as a spatial equalization. This method allows us to correct for the wavefield anisotropy in two ways: (1) the influence of strong directive sources is substantially attenuated, and (2) the weakly excited modes are reinforced, allowing to partially recover the conditions required for the Green's function retrieval. We also present an eigenvector-based spatial filter used to distinguish between surface and body waves. This last filter is used together with the equalization of the eigenvalue spectrum. We simulate two-dimensional wavefield in a heterogeneous medium with strongly dominating source. We show that our method greatly improves the travel-time measurements obtained from the inter-station cross-correlation functions. Also, we apply the developed method to the USArray data and pre-process the continuous records strongly influenced

  3. Hyperspectral imaging in medicine: image pre-processing problems and solutions in Matlab.

    Science.gov (United States)

    Koprowski, Robert

    2015-11-01

    The paper presents problems and solutions related to hyperspectral image pre-processing. New methods of preliminary image analysis are proposed. The paper shows problems occurring in Matlab when trying to analyse this type of images. Moreover, new methods are discussed which provide the source code in Matlab that can be used in practice without any licensing restrictions. The proposed application and sample result of hyperspectral image analysis.

  4. Review of Data Preprocessing Methods for Sign Language Recognition Systems based on Artificial Neural Networks

    Directory of Open Access Journals (Sweden)

    Zorins Aleksejs

    2016-12-01

    Full Text Available The article presents an introductory analysis of relevant research topic for Latvian deaf society, which is the development of the Latvian Sign Language Recognition System. More specifically the data preprocessing methods are discussed in the paper and several approaches are shown with a focus on systems based on artificial neural networks, which are one of the most successful solutions for sign language recognition task.

  5. Data Cleaning In Data Warehouse: A Survey of Data Pre-processing Techniques and Tools

    Directory of Open Access Journals (Sweden)

    Anosh Fatima

    2017-03-01

    Full Text Available A Data Warehouse is a computer system designed for storing and analyzing an organization's historical data from day-to-day operations in Online Transaction Processing System (OLTP. Usually, an organization summarizes and copies information from its operational systems to the data warehouse on a regular schedule and management performs complex queries and analysis on the information without slowing down the operational systems. Data need to be pre-processed to improve quality of data, before storing into data warehouse. This survey paper presents data cleaning problems and the approaches in use currently for preprocessing. To determine which technique of preprocessing is best in what scenario to improve the performance of Data Warehouse is main goal of this paper. Many techniques have been analyzed for data cleansing, using certain evaluation attributes and tested on different kind of data sets. Data quality tools such as YALE, ALTERYX, and WEKA have been used for conclusive results to ready the data in data warehouse and ensure that only cleaned data populates the warehouse, thus enhancing usability of the warehouse. Results of paper can be useful in many future activities like cleansing, standardizing, correction, matching and transformation. This research can help in data auditing and pattern detection in the data.

  6. Supervised pre-processing approaches in multiple class variables classification for fish recruitment forecasting

    KAUST Repository

    Fernandes, José Antonio

    2013-02-01

    A multi-species approach to fisheries management requires taking into account the interactions between species in order to improve recruitment forecasting of the fish species. Recent advances in Bayesian networks direct the learning of models with several interrelated variables to be forecasted simultaneously. These models are known as multi-dimensional Bayesian network classifiers (MDBNs). Pre-processing steps are critical for the posterior learning of the model in these kinds of domains. Therefore, in the present study, a set of \\'state-of-the-art\\' uni-dimensional pre-processing methods, within the categories of missing data imputation, feature discretization and feature subset selection, are adapted to be used with MDBNs. A framework that includes the proposed multi-dimensional supervised pre-processing methods, coupled with a MDBN classifier, is tested with synthetic datasets and the real domain of fish recruitment forecasting. The correctly forecasting of three fish species (anchovy, sardine and hake) simultaneously is doubled (from 17.3% to 29.5%) using the multi-dimensional approach in comparison to mono-species models. The probability assessments also show high improvement reducing the average error (estimated by means of Brier score) from 0.35 to 0.27. Finally, these differences are superior to the forecasting of species by pairs. © 2012 Elsevier Ltd.

  7. Super-resolution algorithm based on sparse representation and wavelet preprocessing for remote sensing imagery

    Science.gov (United States)

    Ren, Ruizhi; Gu, Lingjia; Fu, Haoyang; Sun, Chenglin

    2017-04-01

    An effective super-resolution (SR) algorithm is proposed for actual spectral remote sensing images based on sparse representation and wavelet preprocessing. The proposed SR algorithm mainly consists of dictionary training and image reconstruction. Wavelet preprocessing is used to establish four subbands, i.e., low frequency, horizontal, vertical, and diagonal high frequency, for an input image. As compared to the traditional approaches involving the direct training of image patches, the proposed approach focuses on the training of features derived from these four subbands. The proposed algorithm is verified using different spectral remote sensing images, e.g., moderate-resolution imaging spectroradiometer (MODIS) images with different bands, and the latest Chinese Jilin-1 satellite images with high spatial resolution. According to the visual experimental results obtained from the MODIS remote sensing data, the SR images using the proposed SR algorithm are superior to those using a conventional bicubic interpolation algorithm or traditional SR algorithms without preprocessing. Fusion algorithms, e.g., standard intensity-hue-saturation, principal component analysis, wavelet transform, and the proposed SR algorithms are utilized to merge the multispectral and panchromatic images acquired by the Jilin-1 satellite. The effectiveness of the proposed SR algorithm is assessed by parameters such as peak signal-to-noise ratio, structural similarity index, correlation coefficient, root-mean-square error, relative dimensionless global error in synthesis, relative average spectral error, spectral angle mapper, and the quality index Q4, and its performance is better than that of the standard image fusion algorithms.

  8. Desktop Software for Patch-Clamp Raw Binary Data Conversion and Preprocessing

    Directory of Open Access Journals (Sweden)

    Ning Zhang

    2011-01-01

    Full Text Available Since raw data recorded by patch-clamp systems are always stored in binary format, electrophysiologists may experience difficulties with patch clamp data preprocessing especially when they want to analyze by custom-designed algorithms. In this study, we present desktop software, called PCDReader, which could be an effective and convenient solution for patch clamp data preprocessing for daily laboratory use. We designed a novel class module, called clsPulseData, to directly read the raw data along with the parameters recorded from HEKA instruments without any other program support. By a graphical user interface, raw binary data files can be converted into several kinds of ASCII text files for further analysis, with several preprocessing options. And the parameters can also be viewed, modified and exported into ASCII files by a user-friendly Explorer style window. The real-time data loading technique and optimized memory management programming makes PCDReader a fast and efficient tool. The compiled software along with the source code of the clsPulseData class module is freely available to academic and nonprofit users.

  9. Learning-based image preprocessing for robust computer-aided detection

    Science.gov (United States)

    Raghupathi, Laks; Devarakota, Pandu R.; Wolf, Matthias

    2013-03-01

    Recent studies have shown that low dose computed tomography (LDCT) can be an effective screening tool to reduce lung cancer mortality. Computer-aided detection (CAD) would be a beneficial second reader for radiologists in such cases. Studies demonstrate that while iterative reconstructions (IR) improve LDCT diagnostic quality, it however degrades CAD performance significantly (increased false positives) when applied directly. For improving CAD performance, solutions such as retraining with newer data or applying a standard preprocessing technique may not be suffice due to high prevalence of CT scanners and non-uniform acquisition protocols. Here, we present a learning-based framework that can adaptively transform a wide variety of input data to boost an existing CAD performance. This not only enhances their robustness but also their applicability in clinical workflows. Our solution consists of applying a suitable pre-processing filter automatically on the given image based on its characteristics. This requires the preparation of ground truth (GT) of choosing an appropriate filter resulting in improved CAD performance. Accordingly, we propose an efficient consolidation process with a novel metric. Using key anatomical landmarks, we then derive consistent feature descriptors for the classification scheme that then uses a priority mechanism to automatically choose an optimal preprocessing filter. We demonstrate CAD prototype∗ performance improvement using hospital-scale datasets acquired from North America, Europe and Asia. Though we demonstrated our results for a lung nodule CAD, this scheme is straightforward to extend to other post-processing tools dedicated to other organs and modalities.

  10. Data pre-processing for web log mining: Case study of commercial bank website usage analysis

    Directory of Open Access Journals (Sweden)

    Jozef Kapusta

    2013-01-01

    Full Text Available We use data cleaning, integration, reduction and data conversion methods in the pre-processing level of data analysis. Data processing techniques improve the overall quality of the patterns mined. The paper describes using of standard pre-processing methods for preparing data of the commercial bank website in the form of the log file obtained from the web server. Data cleaning, as the simplest step of data pre-processing, is non–trivial as the analysed content is highly specific. We had to deal with the problem of frequent changes of the content and even frequent changes of the structure. Regular changes in the structure make use of the sitemap impossible. We presented approaches how to deal with this problem. We were able to create the sitemap dynamically just based on the content of the log file. In this case study, we also examined just the one part of the website over the standard analysis of an entire website, as we did not have access to all log files for the security reason. As the result, the traditional practices had to be adapted for this special case. Analysing just the small fraction of the website resulted in the short session time of regular visitors. We were not able to use recommended methods to determine the optimal value of session time. Therefore, we proposed new methods based on outliers identification for raising the accuracy of the session length in this paper.

  11. Flexibility and utility of pre-processing methods in converting STXM setups for ptychography - Final Paper

    Energy Technology Data Exchange (ETDEWEB)

    Fromm, Catherine [SLAC National Accelerator Lab., Menlo Park, CA (United States)

    2015-08-20

    Ptychography is an advanced diffraction based imaging technique that can achieve resolution of 5nm and below. It is done by scanning a sample through a beam of focused x-rays using discrete yet overlapping scan steps. Scattering data is collected on a CCD camera, and the phase of the scattered light is reconstructed with sophisticated iterative algorithms. Because the experimental setup is similar, ptychography setups can be created by retrofitting existing STXM beam lines with new hardware. The other challenge comes in the reconstruction of the collected scattering images. Scattering data must be adjusted and packaged with experimental parameters to calibrate the reconstruction software. The necessary pre-processing of data prior to reconstruction is unique to each beamline setup, and even the optical alignments used on that particular day. Pre-processing software must be developed to be flexible and efficient in order to allow experiments appropriate control and freedom in the analysis of their hard-won data. This paper will describe the implementation of pre-processing software which successfully connects data collection steps to reconstruction steps, letting the user accomplish accurate and reliable ptychography.

  12. Evaluating the validity of spectral calibration models for quantitative analysis following signal preprocessing.

    Science.gov (United States)

    Chen, Da; Grant, Edward

    2012-11-01

    When paired with high-powered chemometric analysis, spectrometric methods offer great promise for the high-throughput analysis of complex systems. Effective classification or quantification often relies on signal preprocessing to reduce spectral interference and optimize the apparent performance of a calibration model. However, less frequently addressed by systematic research is the affect of preprocessing on the statistical accuracy of a calibration result. The present work demonstrates the effectiveness of two criteria for validating the performance of signal preprocessing in multivariate models in the important dimensions of bias and precision. To assess the extent of bias, we explore the applicability of the elliptic joint confidence region (EJCR) test and devise a new means to evaluate precision by a bias-corrected root mean square error of prediction. We show how these criteria can effectively gauge the success of signal pretreatments in suppressing spectral interference while providing a straightforward means to determine the optimal level of model complexity. This methodology offers a graphical diagnostic by which to visualize the consequences of pretreatment on complex multivariate models, enabling optimization with greater confidence. To demonstrate the application of the EJCR criterion in this context, we evaluate the validity of representative calibration models using standard pretreatment strategies on three spectral data sets. The results indicate that the proposed methodology facilitates the reliable optimization of a well-validated calibration model, thus improving the capability of spectrophotometric analysis.

  13. Characterizing the continuously acquired cardiovascular time series during hemodialysis, using median hybrid filter preprocessing noise reduction.

    Science.gov (United States)

    Wilson, Scott; Bowyer, Andrea; Harrap, Stephen B

    2015-01-01

    The clinical characterization of cardiovascular dynamics during hemodialysis (HD) has important pathophysiological implications in terms of diagnostic, cardiovascular risk assessment, and treatment efficacy perspectives. Currently the diagnosis of significant intradialytic systolic blood pressure (SBP) changes among HD patients is imprecise and opportunistic, reliant upon the presence of hypotensive symptoms in conjunction with coincident but isolated noninvasive brachial cuff blood pressure (NIBP) readings. Considering hemodynamic variables as a time series makes a continuous recording approach more desirable than intermittent measures; however, in the clinical environment, the data signal is susceptible to corruption due to both impulsive and Gaussian-type noise. Signal preprocessing is an attractive solution to this problem. Prospectively collected continuous noninvasive SBP data over the short-break intradialytic period in ten patients was preprocessed using a novel median hybrid filter (MHF) algorithm and compared with 50 time-coincident pairs of intradialytic NIBP measures from routine HD practice. The median hybrid preprocessing technique for continuously acquired cardiovascular data yielded a dynamic regression without significant noise and artifact, suitable for high-level profiling of time-dependent SBP behavior. Signal accuracy is highly comparable with standard NIBP measurement, with the added clinical benefit of dynamic real-time hemodynamic information.

  14. Foveal processing difficulty does not affect parafoveal preprocessing in young readers

    Science.gov (United States)

    Marx, Christina; Hawelka, Stefan; Schuster, Sarah; Hutzler, Florian

    2017-01-01

    Recent evidence suggested that parafoveal preprocessing develops early during reading acquisition, that is, young readers profit from valid parafoveal information and exhibit a resultant preview benefit. For young readers, however, it is unknown whether the processing demands of the currently fixated word modulate the extent to which the upcoming word is parafoveally preprocessed – as it has been postulated (for adult readers) by the foveal load hypothesis. The present study used the novel incremental boundary technique to assess whether 4th and 6th Graders exhibit an effect of foveal load. Furthermore, we attempted to distinguish the foveal load effect from the spillover effect. These effects are hard to differentiate with respect to the expected pattern of results, but are conceptually different. The foveal load effect is supposed to reflect modulations of the extent of parafoveal preprocessing, whereas the spillover effect reflects the ongoing processing of the previous word whilst the reader’s fixation is already on the next word. The findings revealed that the young readers did not exhibit an effect of foveal load, but a substantial spillover effect. The implications for previous studies with adult readers and for models of eye movement control in reading are discussed. PMID:28139718

  15. HEp-2 Cell Classification: The Role of Gaussian Scale Space Theory as A Pre-processing Approach

    OpenAIRE

    Qi, Xianbiao; Zhao, Guoying; Chen, Jie; Pietikäinen, Matti

    2015-01-01

    \\textit{Indirect Immunofluorescence Imaging of Human Epithelial Type 2} (HEp-2) cells is an effective way to identify the presence of Anti-Nuclear Antibody (ANA). Most existing works on HEp-2 cell classification mainly focus on feature extraction, feature encoding and classifier design. Very few efforts have been devoted to study the importance of the pre-processing techniques. In this paper, we analyze the importance of the pre-processing, and investigate the role of Gaussian Scale Space (GS...

  16. Pre-Processing Effect on the Accuracy of Event-Based Activity Segmentation and Classification through Inertial Sensors

    Directory of Open Access Journals (Sweden)

    Benish Fida

    2015-09-01

    Full Text Available Inertial sensors are increasingly being used to recognize and classify physical activities in a variety of applications. For monitoring and fitness applications, it is crucial to develop methods able to segment each activity cycle, e.g., a gait cycle, so that the successive classification step may be more accurate. To increase detection accuracy, pre-processing is often used, with a concurrent increase in computational cost. In this paper, the effect of pre-processing operations on the detection and classification of locomotion activities was investigated, to check whether the presence of pre-processing significantly contributes to an increase in accuracy. The pre-processing stages evaluated in this study were inclination correction and de-noising. Level walking, step ascending, descending and running were monitored by using a shank-mounted inertial sensor. Raw and filtered segments, obtained from a modified version of a rule-based gait detection algorithm optimized for sequential processing, were processed to extract time and frequency-based features for physical activity classification through a support vector machine classifier. The proposed method accurately detected >99% gait cycles from raw data and produced >98% accuracy on these segmented gait cycles. Pre-processing did not substantially increase classification accuracy, thus highlighting the possibility of reducing the amount of pre-processing for real-time applications.

  17. Complex and magnitude-only preprocessing of 2D and 3D BOLD fMRI data at 7 T.

    Science.gov (United States)

    Barry, Robert L; Strother, Stephen C; Gore, John C

    2012-03-01

    A challenge to ultra high field functional magnetic resonance imaging is the predominance of noise associated with physiological processes unrelated to tasks of interest. This degradation in data quality may be partially reversed using a series of preprocessing algorithms designed to retrospectively estimate and remove the effects of these noise sources. However, such algorithms are routinely validated only in isolation, and thus consideration of their efficacies within realistic preprocessing pipelines and on different data sets is often overlooked. We investigate the application of eight possible combinations of three pseudo-complementary preprocessing algorithms - phase regression, Stockwell transform filtering, and retrospective image correction - to suppress physiological noise in 2D and 3D functional data at 7 T. The performance of each preprocessing pipeline was evaluated using data-driven metrics of reproducibility and prediction. The optimal preprocessing pipeline for both 2D and 3D functional data included phase regression, Stockwell transform filtering, and retrospective image correction. This result supports the hypothesis that a complex preprocessing pipeline is preferable to a magnitude-only pipeline, and suggests that functional magnetic resonance imaging studies should retain complex images and externally monitor subjects' respiratory and cardiac cycles so that these supplementary data may be used to retrospectively reduce noise and enhance overall data quality.

  18. EARLINET Single Calculus Chain - technical - Part 1: Pre-processing of raw lidar data

    Science.gov (United States)

    D'Amico, Giuseppe; Amodeo, Aldo; Mattis, Ina; Freudenthaler, Volker; Pappalardo, Gelsomina

    2016-02-01

    In this paper we describe an automatic tool for the pre-processing of aerosol lidar data called ELPP (EARLINET Lidar Pre-Processor). It is one of two calculus modules of the EARLINET Single Calculus Chain (SCC), the automatic tool for the analysis of EARLINET data. ELPP is an open source module that executes instrumental corrections and data handling of the raw lidar signals, making the lidar data ready to be processed by the optical retrieval algorithms. According to the specific lidar configuration, ELPP automatically performs dead-time correction, atmospheric and electronic background subtraction, gluing of lidar signals, and trigger-delay correction. Moreover, the signal-to-noise ratio of the pre-processed signals can be improved by means of configurable time integration of the raw signals and/or spatial smoothing. ELPP delivers the statistical uncertainties of the final products by means of error propagation or Monte Carlo simulations. During the development of ELPP, particular attention has been payed to make the tool flexible enough to handle all lidar configurations currently used within the EARLINET community. Moreover, it has been designed in a modular way to allow an easy extension to lidar configurations not yet implemented. The primary goal of ELPP is to enable the application of quality-assured procedures in the lidar data analysis starting from the raw lidar data. This provides the added value of full traceability of each delivered lidar product. Several tests have been performed to check the proper functioning of ELPP. The whole SCC has been tested with the same synthetic data sets, which were used for the EARLINET algorithm inter-comparison exercise. ELPP has been successfully employed for the automatic near-real-time pre-processing of the raw lidar data measured during several EARLINET inter-comparison campaigns as well as during intense field campaigns.

  19. AN ENHANCED PRE-PROCESSING RESEARCH FRAMEWORK FOR WEB LOG DATA USING A LEARNING ALGORITHM

    Directory of Open Access Journals (Sweden)

    V.V.R. Maheswara Rao

    2011-01-01

    Full Text Available With the continued growth and proliferation of Web services and Web based information systems, the volumes of user data have reached astronomical proportions. Before analyzing such data using web mining techniques, the web log has to be pre processed, integrated and transformed. As the World Wide Web is continuously and rapidly growing, it is necessary for the web miners to utilize intelligent tools in order to find, extract, filter and evaluate the desired information. The data pre-processing stage is the most important phase for investigation of the web user usage behaviour. To do this one must extract the only human user accesses from weblog data which is critical and complex. The web log is incremental in nature, thus conventional data pre-processing techniques were proved to be not suitable. Hence an extensive learning algorithm is required in order to get the desired information.This paper introduces an extensive research frame work capable of pre processing web log data completely and efficiently. The learning algorithm of proposed research frame work can separates human user and search engine accesses intelligently, with less time. In order to create suitable target data, the further essential tasks of pre-processing Data Cleansing, User Identification, Sessionization and Path Completion are designed collectively. The framework reduces the error rate and improves significant learning performance of the algorithm. The work ensures the goodness of split by using popular measures like Entropy and Gini index. This framework helps to investigate the web user usage behaviour efficiently. The experimental results proving this claim are given in this paper.

  20. EARLINET Single Calculus Chain – technical – Part 1: Pre-processing of raw lidar data

    Directory of Open Access Journals (Sweden)

    G. D'Amico

    2015-10-01

    Full Text Available In this paper we describe an automatic tool for the pre-processing of lidar data called ELPP (EARLINET Lidar Pre-Processor. It is one of two calculus modules of the EARLINET Single Calculus Chain (SCC, the automatic tool for the analysis of EARLINET data. The ELPP is an open source module that executes instrumental corrections and data handling of the raw lidar signals, making the lidar data ready to be processed by the optical retrieval algorithms. According to the specific lidar configuration, the ELPP automatically performs dead-time correction, atmospheric and electronic background subtraction, gluing of lidar signals, and trigger-delay correction. Moreover, the signal-to-noise ratio of the pre-processed signals can be improved by means of configurable time integration of the raw signals and/or spatial smoothing. The ELPP delivers the statistical uncertainties of the final products by means of error propagation or Monte Carlo simulations. During the development of the ELPP module, particular attention has been payed to make the tool flexible enough to handle all lidar configurations currently used within the EARLINET community. Moreover, it has been designed in a modular way to allow an easy extension to lidar configurations not yet implemented. The primary goal of the ELPP module is to enable the application of quality-assured procedures in the lidar data analysis starting from the raw lidar data. This provides the added value of full traceability of each delivered lidar product. Several tests have been performed to check the proper functioning of the ELPP module. The whole SCC has been tested with the same synthetic data sets, which were used for the EARLINET algorithm inter-comparison exercise. The ELPP module has been successfully employed for the automatic near-real-time pre-processing of the raw lidar data measured during several EARLINET inter-comparison campaigns as well as during intense field campaigns.

  1. Classification-based comparison of pre-processing methods for interpretation of mass spectrometry generated clinical datasets

    Directory of Open Access Journals (Sweden)

    Hoefsloot Huub CJ

    2009-05-01

    Full Text Available Abstract Background Mass spectrometry is increasingly being used to discover proteins or protein profiles associated with disease. Experimental design of mass-spectrometry studies has come under close scrutiny and the importance of strict protocols for sample collection is now understood. However, the question of how best to process the large quantities of data generated is still unanswered. Main challenges for the analysis are the choice of proper pre-processing and classification methods. While these two issues have been investigated in isolation, we propose to use the classification of patient samples as a clinically relevant benchmark for the evaluation of pre-processing methods. Results Two in-house generated clinical SELDI-TOF MS datasets are used in this study as an example of high throughput mass-spectrometry data. We perform a systematic comparison of two commonly used pre-processing methods as implemented in Ciphergen ProteinChip Software and in the Cromwell package. With respect to reproducibility, Ciphergen and Cromwell pre-processing are largely comparable. We find that the overlap between peaks detected by either Ciphergen ProteinChip Software or Cromwell is large. This is especially the case for the more stringent peak detection settings. Moreover, similarity of the estimated intensities between matched peaks is high. We evaluate the pre-processing methods using five different classification methods. Classification is done in a double cross-validation protocol using repeated random sampling to obtain an unbiased estimate of classification accuracy. No pre-processing method significantly outperforms the other for all peak detection settings evaluated. Conclusion We use classification of patient samples as a clinically relevant benchmark for the evaluation of pre-processing methods. Both pre-processing methods lead to similar classification results on an ovarian cancer and a Gaucher disease dataset. However, the settings for pre-processing

  2. Comparative Evaluation of Preprocessing Freeware on Chromatography/Mass Spectrometry Data for Signature Discovery

    Energy Technology Data Exchange (ETDEWEB)

    Coble, Jamie B.; Fraga, Carlos G.

    2014-07-07

    Preprocessing software is crucial for the discovery of chemical signatures in metabolomics, chemical forensics, and other signature-focused disciplines that involve analyzing large data sets from chemical instruments. Here, four freely available and published preprocessing tools known as metAlign, MZmine, SpectConnect, and XCMS were evaluated for impurity profiling using nominal mass GC/MS data and accurate mass LC/MS data. Both data sets were previously collected from the analysis of replicate samples from multiple stocks of a nerve-agent precursor. Each of the four tools had their parameters set for the untargeted detection of chromatographic peaks from impurities present in the stocks. The peak table generated by each preprocessing tool was analyzed to determine the number of impurity components detected in all replicate samples per stock. A cumulative set of impurity components was then generated using all available peak tables and used as a reference to calculate the percent of component detections for each tool, in which 100% indicated the detection of every component. For the nominal mass GC/MS data, metAlign performed the best followed by MZmine, SpectConnect, and XCMS with detection percentages of 83, 60, 47, and 42%, respectively. For the accurate mass LC/MS data, the order was metAlign, XCMS, and MZmine with detection percentages of 80, 45, and 35%, respectively. SpectConnect did not function for the accurate mass LC/MS data. Larger detection percentages were obtained by combining the top performer with at least one of the other tools such as 96% by combining metAlign with MZmine for the GC/MS data and 93% by combining metAlign with XCMS for the LC/MS data. In terms of quantitative performance, the reported peak intensities had average absolute biases of 41, 4.4, 1.3 and 1.3% for SpectConnect, metAlign, XCMS, and MZmine, respectively, for the GC/MS data. For the LC/MS data, the average absolute biases were 22, 4.5, and 3.1% for metAlign, MZmine, and XCMS

  3. A Multi-channel Pre-processing Circuit for Signals from Thermocouple/Thermister

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    In this paper,a new developed multi-channel pre-processing circuit for signals from temperature sensor was introduced in brief.This circuit was developed to collect and amplify the signals from temperature sensor.This is a universal circuit.It can be used to process the signals from thermocouples and also used to process signals from thermistors.This circuit was mounted in a standard box(440W×405D×125H mm)as an instrument.The

  4. Experimental examination of similarity measures and preprocessing methods used for image registration

    Science.gov (United States)

    Svedlow, M.; Mcgillem, C. D.; Anuta, P. E.

    1976-01-01

    The criterion used to measure the similarity between images and thus find the position where the images are registered is examined. The three similarity measures considered are the correlation coefficient, the sum of the absolute differences, and the correlation function. Three basic types of preprocessing are then discussed: taking the magnitude of the gradient of the images, thresholding the images at their medians, and thresholding the magnitude of the gradient of the images at an arbitrary level to be determined experimentally. These multitemporal registration techniques are applied to remote imagery of agricultural areas.

  5. Preprocessing for Optimization of Probabilistic-Logic Models for Sequence Analysis

    DEFF Research Database (Denmark)

    Christiansen, Henning; Lassen, Ole Torp

    2009-01-01

    , the original complex models may be used for generating artificial evaluation data by efficient sampling, which can be used in the evaluation, although it does not constitute a foolproof test procedure. These models and evaluation processes are illustrated in the PRISM system developed by other authors, and we...... and approximation are needed. The first steps are taken towards a methodology for optimizing such models by approximations using auxiliary models for preprocessing or splitting them into submodels. Evaluation of such approximating models is challenging as authoritative test data may be sparse. On the other hand...

  6. Combined principal component preprocessing and n-tuple neural networks for improved classification

    DEFF Research Database (Denmark)

    Høskuldsson, Agnar; Linneberg, Christian

    2000-01-01

    We present a combined principal component analysis/neural network scheme for classification. The data used to illustrate the method consist of spectral fluorescence recordings from seven different production facilities, and the task is to relate an unknown sample to one of these seven factories....... The data are first preprocessed by performing an individual principal component analysis on each of the seven groups of data. The components found are then used for classifying the data, but instead of making a single multiclass classifier, we follow the ideas of turning a multiclass problem into a number...

  7. Fast randomized point location without preprocessing in two- and three-dimensional Delaunay triangulations

    Energy Technology Data Exchange (ETDEWEB)

    Muecke, E.P.; Saias, I.; Zhu, B.

    1996-05-01

    This paper studies the point location problem in Delaunay triangulations without preprocessing and additional storage. The proposed procedure finds the query point simply by walking through the triangulation, after selecting a good starting point by random sampling. The analysis generalizes and extends a recent result of d = 2 dimensions by proving this procedure to take expected time close to O(n{sup 1/(d+1)}) for point location in Delaunay triangulations of n random points in d = 3 dimensions. Empirical results in both two and three dimensions show that this procedure is efficient in practice.

  8. Interest rate prediction: a neuro-hybrid approach with data preprocessing

    Science.gov (United States)

    Mehdiyev, Nijat; Enke, David

    2014-07-01

    The following research implements a differential evolution-based fuzzy-type clustering method with a fuzzy inference neural network after input preprocessing with regression analysis in order to predict future interest rates, particularly 3-month T-bill rates. The empirical results of the proposed model is compared against nonparametric models, such as locally weighted regression and least squares support vector machines, along with two linear benchmark models, the autoregressive model and the random walk model. The root mean square error is reported for comparison.

  9. Reservoir computing with a slowly modulated mask signal for preprocessing using a mutually coupled optoelectronic system

    Science.gov (United States)

    Tezuka, Miwa; Kanno, Kazutaka; Bunsen, Masatoshi

    2016-08-01

    Reservoir computing is a machine-learning paradigm based on information processing in the human brain. We numerically demonstrate reservoir computing with a slowly modulated mask signal for preprocessing by using a mutually coupled optoelectronic system. The performance of our system is quantitatively evaluated by a chaotic time series prediction task. Our system can produce comparable performance with reservoir computing with a single feedback system and a fast modulated mask signal. We showed that it is possible to slow down the modulation speed of the mask signal by using the mutually coupled system in reservoir computing.

  10. Comparative evaluation of preprocessing freeware on chromatography/mass spectrometry data for signature discovery.

    Science.gov (United States)

    Coble, Jamie B; Fraga, Carlos G

    2014-09-01

    Preprocessing software, which converts large instrumental data sets into a manageable format for data analysis, is crucial for the discovery of chemical signatures in metabolomics, chemical forensics, and other signature-focused disciplines. Here, four freely available and published preprocessing tools known as MetAlign, MZmine, SpectConnect, and XCMS were evaluated for impurity profiling using nominal mass GC/MS data and accurate mass LC/MS data. Both data sets were previously collected from the analysis of replicate samples from multiple stocks of a nerve-agent precursor and method blanks. Parameters were optimized for each of the four tools for the untargeted detection, matching, and cataloging of chromatographic peaks from impurities present in the stock samples. The peak table generated by each preprocessing tool was analyzed to determine the number of impurity components detected in all replicate samples per stock and absent in the method blanks. A cumulative set of impurity components was then generated using all available peak tables and used as a reference to calculate the percent of component detections for each tool, in which 100% indicated the detection of every known component present in a stock. For the nominal mass GC/MS data, MetAlign had the most component detections followed by MZmine, SpectConnect, and XCMS with detection percentages of 83, 60, 47, and 41%, respectively. For the accurate mass LC/MS data, the order was MetAlign, XCMS, and MZmine with detection percentages of 80, 45, and 35%, respectively. SpectConnect did not function for the accurate mass LC/MS data. Larger detection percentages were obtained by combining the top performer with at least one of the other tools such as 96% by combining MetAlign with MZmine for the GC/MS data and 93% by combining MetAlign with XCMS for the LC/MS data. In terms of quantitative performance, the reported peak intensities from each tool had averaged absolute biases (relative to peak intensities obtained

  11. Computer-assisted bone age assessment: image preprocessing and epiphyseal/metaphyseal ROI extraction.

    Science.gov (United States)

    Pietka, E; Gertych, A; Pospiech, S; Cao, F; Huang, H K; Gilsanz, V

    2001-08-01

    Clinical assessment of skeletal maturity is based on a visual comparison of a left-hand wrist radiograph with atlas patterns. Using a new digital hand atlas an image analysis methodology is being developed. To assist radiologists in bone age estimation. The analysis starts with a preprocessing function yielding epiphyseal/metaphyseal regions of interest (EMROIs). Then, these regions are subjected to a feature extraction function. Accuracy has been measured independently at three stages of the image analysis: detection of phalangeal tip, extraction of the EMROIs, and location of diameters and lower edge of the EMROIs. Extracted features describe the stage of skeletal development more objectively than visual comparison.

  12. Mapping of electrical potentials from the chest surface - preprocessing and visualization

    Directory of Open Access Journals (Sweden)

    Vaclav Chudacek

    2005-01-01

    Full Text Available The aim of the paper is to present current research activity in the area of computer supported ECG processing. Analysis of heart electric field based on standard 12lead system is at present the most frequently used method of heart diseasediagnostics. However body surface potential mapping (BSPM that measures electric potentials from several tens to hundreds of electrodes placed on thorax surface has in certain cases higher diagnostic value given by data collection in areas that are inaccessible for standard 12lead ECG. For preprocessing, wavelet transform is used; it allows detect significant values on the ECG signal. Several types of maps, namely immediate potential, integral, isochronous, and differential.

  13. Experimental evaluation of video preprocessing algorithms for automatic target hand-off

    Science.gov (United States)

    McIngvale, P. H.; Guyton, R. D.

    It is pointed out that the Automatic Target Hand-Off Correlator (ATHOC) hardware has been modified to permit operation in a nonreal-time mode as a programmable laboratory test unit using video recordings as inputs and allowing several preprocessing algorithms to be software programmable. In parallel with this hardware modification effort, an analysis and simulation effort has been underway to help determine which of the many available preprocessing algorithms should be implemented in the ATHOC software. It is noted that videotapes from a current technology airborne target acquisition system and an imaging infrared missile seeker were recorded and used in the laboratory experiments. These experiments are described and the results are presented. A set of standard parameters is found for each case. Consideration of the background in the target scene is found to be important. Analog filter cutoff frequencies of 2.5 MHz for low pass and 300 kHz for high pass are found to give best results. EPNC = 1 is found to be slightly better than EPNC = 0. It is also shown that trilevel gives better results than bilevel.

  14. Automated cleaning and pre-processing of immunoglobulin gene sequences from high-throughput sequencing

    Directory of Open Access Journals (Sweden)

    Miri eMichaeli

    2012-12-01

    Full Text Available High throughput sequencing (HTS yields tens of thousands to millions of sequences that require a large amount of pre-processing work to clean various artifacts. Such cleaning cannot be performed manually. Existing programs are not suitable for immunoglobulin (Ig genes, which are variable and often highly mutated. This paper describes Ig-HTS-Cleaner (Ig High Throughput Sequencing Cleaner, a program containing a simple cleaning procedure that successfully deals with pre-processing of Ig sequences derived from HTS, and Ig-Indel-Identifier (Ig Insertion – Deletion Identifier, a program for identifying legitimate and artifact insertions and/or deletions (indels. Our programs were designed for analyzing Ig gene sequences obtained by 454 sequencing, but they are applicable to all types of sequences and sequencing platforms. Ig-HTS-Cleaner and Ig-Indel-Identifier have been implemented in Java and saved as executable JAR files, supported on Linux and MS Windows. No special requirements are needed in order to run the programs, except for correctly constructing the input files as explained in the text. The programs' performance has been tested and validated on real and simulated data sets.

  15. Preprocessing of A-scan GPR data based on energy features

    Science.gov (United States)

    Dogan, Mesut; Turhan-Sayan, Gonul

    2016-05-01

    There is an increasing demand for noninvasive real-time detection and classification of buried objects in various civil and military applications. The problem of detection and annihilation of landmines is particularly important due to strong safety concerns. The requirement for a fast real-time decision process is as important as the requirements for high detection rates and low false alarm rates. In this paper, we introduce and demonstrate a computationally simple, timeefficient, energy-based preprocessing approach that can be used in ground penetrating radar (GPR) applications to eliminate reflections from the air-ground boundary and to locate the buried objects, simultaneously, at one easy step. The instantaneous power signals, the total energy values and the cumulative energy curves are extracted from the A-scan GPR data. The cumulative energy curves, in particular, are shown to be useful to detect the presence and location of buried objects in a fast and simple way while preserving the spectral content of the original A-scan data for further steps of physics-based target classification. The proposed method is demonstrated using the GPR data collected at the facilities of IPA Defense, Ankara at outdoor test lanes. Cylindrically shaped plastic containers were buried in fine-medium sand to simulate buried landmines. These plastic containers were half-filled by ammonium nitrate including metal pins. Results of this pilot study are demonstrated to be highly promising to motivate further research for the use of energy-based preprocessing features in landmine detection problem.

  16. Selections of data preprocessing methods and similarity metrics for gene cluster analysis

    Institute of Scientific and Technical Information of China (English)

    YANG Chunmei; WAN Baikun; GAO Xiaofeng

    2006-01-01

    Clustering is one of the major exploratory techniques for gene expression data analysis. Only with suitable similarity metrics and when datasets are properly preprocessed, can results of high quality be obtained in cluster analysis. In this study, gene expression datasets with external evaluation criteria were preprocessed as normalization by line, normalization by column or logarithm transformation by base-2, and were subsequently clustered by hierarchical clustering, k-means clustering and self-organizing maps (SOMs) with Pearson correlation coefficient or Euclidean distance as similarity metric. Finally, the quality of clusters was evaluated by adjusted Rand index. The results illustrate that k-means clustering and SOMs have distinct advantages over hierarchical clustering in gene clustering, and SOMs are a bit better than k-means when randomly initialized. It also shows that hierarchical clustering prefers Pearson correlation coefficient as similarity metric and dataset normalized by line. Meanwhile, k-means clustering and SOMs can produce better clusters with Euclidean distance and logarithm transformed datasets. These results will afford valuable reference to the implementation of gene expression cluster analysis.

  17. A Technical Review on Biomass Processing: Densification, Preprocessing, Modeling and Optimization

    Energy Technology Data Exchange (ETDEWEB)

    Jaya Shankar Tumuluru; Christopher T. Wright

    2010-06-01

    It is now a well-acclaimed fact that burning fossil fuels and deforestation are major contributors to climate change. Biomass from plants can serve as an alternative renewable and carbon-neutral raw material for the production of bioenergy. Low densities of 40–60 kg/m3 for lignocellulosic and 200–400 kg/m3 for woody biomass limits their application for energy purposes. Prior to use in energy applications these materials need to be densified. The densified biomass can have bulk densities over 10 times the raw material helping to significantly reduce technical limitations associated with storage, loading and transportation. Pelleting, briquetting, or extrusion processing are commonly used methods for densification. The aim of the present research is to develop a comprehensive review of biomass processing that includes densification, preprocessing, modeling and optimization. The specific objective include carrying out a technical review on (a) mechanisms of particle bonding during densification; (b) methods of densification including extrusion, briquetting, pelleting, and agglomeration; (c) effects of process and feedstock variables and biomass biochemical composition on the densification (d) effects of preprocessing such as grinding, preheating, steam explosion, and torrefaction on biomass quality and binding characteristics; (e) models for understanding the compression characteristics; and (f) procedures for response surface modeling and optimization.

  18. [Research on preprocessing method of near-infrared spectroscopy detection of coal ash calorific value].

    Science.gov (United States)

    Zhang, Lin; Lu, Hui-Shan; Yan, Hong-Wei; Gao, Qiang; Wang, Fu-Jie

    2013-12-01

    The calorific value of coal ash is an important indicator to evaluate the coal quality. In the experiment, the effect of spectrum and processing methods such as smoothing, differential processing, multiplicative scatter correction (MSC) and standard normal variate (SNV) in improving the near-infrared diffuse reflection spectrum signal-noise ratio was analyzed first, then partial least squares (PLS) and principal component analysis (PCR) were used to establish the calorific value model of coal ash for the spectrums processed with each preprocessing method respectively. It was found that the model performance can be obviously improved with 5-point smoothing processing, MSC and SNV, in which 5-point smoothing processing has the best effect, the coefficient of association, correction standard deviation and forecast standard deviation are respectively 0.9899, 0.00049 and 0.00052, and when 25-point smoothing processing is adopted, over-smoothing occurs, which worsens the model performance, while the model established with the spectrum after differential preprocessing has no obvious change and the influence on the model is not large.

  19. Satellite Dwarf Galaxies in a Hierarchical Universe: Infall Histories, Group Preprocessing, and Reionization

    CERN Document Server

    Wetzel, Andrew R; Garrison-Kimmel, Shea

    2015-01-01

    In the Local Group, almost all satellite dwarf galaxies that are within the virial radius of the Milky Way (MW) and M31 exhibit strong environmental influence. The orbital histories of these satellites provide the key to understanding the role of the MW/M31 halo, lower-mass groups, and cosmic reionization on the evolution of dwarf galaxies. We examine the virial-infall histories of satellites with M_star = 10 ^ {3 - 9} M_sun using the ELVIS suite of cosmological zoom-in dissipationless simulations of 48 MW/M31-like halos. Satellites at z = 0 fell into the MW/M31 halos typically 5 - 8 Gyr ago at z = 0.5 - 1. However, they first fell into any host halo typically 7 - 10 Gyr ago at z = 0.7 - 1.5. This difference arises because many satellites experienced "group preprocessing" in another host halo, typically of M_vir ~ 10 ^ {10 - 12} M_sun, before falling into the MW/M31 halos. Satellites with lower-mass and/or those closer to the MW/M31 fell in earlier and are more likely to have experienced group preprocessing; ...

  20. Tactile on-chip pre-processing with techniques from artificial retinas

    Science.gov (United States)

    Maldonado-Lopez, R.; Vidal-Verdu, F.; Linan, G.; Roca, E.; Rodriguez-Vazquez, A.

    2005-06-01

    The interest in tactile sensors is increasing as their use in complex unstructured environments is demanded, like in telepresence, minimal invasive surgery, robotics etc. The matrix of pressure data these devices provide can be managed with many image processing algorithms to extract the required information. However, as in the case of vision chips or artificial retinas, problems arise when the array size and the computation complexity increase. Having a look to the skin, the information collected by every mechanoreceptor is not carried to the brain for its processing, but some complex pre-processing is performed to fit the limited throughput of the nervous system. This is specially important for high bandwidth demanding tasks. Experimental works report that neural response of skin mechanoreceptors encodes the change in local shape from an offset level rather than the absolute force or pressure distributions. This is also the behavior of the retina, which implements a spatio-temporal averaging. We propose the same strategy in tactile preprocessing, and we show preliminary results when it faces the detection of the slip, which involves fast real-time processing.

  1. Penggunaan Web Crawler Untuk Menghimpun Tweets dengan Metode Pre-Processing Text Mining

    Directory of Open Access Journals (Sweden)

    Bayu Rima Aditya

    2015-11-01

    Full Text Available Saat ini jumlah data di media sosial sudah terbilang sangat besar, namun jumlah data tersebut masih belum banyak dimanfaatkan atau diolah untuk menjadi sesuatu yang bernilai guna, salah satunya adalah tweets pada media sosial twitter. Paper ini menguraikan hasil penggunaan engine web crawel menggunakan metode pre-processing text mining. Penggunaan engine web crawel itu sendiri bertujuan untuk menghimpun tweets melalui API twitter sebagai data teks tidak terstruktur yang kemudian direpresentasikan kembali kedalam bentuk web. Sedangkan penggunaan metode pre-processing bertujuan untuk menyaring tweets melalui tiga tahap, yaitu cleansing, case folding, dan parsing. Aplikasi yang dirancang pada penelitian ini menggunakan metode pengembangan perangkat lunak yaitu model waterfall dan diimplementasikan dengan bahasa pemrograman PHP. Sedangkan untuk pengujiannya menggunakan black box testing untuk memeriksa apakah hasil perancangan sudah dapat berjalan sesuai dengan harapan atau belum. Hasil dari penelitian ini adalah berupa aplikasi yang dapat mengubah tweets yang telah dihimpun menjadi data yang siap diolah lebih lanjut sesuai dengan kebutuhan user berdasarkan kata kunci dan tanggal pencarian. Hal ini dilakukan karena dari beberapa penelitian terkait terlihat bahwa data pada media sosial khususnya twitter saat ini menjadi tujuan perusahaan atau instansi untuk memahami opini masyarakat

  2. Review of Intelligent Techniques Applied for Classification and Preprocessing of Medical Image Data

    Directory of Open Access Journals (Sweden)

    H S Hota

    2013-01-01

    Full Text Available Medical image data like ECG, EEG and MRI, CT-scan images are the most important way to diagnose disease of human being in precise way and widely used by the physician. Problem can be clearly identified with the help of these medical images. A robust model can classify the medical image data in better way .In this paper intelligent techniques like neural network and fuzzy logic techniques are explored for MRI medical image data to identify tumor in human brain. Also need of preprocessing of medical image data is explored. Classification technique has been used extensively in the field of medical imaging. The conventional method in medical science for medical image data classification is done by human inspection which may result misclassification of data sometime this type of problem identification are impractical for large amounts of data and noisy data, a noisy data may be produced due to some technical fault of the machine or by human errors and can lead misclassification of medical image data. We have collected number of papers based on neural network and fuzzy logic along with hybrid technique to explore the efficiency and robustness of the model for brain MRI data. It has been analyzed that intelligent model along with data preprocessing using principal component analysis (PCA and segmentation may be the competitive model in this domain.

  3. Statistical Downscaling Output GCM Modeling with Continuum Regression and Pre-Processing PCA Approach

    Directory of Open Access Journals (Sweden)

    Sutikno Sutikno

    2010-08-01

    Full Text Available One of the climate models used to predict the climatic conditions is Global Circulation Models (GCM. GCM is a computer-based model that consists of different equations. It uses numerical and deterministic equation which follows the physics rules. GCM is a main tool to predict climate and weather, also it uses as primary information source to review the climate change effect. Statistical Downscaling (SD technique is used to bridge the large-scale GCM with a small scale (the study area. GCM data is spatial and temporal data most likely to occur where the spatial correlation between different data on the grid in a single domain. Multicollinearity problems require the need for pre-processing of variable data X. Continuum Regression (CR and pre-processing with Principal Component Analysis (PCA methods is an alternative to SD modelling. CR is one method which was developed by Stone and Brooks (1990. This method is a generalization from Ordinary Least Square (OLS, Principal Component Regression (PCR and Partial Least Square method (PLS methods, used to overcome multicollinearity problems. Data processing for the station in Ambon, Pontianak, Losarang, Indramayu and Yuntinyuat show that the RMSEP values and R2 predict in the domain 8x8 and 12x12 by uses CR method produces results better than by PCR and PLS.

  4. Fast data preprocessing with Graphics Processing Units for inverse problem solving in light-scattering measurements

    Science.gov (United States)

    Derkachov, G.; Jakubczyk, T.; Jakubczyk, D.; Archer, J.; Woźniak, M.

    2017-07-01

    Utilising Compute Unified Device Architecture (CUDA) platform for Graphics Processing Units (GPUs) enables significant reduction of computation time at a moderate cost, by means of parallel computing. In the paper [Jakubczyk et al., Opto-Electron. Rev., 2016] we reported using GPU for Mie scattering inverse problem solving (up to 800-fold speed-up). Here we report the development of two subroutines utilising GPU at data preprocessing stages for the inversion procedure: (i) A subroutine, based on ray tracing, for finding spherical aberration correction function. (ii) A subroutine performing the conversion of an image to a 1D distribution of light intensity versus azimuth angle (i.e. scattering diagram), fed from a movie-reading CPU subroutine running in parallel. All subroutines are incorporated in PikeReader application, which we make available on GitHub repository. PikeReader returns a sequence of intensity distributions versus a common azimuth angle vector, corresponding to the recorded movie. We obtained an overall ∼ 400 -fold speed-up of calculations at data preprocessing stages using CUDA codes running on GPU in comparison to single thread MATLAB-only code running on CPU.

  5. Evaluation of preprocessing, mapping and postprocessing algorithms for analyzing whole genome bisulfite sequencing data.

    Science.gov (United States)

    Tsuji, Junko; Weng, Zhiping

    2016-11-01

    Cytosine methylation regulates many biological processes such as gene expression, chromatin structure and chromosome stability. The whole genome bisulfite sequencing (WGBS) technique measures the methylation level at each cytosine throughout the genome. There are an increasing number of publicly available pipelines for analyzing WGBS data, reflecting many choices of read mapping algorithms as well as preprocessing and postprocessing methods. We simulated single-end and paired-end reads based on three experimental data sets, and comprehensively evaluated 192 combinations of three preprocessing, five postprocessing and five widely used read mapping algorithms. We also compared paired-end data with single-end data at the same sequencing depth for performance of read mapping and methylation level estimation. Bismark and LAST were the most robust mapping algorithms. We found that Mott trimming and quality filtering individually improved the performance of both read mapping and methylation level estimation, but combining them did not lead to further improvement. Furthermore, we confirmed that paired-end sequencing reduced error rate and enhanced sensitivity for both read mapping and methylation level estimation, especially for short reads and in repetitive regions of the human genome.

  6. Data Acquisition and Preprocessing in Studies on Humans: What Is Not Taught in Statistics Classes?

    Science.gov (United States)

    Zhu, Yeyi; Hernandez, Ladia M; Mueller, Peter; Dong, Yongquan; Forman, Michele R

    2013-01-01

    The aim of this paper is to address issues in research that may be missing from statistics classes and important for (bio-)statistics students. In the context of a case study, we discuss data acquisition and preprocessing steps that fill the gap between research questions posed by subject matter scientists and statistical methodology for formal inference. Issues include participant recruitment, data collection training and standardization, variable coding, data review and verification, data cleaning and editing, and documentation. Despite the critical importance of these details in research, most of these issues are rarely discussed in an applied statistics program. One reason for the lack of more formal training is the difficulty in addressing the many challenges that can possibly arise in the course of a study in a systematic way. This article can help to bridge this gap between research questions and formal statistical inference by using an illustrative case study for a discussion. We hope that reading and discussing this paper and practicing data preprocessing exercises will sensitize statistics students to these important issues and achieve optimal conduct, quality control, analysis, and interpretation of a study.

  7. A data preprocessing strategy for metabolomics to reduce the mask effect in data analysis.

    Science.gov (United States)

    Yang, Jun; Zhao, Xinjie; Lu, Xin; Lin, Xiaohui; Xu, Guowang

    2015-01-01

    HighlightsDeveloped a data preprocessing strategy to cope with missing values and mask effects in data analysis from high variation of abundant metabolites.A new method- 'x-VAST' was developed to amend the measurement deviation enlargement.Applying the above strategy, several low abundant masked differential metabolites were rescued. Metabolomics is a booming research field. Its success highly relies on the discovery of differential metabolites by comparing different data sets (for example, patients vs. controls). One of the challenges is that differences of the low abundant metabolites between groups are often masked by the high variation of abundant metabolites. In order to solve this challenge, a novel data preprocessing strategy consisting of three steps was proposed in this study. In step 1, a 'modified 80%' rule was used to reduce effect of missing values; in step 2, unit-variance and Pareto scaling methods were used to reduce the mask effect from the abundant metabolites. In step 3, in order to fix the adverse effect of scaling, stability information of the variables deduced from intensity information and the class information, was used to assign suitable weights to the variables. When applying to an LC/MS based metabolomics dataset from chronic hepatitis B patients study and two simulated datasets, the mask effect was found to be partially eliminated and several new low abundant differential metabolites were rescued.

  8. Effective Preprocessing Procedures Virtually Eliminate Distance-Dependent Motion Artifacts in Resting State FMRI.

    Science.gov (United States)

    Jo, Hang Joon; Gotts, Stephen J; Reynolds, Richard C; Bandettini, Peter A; Martin, Alex; Cox, Robert W; Saad, Ziad S

    2013-05-21

    Artifactual sources of resting-state (RS) FMRI can originate from head motion, physiology, and hardware. Of these sources, motion has received considerable attention and was found to induce corrupting effects by differentially biasing correlations between regions depending on their distance. Numerous corrective approaches have relied on the identification and censoring of high-motion time points and the use of the brain-wide average time series as a nuisance regressor to which the data are orthogonalized (Global Signal Regression, GSReg). We first replicate the previously reported head-motion bias on correlation coefficients using data generously contributed by Power et al. (2012). We then show that while motion can be the source of artifact in correlations, the distance-dependent bias-taken to be a manifestation of the motion effect on correlation-is exacerbated by the use of GSReg. Put differently, correlation estimates obtained after GSReg are more susceptible to the presence of motion and by extension to the levels of censoring. More generally, the effect of motion on correlation estimates depends on the preprocessing steps leading to the correlation estimate, with certain approaches performing markedly worse than others. For this purpose, we consider various models for RS FMRI preprocessing and show that WMeLOCAL, as subset of the ANATICOR discussed by Jo et al. (2010), denoising approach results in minimal sensitivity to motion and reduces by extension the dependence of correlation results on censoring.

  9. Data preprocessing method for liquid chromatography-mass spectrometry based metabolomics.

    Science.gov (United States)

    Wei, Xiaoli; Shi, Xue; Kim, Seongho; Zhang, Li; Patrick, Jeffrey S; Binkley, Joe; McClain, Craig; Zhang, Xiang

    2012-09-18

    A set of data preprocessing algorithms for peak detection and peak list alignment are reported for analysis of liquid chromatography-mass spectrometry (LC-MS)-based metabolomics data. For spectrum deconvolution, peak picking is achieved at the selected ion chromatogram (XIC) level. To estimate and remove the noise in XICs, each XIC is first segmented into several peak groups based on the continuity of scan number, and the noise level is estimated by all the XIC signals, except the regions potentially with presence of metabolite ion peaks. After removing noise, the peaks of molecular ions are detected using both the first and the second derivatives, followed by an efficient exponentially modified Gaussian-based peak deconvolution method for peak fitting. A two-stage alignment algorithm is also developed, where the retention times of all peaks are first transferred into the z-score domain and the peaks are aligned based on the measure of their mixture scores after retention time correction using a partial linear regression. Analysis of a set of spike-in LC-MS data from three groups of samples containing 16 metabolite standards mixed with metabolite extract from mouse livers demonstrates that the developed data preprocessing method performs better than two of the existing popular data analysis packages, MZmine2.6 and XCMS(2), for peak picking, peak list alignment, and quantification.

  10. A Lightweight Data Preprocessing Strategy with Fast Contradiction Analysis for Incremental Classifier Learning

    Directory of Open Access Journals (Sweden)

    Simon Fong

    2015-01-01

    Full Text Available A prime objective in constructing data streaming mining models is to achieve good accuracy, fast learning, and robustness to noise. Although many techniques have been proposed in the past, efforts to improve the accuracy of classification models have been somewhat disparate. These techniques include, but are not limited to, feature selection, dimensionality reduction, and the removal of noise from training data. One limitation common to all of these techniques is the assumption that the full training dataset must be applied. Although this has been effective for traditional batch training, it may not be practical for incremental classifier learning, also known as data stream mining, where only a single pass of the data stream is seen at a time. Because data streams can amount to infinity and the so-called big data phenomenon, the data preprocessing time must be kept to a minimum. This paper introduces a new data preprocessing strategy suitable for the progressive purging of noisy data from the training dataset without the need to process the whole dataset at one time. This strategy is shown via a computer simulation to provide the significant benefit of allowing for the dynamic removal of bad records from the incremental classifier learning process.

  11. Robust symmetrical number system preprocessing for minimizing encoding errors in photonic analog-to-digital converters

    Science.gov (United States)

    Arvizo, Mylene R.; Calusdian, James; Hollinger, Kenneth B.; Pace, Phillip E.

    2011-08-01

    A photonic analog-to-digital converter (ADC) preprocessing architecture based on the robust symmetrical number system (RSNS) is presented. The RSNS preprocessing architecture is a modular scheme in which a modulus number of comparators are used at the output of each Mach-Zehnder modulator channel. The number of comparators with a logic 1 in each channel represents the integer values within each RSNS modulus sequence. When considered together, the integers within each sequence change one at a time at the next code position, resulting in an integer Gray code property. The RSNS ADC has the feature that the maximum nonlinearity is less than a least significant bit (LSB). Although the observed dynamic range (greatest length of combined sequences that contain no ambiguities) of the RSNS ADC is less than the optimum symmetrical number system ADC, the integer Gray code properties make it attractive for error control. A prototype is presented to demonstrate the feasibility of the concept and to show the important RSNS property that the largest nonlinearity is always less than a LSB. Also discussed are practical considerations related to multi-gigahertz implementations.

  12. Effective Preprocessing Procedures Virtually Eliminate Distance-Dependent Motion Artifacts in Resting State FMRI

    Directory of Open Access Journals (Sweden)

    Hang Joon Jo

    2013-01-01

    Full Text Available Artifactual sources of resting-state (RS FMRI can originate from head motion, physiology, and hardware. Of these sources, motion has received considerable attention and was found to induce corrupting effects by differentially biasing correlations between regions depending on their distance. Numerous corrective approaches have relied on the identification and censoring of high-motion time points and the use of the brain-wide average time series as a nuisance regressor to which the data are orthogonalized (Global Signal Regression, GSReg. We replicate the previously reported head-motion bias on correlation coefficients and then show that while motion can be the source of artifact in correlations, the distance-dependent bias is exacerbated by GSReg. Put differently, correlation estimates obtained after GSReg are more susceptible to the presence of motion and by extension to the levels of censoring. More generally, the effect of motion on correlation estimates depends on the preprocessing steps leading to the correlation estimate, with certain approaches performing markedly worse than others. For this purpose, we consider various models for RS FMRI preprocessing and show that the local white matter regressor (WMeLOCAL, a subset of ANATICOR, results in minimal sensitivity to motion and reduces by extension the dependence of correlation results on censoring.

  13. MODIStsp: An R package for automatic preprocessing of MODIS Land Products time series

    Science.gov (United States)

    Busetto, L.; Ranghetti, L.

    2016-12-01

    MODIStsp is a new R package allowing automating the creation of raster time series derived from MODIS Land Products. It allows performing several preprocessing steps (e.g. download, mosaicing, reprojection and resize) on MODIS products on a selected time period and area. All processing parameters can be set with a user-friendly GUI, allowing users to select which specific layers of the original MODIS HDF files have to be processed and which Quality Indicators have to be extracted from the aggregated MODIS Quality Assurance layers. Moreover, the tool allows on-the-fly computation of time series of Spectral Indexes (either standard or custom-specified by the user through the GUI) from surface reflectance bands. Outputs are saved as single-band rasters corresponding to each available acquisition date and output layer. Virtual files allowing easy access to the entire time series as a single file using common image processing/GIS software or R scripts can be also created. Non-interactive execution within an R script and stand-alone execution outside an R environment exploiting a previously created Options File are also possible, the latter allowing scheduling execution of MODIStsp to automatically update a time series when a new image is available. The proposed software constitutes a very useful tool for the Remote Sensing community, since it allows performing all the main preprocessing steps required for the creation of time series of MODIS data within a common framework, and without requiring any particular programming skills by its users.

  14. A preprocessing tool for removing artifact from cardiac RR interval recordings using three-dimensional spatial distribution mapping.

    Science.gov (United States)

    Stapelberg, Nicolas J C; Neumann, David L; Shum, David H K; McConnell, Harry; Hamilton-Craig, Ian

    2016-04-01

    Artifact is common in cardiac RR interval data that is recorded for heart rate variability (HRV) analysis. A novel algorithm for artifact detection and interpolation in RR interval data is described. It is based on spatial distribution mapping of RR interval magnitude and relationships to adjacent values in three dimensions. The characteristics of normal physiological RR intervals and artifact intervals were established using 24-h recordings from 20 technician-assessed human cardiac recordings. The algorithm was incorporated into a preprocessing tool and validated using 30 artificial RR (ARR) interval data files, to which known quantities of artifact (0.5%, 1%, 2%, 3%, 5%, 7%, 10%) were added. The impact of preprocessing ARR files with 1% added artifact was also assessed using 10 time domain and frequency domain HRV metrics. The preprocessing tool was also used to preprocess 69 24-h human cardiac recordings. The tool was able to remove artifact from technician-assessed human cardiac recordings (sensitivity 0.84, SD = 0.09, specificity of 1.00, SD = 0.01) and artificial data files. The removal of artifact had a low impact on time domain and frequency domain HRV metrics (ranging from 0% to 2.5% change in values). This novel preprocessing tool can be used with human 24-h cardiac recordings to remove artifact while minimally affecting physiological data and therefore having a low impact on HRV measures of that data.

  15. Increasing conclusiveness of metabonomic studies by chem-informatic preprocessing of capillary electrophoretic data on urinary nucleoside profiles.

    Science.gov (United States)

    Szymańska, E; Markuszewski, M J; Capron, X; van Nederkassel, A-M; Heyden, Y Vander; Markuszewski, M; Krajka, K; Kaliszan, R

    2007-01-17

    Nowadays, bioinformatics offers advanced tools and procedures of data mining aimed at finding consistent patterns or systematic relationships between variables. Numerous metabolites concentrations can readily be determined in a given biological system by high-throughput analytical methods. However, such row analytical data comprise noninformative components due to many disturbances normally occurring in analysis of biological samples. To eliminate those unwanted original analytical data components advanced chemometric data preprocessing methods might be of help. Here, such methods are applied to electrophoretic nucleoside profiles in urine samples of cancer patients and healthy volunteers. The electrophoretic nucleoside profiles were obtained under following conditions: 100 mM borate, 72.5 mM phosphate, 160 mM SDS, pH 6.7; 25 kV voltage, 30 degrees C temperature; untreated fused silica capillary 70 cm effective length, 50 microm I.D. Different most advanced preprocessing tools were applied for baseline correction, denoising and alignment of electrophoretic data. That approach was compared to standard procedure of electrophoretic peak integration. The best results of preprocessing were obtained after application of the so-called correlation optimized warping (COW) to align the data. The principal component analysis (PCA) of preprocessed data provides a clearly better consistency of the nucleoside electrophoretic profiles with health status of subjects than PCA of peak areas of original data (without preprocessing).

  16. Evaluating the reliability of different preprocessing steps to estimate graph theoretical measures in resting state fMRI data.

    Science.gov (United States)

    Aurich, Nathassia K; Alves Filho, José O; Marques da Silva, Ana M; Franco, Alexandre R

    2015-01-01

    With resting-state functional MRI (rs-fMRI) there are a variety of post-processing methods that can be used to quantify the human brain connectome. However, there is also a choice of which preprocessing steps will be used prior to calculating the functional connectivity of the brain. In this manuscript, we have tested seven different preprocessing schemes and assessed the reliability between and reproducibility within the various strategies by means of graph theoretical measures. Different preprocessing schemes were tested on a publicly available dataset, which includes rs-fMRI data of healthy controls. The brain was parcellated into 190 nodes and four graph theoretical (GT) measures were calculated; global efficiency (GEFF), characteristic path length (CPL), average clustering coefficient (ACC), and average local efficiency (ALE). Our findings indicate that results can significantly differ based on which preprocessing steps are selected. We also found dependence between motion and GT measurements in most preprocessing strategies. We conclude that by using censoring based on outliers within the functional time-series as a processing, results indicate an increase in reliability of GT measurements with a reduction of the dependency of head motion.

  17. On image pre-processing for PIV of single- and two-phase flows over reflecting objects

    Energy Technology Data Exchange (ETDEWEB)

    Deen, Niels G.; Willems, Paul; Sint Annaland, Martin van; Kuipers, J.A.M.; Lammertink, Rob G.H.; Kemperman, Antoine J.B.; Wessling, Matthias; Meer, Walter G.J. van der [University of Twente, Faculty of Science and Technology, Institute of Mechanics, Processes and Control Twente (IMPACT), Enschede (Netherlands)

    2010-08-15

    A novel image pre-processing scheme for PIV of single- and two-phase flows over reflecting objects which does not require the use of additional hardware is discussed. The approach for single-phase flow consists of image normalization and intensity stretching followed by background subtraction. For two-phase flow, an additional masking step is added after the background subtraction. The effectiveness of the pre-processing scheme is shown for two examples: PIV of single-phase flow in spacer-filled channels and two-phase flow in these channels. The pre-processing scheme increased the displacement peak detectability significantly and produced high quality vector fields, without the use of additional hardware. (orig.)

  18. A simpler method of preprocessing MALDI-TOF MS data for differential biomarker analysis: stem cell and melanoma cancer studies

    Directory of Open Access Journals (Sweden)

    Tong Dong L

    2011-09-01

    Full Text Available Abstract Introduction Raw spectral data from matrix-assisted laser desorption/ionisation time-of-flight (MALDI-TOF with MS profiling techniques usually contains complex information not readily providing biological insight into disease. The association of identified features within raw data to a known peptide is extremely difficult. Data preprocessing to remove uncertainty characteristics in the data is normally required before performing any further analysis. This study proposes an alternative yet simple solution to preprocess raw MALDI-TOF-MS data for identification of candidate marker ions. Two in-house MALDI-TOF-MS data sets from two different sample sources (melanoma serum and cord blood plasma are used in our study. Method Raw MS spectral profiles were preprocessed using the proposed approach to identify peak regions in the spectra. The preprocessed data was then analysed using bespoke machine learning algorithms for data reduction and ion selection. Using the selected ions, an ANN-based predictive model was constructed to examine the predictive power of these ions for classification. Results Our model identified 10 candidate marker ions for both data sets. These ion panels achieved over 90% classification accuracy on blind validation data. Receiver operating characteristics analysis was performed and the area under the curve for melanoma and cord blood classifiers was 0.991 and 0.986, respectively. Conclusion The results suggest that our data preprocessing technique removes unwanted characteristics of the raw data, while preserving the predictive components of the data. Ion identification analysis can be carried out using MALDI-TOF-MS data with the proposed data preprocessing technique coupled with bespoke algorithms for data reduction and ion selection.

  19. Preprocessing and Quality Control Strategies for Illumina DASL Assay-Based Brain Gene Expression Studies with Semi-Degraded Samples.

    Science.gov (United States)

    Chow, Maggie L; Winn, Mary E; Li, Hai-Ri; April, Craig; Wynshaw-Boris, Anthony; Fan, Jian-Bing; Fu, Xiang-Dong; Courchesne, Eric; Schork, Nicholas J

    2012-01-01

    Available statistical preprocessing or quality control analysis tools for gene expression microarray datasets are known to greatly affect downstream data analysis, especially when degraded samples, unique tissue samples, or novel expression assays are used. It is therefore important to assess the validity and impact of the assumptions built in to preprocessing schemes for a dataset. We developed and assessed a data preprocessing strategy for use with the Illumina DASL-based gene expression assay with partially degraded postmortem prefrontal cortex samples. The samples were obtained from individuals with autism as part of an investigation of the pathogenic factors contributing to autism. Using statistical analysis methods and metrics such as those associated with multivariate distance matrix regression and mean inter-array correlation, we developed a DASL-based assay gene expression preprocessing pipeline to accommodate and detect problems with microarray-based gene expression values obtained with degraded brain samples. Key steps in the pipeline included outlier exclusion, data transformation and normalization, and batch effect and covariate corrections. Our goal was to produce a clean dataset for subsequent downstream differential expression analysis. We ultimately settled on available transformation and normalization algorithms in the R/Bioconductor package lumi based on an assessment of their use in various combinations. A log2-transformed, quantile-normalized, and batch and seizure-corrected procedure was likely the most appropriate for our data. We empirically tested different components of our proposed preprocessing strategy and believe that our results suggest that a preprocessing strategy that effectively identifies outliers, normalizes the data, and corrects for batch effects can be applied to all studies, even those pursued with degraded samples.

  20. From Voids to Coma: the prevalence of pre-processing in the local Universe

    CERN Document Server

    Cybulski, Ryan; Fazio, Giovanni G; Gutermuth, Robert A

    2014-01-01

    We examine the effects of pre-processing across the Coma Supercluster, including 3505 galaxies over 500 sq deg, by quantifying the degree to which star-forming (SF) activity is quenched as a function of environment. We characterise environment using the complementary techniques of Voronoi Tessellation, to measure the density field, and the Minimal Spanning Tree, to define continuous structures, and so we measure SF activity as a function of local density and the type of environment (cluster, group, filament, and void), and quantify the degree to which environment contributes to quenching of SF activity. Our sample covers over two orders of magnitude in stellar mass (10^8.5 to 10^11 Msun), and consequently we trace the effects of environment on SF activity for dwarf and massive galaxies, distinguishing so-called "mass quenching" from "environment quenching". Environmentally-driven quenching of SF activity, measured relative to the void galaxies, occurs to progressively greater degrees in filaments, groups, and...

  1. Joint preprocesser-based detector for cooperative networks with limited hardware processing capability

    KAUST Repository

    Abuzaid, Abdulrahman I.

    2015-02-01

    In this letter, a joint detector for cooperative communication networks is proposed when the destination has limited hardware processing capability. The transmitter sends its symbols with the help of L relays. As the destination has limited hardware, only U out of L signals are processed and the energy of the remaining relays is lost. To solve this problem, a joint preprocessing based detector is proposed. This joint preprocessor based detector operate on the principles of minimizing the symbol error rate (SER). For a realistic assessment, pilot symbol aided channel estimation is incorporated for this proposed detector. From our simulations, it can be observed that our proposed detector achieves the same SER performance as that of the maximum likelihood (ML) detector with all participating relays. Additionally, our detector outperforms selection combining (SC), channel shortening (CS) scheme and reduced-rank techniques when using the same U. Our proposed scheme has low computational complexity.

  2. DUAL CHANNEL SPEECH ENHANCEMENT USING HADAMARD-LMS ALGORITHM WITH DCT PREPROCESSING TECHNIQUE

    Directory of Open Access Journals (Sweden)

    D.DEEPA,

    2010-09-01

    Full Text Available Speech enhancement and noise reduction have wide applications in speech processing. They are often employed as a pre-processing stage in various applications. Two points are often required to be considered in signal de-noising applications: eliminating the undesired noise from signal to improve the Signal to noise Ratio(SNR and preserving the shape and characteristics of the original signal. Background noise in speech signal will reduce the speech intelligibility for people with hearing loss especially for sensorineural loss patients. The proposed algorithm describes Hadamard - Least Mean Square algorithm with DCT pre processing technique to improve the SNR and to reduce the mean square error (MSE. The DCT has separable, and energy compaction property. Although the DCT does not separate frequencies, it is a powerful signal decorrelator. It is a real valued function and thus can be effectively used in real-time operation.

  3. Synthetic aperture radar image correlation by use of preprocessing for enhancement of scattering centers.

    Science.gov (United States)

    Khoury, J; Gianino, P D; Woods, C L

    2000-10-15

    We demonstrate that a significant improvement can be obtained in the recognition of complicated synthetic aperture radar images taken from the Moving and Stationary Target Acquisitions and Recognition database. These images typically have a low number of scattering centers and high noise. We first preprocess the images and the templates formed from them so that their scattering centers are enhanced. Our technique can produce high-quality performance in several correlation criteria. For realistic automatic target recognition systems, our approach should make it easy to implement optical recognition systems with binarized data for many different types of correlation filter and should have a great effect on feeding data-compressed (binarized) information into either digital or optical processors.

  4. [Sample preprocessing method for residual quinolones in honey using immunoaffinity resin].

    Science.gov (United States)

    Ihara, Yoshiharu; Kato, Mihoko; Kodaira, Tsukasa; Itoh, Shinji; Terakawa, Mika; Horie, Masakazu; Saito, Koichi; Nakazawa, Hiroyuki

    2009-06-01

    A sample preparation method was developed for determination of quinolones in honey using immunoaffinity resin. For this purpose, an immunoaffinity resin for quinolones was prepared by coupling a quinolone-specific monoclonal antibody to agarose resin. Honey samples diluted with phosphate buffer were reacted with immunoaffinity resin. After the resin was washed, quinolones were eluted with glycine-HCl. Quinolones in the eluate were determined by HPLC with fluorescence detection. No interfering peak was found on the chromatograms of honey samples. The recoveries of quinolones from samples were over 70% at fortification levels of 20 ng/g (for norfloxacin, ciprofloxacin and enrofloxacin) and 10 ng/g (for danofloxacin). The quantification limits of quinolones were 2 ng/g. This sample preprocessing method using immunoaffinity resin was found to be effective and suitable for determining residual quinolones in honey.

  5. Base resolution methylome profiling: considerations in platform selection, data preprocessing and analysis.

    Science.gov (United States)

    Sun, Zhifu; Cunningham, Julie; Slager, Susan; Kocher, Jean-Pierre

    2015-08-01

    Bisulfite treatment-based methylation microarray (mainly Illumina 450K Infinium array) and next-generation sequencing (reduced representation bisulfite sequencing, Agilent SureSelect Human Methyl-Seq, NimbleGen SeqCap Epi CpGiant or whole-genome bisulfite sequencing) are commonly used for base resolution DNA methylome research. Although multiple tools and methods have been developed and used for the data preprocessing and analysis, confusions remains for these platforms including how and whether the 450k array should be normalized; which platform should be used to better fit researchers' needs; and which statistical models would be more appropriate for differential methylation analysis. This review presents the commonly used platforms and compares the pros and cons of each in methylome profiling. We then discuss approaches to study design, data normalization, bias correction and model selection for differentially methylated individual CpGs and regions.

  6. Feasibility investigation of integrated optics Fourier transform devices. [holographic subtraction for multichannel data preprocessing

    Science.gov (United States)

    Verber, C. M.; Vahey, D. W.; Wood, V. E.; Kenan, R. P.; Hartman, N. F.

    1977-01-01

    The possibility of producing an integrated optics data processing device based upon Fourier transformations or other parallel processing techniques, and the ways in which such techniques may be used to upgrade the performance of present and projected NASA systems were investigated. Activities toward this goal include; (1) production of near-diffraction-limited geodesic lenses in glass waveguides; (2) development of grinding and polishing techniques for the production of geodesic lenses in LiNbO3 waveguides; (3) development of a characterization technique for waveguide lenses; and (4) development of a theory for corrected aspheric geodesic lenses. A holographic subtraction system was devised which should be capable of rapid on-board preprocessing of a large number of parallel data channels. The principle involved is validated in three demonstrations.

  7. Data preprocessing method for fluorescence molecular tomography using a priori information provided by CT.

    Science.gov (United States)

    Fu, Jianwei; Yang, Xiaoquan; Meng, Yuanzheng; Luo, Qingming; Gong, Hui

    2012-01-01

    The combined system of micro-CT and fluorescence molecular tomography (FMT) offers a new tool to provide anatomical and functional information of small animals in a single study. To take advantages of the combined system, a data preprocessing method is proposed to extract the valid data for FMT reconstruction algorithms using a priori information provided by CT. The boundary information of the animal and animal holder is extracted from reconstructed CT volume data. A ray tracing method is used to trace the path of the excitation beam, calculate the locations and directions of the optional sources and determine whether the optional sources are valid. To accurately calculate the projections of the detectors on optical images and judge their validity, a combination of perspective projection and inverse ray tracing method are adopted to offer optimal performance. The imaging performance of the combined system with the presented method is validated through experimental rat imaging.

  8. Analyzing ChIP-seq data: preprocessing, normalization, differential identification, and binding pattern characterization.

    Science.gov (United States)

    Taslim, Cenny; Huang, Kun; Huang, Tim; Lin, Shili

    2012-01-01

    Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is a high-throughput antibody-based method to study genome-wide protein-DNA binding interactions. ChIP-seq technology allows scientist to obtain more accurate data providing genome-wide coverage with less starting material and in shorter time compared to older ChIP-chip experiments. Herein we describe a step-by-step guideline in analyzing ChIP-seq data including data preprocessing, nonlinear normalization to enable comparison between different samples and experiments, statistical-based method to identify differential binding sites using mixture modeling and local false discovery rates (fdrs), and binding pattern characterization. In addition, we provide a sample analysis of ChIP-seq data using the steps provided in the guideline.

  9. Data pre-processing for quantification in tomography and radiography with a digital flat panel detector

    Science.gov (United States)

    Rinkel, Jean; Gerfault, Laurent; Estève, François; Dinten, Jean-Marc

    2006-03-01

    In order to obtain accurate quantitative results, flat panel detectors require specific calibration and correction of acquisitions. Main artefacts are due to bad pixels, variations of photodiodes characteristics and inhomogeneity of X-rays sensitivity of the scintillator layer. Other limitations for quantification are the non-linearity of the detector due to charge trapping in the transistors and the scattering generated inside the detector, called detector scattering. Based on physical models of artefacts generation, this paper presents an unified framework for the calibration and correction of these artefacts. The following specific algorithms have been developed to correct them. A new method for correction of deviation to linearity is based on the comparison between experimental and simulated data. A method of detector scattering correction is performed in two steps: off-line characterization of detector scattering by considering its spatial distribution through a convolution model and on-line correction based on a deconvolution approach. Radiographic results on an anthropomorphic thorax phantom imaged with a flat panel detector, that convert X-rays into visible light using scintillator coupled to an amorphous silicon transistor frame for photons to electrons conversion, demonstrate that experimental X-rays attenuation images are significantly improved qualitatively and quantitatively by applying non-linearity correction and detector scattering correction. Results obtained on tomographic reconstructions from pre-processed acquisitions of the phantom are in very good agreement with expected attenuation coefficients values obtained with a multi-slice CT scanner. Thus, this paper demonstrates the efficiency of the proposed pre-processings to perform accurate quantification on radiographies and tomographies.

  10. Pre-processing ambient noise cross-correlations with equalizing the covariance matrix eigenspectrum

    Science.gov (United States)

    Seydoux, Léonard; de Rosny, Julien; Shapiro, Nikolai M.

    2017-09-01

    Passive imaging techniques from ambient seismic noise requires a nearly isotropic distribution of the noise sources in order to ensure reliable traveltime measurements between seismic stations. However, real ambient seismic noise often partially fulfils this condition. It is generated in preferential areas (in deep ocean or near continental shores), and some highly coherent pulse-like signals may be present in the data such as those generated by earthquakes. Several pre-processing techniques have been developed in order to attenuate the directional and deterministic behaviour of this real ambient noise. Most of them are applied to individual seismograms before cross-correlation computation. The most widely used techniques are the spectral whitening and temporal smoothing of the individual seismic traces. We here propose an additional pre-processing to be used together with the classical ones, which is based on the spatial analysis of the seismic wavefield. We compute the cross-spectra between all available stations pairs in spectral domain, leading to the data covariance matrix. We apply a one-bit normalization to the covariance matrix eigenspectrum before extracting the cross-correlations in the time domain. The efficiency of the method is shown with several numerical tests. We apply the method to the data collected by the USArray, when the M8.8 Maule earthquake occurred on 2010 February 27. The method shows a clear improvement compared with the classical equalization to attenuate the highly energetic and coherent waves incoming from the earthquake, and allows to perform reliable traveltime measurement even in the presence of the earthquake.

  11. Image pre-processing method for near-wall PIV measurements over moving curved interfaces

    Science.gov (United States)

    Jia, L. C.; Zhu, Y. D.; Jia, Y. X.; Yuan, H. J.; Lee, C. B.

    2017-03-01

    PIV measurements near a moving interface are always difficult. This paper presents a PIV image pre-processing method that returns high spatial resolution velocity profiles near the interface. Instead of re-shaping or re-orientating the interrogation windows, interface tracking and an image transformation are used to stretch the particle image strips near a curved interface into rectangles. Then the adaptive structured interrogation windows can be arranged at specified distances from the interface. Synthetic particles are also added into the solid region to minimize interfacial effects and to restrict particles on both sides of the interface. Since a high spatial resolution is only required in high velocity gradient region, adaptive meshing and stretching of the image strips in the normal direction is used to improve the cross-correlation signal-to-noise ratio (SN) by reducing the velocity difference and the particle image distortion within the interrogation window. A two dimensional Gaussian fit is used to compensate for the effects of stretching particle images. The working hypothesis is that fluid motion near the interface is ‘quasi-tangential flow’, which is reasonable in most fluid-structure interaction scenarios. The method was validated against the window deformation iterative multi-grid scheme (WIDIM) using synthetic image pairs with different velocity profiles. The method was tested for boundary layer measurements of a supersonic turbulent boundary layer on a flat plate, near a rotating blade and near a flexible flapping flag. This image pre-processing method provides higher spatial resolution than conventional WIDIM and good robustness for measuring velocity profiles near moving interfaces.

  12. Chang'E-3 data pre-processing system based on scientific workflow

    Science.gov (United States)

    tan, xu; liu, jianjun; wang, yuanyuan; yan, wei; zhang, xiaoxia; li, chunlai

    2016-04-01

    The Chang'E-3(CE3) mission have obtained a huge amount of lunar scientific data. Data pre-processing is an important segment of CE3 ground research and application system. With a dramatic increase in the demand of data research and application, Chang'E-3 data pre-processing system(CEDPS) based on scientific workflow is proposed for the purpose of making scientists more flexible and productive by automating data-driven. The system should allow the planning, conduct and control of the data processing procedure with the following possibilities: • describe a data processing task, include:1)define input data/output data, 2)define the data relationship, 3)define the sequence of tasks,4)define the communication between tasks,5)define mathematical formula, 6)define the relationship between task and data. • automatic processing of tasks. Accordingly, Describing a task is the key point whether the system is flexible. We design a workflow designer which is a visual environment for capturing processes as workflows, the three-level model for the workflow designer is discussed:1) The data relationship is established through product tree.2)The process model is constructed based on directed acyclic graph(DAG). Especially, a set of process workflow constructs, including Sequence, Loop, Merge, Fork are compositional one with another.3)To reduce the modeling complexity of the mathematical formulas using DAG, semantic modeling based on MathML is approached. On top of that, we will present how processed the CE3 data with CEDPS.

  13. Image preprocessing for improving computational efficiency in implementation of restoration and superresolution algorithms.

    Science.gov (United States)

    Sundareshan, Malur K; Bhattacharjee, Supratik; Inampudi, Radhika; Pang, Ho-Yuen

    2002-12-10

    Computational complexity is a major impediment to the real-time implementation of image restoration and superresolution algorithms in many applications. Although powerful restoration algorithms have been developed within the past few years utilizing sophisticated mathematical machinery (based on statistical optimization and convex set theory), these algorithms are typically iterative in nature and require a sufficient number of iterations to be executed to achieve the desired resolution improvement that may be needed to meaningfully perform postprocessing image exploitation tasks in practice. Additionally, recent technological breakthroughs have facilitated novel sensor designs (focal plane arrays, for instance) that make it possible to capture megapixel imagery data at video frame rates. A major challenge in the processing of these large-format images is to complete the execution of the image processing steps within the frame capture times and to keep up with the output rate of the sensor so that all data captured by the sensor can be efficiently utilized. Consequently, development of novel methods that facilitate real-time implementation of image restoration and superresolution algorithms is of significant practical interest and is the primary focus of this study. The key to designing computationally efficient processing schemes lies in strategically introducing appropriate preprocessing steps together with the superresolution iterations to tailor optimized overall processing sequences for imagery data of specific formats. For substantiating this assertion, three distinct methods for tailoring a preprocessing filter and integrating it with the superresolution processing steps are outlined. These methods consist of a region-of-interest extraction scheme, a background-detail separation procedure, and a scene-derived information extraction step for implementing a set-theoretic restoration of the image that is less demanding in computation compared with the

  14. Image preprocessing for improving computational efficiency in implementation of restoration and superresolution algorithms

    Science.gov (United States)

    Sundareshan, Malur K.; Bhattacharjee, Supratik; Inampudi, Radhika; Pang, Ho-Yuen

    2002-12-01

    Computational complexity is a major impediment to the real-time implementation of image restoration and superresolution algorithms in many applications. Although powerful restoration algorithms have been developed within the past few years utilizing sophisticated mathematical machinery (based on statistical optimization and convex set theory), these algorithms are typically iterative in nature and require a sufficient number of iterations to be executed to achieve the desired resolution improvement that may be needed to meaningfully perform postprocessing image exploitation tasks in practice. Additionally, recent technological breakthroughs have facilitated novel sensor designs (focal plane arrays, for instance) that make it possible to capture megapixel imagery data at video frame rates. A major challenge in the processing of these large-format images is to complete the execution of the image processing steps within the frame capture times and to keep up with the output rate of the sensor so that all data captured by the sensor can be efficiently utilized. Consequently, development of novel methods that facilitate real-time implementation of image restoration and superresolution algorithms is of significant practical interest and is the primary focus of this study. The key to designing computationally efficient processing schemes lies in strategically introducing appropriate preprocessing steps together with the superresolution iterations to tailor optimized overall processing sequences for imagery data of specific formats. For substantiating this assertion, three distinct methods for tailoring a preprocessing filter and integrating it with the superresolution processing steps are outlined. These methods consist of a region-of-interest extraction scheme, a background-detail separation procedure, and a scene-derived information extraction step for implementing a set-theoretic restoration of the image that is less demanding in computation compared with the

  15. CudaPre3D: An Alternative Preprocessing Algorithm for Accelerating 3D Convex Hull Computation on the GPU

    Directory of Open Access Journals (Sweden)

    MEI, G.

    2015-05-01

    Full Text Available In the calculating of convex hulls for point sets, a preprocessing procedure that is to filter the input points by discarding non-extreme points is commonly used to improve the computational efficiency. We previously proposed a quite straightforward preprocessing approach for accelerating 2D convex hull computation on the GPU. In this paper, we extend that algorithm to being used in 3D cases. The basic ideas behind these two preprocessing algorithms are similar: first, several groups of extreme points are found according to the original set of input points and several rotated versions of the input set; then, a convex polyhedron is created using the found extreme points; and finally those interior points locating inside the formed convex polyhedron are discarded. Experimental results show that: when employing the proposed preprocessing algorithm, it achieves the speedups of about 4x on average and 5x to 6x in the best cases over the cases where the proposed approach is not used. In addition, more than 95 percent of the input points can be discarded in most experimental tests.

  16. Convergence Properties of an Iterative Procedure of Ipsatizing and Standardizing a Data Matrix, with Applications to Parafac/Candecomp Preprocessing.

    Science.gov (United States)

    ten Berge, Jos M. F.; Kiers, Henk A. L.

    1989-01-01

    Centering a matrix row-wise and rescaling it column-wise to a unit sum of squares requires an iterative procedure. It is shown that this procedure converges to a stable solution that need not be centered row-wise. The results bear directly on several types of preprocessing methods in Parafac/Candecomp. (Author/TJH)

  17. A comparative analysis of preprocessing techniques of cardiac event series for the study of heart rhythm variability using simulated signals

    Directory of Open Access Journals (Sweden)

    Guimarães H.N.

    1998-01-01

    Full Text Available In the present study, using noise-free simulated signals, we performed a comparative examination of several preprocessing techniques that are used to transform the cardiac event series in a regularly sampled time series, appropriate for spectral analysis of heart rhythm variability (HRV. First, a group of noise-free simulated point event series, which represents a time series of heartbeats, was generated by an integral pulse frequency modulation model. In order to evaluate the performance of the preprocessing methods, the differences between the spectra of the preprocessed simulated signals and the true spectrum (spectrum of the model input modulating signals were surveyed by visual analysis and by contrasting merit indices. It is desired that estimated spectra match the true spectrum as close as possible, showing a minimum of harmonic components and other artifacts. The merit indices proposed to quantify these mismatches were the leakage rate, defined as a measure of leakage components (located outside some narrow windows centered at frequencies of model input modulating signals with respect to the whole spectral components, and the numbers of leakage components with amplitudes greater than 1%, 5% and 10% of the total spectral components. Our data, obtained from a noise-free simulation, indicate that the utilization of heart rate values instead of heart period values in the derivation of signals representative of heart rhythm results in more accurate spectra. Furthermore, our data support the efficiency of the widely used preprocessing technique based on the convolution of inverse interval function values with a rectangular window, and suggest the preprocessing technique based on a cubic polynomial interpolation of inverse interval function values and succeeding spectral analysis as another efficient and fast method for the analysis of HRV signals

  18. Evaluation of standard and advanced preprocessing methods for the univariate analysis of blood serum 1H-NMR spectra.

    Science.gov (United States)

    De Meyer, Tim; Sinnaeve, Davy; Van Gasse, Bjorn; Rietzschel, Ernst-R; De Buyzere, Marc L; Langlois, Michel R; Bekaert, Sofie; Martins, José C; Van Criekinge, Wim

    2010-10-01

    Proton nuclear magnetic resonance ((1)H-NMR)-based metabolomics enables the high-resolution and high-throughput assessment of a broad spectrum of metabolites in biofluids. Despite the straightforward character of the experimental methodology, the analysis of spectral profiles is rather complex, particularly due to the requirement of numerous data preprocessing steps. Here, we evaluate how several of the most common preprocessing procedures affect the subsequent univariate analyses of blood serum spectra, with a particular focus on how the standard methods perform compared to more advanced examples. Carr-Purcell-Meiboom-Gill 1D (1)H spectra were obtained for 240 serum samples from healthy subjects of the Asklepios study. We studied the impact of different preprocessing steps--integral (standard method) and probabilistic quotient normalization; no, equidistant (standard), and adaptive-intelligent binning; mean (standard) and maximum bin intensity data summation--on the resonance intensities of three different types of metabolites: triglycerides, glucose, and creatinine. The effects were evaluated by correlating the differently preprocessed NMR data with the independently measured metabolite concentrations. The analyses revealed that the standard methods performed inferiorly and that a combination of probabilistic quotient normalization after adaptive-intelligent binning and maximum intensity variable definition yielded the best overall results (triglycerides, R = 0.98; glucose, R = 0.76; creatinine, R = 0.70). Therefore, at least in the case of serum metabolomics, these or equivalent methods should be preferred above the standard preprocessing methods, particularly for univariate analyses. Additional optimization of the normalization procedure might further improve the analyses.

  19. Prognosis classification in glioblastoma multiforme using multimodal MRI derived heterogeneity textural features: impact of pre-processing choices

    Science.gov (United States)

    Upadhaya, Taman; Morvan, Yannick; Stindel, Eric; Le Reste, Pierre-Jean; Hatt, Mathieu

    2016-03-01

    Heterogeneity image-derived features of Glioblastoma multiforme (GBM) tumors from multimodal MRI sequences may provide higher prognostic value than standard parameters used in routine clinical practice. We previously developed a framework for automatic extraction and combination of image-derived features (also called "Radiomics") through support vector machines (SVM) for predictive model building. The results we obtained in a cohort of 40 GBM suggested these features could be used to identify patients with poorer outcome. However, extraction of these features is a delicate multi-step process and their values may therefore depend on the pre-processing of images. The original developed workflow included skull removal, bias homogeneity correction, and multimodal tumor segmentation, followed by textural features computation, and lastly ranking, selection and combination through a SVM-based classifier. The goal of the present work was to specifically investigate the potential benefit and respective impact of the addition of several MRI pre-processing steps (spatial resampling for isotropic voxels, intensities quantization and normalization) before textural features computation, on the resulting accuracy of the classifier. Eighteen patients datasets were also added for the present work (58 patients in total). A classification accuracy of 83% (sensitivity 79%, specificity 85%) was obtained using the original framework. The addition of the new pre-processing steps increased it to 93% (sensitivity 93%, specificity 93%) in identifying patients with poorer survival (below the median of 12 months). Among the three considered pre-processing steps, spatial resampling was found to have the most important impact. This shows the crucial importance of investigating appropriate image pre-processing steps to be used for methodologies based on textural features extraction in medical imaging.

  20. Research on Data Preprocessing Technology in Web Log Mining%Web日志挖掘中的数据预处理技术研究

    Institute of Scientific and Technical Information of China (English)

    杨玉梅

    2014-01-01

    Preprocessing is the key of Web log mining, the result of preprocessing has a great influence on rules and pattern produced by mining algorithm, which is key ensuring the quality of Web mining. This paper presents DUI technology, enhance the preprocessing technology. It is proved by experiments, advanced data preprocessing technology may enhance the result quality of data preprocessing .%预处理是Web日志挖掘的重点,预处理的结果对挖掘算法产生的规则与模式有很大的影响,是保证 Web日志挖掘质量的关键。本文提出了DUI技术,增强了预处理技术。并通过实验证明,先进的数据预处理技术可以提高数据预处理的结果质量。

  1. Automatic pre-processing for an object-oriented distributed hydrological model using GRASS-GIS

    Science.gov (United States)

    Sanzana, P.; Jankowfsky, S.; Branger, F.; Braud, I.; Vargas, X.; Hitschfeld, N.

    2012-04-01

    Landscapes are very heterogeneous, which impact the hydrological processes occurring in the catchments, especially in the modeling of peri-urban catchments. The Hydrological Response Units (HRUs), resulting from the intersection of different maps, such as land use, soil types and geology, and flow networks, allow the representation of these elements in an explicit way, preserving natural and artificial contours of the different layers. These HRUs are used as model mesh in some distributed object-oriented hydrological models, allowing the application of a topological oriented approach. The connectivity between polygons and polylines provides a detailed representation of the water balance and overland flow in these distributed hydrological models, based on irregular hydro-landscape units. When computing fluxes between these HRUs, the geometrical parameters, such as the distance between the centroid of gravity of the HRUs and the river network, and the length of the perimeter, can impact the realism of the calculated overland, sub-surface and groundwater fluxes. Therefore, it is necessary to process the original model mesh in order to avoid these numerical problems. We present an automatic pre-processing implemented in the open source GRASS-GIS software, for which several Python scripts or some algorithms already available were used, such as the Triangle software. First, some scripts were developed to improve the topology of the various elements, such as snapping of the river network to the closest contours. When data are derived with remote sensing, such as vegetation areas, their perimeter has lots of right angles that were smoothed. Second, the algorithms more particularly address bad-shaped elements of the model mesh such as polygons with narrow shapes, marked irregular contours and/or the centroid outside of the polygons. To identify these elements we used shape descriptors. The convexity index was considered the best descriptor to identify them with a threshold

  2. Tools and Databases of the KOMICS Web Portal for Preprocessing, Mining, and Dissemination of Metabolomics Data

    Directory of Open Access Journals (Sweden)

    Nozomu Sakurai

    2014-01-01

    Full Text Available A metabolome—the collection of comprehensive quantitative data on metabolites in an organism—has been increasingly utilized for applications such as data-intensive systems biology, disease diagnostics, biomarker discovery, and assessment of food quality. A considerable number of tools and databases have been developed to date for the analysis of data generated by various combinations of chromatography and mass spectrometry. We report here a web portal named KOMICS (The Kazusa Metabolomics Portal, where the tools and databases that we developed are available for free to academic users. KOMICS includes the tools and databases for preprocessing, mining, visualization, and publication of metabolomics data. Improvements in the annotation of unknown metabolites and dissemination of comprehensive metabolomic data are the primary aims behind the development of this portal. For this purpose, PowerGet and FragmentAlign include a manual curation function for the results of metabolite feature alignments. A metadata-specific wiki-based database, Metabolonote, functions as a hub of web resources related to the submitters' work. This feature is expected to increase citation of the submitters' work, thereby promoting data publication. As an example of the practical use of KOMICS, a workflow for a study on Jatropha curcas is presented. The tools and databases available at KOMICS should contribute to enhanced production, interpretation, and utilization of metabolomic Big Data.

  3. Parametric Study to Improve Subpixel Accuracy of Nitric Oxide Tagging Velocimetry with Image Preprocessing

    Directory of Open Access Journals (Sweden)

    Ravi Teja Vedula

    2017-01-01

    Full Text Available Biacetyl phosphorescence has been the commonly used molecular tagging velocimetry (MTV technique to investigate in-cylinder flow evolution and cycle-to-cycle variations in an optical engine. As the phosphorescence of biacetyl tracer deteriorates in the presence of oxygen, nitrogen was adopted as the working medium in the past. Recently, nitrous oxide MTV technique was employed to measure the velocity profile of an air jet. The authors here plan to investigate the potential application of this technique for engine flow studies. A possible experimental setup for this task indicated different permutations of image signal-to-noise ratio (SNR and laser line width. In the current work, a numerical analysis is performed to study the effect of these two factors on displacement error in MTV image processing. Also, several image filtering techniques were evaluated and the performance of selected filters was analyzed in terms of enhancing the image quality and minimizing displacement errors. The flow displacement error without image preprocessing was observed to be inversely proportional to SNR and directly proportional to laser line width. The mean filter resulted in the smallest errors for line widths smaller than 9 pixels. The effect of filter size on subpixel accuracy showed that error levels increased as the filter size increased.

  4. Classifying human voices by using hybrid SFX time-series preprocessing and ensemble feature selection.

    Science.gov (United States)

    Fong, Simon; Lan, Kun; Wong, Raymond

    2013-01-01

    Voice biometrics is one kind of physiological characteristics whose voice is different for each individual person. Due to this uniqueness, voice classification has found useful applications in classifying speakers' gender, mother tongue or ethnicity (accent), emotion states, identity verification, verbal command control, and so forth. In this paper, we adopt a new preprocessing method named Statistical Feature Extraction (SFX) for extracting important features in training a classification model, based on piecewise transformation treating an audio waveform as a time-series. Using SFX we can faithfully remodel statistical characteristics of the time-series; together with spectral analysis, a substantial amount of features are extracted in combination. An ensemble is utilized in selecting only the influential features to be used in classification model induction. We focus on the comparison of effects of various popular data mining algorithms on multiple datasets. Our experiment consists of classification tests over four typical categories of human voice data, namely, Female and Male, Emotional Speech, Speaker Identification, and Language Recognition. The experiments yield encouraging results supporting the fact that heuristically choosing significant features from both time and frequency domains indeed produces better performance in voice classification than traditional signal processing techniques alone, like wavelets and LPC-to-CC.

  5. Pre-Processing of Point-Data from Contact and Optical 3D Digitization Sensors

    Directory of Open Access Journals (Sweden)

    Mirko Soković

    2012-01-01

    Full Text Available Contemporary 3D digitization systems employed by reverse engineering (RE feature ever-growing scanning speeds with the ability to generate large quantity of points in a unit of time. Although advantageous for the quality and efficiency of RE modelling, the huge number of point datas can turn into a serious practical problem, later on, when the CAD model is generated. In addition, 3D digitization processes are very often plagued by measuring errors, which can be attributed to the very nature of measuring systems, various characteristics of the digitized objects and subjective errors by the operator, which also contribute to problems in the CAD model generation process. This paper presents an integral system for the pre-processing of point data, i.e., filtering, smoothing and reduction, based on a cross-sectional RE approach. In the course of the proposed system development, major emphasis was placed on the module for point data reduction, which was designed according to a novel approach with integrated deviation analysis and fuzzy logic reasoning. The developed system was verified through its application on three case studies, on point data from objects of versatile geometries obtained by contact and laser 3D digitization systems. The obtained results demonstrate the effectiveness of the system.

  6. Pre-processing Algorithm for Rectification of Geometric Distortions in Satellite Images

    Directory of Open Access Journals (Sweden)

    Narayan Panigrahi

    2011-02-01

    Full Text Available A number of algorithms have been reported to process and remove geometric distortions in satellite images. Ortho-correction, geometric error correction, radiometric error removal, etc are a few important examples. These algorithm require supplementary meta-information of the satellite images such as ground control points and correspondence, sensor orientation details, elevation profile of the terrain, etc to establish corresponding transformations. In this paper, a pre-processing algorithm has been proposed which removes systematic distortions of a satellite image and thereby removes the blank portion of the image. It is an input-to-output mapping of image pixels, where the transformation computes the coordinate of each output pixel corresponding to the input pixel of an image. The transformation is established by the exact amount of scaling, rotation and translation needed for each pixel in the input image so that the distortion induced during the recording stage is corrected.Defence Science Journal, 2011, 61(2, pp.174-179, DOI:http://dx.doi.org/10.14429/dsj.61.421

  7. A comparative analysis of pre-processing techniques in colour retinal images

    Energy Technology Data Exchange (ETDEWEB)

    Salvatelli, A [Artificial Intelligence Group, Facultad de Ingenieria, Universidad Nacional de Entre Rios (Argentina); Bizai, G [Artificial Intelligence Group, Facultad de Ingenieria, Universidad Nacional de Entre Rios (Argentina); Barbosa, G [Artificial Intelligence Group, Facultad de Ingenieria, Universidad Nacional de Entre Rios (Argentina); Drozdowicz, B [Artificial Intelligence Group, Facultad de Ingenieria, Universidad Nacional de Entre Rios (Argentina); Delrieux, C [Electric and Computing Engineering Department, Universidad Nacional del Sur, Alem 1253, BahIa Blanca, (Partially funded by SECyT-UNS) (Argentina)], E-mail: claudio@acm.org

    2007-11-15

    Diabetic retinopathy (DR) is a chronic disease of the ocular retina, which most of the times is only discovered when the disease is on an advanced stage and most of the damage is irreversible. For that reason, early diagnosis is paramount for avoiding the most severe consequences of the DR, of which complete blindness is not uncommon. Unsupervised or supervised image processing of retinal images emerges as a feasible tool for this diagnosis. The preprocessing stages are the key for any further assessment, since these images exhibit several defects, including non uniform illumination, sampling noise, uneven contrast due to pigmentation loss during sampling, and many others. Any feasible diagnosis system should work with images where these defects were compensated. In this work we analyze and test several correction techniques. Non uniform illumination is compensated using morphology and homomorphic filtering; uneven contrast is compensated using morphology and local enhancement. We tested our processing stages using Fuzzy C-Means, and local Hurst (self correlation) coefficient for unsupervised segmentation of the abnormal blood vessels. The results over a standard set of DR images are more than promising.

  8. Tools and databases of the KOMICS web portal for preprocessing, mining, and dissemination of metabolomics data.

    Science.gov (United States)

    Sakurai, Nozomu; Ara, Takeshi; Enomoto, Mitsuo; Motegi, Takeshi; Morishita, Yoshihiko; Kurabayashi, Atsushi; Iijima, Yoko; Ogata, Yoshiyuki; Nakajima, Daisuke; Suzuki, Hideyuki; Shibata, Daisuke

    2014-01-01

    A metabolome--the collection of comprehensive quantitative data on metabolites in an organism--has been increasingly utilized for applications such as data-intensive systems biology, disease diagnostics, biomarker discovery, and assessment of food quality. A considerable number of tools and databases have been developed to date for the analysis of data generated by various combinations of chromatography and mass spectrometry. We report here a web portal named KOMICS (The Kazusa Metabolomics Portal), where the tools and databases that we developed are available for free to academic users. KOMICS includes the tools and databases for preprocessing, mining, visualization, and publication of metabolomics data. Improvements in the annotation of unknown metabolites and dissemination of comprehensive metabolomic data are the primary aims behind the development of this portal. For this purpose, PowerGet and FragmentAlign include a manual curation function for the results of metabolite feature alignments. A metadata-specific wiki-based database, Metabolonote, functions as a hub of web resources related to the submitters' work. This feature is expected to increase citation of the submitters' work, thereby promoting data publication. As an example of the practical use of KOMICS, a workflow for a study on Jatropha curcas is presented. The tools and databases available at KOMICS should contribute to enhanced production, interpretation, and utilization of metabolomic Big Data.

  9. Preprocessing: Geocoding of AVIRIS data using navigation, engineering, DEM, and radar tracking system data

    Science.gov (United States)

    Meyer, Peter; Larson, Steven A.; Hansen, Earl G.; Itten, Klaus I.

    1993-01-01

    Remotely sensed data have geometric characteristics and representation which depend on the type of the acquisition system used. To correlate such data over large regions with other real world representation tools like conventional maps or Geographic Information System (GIS) for verification purposes, or for further treatment within different data sets, a coregistration has to be performed. In addition to the geometric characteristics of the sensor there are two other dominating factors which affect the geometry: the stability of the platform and the topography. There are two basic approaches for a geometric correction on a pixel-by-pixel basis: (1) A parametric approach using the location of the airplane and inertial navigation system data to simulate the observation geometry; and (2) a non-parametric approach using tie points or ground control points. It is well known that the non-parametric approach is not reliable enough for the unstable flight conditions of airborne systems, and is not satisfying in areas with significant topography, e.g. mountains and hills. The present work describes a parametric preprocessing procedure which corrects effects of flight line and attitude variation as well as topographic influences and is described in more detail by Meyer.

  10. A comparative study on preprocessing techniques in diabetic retinopathy retinal images: illumination correction and contrast enhancement.

    Science.gov (United States)

    Rasta, Seyed Hossein; Partovi, Mahsa Eisazadeh; Seyedarabi, Hadi; Javadzadeh, Alireza

    2015-01-01

    To investigate the effect of preprocessing techniques including contrast enhancement and illumination correction on retinal image quality, a comparative study was carried out. We studied and implemented a few illumination correction and contrast enhancement techniques on color retinal images to find out the best technique for optimum image enhancement. To compare and choose the best illumination correction technique we analyzed the corrected red and green components of color retinal images statistically and visually. The two contrast enhancement techniques were analyzed using a vessel segmentation algorithm by calculating the sensitivity and specificity. The statistical evaluation of the illumination correction techniques were carried out by calculating the coefficients of variation. The dividing method using the median filter to estimate background illumination showed the lowest Coefficients of variations in the red component. The quotient and homomorphic filtering methods after the dividing method presented good results based on their low Coefficients of variations. The contrast limited adaptive histogram equalization increased the sensitivity of the vessel segmentation algorithm up to 5% in the same amount of accuracy. The contrast limited adaptive histogram equalization technique has a higher sensitivity than the polynomial transformation operator as a contrast enhancement technique for vessel segmentation. Three techniques including the dividing method using the median filter to estimate background, quotient based and homomorphic filtering were found as the effective illumination correction techniques based on a statistical evaluation. Applying the local contrast enhancement technique, such as CLAHE, for fundus images presented good potentials in enhancing the vasculature segmentation.

  11. A new approach to pre-processing digital image for wavelet-based watermark

    Science.gov (United States)

    Agreste, Santa; Andaloro, Guido

    2008-11-01

    The growth of the Internet has increased the phenomenon of digital piracy, in multimedia objects, like software, image, video, audio and text. Therefore it is strategic to individualize and to develop methods and numerical algorithms, which are stable and have low computational cost, that will allow us to find a solution to these problems. We describe a digital watermarking algorithm for color image protection and authenticity: robust, not blind, and wavelet-based. The use of Discrete Wavelet Transform is motivated by good time-frequency features and a good match with Human Visual System directives. These two combined elements are important for building an invisible and robust watermark. Moreover our algorithm can work with any image, thanks to the step of pre-processing of the image that includes resize techniques that adapt to the size of the original image for Wavelet transform. The watermark signal is calculated in correlation with the image features and statistic properties. In the detection step we apply a re-synchronization between the original and watermarked image according to the Neyman-Pearson statistic criterion. Experimentation on a large set of different images has been shown to be resistant against geometric, filtering, and StirMark attacks with a low rate of false alarm.

  12. Effect of preprocessing olive storage conditions on virgin olive oil quality and composition.

    Science.gov (United States)

    Inarejos-García, Antonio M; Gómez-Rico, Aurora; Desamparados Salvador, M; Fregapane, Giuseppe

    2010-04-28

    The quality of virgin olive oil (VOO) is intimately related to the characteristics and composition of the olive fruit at the moment of its milling. In this study, the determination of suitable olive storage conditions and feasibility of using this preprocessing operation to modulate the sensory taste of VOO are reported. Several olive batches were stored in different conditions (from monolayer up to 60 cm thickness, at 20 and 10 degrees C) for a period of up to three weeks, and the quality and composition of minor constituents, mainly phenols and volatiles, in the corresponding VOO were monitored. Cornicabra cultivar VOO obtained from drupes stored for 5 or 8 days at 20 or 10 degrees C, respectively, retained the "extra virgin" category, according to chemical quality indices, since only small increases in free acidity and peroxide values were observed, and the bitter index of this monovarietal oil was reduced by 30-40%. Storage under monolayer conditions at 10 degrees C for up to two weeks is also feasible because "off-odor" development was delayed, a 50% reduction in bitterness was obtained, and the overall good quality of the final product was preserved.

  13. An Application for Data Preprocessing and Models Extractions in Web Usage Mining

    Directory of Open Access Journals (Sweden)

    Claudia Elena DINUCA

    2011-11-01

    Full Text Available Web servers worldwide generate a vast amount of information on web users’ browsing activities. Several researchers have studied these so-called clickstream or web access log data to better understand and characterize web users. The goal of this application is to analyze user behaviour by mining enriched web access log data. With the continued growth and proliferation of e-commerce, Web services, and Web-based information systems, the volumes of click stream and user data collected by Web-based organizations in their daily operations has reached astronomical proportions. This information can be exploited in various ways, such as enhancing the effectiveness of websites or developing directed web marketing campaigns. The discovered patterns are usually represented as collections of pages, objects, or re-sources that are frequently accessed by groups of users with common needs or interests. In this paper we will focus on displaying the way how it was implemented the application for data preprocessing and extracting different data models from web logs data, finding association as a data mining technique to extract potentially useful knowledge from web usage data. We find different data models navigation patterns by analysing the log files of the web-site. I implemented the application in Java using NetBeans IDE. For exemplification, I used the log files data from a commercial web site www.nice-layouts.com.

  14. The PREP pipeline: standardized preprocessing for large-scale EEG analysis.

    Science.gov (United States)

    Bigdely-Shamlo, Nima; Mullen, Tim; Kothe, Christian; Su, Kyung-Min; Robbins, Kay A

    2015-01-01

    The technology to collect brain imaging and physiological measures has become portable and ubiquitous, opening the possibility of large-scale analysis of real-world human imaging. By its nature, such data is large and complex, making automated processing essential. This paper shows how lack of attention to the very early stages of an EEG preprocessing pipeline can reduce the signal-to-noise ratio and introduce unwanted artifacts into the data, particularly for computations done in single precision. We demonstrate that ordinary average referencing improves the signal-to-noise ratio, but that noisy channels can contaminate the results. We also show that identification of noisy channels depends on the reference and examine the complex interaction of filtering, noisy channel identification, and referencing. We introduce a multi-stage robust referencing scheme to deal with the noisy channel-reference interaction. We propose a standardized early-stage EEG processing pipeline (PREP) and discuss the application of the pipeline to more than 600 EEG datasets. The pipeline includes an automatically generated report for each dataset processed. Users can download the PREP pipeline as a freely available MATLAB library from http://eegstudy.org/prepcode.

  15. Robust preprocessing for stimulus-based functional MRI of the moving fetus.

    Science.gov (United States)

    You, Wonsang; Evangelou, Iordanis E; Zun, Zungho; Andescavage, Nickie; Limperopoulos, Catherine

    2016-04-01

    Fetal motion manifests as signal degradation and image artifact in the acquired time series of blood oxygen level dependent (BOLD) functional magnetic resonance imaging (fMRI) studies. We present a robust preprocessing pipeline to specifically address fetal and placental motion-induced artifacts in stimulus-based fMRI with slowly cycled block design in the living fetus. In the proposed pipeline, motion correction is optimized to the experimental paradigm, and it is performed separately in each phase as well as in each region of interest (ROI), recognizing that each phase and organ experiences different types of motion. To obtain the averaged BOLD signals for each ROI, both misaligned volumes and noisy voxels are automatically detected and excluded, and the missing data are then imputed by statistical estimation based on local polynomial smoothing. Our experimental results demonstrate that the proposed pipeline was effective in mitigating the motion-induced artifacts in stimulus-based fMRI data of the fetal brain and placenta.

  16. A data preprocessing strategy for metabolomics to reduce the mask effect in data analysis

    Directory of Open Access Journals (Sweden)

    Jun eYang

    2015-02-01

    Full Text Available Metabolomics is a booming research field. Its success highly relies on the discovery of differential metabolites by comparing different data sets (for example, patients vs. controls. One of the challenges is that differences of the low abundant metabolites between groups are often masked by the high variation of abundant metabolites -. In order to solve this challenge, a novel data preprocessing strategy consisting of 3 steps was proposed in this study. In step 1, a ‘modified 80%’ rule was used to reduce effect of missing values; in step 2, unit-variance and Pareto scaling methods were used to reduce the mask effect from the abundant metabolites. In step 3, in order to fix the adverse effect of scaling, stability information of the variables deduced from intensity information and the class information, was used to assign suitable weights to the variables. When applying to an LC/MS based metabolomics dataset from chronic hepatitis B patients study and two simulated datasets, the mask effect was found to be partially eliminated and several new low abundant differential metabolites were rescued.

  17. The Impact of the Preprocessing Methods in Downstream Analysis of Agilent Microarray Data

    Directory of Open Access Journals (Sweden)

    Loredana BĂLĂCESCU

    2015-12-01

    Full Text Available Over the past decades, gene expression microarrays have been used extensively in biomedical research. However, these high-throughput experiments are affected by technical variation and biases introduced at different levels, such as mRNA processing, labeling, hybridization, scanning and/or imaging. Therefore, data preprocessing is important to minimize these systematic errors in order to identify actual biological changes. The aim of this study was to compare all possible combinations of two normalization, four summarization, and two background correction options, using two different foreground estimates. The results shows that the background correction of the raw median signal and summarization methods used here have no impact in downstream analysis. In contrast, the choice of the normalization method influences the results; the quantile normalization leading to a better biological sensitivity of the data. When Agilent processed signal was considered, regardless of the summarization and normalization options, there were consistently identified more differentially expressed genes (DEG than when raw median signal was used. Nevertheless, the greater number of DEG didn’t result in an improvement of the biological relevance.

  18. Optimized data preprocessing for multivariate analysis applied to 99mTc-ECD SPECT data sets of Alzheimer's patients and asymptomatic controls.

    Science.gov (United States)

    Merhof, Dorit; Markiewicz, Pawel J; Platsch, Günther; Declerck, Jerome; Weih, Markus; Kornhuber, Johannes; Kuwert, Torsten; Matthews, Julian C; Herholz, Karl

    2011-01-01

    Multivariate image analysis has shown potential for classification between Alzheimer's disease (AD) patients and healthy controls with a high-diagnostic performance. As image analysis of positron emission tomography (PET) and single photon emission computed tomography (SPECT) data critically depends on appropriate data preprocessing, the focus of this work is to investigate the impact of data preprocessing on the outcome of the analysis, and to identify an optimal data preprocessing method. In this work, technetium-99methylcysteinatedimer ((99m)Tc-ECD) SPECT data sets of 28 AD patients and 28 asymptomatic controls were used for the analysis. For a series of different data preprocessing methods, which includes methods for spatial normalization, smoothing, and intensity normalization, multivariate image analysis based on principal component analysis (PCA) and Fisher discriminant analysis (FDA) was applied. Bootstrap resampling was used to investigate the robustness of the analysis and the classification accuracy, depending on the data preprocessing method. Depending on the combination of preprocessing methods, significant differences regarding the classification accuracy were observed. For (99m)Tc-ECD SPECT data, the optimal data preprocessing method in terms of robustness and classification accuracy is based on affine registration, smoothing with a Gaussian of 12 mm full width half maximum, and intensity normalization based on the 25% brightest voxels within the whole-brain region.

  19. Quality assessment of baby food made of different pre-processed organic raw materials under industrial processing conditions.

    Science.gov (United States)

    Seidel, Kathrin; Kahl, Johannes; Paoletti, Flavio; Birlouez, Ines; Busscher, Nicolaas; Kretzschmar, Ursula; Särkkä-Tirkkonen, Marjo; Seljåsen, Randi; Sinesio, Fiorella; Torp, Torfinn; Baiamonte, Irene

    2015-02-01

    The market for processed food is rapidly growing. The industry needs methods for "processing with care" leading to high quality products in order to meet consumers' expectations. Processing influences the quality of the finished product through various factors. In carrot baby food, these are the raw material, the pre-processing and storage treatments as well as the processing conditions. In this study, a quality assessment was performed on baby food made from different pre-processed raw materials. The experiments were carried out under industrial conditions using fresh, frozen and stored organic carrots as raw material. Statistically significant differences were found for sensory attributes among the three autoclaved puree samples (e.g. overall odour F = 90.72, p food.

  20. lop-DWI: A Novel Scheme for Pre-Processing of Diffusion Weighted Images in the Gradient Direction Domain

    Directory of Open Access Journals (Sweden)

    Farshid eSepehrband

    2015-01-01

    Full Text Available We describe and evaluate a pre-processing method based on a periodic spiral sampling of diffusion gradient directions for high angular resolution diffusion magnetic resonance imaging. Our pre-processing method incorporates prior knowledge about the acquired diffusion-weighted signal, facilitating noise reduction. Periodic spiral sampling of gradient direction encodings results in an acquired signal in each voxel that is pseudo periodic with characteristics that allow separation of low-frequency signal from high frequency noise. Consequently, it enhances local reconstruction of the orientation distribution function used to define fibre tracks in the brain. Denoising with periodic spiral sampling was tested using synthetic data and in vivo human brain images. The level of improvement in signal-to-noise ratio and in the accuracy of local reconstruction of fibre tracks was significantly improved using our method.

  1. lop-DWI: A Novel Scheme for Pre-Processing of Diffusion-Weighted Images in the Gradient Direction Domain.

    Science.gov (United States)

    Sepehrband, Farshid; Choupan, Jeiran; Caruyer, Emmanuel; Kurniawan, Nyoman D; Gal, Yaniv; Tieng, Quang M; McMahon, Katie L; Vegh, Viktor; Reutens, David C; Yang, Zhengyi

    2014-01-01

    We describe and evaluate a pre-processing method based on a periodic spiral sampling of diffusion-gradient directions for high angular resolution diffusion magnetic resonance imaging. Our pre-processing method incorporates prior knowledge about the acquired diffusion-weighted signal, facilitating noise reduction. Periodic spiral sampling of gradient direction encodings results in an acquired signal in each voxel that is pseudo-periodic with characteristics that allow separation of low-frequency signal from high frequency noise. Consequently, it enhances local reconstruction of the orientation distribution function used to define fiber tracks in the brain. Denoising with periodic spiral sampling was tested using synthetic data and in vivo human brain images. The level of improvement in signal-to-noise ratio and in the accuracy of local reconstruction of fiber tracks was significantly improved using our method.

  2. Web日志挖掘数据预处理研究%Data Preprocessing of Web Log Mining

    Institute of Scientific and Technical Information of China (English)

    何波; 涂飞; 程勇军

    2011-01-01

    数据预处理在Web日志挖掘过程中起着至关重要的作用.论文分析了Web日志挖掘数据预处理的主要步骤,设计了用户识别、访问操作识别和路径完善三个步骤的关键算法.实验结果表明,设计的关键算法是有效的.%Data preprocessing plays an essential role in the process of Web log mining. This paper analyses the steps of data preprocessing, and designs the key algorithms of user identification, session identification and path completion. It is proved that the key algorithms are effective.

  3. PRACTICAL RECOMMENDATIONS OF DATA PREPROCESSING AND GEOSPATIAL MEASURES FOR OPTIMIZING THE NEUROLOGICAL AND OTHER PEDIATRIC EMERGENCIES MANAGEMENT

    Directory of Open Access Journals (Sweden)

    Ionela MANIU

    2017-08-01

    Full Text Available Time management, optimal and timed determination of emergency severity as well as optimizing the use of available human and material resources are crucial areas of emergency services. A starting point for achieving these optimizations can be considered the analysis and preprocess of real data from the emergency services. The benefits of performing this method consist in exposing more useful structures to data modelling algorithms which consequently will reduce overfitting and improves accuracy. This paper aims to offer practical recommendations for data preprocessing measures including feature selection and discretization of numeric attributes regarding age, duration of the case, season, period, week period (workday, weekend and geospatial location of neurological and other pediatric emergencies. An analytical, retrospective study was conducted on a sample consisting of 933 pediatric cases, from UPU-SMURD Sibiu, 01.01.2014 – 27.02.2017 period.

  4. Impact of functional MRI data preprocessing pipeline on default-mode network detectability in patients with disorders of consciousness

    Directory of Open Access Journals (Sweden)

    Adrian eAndronache

    2013-08-01

    Full Text Available An emerging application of resting-state functional MRI is the study of patients with disorders of consciousness (DoC, where integrity of default-mode network (DMN activity is associated to the clinical level of preservation of consciousness. Due to the inherent inability to follow verbal instructions, arousal induced by scanning noise and postural pain, these patients tend to exhibit substantial levels of movement. This results in spurious, non-neural fluctuations of the blood-oxygen level-dependent (BOLD signal, which impair the evaluation of residual functional connectivity. Here, the effect of data preprocessing choices on the detectability of the DMN was systematically evaluated in a representative cohort of 30 clinically and etiologically heterogeneous DoC patients and 33 healthy controls. Starting from a standard preprocessing pipeline, additional steps were gradually inserted, namely band-pass filtering, removal of co-variance with the movement vectors, removal of co-variance with the global brain parenchyma signal, rejection of realignment outlier volumes and ventricle masking. Both independent-component analysis (ICA and seed-based analysis (SBA were performed, and DMN detectability was assessed quantitatively as well as visually. The results of the present study strongly show that the detection of DMN activity in the sub-optimal fMRI series acquired on DoC patients is contingent on the use of adequate filtering steps. ICA and SBA are differently affected but give convergent findings for high-grade preprocessing. We propose that future studies in this area should adopt the described preprocessing procedures as a minimum standard to reduce the probability of wrongly inferring that DMN activity is absent.

  5. A signal pre-processing algorithm designed for the needs of hardware implementation of neural classifiers used in condition monitoring

    DEFF Research Database (Denmark)

    Dabrowski, Dariusz; Hashemiyan, Zahra; Adamczyk, Jan

    2015-01-01

    and bucket wheel excavators. In this paper, a signal pre-processing algorithm designed for condition monitoring of planetary gears working in non-stationary operation is presented. The algorithm is dedicated for hardware implementation on Field Programmable Gate Arrays (FPGAs). The purpose of the algorithm......%, it can be performed in real-time conditions and its implementation does not require many resources of FPGAs....

  6. Impact of functional MRI data preprocessing pipeline on default-mode network detectability in patients with disorders of consciousness.

    Science.gov (United States)

    Andronache, Adrian; Rosazza, Cristina; Sattin, Davide; Leonardi, Matilde; D'Incerti, Ludovico; Minati, Ludovico

    2013-01-01

    An emerging application of resting-state functional MRI (rs-fMRI) is the study of patients with disorders of consciousness (DoC), where integrity of default-mode network (DMN) activity is associated to the clinical level of preservation of consciousness. Due to the inherent inability to follow verbal instructions, arousal induced by scanning noise and postural pain, these patients tend to exhibit substantial levels of movement. This results in spurious, non-neural fluctuations of the rs-fMRI signal, which impair the evaluation of residual functional connectivity. Here, the effect of data preprocessing choices on the detectability of the DMN was systematically evaluated in a representative cohort of 30 clinically and etiologically heterogeneous DoC patients and 33 healthy controls. Starting from a standard preprocessing pipeline, additional steps were gradually inserted, namely band-pass filtering (BPF), removal of co-variance with the movement vectors, removal of co-variance with the global brain parenchyma signal, rejection of realignment outlier volumes and ventricle masking. Both independent-component analysis (ICA) and seed-based analysis (SBA) were performed, and DMN detectability was assessed quantitatively as well as visually. The results of the present study strongly show that the detection of DMN activity in the sub-optimal fMRI series acquired on DoC patients is contingent on the use of adequate filtering steps. ICA and SBA are differently affected but give convergent findings for high-grade preprocessing. We propose that future studies in this area should adopt the described preprocessing procedures as a minimum standard to reduce the probability of wrongly inferring that DMN activity is absent.

  7. The Effect of LC-MS Data Preprocessing Methods on the Selection of Plasma Biomarkers in Fed vs. Fasted Rats.

    Science.gov (United States)

    Gürdeniz, Gözde; Kristensen, Mette; Skov, Thomas; Dragsted, Lars O

    2012-01-18

    The metabolic composition of plasma is affected by time passed since the last meal and by individual variation in metabolite clearance rates. Rat plasma in fed and fasted states was analyzed with liquid chromatography quadrupole-time-of-flight mass spectrometry (LC-QTOF) for an untargeted investigation of these metabolite patterns. The dataset was used to investigate the effect of data preprocessing on biomarker selection using three different softwares, MarkerLynxTM, MZmine, XCMS along with a customized preprocessing method that performs binning of m/z channels followed by summation through retention time. Direct comparison of selected features representing the fed or fasted state showed large differences between the softwares. Many false positive markers were obtained from custom data preprocessing compared with dedicated softwares while MarkerLynxTM provided better coverage of markers. However, marker selection was more reliable with the gap filling (or peak finding) algorithms present in MZmine and XCMS. Further identification of the putative markers revealed that many of the differences between the markers selected were due to variations in features representing adducts or daughter ions of the same metabolites or of compounds from the same chemical subclasses, e.g., lyso-phosphatidylcholines (LPCs) and lyso-phosphatidylethanolamines (LPEs). We conclude that despite considerable differences in the performance of the preprocessing tools we could extract the same biological information by any of them. Carnitine, branched-chain amino acids, LPCs and LPEs were identified by all methods as markers of the fed state whereas acetylcarnitine was abundant during fasting in rats.

  8. HapMap filter 1.0: A tool to preprocess the HapMap genotypic data for association studies

    OpenAIRE

    2008-01-01

    The International HapMap Project provides a resource of genotypic data on single nucleotide polymorphisms (SNPs), which can be used in various association studies to identify the genetic determinants for phenotypic variations. Prior to the association studies, the HapMap dataset should be preprocessed in order to reduce the computation time and control the multiple testing problem. The less informative SNPs including those with very low genotyping rate and SNPs with rare minor allele frequenc...

  9. Performance Comparison of Several Pre-Processing Methods in a Hand Gesture Recognition System based on Nearest Neighbor for Different Background Conditions

    Directory of Open Access Journals (Sweden)

    Iwan Setyawan

    2012-12-01

    Full Text Available This paper presents a performance analysis and comparison of several pre-processing methods used in a hand gesture recognition system. The pre-processing methods are based on the combinations of several image processing operations, namely edge detection, low pass filtering, histogram equalization, thresholding and desaturation. The hand gesture recognition system is designed to classify an input image into one of six possible classes. The input images are taken with various background conditions. Our experiments showed that the best result is achieved when the pre-processing method consists of only a desaturation operation, achieving a classification accuracy of up to 83.15%.

  10. Novel low-power ultrasound digital preprocessing architecture for wireless display.

    Science.gov (United States)

    Levesque, Philippe; Sawan, Mohamad

    2010-03-01

    A complete hardware-based ultrasound preprocessing unit (PPU) is presented as an alternative to available power-hungry devices. Intended to expand the ultrasonic applications, the proposed unit allows replacement of the cable of the ultrasonic probe by a wireless link to transfer data from the probe to a remote monitor. The digital back-end architecture of this PPU is fully pipelined, which permits sampling of ultrasonic signals at a frequency equal to the field-programmable gate array-based system clock, up to 100 MHz. Experimental results show that the proposed processing unit has an excellent performance, an equivalent 53.15 Dhrystone 2.1 MIPS/ MHz (DMIPS/MHz), compared with other software-based architectures that allow a maximum of 1.6 DMIPS/MHz. In addition, an adaptive subsampling method is proposed to operate the pixel compressor, which allows real-time image zooming and, by removing high-frequency noise, the lateral and axial resolutions are enhanced by 25% and 33%, respectively. Realtime images, acquired from a reference phantom, validated the feasibility of the proposed architecture. For a display rate of 15 frames per second, and a 5-MHz single-element piezoelectric transducer, the proposed digital PPU requires a dynamic power of only 242 mW, which represents around 20% of the best-available software-based system. Furthermore, composed by the ultrasound processor and the image interpolation unit, the digital processing core of the PPU presents good power-performance ratios of 26 DMIPS/mW and 43.9 DMIPS/mW at a 20-MHz and 100-MHz sample frequency, respectively.

  11. Functional MRI preprocessing in lesioned brains: manual versus automated region of interest analysis

    Directory of Open Access Journals (Sweden)

    Kathleen A Garrison

    2015-09-01

    Full Text Available Functional magnetic resonance imaging has significant potential in the study and treatment of neurological disorders and stroke. Region of interest (ROI analysis in such studies allows for testing of strong a priori clinical hypotheses with improved statistical power. A commonly used automated approach to ROI analysis is to spatially normalize each participant’s structural brain image to a template brain image and define ROIs using an atlas. However, in studies of individuals with structural brain lesions such as stroke, the gold standard approach may be to manually hand-draw ROIs on each participant’s non-normalized structural brain image. Automated approaches to ROI analysis are faster and more standardized, yet are susceptible to preprocessing error (e.g., normalization error that can be greater in lesioned brains. The manual approach to ROI analysis has high demand for time and expertise but may provide a more accurate estimate of brain response. In this study, we directly compare commonly used automated and manual approaches to ROI analysis by reanalyzing data from a previously published hypothesis-driven cognitive fMRI study involving individuals with stroke. The ROI evaluated is the pars opercularis of the inferior frontal gyrus. We found a significant difference in task-related effect size and percent activated voxels in this ROI between the automated and manual approaches to ROI analysis. Task interactions, however, were consistent across ROI analysis approaches. These findings support the use of automated approaches to ROI analysis in studies of lesioned brains, provided they employ a task interaction design.

  12. MeteoIO 2.4.2: a preprocessing library for meteorological data

    Directory of Open Access Journals (Sweden)

    M. Bavay

    2014-12-01

    Full Text Available Using numerical models which require large meteorological data sets is sometimes difficult and problems can often be traced back to the Input/Output functionality. Complex models are usually developed by the environmental sciences community with a focus on the core modelling issues. As a consequence, the I/O routines that are costly to properly implement are often error-prone, lacking flexibility and robustness. With the increasing use of such models in operational applications, this situation ceases to be simply uncomfortable and becomes a major issue. The MeteoIO library has been designed for the specific needs of numerical models that require meteorological data. The whole task of data preprocessing has been delegated to this library, namely retrieving, filtering and resampling the data if necessary as well as providing spatial interpolations and parameterizations. The focus has been to design an Application Programming Interface (API that (i provides a uniform interface to meteorological data in the models, (ii hides the complexity of the processing taking place, and (iii guarantees a robust behaviour in the case of format errors, erroneous or missing data. Moreover, in an operational context, this error handling should avoid unnecessary interruptions in the simulation process. A strong emphasis has been put on simplicity and modularity in order to make it extremely easy to support new data formats or protocols and to allow contributors with diverse backgrounds to participate. This library is also regularly evaluated for computing performance and further optimized where necessary. Finally, it is released under an Open Source license and is available at http://models.slf.ch/p/meteoio. This paper gives an overview of the MeteoIO library from the point of view of conceptual design, architecture, features and computational performance. A scientific evaluation of the produced results is not given here since the scientific algorithms that are used

  13. Hardware Design and Implementation of a Wavelet De-Noising Procedure for Medical Signal Preprocessing

    Directory of Open Access Journals (Sweden)

    Szi-Wen Chen

    2015-10-01

    Full Text Available In this paper, a discrete wavelet transform (DWT based de-noising with its applications into the noise reduction for medical signal preprocessing is introduced. This work focuses on the hardware realization of a real-time wavelet de-noising procedure. The proposed de-noising circuit mainly consists of three modules: a DWT, a thresholding, and an inverse DWT (IDWT modular circuits. We also proposed a novel adaptive thresholding scheme and incorporated it into our wavelet de-noising procedure. Performance was then evaluated on both the architectural designs of the software and. In addition, the de-noising circuit was also implemented by downloading the Verilog codes to a field programmable gate array (FPGA based platform so that its ability in noise reduction may be further validated in actual practice. Simulation experiment results produced by applying a set of simulated noise-contaminated electrocardiogram (ECG signals into the de-noising circuit showed that the circuit could not only desirably meet the requirement of real-time processing, but also achieve satisfactory performance for noise reduction, while the sharp features of the ECG signals can be well preserved. The proposed de-noising circuit was further synthesized using the Synopsys Design Compiler with an Artisan Taiwan Semiconductor Manufacturing Company (TSMC, Hsinchu, Taiwan 40 nm standard cell library. The integrated circuit (IC synthesis simulation results showed that the proposed design can achieve a clock frequency of 200 MHz and the power consumption was only 17.4 mW, when operated at 200 MHz.

  14. Detection of epileptic seizure in EEG signals using linear least squares preprocessing.

    Science.gov (United States)

    Roshan Zamir, Z

    2016-09-01

    An epileptic seizure is a transient event of abnormal excessive neuronal discharge in the brain. This unwanted event can be obstructed by detection of electrical changes in the brain that happen before the seizure takes place. The automatic detection of seizures is necessary since the visual screening of EEG recordings is a time consuming task and requires experts to improve the diagnosis. Much of the prior research in detection of seizures has been developed based on artificial neural network, genetic programming, and wavelet transforms. Although the highest achieved accuracy for classification is 100%, there are drawbacks, such as the existence of unbalanced datasets and the lack of investigations in performances consistency. To address these, four linear least squares-based preprocessing models are proposed to extract key features of an EEG signal in order to detect seizures. The first two models are newly developed. The original signal (EEG) is approximated by a sinusoidal curve. Its amplitude is formed by a polynomial function and compared with the predeveloped spline function. Different statistical measures, namely classification accuracy, true positive and negative rates, false positive and negative rates and precision, are utilised to assess the performance of the proposed models. These metrics are derived from confusion matrices obtained from classifiers. Different classifiers are used over the original dataset and the set of extracted features. The proposed models significantly reduce the dimension of the classification problem and the computational time while the classification accuracy is improved in most cases. The first and third models are promising feature extraction methods with the classification accuracy of 100%. Logistic, LazyIB1, LazyIB5, and J48 are the best classifiers. Their true positive and negative rates are 1 while false positive and negative rates are 0 and the corresponding precision values are 1. Numerical results suggest that these

  15. New supervised alignment method as a preprocessing tool for chromatographic data in metabolomic studies.

    Science.gov (United States)

    Struck, Wiktoria; Wiczling, Paweł; Waszczuk-Jankowska, Małgorzata; Kaliszan, Roman; Markuszewski, Michał Jan

    2012-09-21

    The purpose of this work was to develop a new aligning algorithm called supervised alignment and to compare its performance with the correlation optimized warping. The supervised alignment is based on a "supervised" selection of a few common peaks presented on each chromatogram. The selected peaks are aligned based on a difference in the retention time of the selected analytes in the sample and the reference chromatogram. The retention times of the fragments between known peaks are subsequently linearly interpolated. The performance of the proposed algorithm has been tested on a series of simulated and experimental chromatograms. The simulated chromatograms comprised analytes with a systematic or random retention time shifts. The experimental chromatographic (RP-HPLC) data have been obtained during the analysis of nucleosides from 208 urine samples and consists of both the systematic and random displacements. All the data sets have been aligned using the correlation optimized warping and the supervised alignment. The time required to complete the alignment, the overall complexity of both algorithms, and its performance measured by the average correlation coefficients are compared to assess performance of tested methods. In the case of systematic shifts, both methods lead to the successful alignment. However, for random shifts, the correlation optimized warping in comparison to the supervised alignment requires more time (few hours versus few minutes) and the quality of the alignment described as correlation coefficient of the newly aligned matrix is worse 0.8593 versus 0.9629. For the experimental dataset supervised alignment successfully aligns 208 samples using 10 prior identified peaks. The knowledge about retention times of few analytes' in the data sets is necessary to perform the supervised alignment for both systematic and random shifts. The supervised alignment method is faster, more effective and simpler preprocessing method than the correlation optimized

  16. Functional MRI Preprocessing in Lesioned Brains: Manual Versus Automated Region of Interest Analysis.

    Science.gov (United States)

    Garrison, Kathleen A; Rogalsky, Corianne; Sheng, Tong; Liu, Brent; Damasio, Hanna; Winstein, Carolee J; Aziz-Zadeh, Lisa S

    2015-01-01

    Functional magnetic resonance imaging (fMRI) has significant potential in the study and treatment of neurological disorders and stroke. Region of interest (ROI) analysis in such studies allows for testing of strong a priori clinical hypotheses with improved statistical power. A commonly used automated approach to ROI analysis is to spatially normalize each participant's structural brain image to a template brain image and define ROIs using an atlas. However, in studies of individuals with structural brain lesions, such as stroke, the gold standard approach may be to manually hand-draw ROIs on each participant's non-normalized structural brain image. Automated approaches to ROI analysis are faster and more standardized, yet are susceptible to preprocessing error (e.g., normalization error) that can be greater in lesioned brains. The manual approach to ROI analysis has high demand for time and expertise, but may provide a more accurate estimate of brain response. In this study, commonly used automated and manual approaches to ROI analysis were directly compared by reanalyzing data from a previously published hypothesis-driven cognitive fMRI study, involving individuals with stroke. The ROI evaluated is the pars opercularis of the inferior frontal gyrus. Significant differences were identified in task-related effect size and percent-activated voxels in this ROI between the automated and manual approaches to ROI analysis. Task interactions, however, were consistent across ROI analysis approaches. These findings support the use of automated approaches to ROI analysis in studies of lesioned brains, provided they employ a task interaction design.

  17. Joint Preprocesser-Based Detectors for One-Way and Two-Way Cooperative Communication Networks

    KAUST Repository

    Abuzaid, Abdulrahman I.

    2014-05-01

    Efficient receiver designs for cooperative communication networks are becoming increasingly important. In previous work, cooperative networks communicated with the use of L relays. As the receiver is constrained, channel shortening and reduced-rank techniques were employed to design the preprocessing matrix that reduces the length of the received vector from L to U. In the first part of the work, a receiver structure is proposed which combines our proposed threshold selection criteria with the joint iterative optimization (JIO) algorithm that is based on the mean square error (MSE). Our receiver assists in determining the optimal U. Furthermore, this receiver provides the freedom to choose U for each frame depending on the tolerable difference allowed for MSE. Our study and simulation results show that by choosing an appropriate threshold, it is possible to gain in terms of complexity savings while having no or minimal effect on the BER performance of the system. Furthermore, the effect of channel estimation on the performance of the cooperative system is investigated. In the second part of the work, a joint preprocessor-based detector for cooperative communication networks is proposed for one-way and two-way relaying. This joint preprocessor-based detector operates on the principles of minimizing the symbol error rate (SER) instead of minimizing MSE. For a realistic assessment, pilot symbols are used to estimate the channel. From our simulations, it can be observed that our proposed detector achieves the same SER performance as that of the maximum likelihood (ML) detector with all participating relays. Additionally, our detector outperforms selection combining (SC), channel shortening (CS) scheme and reduced-rank techniques when using the same U. Finally, our proposed scheme has the lowest computational complexity.

  18. A non-linear preprocessing for opto-digital image encryption using multiple-parameter discrete fractional Fourier transform

    Science.gov (United States)

    Azoug, Seif Eddine; Bouguezel, Saad

    2016-01-01

    In this paper, a novel opto-digital image encryption technique is proposed by introducing a new non-linear preprocessing and using the multiple-parameter discrete fractional Fourier transform (MPDFrFT). The non-linear preprocessing is performed digitally on the input image in the spatial domain using a piecewise linear chaotic map (PLCM) coupled with the bitwise exclusive OR (XOR). The resulting image is multiplied by a random phase mask before applying the MPDFrFT to whiten the image. Then, a chaotic permutation is performed on the output of the MPDFrFT using another PLCM different from the one used in the spatial domain. Finally, another MPDFrFT is applied to obtain the encrypted image. The parameters of the PLCMs together with the multiple fractional orders of the MPDFrFTs constitute the secret key for the proposed cryptosystem. Computer simulation results and security analysis are presented to show the robustness of the proposed opto-digital image encryption technique and the great importance of the new non-linear preprocessing introduced to enhance the security of the cryptosystem and overcome the problem of linearity encountered in the existing permutation-based opto-digital image encryption schemes.

  19. Design of radial basis function neural network classifier realized with the aid of data preprocessing techniques: design and analysis

    Science.gov (United States)

    Oh, Sung-Kwun; Kim, Wook-Dong; Pedrycz, Witold

    2016-05-01

    In this paper, we introduce a new architecture of optimized Radial Basis Function neural network classifier developed with the aid of fuzzy clustering and data preprocessing techniques and discuss its comprehensive design methodology. In the preprocessing part, the Linear Discriminant Analysis (LDA) or Principal Component Analysis (PCA) algorithm forms a front end of the network. The transformed data produced here are used as the inputs of the network. In the premise part, the Fuzzy C-Means (FCM) algorithm determines the receptive field associated with the condition part of the rules. The connection weights of the classifier are of functional nature and come as polynomial functions forming the consequent part. The Particle Swarm Optimization algorithm optimizes a number of essential parameters needed to improve the accuracy of the classifier. Those optimized parameters include the type of data preprocessing, the dimensionality of the feature vectors produced by the LDA (or PCA), the number of clusters (rules), the fuzzification coefficient used in the FCM algorithm and the orders of the polynomials of networks. The performance of the proposed classifier is reported for several benchmarking data-sets and is compared with the performance of other classifiers reported in the previous studies.

  20. Signal Feature Extraction and Quantitative Evaluation of Metal Magnetic Memory Testing for Oil Well Casing Based on Data Preprocessing Technique

    Directory of Open Access Journals (Sweden)

    Zhilin Liu

    2014-01-01

    Full Text Available Metal magnetic memory (MMM technique is an effective method to achieve the detection of stress concentration (SC zone for oil well casing. It can provide an early diagnosis of microdamages for preventive protection. MMM is a natural space domain signal which is weak and vulnerable to noise interference. So, it is difficult to achieve effective feature extraction of MMM signal especially under the hostile subsurface environment of high temperature, high pressure, high humidity, and multiple interfering sources. In this paper, a method of median filter preprocessing based on data preprocessing technique is proposed to eliminate the outliers point of MMM. And, based on wavelet transform (WT, the adaptive wavelet denoising method and data smoothing arithmetic are applied in testing the system of MMM. By using data preprocessing technique, the data are reserved and the noises of the signal are reduced. Therefore, the correct localization of SC zone can be achieved. In the meantime, characteristic parameters in new diagnostic approach are put forward to ensure the reliable determination of casing danger level through least squares support vector machine (LS-SVM and nonlinear quantitative mapping relationship. The effectiveness and feasibility of this method are verified through experiments.

  1. THE EFFECT OF DECOMPOSITION METHOD AS DATA PREPROCESSING ON NEURAL NETWORKS MODEL FOR FORECASTING TREND AND SEASONAL TIME SERIES

    Directory of Open Access Journals (Sweden)

    Subanar Subanar

    2006-01-01

    Full Text Available Recently, one of the central topics for the neural networks (NN community is the issue of data preprocessing on the use of NN. In this paper, we will investigate this topic particularly on the effect of Decomposition method as data processing and the use of NN for modeling effectively time series with both trend and seasonal patterns. Limited empirical studies on seasonal time series forecasting with neural networks show that some find neural networks are able to model seasonality directly and prior deseasonalization is not necessary, and others conclude just the opposite. In this research, we study particularly on the effectiveness of data preprocessing, including detrending and deseasonalization by applying Decomposition method on NN modeling and forecasting performance. We use two kinds of data, simulation and real data. Simulation data are examined on multiplicative of trend and seasonality patterns. The results are compared to those obtained from the classical time series model. Our result shows that a combination of detrending and deseasonalization by applying Decomposition method is the effective data preprocessing on the use of NN for forecasting trend and seasonal time series.

  2. PathoQC: Computationally Efficient Read Preprocessing and Quality Control for High-Throughput Sequencing Data Sets.

    Science.gov (United States)

    Hong, Changjin; Manimaran, Solaiappan; Johnson, William Evan

    2014-01-01

    Quality control and read preprocessing are critical steps in the analysis of data sets generated from high-throughput genomic screens. In the most extreme cases, improper preprocessing can negatively affect downstream analyses and may lead to incorrect biological conclusions. Here, we present PathoQC, a streamlined toolkit that seamlessly combines the benefits of several popular quality control software approaches for preprocessing next-generation sequencing data. PathoQC provides a variety of quality control options appropriate for most high-throughput sequencing applications. PathoQC is primarily developed as a module in the PathoScope software suite for metagenomic analysis. However, PathoQC is also available as an open-source Python module that can run as a stand-alone application or can be easily integrated into any bioinformatics workflow. PathoQC achieves high performance by supporting parallel computation and is an effective tool that removes technical sequencing artifacts and facilitates robust downstream analysis. The PathoQC software package is available at http://sourceforge.net/projects/PathoScope/.

  3. Facilitating neuronal connectivity analysis of evoked responses by exposing local activity with principal component analysis preprocessing: simulation of evoked MEG.

    Science.gov (United States)

    Gao, Lin; Zhang, Tongsheng; Wang, Jue; Stephen, Julia

    2013-04-01

    When connectivity analysis is carried out for event related EEG and MEG, the presence of strong spatial correlations from spontaneous activity in background may mask the local neuronal evoked activity and lead to spurious connections. In this paper, we hypothesized PCA decomposition could be used to diminish the background activity and further improve the performance of connectivity analysis in event related experiments. The idea was tested using simulation, where we found that for the 306-channel Elekta Neuromag system, the first 4 PCs represent the dominant background activity, and the source connectivity pattern after preprocessing is consistent with the true connectivity pattern designed in the simulation. Improving signal to noise of the evoked responses by discarding the first few PCs demonstrates increased coherences at major physiological frequency bands when removing the first few PCs. Furthermore, the evoked information was maintained after PCA preprocessing. In conclusion, it is demonstrated that the first few PCs represent background activity, and PCA decomposition can be employed to remove it to expose the evoked activity for the channels under investigation. Therefore, PCA can be applied as a preprocessing approach to improve neuronal connectivity analysis for event related data.

  4. Finding differentially expressed genes in two-channel DNA microarray datasets: how to increase reliability of data preprocessing.

    Science.gov (United States)

    Rotter, Ana; Hren, Matjaz; Baebler, Spela; Blejec, Andrej; Gruden, Kristina

    2008-09-01

    Due to the great variety of preprocessing tools in two-channel expression microarray data analysis it is difficult to choose the most appropriate one for a given experimental setup. In our study, two independent two-channel inhouse microarray experiments as well as a publicly available dataset were used to investigate the influence of the selection of preprocessing methods (background correction, normalization, and duplicate spots correlation calculation) on the discovery of differentially expressed genes. Here we are showing that both the list of differentially expressed genes and the expression values of selected genes depend significantly on the preprocessing approach applied. The choice of normalization method to be used had the highest impact on the results. We propose a simple but efficient approach to increase the reliability of obtained results, where two normalization methods which are theoretically distinct from one another are used on the same dataset. Then the intersection of results, that is, the lists of differentially expressed genes, is used in order to get a more accurate estimation of the genes that were de facto differentially expressed.

  5. A Conversation on Data Mining Strategies in LC-MS Untargeted Metabolomics: Pre-Processing and Pre-Treatment Steps.

    Science.gov (United States)

    Tugizimana, Fidele; Steenkamp, Paul A; Piater, Lizelle A; Dubery, Ian A

    2016-11-03

    Untargeted metabolomic studies generate information-rich, high-dimensional, and complex datasets that remain challenging to handle and fully exploit. Despite the remarkable progress in the development of tools and algorithms, the "exhaustive" extraction of information from these metabolomic datasets is still a non-trivial undertaking. A conversation on data mining strategies for a maximal information extraction from metabolomic data is needed. Using a liquid chromatography-mass spectrometry (LC-MS)-based untargeted metabolomic dataset, this study explored the influence of collection parameters in the data pre-processing step, scaling and data transformation on the statistical models generated, and feature selection, thereafter. Data obtained in positive mode generated from a LC-MS-based untargeted metabolomic study (sorghum plants responding dynamically to infection by a fungal pathogen) were used. Raw data were pre-processed with MarkerLynx(TM) software (Waters Corporation, Manchester, UK). Here, two parameters were varied: the intensity threshold (50-100 counts) and the mass tolerance (0.005-0.01 Da). After the pre-processing, the datasets were imported into SIMCA (Umetrics, Umea, Sweden) for more data cleaning and statistical modeling. In addition, different scaling (unit variance, Pareto, etc.) and data transformation (log and power) methods were explored. The results showed that the pre-processing parameters (or algorithms) influence the output dataset with regard to the number of defined features. Furthermore, the study demonstrates that the pre-treatment of data prior to statistical modeling affects the subspace approximation outcome: e.g., the amount of variation in X-data that the model can explain and predict. The pre-processing and pre-treatment steps subsequently influence the number of statistically significant extracted/selected features (variables). Thus, as informed by the results, to maximize the value of untargeted metabolomic data, understanding

  6. Miniature, Low Power Gas Chromatograph with Sample Pre-Processing Capability and Enhanced G-Force Survivability for Planetary Missions Project

    Data.gov (United States)

    National Aeronautics and Space Administration — Thorleaf Research, Inc. proposes to develop a miniaturized, low power gas chromatograph (GC) with sample pre-processing capability and enhanced capability for...

  7. Optimizing preprocessing and analysis pipelines for single-subject fMRI: 2. Interactions with ICA, PCA, task contrast and inter-subject heterogeneity.

    Science.gov (United States)

    Churchill, Nathan W; Yourganov, Grigori; Oder, Anita; Tam, Fred; Graham, Simon J; Strother, Stephen C

    2012-01-01

    A variety of preprocessing techniques are available to correct subject-dependant artifacts in fMRI, caused by head motion and physiological noise. Although it has been established that the chosen preprocessing steps (or "pipeline") may significantly affect fMRI results, it is not well understood how preprocessing choices interact with other parts of the fMRI experimental design. In this study, we examine how two experimental factors interact with preprocessing: between-subject heterogeneity, and strength of task contrast. Two levels of cognitive contrast were examined in an fMRI adaptation of the Trail-Making Test, with data from young, healthy adults. The importance of standard preprocessing with motion correction, physiological noise correction, motion parameter regression and temporal detrending were examined for the two task contrasts. We also tested subspace estimation using Principal Component Analysis (PCA), and Independent Component Analysis (ICA). Results were obtained for Penalized Discriminant Analysis, and model performance quantified with reproducibility (R) and prediction metrics (P). Simulation methods were also used to test for potential biases from individual-subject optimization. Our results demonstrate that (1) individual pipeline optimization is not significantly more biased than fixed preprocessing. In addition, (2) when applying a fixed pipeline across all subjects, the task contrast significantly affects pipeline performance; in particular, the effects of PCA and ICA models vary with contrast, and are not by themselves optimal preprocessing steps. Also, (3) selecting the optimal pipeline for each subject improves within-subject (P,R) and between-subject overlap, with the weaker cognitive contrast being more sensitive to pipeline optimization. These results demonstrate that sensitivity of fMRI results is influenced not only by preprocessing choices, but also by interactions with other experimental design factors. This paper outlines a

  8. PreP+07: improvements of a user friendly tool to preprocess and analyse microarray data

    Directory of Open Access Journals (Sweden)

    Claros M Gonzalo

    2009-01-01

    Full Text Available Abstract Background Nowadays, microarray gene expression analysis is a widely used technology that scientists handle but whose final interpretation usually requires the participation of a specialist. The need for this participation is due to the requirement of some background in statistics that most users lack or have a very vague notion of. Moreover, programming skills could also be essential to analyse these data. An interactive, easy to use application seems therefore necessary to help researchers to extract full information from data and analyse them in a simple, powerful and confident way. Results PreP+07 is a standalone Windows XP application that presents a friendly interface for spot filtration, inter- and intra-slide normalization, duplicate resolution, dye-swapping, error removal and statistical analyses. Additionally, it contains two unique implementation of the procedures – double scan and Supervised Lowess-, a complete set of graphical representations – MA plot, RG plot, QQ plot, PP plot, PN plot – and can deal with many data formats, such as tabulated text, GenePix GPR and ArrayPRO. PreP+07 performance has been compared with the equivalent functions in Bioconductor using a tomato chip with 13056 spots. The number of differentially expressed genes considering p-values coming from the PreP+07 and Bioconductor Limma packages were statistically identical when the data set was only normalized; however, a slight variability was appreciated when the data was both normalized and scaled. Conclusion PreP+07 implementation provides a high degree of freedom in selecting and organizing a small set of widely used data processing protocols, and can handle many data formats. Its reliability has been proven so that a laboratory researcher can afford a statistical pre-processing of his/her microarray results and obtain a list of differentially expressed genes using PreP+07 without any programming skills. All of this gives support to scientists

  9. A New Unsupervised Pre-processing Algorithm Based on Artificial Immune System for ERP Assessment in a P300-based GKT

    Directory of Open Access Journals (Sweden)

    S. Shojaeilangari

    2012-09-01

    Full Text Available In recent years, an increasing number of researches have been focused on bio-inspired algorithms to solve the elaborate engineering problems. Artificial Immune System (AIS is an artificial intelligence technique which has potential of solving problems in various fields. The immune system, due to self-regulating nature, has been an inspiration source of unsupervised learning methods for pattern recognition task. The purpose of this study is to apply the AIS to pre-process the lie-detection dataset to promote the recognition of guilty and innocent subjects. A new Unsupervised AIS (UAIS was proposed in this study as a pre-processing method before classification. Then, we applied three different classifiers on pre-processed data for Event Related Potential (ERP assessment in a P300-based Guilty Knowledge Test (GKT. Experiment results showed that UAIS is a successful pre-processing method which is able to improve the classification rate. In our experiments, we observed that the classification accuracies for three different classifiers: K-Nearest Neighbourhood (KNN, Support Vector Machine (SVM and Linear Discriminant Analysis (LDA were increased after applying UAIS pre-processing. Using of scattering criterion to assessment the features before and after pre-processing proved that our proposed method was able to perform data mapping from a primary feature space to a new area where the data separability was improved significantly.

  10. [The net analyte preprocessing combined with radial basis partial least squares regression applied in noninvasive measurement of blood glucose].

    Science.gov (United States)

    Li, Qing-Bo; Huang, Zheng-Wei

    2014-02-01

    In order to improve the prediction accuracy of quantitative analysis model in the near-infrared spectroscopy of blood glucose, this paper, by combining net analyte preprocessing (NAP) algorithm and radial basis functions partial least squares (RBFPLS) regression, builds a nonlinear model building method which is suitable for glucose measurement of human, named as NAP-RBFPLS. First, NAP is used to pre-process the near-infrared spectroscopy of blood glucose, in order to effectively extract the information which only relates to glucose signal from the original near-infrared spectra, so that it could effectively weaken the occasional correlation problems of the glucose changes and the interference factors which are caused by the absorption of water, albumin, hemoglobin, fat and other components of the blood in human body, the change of temperature of human body, the drift of measuring instruments, the changes of measuring environment, and the changes of measuring conditions; and then a nonlinear quantitative analysis model is built with the near-infrared spectroscopy data after NAP, in order to solve the nonlinear relationship between glucose concentrations and near-infrared spectroscopy which is caused by body strong scattering. In this paper, the new method is compared with other three quantitative analysis models building on partial least squares (PLS), net analyte preprocessing partial least squares (NAP-PLS) and RBFPLS respectively. At last, the experimental results show that the nonlinear calibration model, developed by combining NAP algorithm and RBFPLS regression, which was put forward in this paper, greatly improves the prediction accuracy of prediction sets, and what has been proved in this paper is that the nonlinear model building method will produce practical applications for the research of non-invasive detection techniques on human glucose concentrations.

  11. MISR Near Real Time (NRT) Level 1B2 Terrain Data V001

    Data.gov (United States)

    National Aeronautics and Space Administration — This file contains Terrain-projected TOA Radiance,resampled at the surface and topographically corrected, as well as geometrically corrected by PGE22. It is used for...

  12. MISR Near Real Time (NRT) Level 1B2 Ellipsoid Data V001

    Data.gov (United States)

    National Aeronautics and Space Administration — This file contains Ellipsoid-projected TOA Radiance,resampled at the surface and topographically corrected, as well as geometrically corrected by PGE22. It is used...

  13. AIRS/Aqua Level 1B HSB geolocated and calibrated brightness temperatures V005

    Data.gov (United States)

    National Aeronautics and Space Administration — The Atmospheric Infrared Sounder (AIRS) is a facility instrument aboard the second Earth Observing System (EOS) polar-orbiting platform, EOS Aqua. In combination...

  14. Current breathomics-a review on data pre-processing techniques and machine learning in metabolomics breath analysis

    DEFF Research Database (Denmark)

    Smolinska, A.; Hauschild, A. C.; Fijten, R. R. R.

    2014-01-01

    been extensively developed. Yet, the application of machine learning methods for fingerprinting VOC profiles in the breathomics is still in its infancy. Therefore, in this paper, we describe the current state of the art in data pre-processing and multivariate analysis of breathomics data. We start...... different conditions (e.g. disease stage, treatment). Independently of the utilized analytical method, the most important question, 'which VOCs are discriminatory?', remains the same. Answers can be given by several modern machine learning techniques (multivariate statistics) and, therefore, are the focus...

  15. RETRACTED: Identifying halophilic proteins based on random forests with preprocessing of the pseudo-amino acid composition.

    Science.gov (United States)

    Ge, Huihua; Zhang, Guangya

    2014-11-21

    This article has been retracted: please see Elsevier Policy on Article Withdrawal (http://www.elsevier.com/locate/withdrawalpolicy). This article has been retracted at the request of the authors. When using the resampling method to preprocess the raw data of the paper used, some of the types of the proteins (i.e., the HI, HO and NP) were changed; thus, the predicting accuracy cannot reflect the real results. This means the effectiveness of resampling methods in this article gives false results. The Publisher apologizes for any inconvenience this may cause. Copyright © 2014 Elsevier Ltd. All rights reserved.

  16. River flow forecasting with Artificial Neural Networks using satellite observed precipitation pre-processed with flow length and travel time information: case study of the Ganges river basin

    Directory of Open Access Journals (Sweden)

    M. K. Akhtar

    2009-04-01

    Full Text Available This paper explores the use of flow length and travel time as a pre-processing step for incorporating spatial precipitation information into Artificial Neural Network (ANN models used for river flow forecasting. Spatially distributed precipitation is commonly required when modelling large basins, and it is usually incorporated in distributed physically-based hydrological modelling approaches. However, these modelling approaches are recognised to be quite complex and expensive, especially due to the data collection of multiple inputs and parameters, which vary in space and time. On the other hand, ANN models for flow forecasting are frequently developed only with precipitation and discharge as inputs, usually without taking into consideration the spatial variability of precipitation. Full inclusion of spatially distributed inputs into ANN models still leads to a complex computational process that may not give acceptable results. Therefore, here we present an analysis of the flow length and travel time as a basis for pre-processing remotely sensed (satellite rainfall data. This pre-processed rainfall is used together with local stream flow measurements of previous days as input to ANN models. The case study for this modelling approach is the Ganges river basin. A comparative analysis of multiple ANN models with different hydrological pre-processing is presented. The ANN showed its ability to forecast discharges 3-days ahead with an acceptable accuracy. Within this forecast horizon, the influence of the pre-processed rainfall is marginal, because of dominant influence of strongly auto-correlated discharge inputs. For forecast horizons of 7 to 10 days, the influence of the pre-processed rainfall is noticeable, although the overall model performance deteriorates. The incorporation of remote sensing data of spatially distributed precipitation information as pre-processing step showed to be a promising alternative for the setting-up of ANN models for

  17. Generalized regression neural network trained preprocessing of frequency domain correlation filter for improved face recognition and its optical implementation

    Science.gov (United States)

    Banerjee, Pradipta K.; Datta, Asit K.

    2013-02-01

    The paper proposes an improved strategy for face recognition using correlation filter under varying lighting conditions and occlusion where spatial domain preprocessing is carried out by two convolution kernels. The first convolution kernel is a contour kernel for emphasizing high frequency components of face image and the other kernel is a smoothing kernel used for minimization of noise those may arise due to preprocessing. The convolution kernels are obtained by training a generalized regression neural network using enhanced face features. Face features are enhanced by conventional principal component analysis. The proposed method reduces the false acceptance rate and false rejection rate in comparison to other standard correlation filtering techniques. Moreover, the processing is fast when compared to the existing illumination normalization techniques. A scheme of hardware implementation of all optical correlation technique is also suggested based on single spatial light modulator in a beam folding architecture. Two benchmark databases YaleB and PIE are used for performance verification of the proposed scheme and the improved results are obtained for both illumination variations and occlusions in test face images.

  18. Quantitative Performance Evaluator for Proteomics (QPEP): Web-based Application for Reproducible Evaluation of Proteomics Preprocessing Methods.

    Science.gov (United States)

    Strbenac, Dario; Zhong, Ling; Raftery, Mark J; Wang, Penghao; Wilson, Susan R; Armstrong, Nicola J; Yang, Jean Y H

    2017-07-07

    Tandem mass spectrometry is one of the most popular techniques for quantitation of proteomes. There exists a large variety of options in each stage of data preprocessing that impact the bias and variance of the summarized protein-level values. Using a newly released data set satisfying a replicated Latin squares design, a diverse set of performance metrics has been developed and implemented in a web-based application, Quantitative Performance Evaluator for Proteomics (QPEP). QPEP has the flexibility to allow users to apply their own method to preprocess this data set and share the results, allowing direct and straightforward comparison of new methodologies. Application of these new metrics to three case studies highlights that (i) the summarization of peptides to proteins is robust to the choice of peptide summary used, (ii) the differences between iTRAQ labels are stronger than the differences between experimental runs, and (iii) the commercial software ProteinPilot performs equivalently well at between-sample normalization to more complicated methods developed by academics. Importantly, finding (ii) underscores the benefits of using the principles of randomization and blocking to avoid the experimental measurements being confounded by technical factors. Data are available via ProteomeXchange with identifier PXD003608.

  19. QR码图像预处理方案研究%Research of QR Code Image Preprocessing Scheme

    Institute of Scientific and Technical Information of China (English)

    李筱楠; 郑华; 刘会杰

    2016-01-01

    Image preprocessing is an important step in the process of QR code decoding. In this paper, a practical image preprocessing method for QR code recognition is proposed. Image binarization can reduce process computation, and locate QR code based on its symbol characteristics. Experimental results demonstrate that the proposed approach can overcome the influence from noise,inhomogeneous light and geometric distortion, and thus increase the recognition rate of the QR code.%图像预处理是QR码解码过程中的重要步骤。在传统识别方案基础上,提出一种实用的QR码图像预处理方法,对采集到的图像进行滤波和二值化处理,由位置探测图形定位 QR 码的位置和畸变角度,并通过透视变换矫正图像几何形变。实验结果表明,该方案可以克服 QR 码易受噪声干扰、光照不均和几何失真等影响的问题,显著提高了QR码的识别率。

  20. A Novel Pre-Processing Technique for Original Feature Matrix of Electronic Nose Based on Supervised Locality Preserving Projections

    Directory of Open Access Journals (Sweden)

    Pengfei Jia

    2016-06-01

    Full Text Available An electronic nose (E-nose consisting of 14 metal oxide gas sensors and one electronic chemical gas sensor has been constructed to identify four different classes of wound infection. However, the classification results of the E-nose are not ideal if the original feature matrix containing the maximum steady-state response value of sensors is processed by the classifier directly, so a novel pre-processing technique based on supervised locality preserving projections (SLPP is proposed in this paper to process the original feature matrix before it is put into the classifier to improve the performance of the E-nose. SLPP is good at finding and keeping the nonlinear structure of data; furthermore, it can provide an explicit mapping expression which is unreachable by the traditional manifold learning methods. Additionally, some effective optimization methods are found by us to optimize the parameters of SLPP and the classifier. Experimental results prove that the classification accuracy of support vector machine (SVM combined with the data pre-processed by SLPP outperforms other considered methods. All results make it clear that SLPP has a better performance in processing the original feature matrix of the E-nose.

  1. Study of pre-processing model of coal-mine hoist wire-rope fatigue damage signal

    Institute of Scientific and Technical Information of China (English)

    Tian Jie; Wang Hongyao⇑; Zhou Junying; Meng Guoying

    2015-01-01

    In this paper, we propose a pre-processing method for the detection of wire-rope signals. This is neces-sary because of the lack of processing methods that are currently employed. First, we investigated the one-dimensional discrete morphological and wavelet transform. Then, we developed a pre-processing model that is based on the morphological wavelet-filtering algorithm. We then proposed a modified morphology filtering algorithm. We also designed an experiment platform for wire-rope detection. Eight levels of localized flaws (LFs) and damage were formed in the wire-rope specimen. We performed a series of experimental studies, and the results show that the proposed method can effectively filter the drift signal. The signal-to-noise ratio of the new filtering algorithm was over 26 dB. The signal-to-noise ratio of the existing method is less than 15 dB, and the noise-signal ratio of the new filtering algorithm has improved by 73%. Based on our results, the filtering effect of the proposed method is better than that of the present method. This study has great significance and practical value in engineering applications.

  2. Preprocessing significantly improves the peptide/protein identification sensitivity of high-resolution isobarically labeled tandem mass spectrometry data.

    Science.gov (United States)

    Sheng, Quanhu; Li, Rongxia; Dai, Jie; Li, Qingrun; Su, Zhiduan; Guo, Yan; Li, Chen; Shyr, Yu; Zeng, Rong

    2015-02-01

    Isobaric labeling techniques coupled with high-resolution mass spectrometry have been widely employed in proteomic workflows requiring relative quantification. For each high-resolution tandem mass spectrum (MS/MS), isobaric labeling techniques can be used not only to quantify the peptide from different samples by reporter ions, but also to identify the peptide it is derived from. Because the ions related to isobaric labeling may act as noise in database searching, the MS/MS spectrum should be preprocessed before peptide or protein identification. In this article, we demonstrate that there are a lot of high-frequency, high-abundance isobaric related ions in the MS/MS spectrum, and removing isobaric related ions combined with deisotoping and deconvolution in MS/MS preprocessing procedures significantly improves the peptide/protein identification sensitivity. The user-friendly software package TurboRaw2MGF (v2.0) has been implemented for converting raw TIC data files to mascot generic format files and can be downloaded for free from https://github.com/shengqh/RCPA.Tools/releases as part of the software suite ProteomicsTools. The data have been deposited to the ProteomeXchange with identifier PXD000994.

  3. Localization of spatially distributed brain sources after a tensor-based preprocessing of interictal epileptic EEG data.

    Science.gov (United States)

    Albera, L; Becker, H; Karfoul, A; Gribonval, R; Kachenoura, A; Bensaid, S; Senhadji, L; Hernandez, A; Merlet, I

    2015-01-01

    This paper addresses the localization of spatially distributed sources from interictal epileptic electroencephalographic data after a tensor-based preprocessing. Justifying the Canonical Polyadic (CP) model of the space-time-frequency and space-time-wave-vector tensors is not an easy task when two or more extended sources have to be localized. On the other hand, the occurrence of several amplitude modulated spikes originating from the same epileptic region can be used to build a space-time-spike tensor from the EEG data. While the CP model of this tensor appears more justified, the exact computation of its loading matrices can be limited by the presence of highly correlated sources or/and a strong background noise. An efficient extended source localization scheme after the tensor-based preprocessing has then to be set up. Different strategies are thus investigated and compared on realistic simulated data: the "disk algorithm" using a precomputed dictionary of circular patches, a standardized Tikhonov regularization and a fused LASSO scheme.

  4. Combined data preprocessing and multivariate statistical analysis characterizes fed-batch culture of mouse hybridoma cells for rational medium design.

    Science.gov (United States)

    Selvarasu, Suresh; Kim, Do Yun; Karimi, Iftekhar A; Lee, Dong-Yup

    2010-10-01

    We present an integrated framework for characterizing fed-batch cultures of mouse hybridoma cells producing monoclonal antibody (mAb). This framework systematically combines data preprocessing, elemental balancing and statistical analysis technique. Initially, specific rates of cell growth, glucose/amino acid consumptions and mAb/metabolite productions were calculated via curve fitting using logistic equations, with subsequent elemental balancing of the preprocessed data indicating the presence of experimental measurement errors. Multivariate statistical analysis was then employed to understand physiological characteristics of the cellular system. The results from principal component analysis (PCA) revealed three major clusters of amino acids with similar trends in their consumption profiles: (i) arginine, threonine and serine, (ii) glycine, tyrosine, phenylalanine, methionine, histidine and asparagine, and (iii) lysine, valine and isoleucine. Further analysis using partial least square (PLS) regression identified key amino acids which were positively or negatively correlated with the cell growth, mAb production and the generation of lactate and ammonia. Based on these results, the optimal concentrations of key amino acids in the feed medium can be inferred, potentially leading to an increase in cell viability and productivity, as well as a decrease in toxic waste production. The study demonstrated how the current methodological framework using multivariate statistical analysis techniques can serve as a potential tool for deriving rational medium design strategies. Copyright © 2010 Elsevier B.V. All rights reserved.

  5. Web Log Pre-processing and Analysis for Generation of Learning Profiles in Adaptive E-learning

    Directory of Open Access Journals (Sweden)

    Radhika M. Pai

    2016-04-01

    Full Text Available Adaptive E-learning Systems (AESs enhance the efficiency of online courses in education by providing personalized contents and user interfaces that changes according to learner’s requirements and usage patterns. This paper presents the approach to generate learning profile of each learner which helps to identify the learning styles and provide Adaptive User Interface which includes adaptive learning components and learning material. The proposed method analyzes the captured web usage data to identify the learning profile of the learners. The learning profiles are identified by an algorithmic approach that is based on the frequency of accessing the materials and the time spent on the various learning components on the portal. The captured log data is pre-processed and converted into standard XML format to generate learners sequence data corresponding to the different sessions and time spent. The learning style model adopted in this approach is Felder-Silverman Learning Style Model (FSLSM. This paper also presents the analysis of learner’s activities, preprocessed XML files and generated sequences.

  6. Web Log Pre-processing and Analysis for Generation of Learning Profiles in Adaptive E-learning

    Directory of Open Access Journals (Sweden)

    Radhika M. Pai

    2016-03-01

    Full Text Available Adaptive E-learning Systems (AESs enhance the efficiency of online courses in education by providing personalized contents and user interfaces that changes according to learner’s requirements and usage patterns. This paper presents the approach to generate learning profile of each learner which helps to identify the learning styles and provide Adaptive User Interface which includes adaptive learning components and learning material. The proposed method analyzes the captured web usage data to identify the learning profile of the learners. The learning profiles are identified by an algorithmic approach that is based on the frequency of accessing the materials and the time spent on the various learning components on the portal. The captured log data is pre-processed and converted into standard XML format to generate learners sequence data corresponding to the different sessions and time spent. The learning style model adopted in this approach is Felder-Silverman Learning Style Model (FSLSM. This paper also presents the analysis of learner’s activities, preprocessed XML files and generated sequences.

  7. Development and integration of block operations for data invariant automation of digital preprocessing and analysis of biological and biomedical Raman spectra.

    Science.gov (United States)

    Schulze, H Georg; Turner, Robin F B

    2015-06-01

    High-throughput information extraction from large numbers of Raman spectra is becoming an increasingly taxing problem due to the proliferation of new applications enabled using advances in instrumentation. Fortunately, in many of these applications, the entire process can be automated, yielding reproducibly good results with significant time and cost savings. Information extraction consists of two stages, preprocessing and analysis. We focus here on the preprocessing stage, which typically involves several steps, such as calibration, background subtraction, baseline flattening, artifact removal, smoothing, and so on, before the resulting spectra can be further analyzed. Because the results of some of these steps can affect the performance of subsequent ones, attention must be given to the sequencing of steps, the compatibility of these sequences, and the propensity of each step to generate spectral distortions. We outline here important considerations to effect full automation of Raman spectral preprocessing: what is considered full automation; putative general principles to effect full automation; the proper sequencing of processing and analysis steps; conflicts and circularities arising from sequencing; and the need for, and approaches to, preprocessing quality control. These considerations are discussed and illustrated with biological and biomedical examples reflecting both successful and faulty preprocessing.

  8. Simultaneous data pre-processing and SVM classification model selection based on a parallel genetic algorithm applied to spectroscopic data of olive oils.

    Science.gov (United States)

    Devos, Olivier; Downey, Gerard; Duponchel, Ludovic

    2014-04-01

    Classification is an important task in chemometrics. For several years now, support vector machines (SVMs) have proven to be powerful for infrared spectral data classification. However such methods require optimisation of parameters in order to control the risk of overfitting and the complexity of the boundary. Furthermore, it is established that the prediction ability of classification models can be improved using pre-processing in order to remove unwanted variance in the spectra. In this paper we propose a new methodology based on genetic algorithm (GA) for the simultaneous optimisation of SVM parameters and pre-processing (GENOPT-SVM). The method has been tested for the discrimination of the geographical origin of Italian olive oil (Ligurian and non-Ligurian) on the basis of near infrared (NIR) or mid infrared (FTIR) spectra. Different classification models (PLS-DA, SVM with mean centre data, GENOPT-SVM) have been tested and statistically compared using McNemar's statistical test. For the two datasets, SVM with optimised pre-processing give models with higher accuracy than the one obtained with PLS-DA on pre-processed data. In the case of the NIR dataset, most of this accuracy improvement (86.3% compared with 82.8% for PLS-DA) occurred using only a single pre-processing step. For the FTIR dataset, three optimised pre-processing steps are required to obtain SVM model with significant accuracy improvement (82.2%) compared to the one obtained with PLS-DA (78.6%). Furthermore, this study demonstrates that even SVM models have to be developed on the basis of well-corrected spectral data in order to obtain higher classification rates.

  9. QSpike Tools: a Generic Framework for Parallel Batch Preprocessing of Extracellular Neuronal Signals Recorded by Substrate Microelectrode Arrays

    Directory of Open Access Journals (Sweden)

    Mufti eMahmud

    2014-03-01

    Full Text Available Micro-Electrode Arrays (MEAs have emerged as a mature technique to investigate brain (dysfunctions in vivo and in in vitro animal models. Often referred to as smart Petri dishes, MEAs has demonstrated a great potential particularly for medium-throughput studies in vitro, both in academic and pharmaceutical industrial contexts. Enabling rapid comparison of ionic/pharmacological/genetic manipulations with control conditions, MEAs are often employed to screen compounds by monitoring non-invasively the spontaneous and evoked neuronal electrical activity in longitudinal studies, with relatively inexpensive equipment. However, in order to acquire sufficient statistical significance, recordings last up to tens of minutes and generate large amount of raw data (e.g., 60 channels/MEA, 16 bits A/D conversion, 20kHz sampling rate: ~8GB/MEA,h uncompressed. Thus, when the experimental conditions to be tested are numerous, the availability of fast, standardized, and automated signal preprocessing becomes pivotal for any subsequent analysis and data archiving. To this aim, we developed an in-house cloud-computing system, named QSpike Tools, where CPU-intensive operations, required for preprocessing of each recorded channel (e.g., filtering, multi-unit activity detection, spike-sorting, etc., are decomposed and batch-queued to a multi-core architecture or to computer cluster. With the commercial availability of new and inexpensive high-density MEAs, we believe that disseminating QSpike Tools might facilitate its wide adoption and customization, and possibly inspire the creation of community-supported cloud-computing facilities for MEAs users.

  10. On semantics-based spatial data preprocessing: a case study in non-ortho RS images mosaic

    Science.gov (United States)

    Wang, Daojun; Gong, Jianhua; Ma, Ai-Nai

    2008-10-01

    A Three-Level Information Architecture containing Syntactic Information, Semantic Information and Pragmatic Information is put forward in Comprehensive Information Theory (CIT). From this point of view, spatial data analysis is in cooperation with semantic information and Spatial Data Preprocessing (SDP) is corresponding to syntactic information. However, in many practical applications, SDP based only on syntactic information can not get a good effect. Semantics-based preprocessing may be an effective scheme. RS images mosaic is a typical SDP where optimal mosaic line extraction is the crux. Lots of researches based on syntactic information are effective just for orthophoto maps. In this paper, an overall optimal mosaic line extraction scheme has been addressed for non-Ortho RS images. It is argued that there is no projection error in the projection datum fitted by Ground Control Points (GCPs), or regional main height surface which can be recognized in medium resolution RS images. Based on above reasons, the method suggests that GCPs collecting for precise geometrical correction should be on the main height surface, as well as the mosaic line extracting for RS images mosaic. Three sheets of CBERS CCD images of Taiyuan are taken as the experimental data. According to the afore-mentioned method, by collecting GCPs in wide riverbeds, all three sheets are rectified to an existing ETM+ mosaic image. And then, the central lines of wide riverbeds in the overlapping areas are extracted as the mosaic line. The experimental result indicates that this method can extract an overall optimal mosaic line and eliminate the visual texture seam-line effectively, even for non-Ortho RS images. It concludes that SDP based on semantic information can play a good role in spatial data applications.

  11. QSpike tools: a generic framework for parallel batch preprocessing of extracellular neuronal signals recorded by substrate microelectrode arrays.

    Science.gov (United States)

    Mahmud, Mufti; Pulizzi, Rocco; Vasilaki, Eleni; Giugliano, Michele

    2014-01-01

    Micro-Electrode Arrays (MEAs) have emerged as a mature technique to investigate brain (dys)functions in vivo and in in vitro animal models. Often referred to as "smart" Petri dishes, MEAs have demonstrated a great potential particularly for medium-throughput studies in vitro, both in academic and pharmaceutical industrial contexts. Enabling rapid comparison of ionic/pharmacological/genetic manipulations with control conditions, MEAs are employed to screen compounds by monitoring non-invasively the spontaneous and evoked neuronal electrical activity in longitudinal studies, with relatively inexpensive equipment. However, in order to acquire sufficient statistical significance, recordings last up to tens of minutes and generate large amount of raw data (e.g., 60 channels/MEA, 16 bits A/D conversion, 20 kHz sampling rate: approximately 8 GB/MEA,h uncompressed). Thus, when the experimental conditions to be tested are numerous, the availability of fast, standardized, and automated signal preprocessing becomes pivotal for any subsequent analysis and data archiving. To this aim, we developed an in-house cloud-computing system, named QSpike Tools, where CPU-intensive operations, required for preprocessing of each recorded channel (e.g., filtering, multi-unit activity detection, spike-sorting, etc.), are decomposed and batch-queued to a multi-core architecture or to a computers cluster. With the commercial availability of new and inexpensive high-density MEAs, we believe that disseminating QSpike Tools might facilitate its wide adoption and customization, and inspire the creation of community-supported cloud-computing facilities for MEAs users.

  12. Predictive modeling of colorectal cancer using a dedicated pre-processing pipeline on routine electronic medical records.

    Science.gov (United States)

    Kop, Reinier; Hoogendoorn, Mark; Teije, Annette Ten; Büchner, Frederike L; Slottje, Pauline; Moons, Leon M G; Numans, Mattijs E

    2016-09-01

    Over the past years, research utilizing routine care data extracted from Electronic Medical Records (EMRs) has increased tremendously. Yet there are no straightforward, standardized strategies for pre-processing these data. We propose a dedicated medical pre-processing pipeline aimed at taking on many problems and opportunities contained within EMR data, such as their temporal, inaccurate and incomplete nature. The pipeline is demonstrated on a dataset of routinely recorded data in general practice EMRs of over 260,000 patients, in which the occurrence of colorectal cancer (CRC) is predicted using various machine learning techniques (i.e., CART, LR, RF) and subsets of the data. CRC is a common type of cancer, of which early detection has proven to be important yet challenging. The results are threefold. First, the predictive models generated using our pipeline reconfirmed known predictors and identified new, medically plausible, predictors derived from the cardiovascular and metabolic disease domain, validating the pipeline's effectiveness. Second, the difference between the best model generated by the data-driven subset (AUC 0.891) and the best model generated by the current state of the art hypothesis-driven subset (AUC 0.864) is statistically significant at the 95% confidence interval level. Third, the pipeline itself is highly generic and independent of the specific disease targeted and the EMR used. In conclusion, the application of established machine learning techniques in combination with the proposed pipeline on EMRs has great potential to enhance disease prediction, and hence early detection and intervention in medical practice. Copyright © 2016 Elsevier Ltd. All rights reserved.

  13. An improved framework for confound regression and filtering for control of motion artifact in the preprocessing of resting-state functional connectivity data.

    Science.gov (United States)

    Satterthwaite, Theodore D; Elliott, Mark A; Gerraty, Raphael T; Ruparel, Kosha; Loughead, James; Calkins, Monica E; Eickhoff, Simon B; Hakonarson, Hakon; Gur, Ruben C; Gur, Raquel E; Wolf, Daniel H

    2013-01-01

    Several recent reports in large, independent samples have demonstrated the influence of motion artifact on resting-state functional connectivity MRI (rsfc-MRI). Standard rsfc-MRI preprocessing typically includes regression of confounding signals and band-pass filtering. However, substantial heterogeneity exists in how these techniques are implemented across studies, and no prior study has examined the effect of differing approaches for the control of motion-induced artifacts. To better understand how in-scanner head motion affects rsfc-MRI data, we describe the spatial, temporal, and spectral characteristics of motion artifacts in a sample of 348 adolescents. Analyses utilize a novel approach for describing head motion on a voxelwise basis. Next, we systematically evaluate the efficacy of a range of confound regression and filtering techniques for the control of motion-induced artifacts. Results reveal that the effectiveness of preprocessing procedures on the control of motion is heterogeneous, and that improved preprocessing provides a substantial benefit beyond typical procedures. These results demonstrate that the effect of motion on rsfc-MRI can be substantially attenuated through improved preprocessing procedures, but not completely removed.

  14. Preprocessing of multispectral data and simulation of ERTS data channels to make computer terrain maps of a Yellowstone National Park test site

    Science.gov (United States)

    Smedes, H. W.; Spencer, M. M.; Thomson, F. J.

    1970-01-01

    The possibility of improving the accuracy of terrain classification by preprocessing spectral data was investigated. Terrain maps were made using the following techniques: 1) preprocessing by scan angle function transformation, using the computer-selected best set of three channels; and 2) preprocessing by ratio transformation, using the specified ERTS data channels, simulated by fitting the spectral response of each of the 12 data channels to the ERTS channels by a set of weighting coefficients. By using a simple technique during printout, the maps were produced in color. The normalized scan angle function transformation resulted in the most accurate classification. The best ratio transformation for the Yellowstone Park data was the ratio of each channel to the sum of all channels. A supervised training program involving maximum likelihood decision for selecting the best spectrometer channels and similar techniques for digitizing the data of the analog magnetic tapes were used. Cloud shadows were recognized in addition to eight classes of terrain. Preprocessing of data resulted in more accurate maps, required fewer training areas (hence less preparation and computer time), and enabled much of the area formerly classified as shadow to be reclassified according to actual terrain type.

  15. Optimizing preprocessing and analysis pipelines for single-subject fMRI. I. Standard temporal motion and physiological noise correction methods.

    Science.gov (United States)

    Churchill, Nathan W; Oder, Anita; Abdi, Hervé; Tam, Fred; Lee, Wayne; Thomas, Christopher; Ween, Jon E; Graham, Simon J; Strother, Stephen C

    2012-03-01

    Subject-specific artifacts caused by head motion and physiological noise are major confounds in BOLD fMRI analyses. However, there is little consensus on the optimal choice of data preprocessing steps to minimize these effects. To evaluate the effects of various preprocessing strategies, we present a framework which comprises a combination of (1) nonparametric testing including reproducibility and prediction metrics of the data-driven NPAIRS framework (Strother et al. [2002]: NeuroImage 15:747-771), and (2) intersubject comparison of SPM effects, using DISTATIS (a three-way version of metric multidimensional scaling (Abdi et al. [2009]: NeuroImage 45:89-95). It is shown that the quality of brain activation maps may be significantly limited by sub-optimal choices of data preprocessing steps (or "pipeline") in a clinical task-design, an fMRI adaptation of the widely used Trail-Making Test. The relative importance of motion correction, physiological noise correction, motion parameter regression, and temporal detrending were examined for fMRI data acquired in young, healthy adults. Analysis performance and the quality of activation maps were evaluated based on Penalized Discriminant Analysis (PDA). The relative importance of different preprocessing steps was assessed by (1) a nonparametric Friedman rank test for fixed sets of preprocessing steps, applied to all subjects; and (2) evaluating pipelines chosen specifically for each subject. Results demonstrate that preprocessing choices have significant, but subject-dependant effects, and that individually-optimized pipelines may significantly improve the reproducibility of fMRI results over fixed pipelines. This was demonstrated by the detection of a significant interaction with motion parameter regression and physiological noise correction, even though the range of subject head motion was small across the group (≪ 1 voxel). Optimizing pipelines on an individual-subject basis also revealed brain activation patterns

  16. A graphical method to evaluate spectral preprocessing in multivariate regression calibrations: example with Savitzky-Golay filters and partial least squares regression.

    Science.gov (United States)

    Delwiche, Stephen R; Reeves, James B

    2010-01-01

    In multivariate regression analysis of spectroscopy data, spectral preprocessing is often performed to reduce unwanted background information (offsets, sloped baselines) or accentuate absorption features in intrinsically overlapping bands. These procedures, also known as pretreatments, are commonly smoothing operations or derivatives. While such operations are often useful in reducing the number of latent variables of the actual decomposition and lowering residual error, they also run the risk of misleading the practitioner into accepting calibration equations that are poorly adapted to samples outside of the calibration. The current study developed a graphical method to examine this effect on partial least squares (PLS) regression calibrations of near-infrared (NIR) reflection spectra of ground wheat meal with two analytes, protein content and sodium dodecyl sulfate sedimentation (SDS) volume (an indicator of the quantity of the gluten proteins that contribute to strong doughs). These two properties were chosen because of their differing abilities to be modeled by NIR spectroscopy: excellent for protein content, fair for SDS sedimentation volume. To further demonstrate the potential pitfalls of preprocessing, an artificial component, a randomly generated value, was included in PLS regression trials. Savitzky-Golay (digital filter) smoothing, first-derivative, and second-derivative preprocess functions (5 to 25 centrally symmetric convolution points, derived from quadratic polynomials) were applied to PLS calibrations of 1 to 15 factors. The results demonstrated the danger of an over reliance on preprocessing when (1) the number of samples used in a multivariate calibration is low (method has application to the evaluation of other preprocess functions and various types of spectroscopy data.

  17. Automatic Preprocessing of Tidal Gravity Observation Data%重力固体潮观测数据的自动化预处理

    Institute of Scientific and Technical Information of China (English)

    许闯; 罗志才; 林旭; 周波阳

    2013-01-01

    The preprocessing of tidal gravity observation data is very important to obtain high-quality tidal harmonic analysis results. The preprocessing methods of tidal gravity observation data are studied systematically, and average filtering method and wavelet filtering method for downsampling original tidal gravity observation data are given in the paper, as well as the linear interpolation method and the cubic spline interpolation method for processing interrupt data. The automatic preprocessing software of the tidal gravity observation data (APTsoft) is developed, which can calibrate and correct automatically abnormal data such as spikes, steps and interrupts. Finally, the experimental results show that the preprocessing methods and APTsoft are very effective, and APTsoft can be applied to the automatic preprocessing of tidal gravity observation data.%研究了重力固体潮汐观测数据的预处理方法,给出了对原始观测数据降采样的平均滤波和小波滤波处理方法以及处理中断数据的线性插值和三次样条插值方法,研制了重力固体潮汐观测数据自动化预处理软件APTsoft,实现了异常数据(包括尖峰、台阶、中断等)的自动标定与改正功能.实验结果验证了本文预处理方法及APTsoft软件的有效性,APTsoft可应用于重力固体潮观测数据的自动化预处理.

  18. 基于正则表达式的遥测数据预处理研究%Telemetering Data Preprocessing Based on Regular Expressions

    Institute of Scientific and Technical Information of China (English)

    陈红英; 张昌明; 何晶; 黄琼

    2015-01-01

    运载火箭遥测数据处理首次运用于海上测控任务中,针对船载设备接收原始遥测事后数据包含许多无效的乱码且记录数据十分庞大、需要在短时间内进行数据处理的难题,首先介绍了遥测数据预处理原理,对预处理方法进行了研究,最后提出了用正则表达式算法将原始数据从庞大的数据帧中解算出来的预处理模式,在短时间内完成了数据的预处理工作,提高了处理效率,解决了遥测数据处理的关键技术问题。%Telemetering data preprocessing of carrier rocket handle is first appied in the ship for tracking and control ‐ling .Aiming at the shortcoming of original telemetering data including many invalid disorderly code of the record data is very huge and need to preprocessing on the equipment of measurement ship in a short time ,first the telemetering data preprocess‐ing principle is introduced ,the method of preprocessing is researched ,finally the preprocessing mode is put forward to use regular expressions calculate original data from the huge data frames .In a short time ,the work of data preprocessing is com‐pleted ,processing efficiency is improved ,and the key technique problem of telemetering data processing is solved .

  19. An open-access software platform for the pre-processing of Earth Observation data from the MSG SEVIRI radiometer

    Science.gov (United States)

    Petropoulos, George; Sandric, Ionut; Anagnostopoulos, Vasilios

    2015-04-01

    The Spinning Enhanced Visible and Infrared Imager (SEVIRI) is multispectral sensor that is one of the main instruments on-board the MSG series of platforms. The radiometer is obtaining from a geostationary orbit coverage of Europe every 15 minutes, but it can also acquire data every 5' in the Rapid Scanning Service mode at the expense of coverage. SEVIRI has 12 spectral bands, five of which are operative in the infrared wavelengths. For the purpose of the present document, it should be mentioned that the instrument has a geometrical resolution of 1 km at Nadir for the high-resolution visible channel and 3 km for the other spectral bands. Detailed information on the SEVIRI specification and operation can be found in the EUMETSAT website. A series of data from SEVIRI instrument are currently provided by EUMETSAT at an operational mode, making a significant contribution to weather forecasting and global climate monitoring. Herein, a software tool developed in Python programming language which allows performing basic pre-processing to the raw acquired SEVIRI data from EUMETSAT is presented. Implementation of this tool allows performing key image processing steps on the SEVIRI data, including but not limited data registration, country subsetting, masking and reprojecting to any national or global coordinate systems. SEVIRI data validation with reference data (e.g. from in-situ measurements if available) and generation of new datasets with ordinary linear regressions, are other capabilities. The tool makes use of the present day multicore processors, being able to process fast very large datasets. The practical usefulness of the software tool is also demonstrated using a variety of examples. Our work is significant to the users' community of the model and very timely, given that to our knowledge there is no similar tool available at present to the SEVIRI users' community, particularly so in the light of the wide range of operationally distributed EO products from

  20. Quality changes of pomegranate arils throughout shelf life affected by deficit irrigation and pre-processing storage.

    Science.gov (United States)

    Peña-Estévez, María E; Artés-Hernández, Francisco; Artés, Francisco; Aguayo, Encarna; Martínez-Hernández, Ginés Benito; Galindo, Alejandro; Gómez, Perla A

    2016-10-15

    This study investigated the influence of sustained deficit irrigation (SDI, 78% less water supply than the reference evapotranspiration, ET0) compared to a control (100% ET0) on the physicochemical and sensory qualities and health-promoting compounds of pomegranate arils stored for 14days at 5°C. Prior to processing, the fruits were stored for 0, 30, 60 or 90days at 5°C. The effect of the pre-processing storage duration was also examined. Physicochemical and sensory qualities were kept during the storage period. Arils from SDI fruit had lower punicalagin-α and ellagic acid losses than the control (13% vs 50%). However, the anthocyanin content decreased during the shelf-life (72%) regardless of the treatment. The ascorbic acid slight decreased. Arils from SDI experienced glucose/fructose ratio loss (19%) lower than that of the control (35%). In general, arils from SDI showed better quality and health attributes during the shelf-life than did the control samples.

  1. Advanced Recording and Preprocessing of Physiological Signals. [data processing equipment for flow measurement of blood flow by ultrasonics

    Science.gov (United States)

    Bentley, P. B.

    1975-01-01

    The measurement of the volume flow-rate of blood in an artery or vein requires both an estimate of the flow velocity and its spatial distribution and the corresponding cross-sectional area. Transcutaneous measurements of these parameters can be performed using ultrasonic techniques that are analogous to the measurement of moving objects by use of a radar. Modern digital data recording and preprocessing methods were applied to the measurement of blood-flow velocity by means of the CW Doppler ultrasonic technique. Only the average flow velocity was measured and no distribution or size information was obtained. Evaluations of current flowmeter design and performance, ultrasonic transducer fabrication methods, and other related items are given. The main thrust was the development of effective data-handling and processing methods by application of modern digital techniques. The evaluation resulted in useful improvements in both the flowmeter instrumentation and the ultrasonic transducers. Effective digital processing algorithms that provided enhanced blood-flow measurement accuracy and sensitivity were developed. Block diagrams illustrative of the equipment setup are included.

  2. A New Hybrid Model Based on Data Preprocessing and an Intelligent Optimization Algorithm for Electrical Power System Forecasting

    Directory of Open Access Journals (Sweden)

    Ping Jiang

    2015-01-01

    Full Text Available The establishment of electrical power system cannot only benefit the reasonable distribution and management in energy resources, but also satisfy the increasing demand for electricity. The electrical power system construction is often a pivotal part in the national and regional economic development plan. This paper constructs a hybrid model, known as the E-MFA-BP model, that can forecast indices in the electrical power system, including wind speed, electrical load, and electricity price. Firstly, the ensemble empirical mode decomposition can be applied to eliminate the noise of original time series data. After data preprocessing, the back propagation neural network model is applied to carry out the forecasting. Owing to the instability of its structure, the modified firefly algorithm is employed to optimize the weight and threshold values of back propagation to obtain a hybrid model with higher forecasting quality. Three experiments are carried out to verify the effectiveness of the model. Through comparison with other traditional well-known forecasting models, and models optimized by other optimization algorithms, the experimental results demonstrate that the hybrid model has the best forecasting performance.

  3. LoCuSS: The slow quenching of star formation in cluster galaxies and the need for pre-processing

    CERN Document Server

    Haines, C P; Smith, G P; Egami, E; Babul, A; Finoguenov, A; Ziparo, F; McGee, S L; Rawle, T D; Okabe, N; Moran, S M

    2015-01-01

    We present a study of the spatial distribution and kinematics of star-forming galaxies in 30 massive clusters at 0.15pre-processed within galaxy groups. Despite the increasing f_SF-radius trend, the surface density of star-forming galaxies actually declines steadily with radius, falling ~15x from the core to 2r200. This requires star-formation to survive within recently accreted spirals for 2--3Gyr to build up the apparent over-density of star-forming galaxies within clusters...

  4. Exploration of preprocessing architectures for field-programmable gate array-based thermal-visual smart camera

    Science.gov (United States)

    Imran, Muhammad; Rinner, Bernhard; Zand, Sajjad Zandi; O'Nils, Mattias

    2016-07-01

    Embedded smart cameras are gaining in popularity for a number of real-time outdoor surveillance applications. However, there are still challenges, i.e., computational latency, variation in illumination, and occlusion. To solve these challenges, multimodal systems, integrating multiple imagers can be utilized. However, trade-off is more stringent requirements on processing and communication for embedded platforms. To meet these challenges, we investigated two low-complexity and high-performance preprocessing architectures for a multiple imagers' node on a field-programmable gate array (FPGA). In the proposed architectures, majority of the tasks are performed on the thermal images because of the lower spatial resolution. Analysis with different sets of images show that the system with proposed architectures offers better detection performance and can reduce output data from 1.7 to 99 times as compared with full-size images. The proposed architectures can achieve a frame rate of 53 fps, logics utilization from 2.1% to 4.1%, memory consumption 987 to 148 KB and power consumption in the range of 141 to 163 mW on Artix-7 FPGA. This concludes that the proposed architectures offer reduced design complexity and lower processing and communication requirements while retaining the configurability of the system.

  5. Requirement Analysis of Data Preprocessing in Exploratory Simulation Experiment%探索性仿真数据预处理需求分析

    Institute of Scientific and Technical Information of China (English)

    李斌; 李春洪; 刘苏洋; 谢涌纹

    2012-01-01

    探索性仿真是一种研究复杂系统的科学手段,而数据挖掘是处理探索性仿真所产生的海量数据,而在实施数据挖掘前如何进行有效的数据预处理成为当前仿真领域面临的难题.为解决目前在探索性仿真中数据预处理工作存在的目标不够明确、重点不够突出等问题,提出了探索性仿真数据预处理需求分析,结合探索性仿真数据的特点,首先分析了通用数据预处理需求,之后对决策树挖掘、关联规则挖掘、聚类分析三种典型数据挖掘算法的数据预处理需求进行了分析.研究成果较好地满足了探索性仿真数据预处理工作的需求.%Exploratory simulation experiment is a scientific method to research complicated system, while data mining is an important method to process a great deal of data produced in an exploratory simulation experiment. How to preprocess data effectively before data mining is a problem to be solved for the domestic simulation realm. In order to solve the problem of the ambiguity of goal and unapparent emphasis of data preprocessing, requirement analysis of data preprocessing in exploratory simulation experiment was put forward. Combing with the specialty of data produced in exploratory simulation experiment, the general requirement of data preprocessing was analyzed in the paper firstly. The requirements of the algorithms of decision tree, association rules, and cluster analysis of data preprocessing were analyzed then. The research results of this paper can meet the requirement of data preprocessing in exploratory simulation experiment.

  6. Performance Comparison of Several Pre-Processing Methods in a Hand Gesture Recognition System based on Nearest Neighbor for Different Background Conditions

    Directory of Open Access Journals (Sweden)

    Regina Lionnie

    2013-09-01

    Full Text Available This paper presents a performance analysis and comparison of several pre-processing  methods  used  in  a  hand  gesture  recognition  system.  The  preprocessing methods are based on the combinations ofseveral image processing operations,  namely  edge  detection,  low  pass  filtering,  histogram  equalization, thresholding and desaturation. The hand gesture recognition system is designed to classify an input image into one of six possibleclasses. The input images are taken with various background conditions. Our experiments showed that the best result is achieved when the pre-processing method consists of only a desaturation operation, achieving a classification accuracy of up to 83.15%.

  7. Preprocessing-Free All-Optical Clock Recovery from NRZ and NRZ-DPSK Signals Using an FP-SOA Based Active Filter

    Science.gov (United States)

    Wang, Fei; Zhang, Xin-Liang; Yu, Yu; Xu, En-Ming

    2011-06-01

    We demonstrate a simple scheme to perform all-optical clock recovery from the input nonreturn-to-zero (NRZ) and nonreturn-to-zero differential phase shifted keying (NRZ-DPSK) data, which are avoided using any preprocessing measures. A multi-quantum-well Fabry-Pérot semiconductor optical amplifier plays the dual role of the data format converter and the clock recovery device. Using this scheme, a stable and low jitter 35.80-GHz optical clock pulse sequence is directly extracted out from the input NRZ or NRZ-DPSK data. This scheme has some distinct advantages such as simple device fabrication, transparence to data format, multiwavelength operation, free preprocessing and convenient tuning. Potential powerful adaptability of this scheme is very important for next-generation optical networks, in which there exist various modulation formats and the used devices are required to be transparent to data formats.

  8. The Research of License Plate Image Preprocessing Method Base on VC++%基于VC++车牌图像预处理方法研究

    Institute of Scientific and Technical Information of China (English)

    李德峰; 丁玉飞; 邱细亚

    2011-01-01

    车牌图像识别的预处理是车牌图像识别系统的重要环节之一。该文简要地介绍车牌图像受环境因素影响所呈现的特征后,系统地阐述了车牌识别系统中图像预处理的各个步骤,包括图像的灰度化、中值滤波、灰度拉伸、sobel算子梯度锐化、二值化等。提出了一种图像预处理方案.并运用VC++编程开发的软件验证了各阶段的实验结果,证实了这种方案对图像的预处理可以达到较好的处理效果,%The license plate image preprocessing is one component of the license Plate recognition system. This paper systematically describes each step of the image preprocessing in the license plate recognition system after describing briefly the characteristics of the license plate images affected by environmental factors,including gray-scale image,median filtering,gray stretch,sobel operator image sharpening,license plate title correction and so on.Proposing an image preprocessing program and developing the Software using VC++ program,which verified the experimental results of the various stages that confirmed the image preprocessing program can achieve better processing results.

  9. Investigation of thermochemical biorefinery sizing and environmental sustainability impacts for conventional supply system and distributed preprocessing supply system designs

    Energy Technology Data Exchange (ETDEWEB)

    Muth, jr., David J. [Idaho National Lab. (INL), Idaho Falls, ID (United States); Langholtz, Matthew H. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Tan, Eric [National Renewable Energy Lab. (NREL), Golden, CO (United States); Jacobson, Jacob [Idaho National Lab. (INL), Idaho Falls, ID (United States); Schwab, Amy [National Renewable Energy Lab. (NREL), Golden, CO (United States); Wu, May [Argonne National Lab. (ANL), Argonne, IL (United States); Argo, Andrew [Sundrop Fuels, Golden, CO (United States); Brandt, Craig C. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Cafferty, Kara [Idaho National Lab. (INL), Idaho Falls, ID (United States); Chiu, Yi-Wen [Argonne National Lab. (ANL), Argonne, IL (United States); Dutta, Abhijit [National Renewable Energy Lab. (NREL), Golden, CO (United States); Eaton, Laurence M. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Searcy, Erin [Idaho National Lab. (INL), Idaho Falls, ID (United States)

    2014-03-31

    The 2011 US Billion-Ton Update estimates that by 2030 there will be enough agricultural and forest resources to sustainably provide at least one billion dry tons of biomass annually, enough to displace approximately 30% of the country's current petroleum consumption. A portion of these resources are inaccessible at current cost targets with conventional feedstock supply systems because of their remoteness or low yields. Reliable analyses and projections of US biofuels production depend on assumptions about the supply system and biorefinery capacity, which, in turn, depend upon economic value, feedstock logistics, and sustainability. A cross-functional team has examined combinations of advances in feedstock supply systems and biorefinery capacities with rigorous design information, improved crop yield and agronomic practices, and improved estimates of sustainable biomass availability. A previous report on biochemical refinery capacity noted that under advanced feedstock logistic supply systems that include depots and pre-processing operations there are cost advantages that support larger biorefineries up to 10 000 DMT/day facilities compared to the smaller 2000 DMT/day facilities. This report focuses on analyzing conventional versus advanced depot biomass supply systems for a thermochemical conversion and refinery sizing based on woody biomass. The results of this analysis demonstrate that the economies of scale enabled by advanced logistics offsets much of the added logistics costs from additional depot processing and transportation, resulting in a small overall increase to the minimum ethanol selling price compared to the conventional logistic supply system. While the overall costs do increase slightly for the advanced logistic supply systems, the ability to mitigate moisture and ash in the system will improve the storage and conversion processes. In addition, being able to draw on feedstocks from further distances will decrease the risk of biomass supply to

  10. Rapid prototyping of SoC-based real-time vision system: application to image preprocessing and face detection

    Science.gov (United States)

    Jridi, Maher; Alfalou, Ayman

    2017-05-01

    By this paper, the major goal is to investigate the Multi-CPU/FPGA SoC (System on Chip) design flow and to transfer a know-how and skills to rapidly design embedded real-time vision system. Our aim is to show how the use of these devices can be benefit for system level integration since they make possible simultaneous hardware and software development. We take the facial detection and pretreatments as case study since they have a great potential to be used in several applications such as video surveillance, building access control and criminal identification. The designed system use the Xilinx Zedboard platform. The last is the central element of the developed vision system. The video acquisition is performed using either standard webcam connected to the Zedboard via USB interface or several camera IP devices. The visualization of video content and intermediate results are possible with HDMI interface connected to HD display. The treatments embedded in the system are as follow: (i) pre-processing such as edge detection implemented in the ARM and in the reconfigurable logic, (ii) software implementation of motion detection and face detection using either ViolaJones or LBP (Local Binary Pattern), and (iii) application layer to select processing application and to display results in a web page. One uniquely interesting feature of the proposed system is that two functions have been developed to transmit data from and to the VDMA port. With the proposed optimization, the hardware implementation of the Sobel filter takes 27 ms and 76 ms for 640x480, and 720p resolutions, respectively. Hence, with the FPGA implementation, an acceleration of 5 times is obtained which allow the processing of 37 fps and 13 fps for 640x480, and 720p resolutions, respectively.

  11. 指纹图像预处理的研究%Research on the Fingerprint Image Preprocessing

    Institute of Scientific and Technical Information of China (English)

    周奇

    2012-01-01

    对一种基于Gabor滤波的指纹图像增强方法进行讨论并加以实现。由于局部指纹纹线脊谷交错,近似平面正弦波,其傅立叶频谱具有明显的峰值,且峰值的方向和位置取决于纹线的方向和频率,因此采用具有良好方向和频率选择特性的Gabor滤波器可以实现较好的增强效果。在这种指纹图像增强方法中,关键是如何准确地计算指纹纹线方向和纹线频率。详细介绍指纹图像的预处理方法包括指纹图像的分割和增强,并给出增强算法的流程及实验结果。%Discusses Gabor filter based fingerprint image enhancement methods and puts into effect. Due to a partial fingerprint ridge valley interlaced, which looks like a plane sinusoidal wave. The Fourier spectrum has an obvious peak, and orientation and position of the peak depends on the direction and frequency of the fingerprint lines. So, to choose Gabor filter, which is good at selecting a direction and frequency, can achieve a better heightened effect. In this fingerprint image enhancement method, the key is how to calculate the direction and ridge frequency of the fingerprint lines accurately. Introduces some methods in fingerprint image preprocessing, which include segment and strengthen fingerprint images, and shows a process of enhancement algorithm and an experimental result.

  12. Remote Sensing Date Preprocessing Based on Spaceborne Hyperspectral Data%基于星载高光谱数据的遥感数据预处理

    Institute of Scientific and Technical Information of China (English)

    张丹丹; 岳彩荣

    2012-01-01

    Remote sensing image preprocessing is the basis of remote sensing data application, pretreatment results will affect image quality and follow-up studies.This paper studied on preprocessing Shangri-La's central King EO-1 Hyperion data by using hyperspectral data.Preprocessing included ENVI data wave portfolio, outstanding in patches separately and moisture influence band removal, absolute value, atmospheric correction and so on.The results showed that image quality was improved, the amount of data for subsequent data was reduced.It built a good foundation for the research of function and utilization o%遥感影像的预处理是遥感数据应用的基础,预处理结果的好坏将影响图像的质量及后续的研究。研究利用高光谱数据对穿越香格里拉县中部一景EO-1 Hyperion数据进行预处理。预处理分别进行envi补丁下数据波组合、未定标和水汽影响波段去除、绝对辐射值转换、大气校正等处理,结果表明:图像质量提高,减少了数据运算量,为应用研究奠定了基础。

  13. 高光谱图像预处理方法研究及进展%The Research and Progress of Hyperspectral Image Preprocessing Method

    Institute of Scientific and Technical Information of China (English)

    杨仁欣; 杨燕; 原晶晶

    2015-01-01

    高光谱图像预处理方法在高光谱图像处理中具有重要意义,有效的预处理方法可以尽可能地减少甚至消除无关信息(如样品背景、电噪音和杂散光等)对高光谱图像的影响,为后续基于高光谱图像的数据分析提供更为可靠的数据来源。基于高光谱图像谱图合一的数据结构特点,该文综述了直方图均衡化,中值滤波,边缘检测等图像预处理方法以及平滑,导数,标准归一化,多元散射校正等光谱预处理方法,并给出了这些方法的应用实例。该文还详细介绍了傅立叶变换和小波变换这两种基于数据压缩和信息提取的光谱预处理方法。这些新方法的研究为后续的数据分析奠定了良好的基础。%Hyperspectral image preprocessing method is significant in hyperspectral image pro‐cessing .Effective preprocessing method can reduce or even eliminate the bad effect caused by irrele‐vant information (such as background ,electrical noise and stray light etc .) ,and also provide more re‐liable data source for the subsequence analysis of hyperspectral image .Based on the special data struc‐ture characteristics integrating with spectrum and hyperspectral image ,the article summarizes the histogram equalization ,median filtering ,edge detection and the other image preprocessing methods commonly used ,such as smooth ,derivative ,normalization of standards ,the multiple scattering cor‐rection of spectral preprocessing methods ,meanwhile the article gives some examples of application of these methods .The article also introduces in detail the Fourier transform and wavelet transform based on the data compression and information extraction .The research of new preprocessing methods will lay the foundation for the subsequent data analysis of hyperspectral image .

  14. 高光谱影像预处理技术%Preprocessing Techniques for Hyperspectral-images

    Institute of Scientific and Technical Information of China (English)

    杨燕杰; 赵英俊; 秦凯; 陆冬华

    2013-01-01

    以核工业北京地质研究院遥感重点实验室获取的高光谱数据CASI/SASI的处理为例,总结了高光谱数据处理技术的流程、具体技术细节和技术点,并根据数据处理中存在的问题,例如高光谱数据量大、处理时间长、影像的拼接困难等,阐述了对高光谱影像预处理中所存在问题的相关对策,为高空间分辨率的高光谱数据预处理提供了较好的应用范例和技术支撑.%This paper discusses the processing of the CASI/SASI hyperspectral-image data obtained by Beijing Research Institute of Uranium Geology. The process flow, the technical details and the technical points of the hyperspectral processing are summarized. The problems of the hyperspectral data processing, such as the large hyperspectral data amount, the long processing time and the image stitching difficulties are discussed and the related countermeasures are proposed. A better application model and the technical support are provided for high spatial resolution hyperspectral data. The airborne hyperspectral images would be distorted due to the effect of the terrain, so the high resolution DEM data should be treated by orthorectification. Under different conditions when the ground images are obtained from the equipment on the plane, the same object on two images would have different colors, so the color difference should be eliminated. A great number of tests show that the CROSS model of the ENVI software can eliminate the color difference. As the space and spectral resolutions of the hyperspectral images are improved, the data volume increases. So the problem of the large volume data would add more difficulties and slow the speed of the hyperspectral image disposition. The spare time of the airborne hyperspectral images acquisition procedure should be used to do some work of the hyperspectral image pre-process. The characteristic bands are used to do the re -sampling of the hyperspectral images, which can improve the

  15. Implications of different digital elevation models and preprocessing techniques to delineate debris flow inundation hazard zones in El Salvador

    Science.gov (United States)

    Anderson, E. R.; Griffin, R.; Irwin, D.

    2013-12-01

    Heavy rains and steep, volcanic slopes in El Salvador cause numerous landslides every year, posing a persistent threat to the population, economy and environment. Although potential debris inundation hazard zones have been delineated using digital elevation models (DEMs), some disparities exist between the simulated zones and actual affected areas. Moreover, these hazard zones have only been identified for volcanic lahars and not the shallow landslides that occur nearly every year. This is despite the availability of tools to delineate a variety of landslide types (e.g., the USGS-developed LAHARZ software). Limitations in DEM spatial resolution, age of the data, and hydrological preprocessing techniques can contribute to inaccurate hazard zone definitions. This study investigates the impacts of using different elevation models and pit filling techniques in the final debris hazard zone delineations, in an effort to determine which combination of methods most closely agrees with observed landslide events. In particular, a national DEM digitized from topographic sheets from the 1970s and 1980s provide an elevation product at a 10 meter resolution. Both natural and anthropogenic modifications of the terrain limit the accuracy of current landslide hazard assessments derived from this source. Global products from the Shuttle Radar Topography Mission (SRTM) and the Advanced Spaceborne Thermal Emission and Reflection Radiometer Global DEM (ASTER GDEM) offer more recent data but at the cost of spatial resolution. New data derived from the NASA Uninhabited Aerial Vehicle Synthetic Aperture Radar (UAVSAR) in 2013 provides the opportunity to update hazard zones at a higher spatial resolution (approximately 6 meters). Hydrological filling of sinks or pits for current hazard zone simulation has previously been achieved through ArcInfo spatial analyst. Such hydrological processing typically only fills pits and can lead to drastic modifications of original elevation values

  16. LoCuSS: THE SLOW QUENCHING OF STAR FORMATION IN CLUSTER GALAXIES AND THE NEED FOR PRE-PROCESSING

    Energy Technology Data Exchange (ETDEWEB)

    Haines, C. P. [Departamento de Astronomía, Universidad de Chile, Casilla 36-D, Correo Central, Santiago (Chile); Pereira, M. J.; Egami, E.; Rawle, T. D. [Steward Observatory, University of Arizona, 933 North Cherry Avenue, Tucson, AZ 85721 (United States); Smith, G. P.; Ziparo, F.; McGee, S. L. [School of Physics and Astronomy, University of Birmingham, Edgbaston, Birmingham, B15 2TT (United Kingdom); Babul, A. [Department of Physics and Astronomy, University of Victoria, 3800 Finnerty Road, Victoria, BC, V8P 1A1 (Canada); Finoguenov, A. [Department of Physics, University of Helsinki, Gustaf Hällströmin katu 2a, FI-0014 Helsinki (Finland); Okabe, N. [Academia Sinica Institute of Astronomy and Astrophysics (ASIAA), P.O. Box 23-141, Taipei 10617, Taiwan (China); Moran, S. M., E-mail: cphaines@das.uchile.cl [Smithsonian Astrophysical Observatory, 60 Garden Street, Cambridge, MA 02138 (United States)

    2015-06-10

    We present a study of the spatial distribution and kinematics of star-forming galaxies in 30 massive clusters at 0.15 < z < 0.30, combining wide-field Spitzer 24 μm and GALEX near-ultraviolet imaging with highly complete spectroscopy of cluster members. The fraction (f{sub SF}) of star-forming cluster galaxies rises steadily with cluster-centric radius, increasing fivefold by 2r{sub 200}, but remains well below field values even at 3r{sub 200}. This suppression of star formation at large radii cannot be reproduced by models in which star formation is quenched in infalling field galaxies only once they pass within r{sub 200} of the cluster, but is consistent with some of them being first pre-processed within galaxy groups. Despite the increasing f{sub SF}-radius trend, the surface density of star-forming galaxies actually declines steadily with radius, falling ∼15× from the core to 2r{sub 200}. This requires star formation to survive within recently accreted spirals for 2–3 Gyr to build up the apparent over-density of star-forming galaxies within clusters. The velocity dispersion profile of the star-forming galaxy population shows a sharp peak of 1.44 σ{sub ν} at 0.3r{sub 500}, and is 10%–35% higher than that of the inactive cluster members at all cluster-centric radii, while their velocity distribution shows a flat, top-hat profile within r{sub 500}. All of these results are consistent with star-forming cluster galaxies being an infalling population, but one that must also survive ∼0.5–2 Gyr beyond passing within r{sub 200}. By comparing the observed distribution of star-forming galaxies in the stacked caustic diagram with predictions from the Millennium simulation, we obtain a best-fit model in which star formation rates decline exponentially on quenching timescales of 1.73 ± 0.25 Gyr upon accretion into the cluster.

  17. 基于延时相关预处理的MUSIC算法%MUSIC algorithm based on delay correlation preprocessing

    Institute of Scientific and Technical Information of China (English)

    初萍; 司伟建

    2013-01-01

    针对MUSIC算法的分辨力受信噪比、快拍数及阵元数等因素限制的问题,利用各阵元接收数据的延时相关函数重新构造协方差矩阵,提出了基于延时相关预处理的MUSIC算法.根据阵元间的延时相关函数与原阵列流型及信号延时相关函数的关系,推导了4个与原阵列流型相同(共轭)的延时相关函数矩阵,分别对各矩阵求协方差并按规则求和得到新的协方差矩阵,之后对协方差矩阵进行特征分解,根据信号子空间处理稳健性高和噪声子空间处理估计精度高的特点构造谱函数进行谱峰搜索,实现DOA估计.通过仿真实验验证了本文算法的可行性和有效性.%To solve the problem that the resolution of MUSIC algorithm is limited by such factors as signal to noise ratio (SNR), number of snapshots and number of array elements, the covariance matrix was reconstructed with the delay correlation function of data received from each array element, and the MUSIC algorithm based on the delay correlation preprocessing was proposed. According to the relationship between the delay correlation function of array elements as well as the delay correlation function of original array manifold and signals, four delay correlation function matrixes which are as the same as the original array manifold were derived. The covariance of each matrix was attained, and the summation was performed according to the rules to obtain a new covariance matrix. Then the eigen-decomposition of covariance matrix was carried out. In addition; the spectrum function was established according to the characteristics of both high robustness of signal subspace processing and high estimation accuracy of noise subspace processing. Furthermore, the searching of spectrum peaks was performed, and the DOA estimation was realized. The feasibility and validity of the proposed algorithm are testified with the simulation experiments.

  18. TEQC与CF2PS在BDS数据预处理中的应用%Application of TEQC and CF2PS in BDS Data Pre-Processing

    Institute of Scientific and Technical Information of China (English)

    党金涛; 李建文; 黄海; 罗璠

    2014-01-01

    This paper analyzes the advantages and disadvantages of TEQC and CF2PS ap-plied in the pre-processing of the BDS observational data,and describes the principle and meth-od of TEQC and CF2PS in quality check,data edit and file plot.For poor human-computer in-teraction and slow manual operation of TEQC and CF2PS,the program for the pre-processing of the BDS observational data is wrote,and the functions is actualied of batch processing and graph visualization.Finally,combined with an practical application example,the basic approach of BDS data pre-processing is researched,and the results show that the data quality is improved greatly.%分析了利用 TEQC和 CF2PS进行 BDS观测数据预处理的优缺点,介绍了 TE-QC和CF2PS在质量检查、数据编辑和文件绘图中的原理与方法。针对 TEQC和 CF2PS人机交互性差和手动操作慢的缺点,编写了 BDS观测数据预处理软件,实现了批处理和图形可视化的功能。最后结合实例,探讨了数据预处理的基本思路,结果表明,经过预处理后的数据质量得到了较大提高。

  19. Parallelizing flow-accumulation calculations on graphics processing units—From iterative DEM preprocessing algorithm to recursive multiple-flow-direction algorithm

    Science.gov (United States)

    Qin, Cheng-Zhi; Zhan, Lijun

    2012-06-01

    As one of the important tasks in digital terrain analysis, the calculation of flow accumulations from gridded digital elevation models (DEMs) usually involves two steps in a real application: (1) using an iterative DEM preprocessing algorithm to remove the depressions and flat areas commonly contained in real DEMs, and (2) using a recursive flow-direction algorithm to calculate the flow accumulation for every cell in the DEM. Because both algorithms are computationally intensive, quick calculation of the flow accumulations from a DEM (especially for a large area) presents a practical challenge to personal computer (PC) users. In recent years, rapid increases in hardware capacity of the graphics processing units (GPUs) provided in modern PCs have made it possible to meet this challenge in a PC environment. Parallel computing on GPUs using a compute-unified-device-architecture (CUDA) programming model has been explored to speed up the execution of the single-flow-direction algorithm (SFD). However, the parallel implementation on a GPU of the multiple-flow-direction (MFD) algorithm, which generally performs better than the SFD algorithm, has not been reported. Moreover, GPU-based parallelization of the DEM preprocessing step in the flow-accumulation calculations has not been addressed. This paper proposes a parallel approach to calculate flow accumulations (including both iterative DEM preprocessing and a recursive MFD algorithm) on a CUDA-compatible GPU. For the parallelization of an MFD algorithm (MFD-md), two different parallelization strategies using a GPU are explored. The first parallelization strategy, which has been used in the existing parallel SFD algorithm on GPU, has the problem of computing redundancy. Therefore, we designed a parallelization strategy based on graph theory. The application results show that the proposed parallel approach to calculate flow accumulations on a GPU performs much faster than either sequential algorithms or other parallel GPU

  20. Preprocessing-Free All-Optical Clock Recovery from NRZ and NRZ-DPSK Signals Using an FP-SOA Based Active Filter

    Institute of Scientific and Technical Information of China (English)

    WANG Fei; ZHANG Xin-Liang; YU Yu; XU En-Ming

    2011-01-01

    @@ We demonstrate a simple scheme to perform all-optical clock recovery from the input nonreturn-to-zero (NRZ) and nonreturn-to-zero differential phase shifted keying (NRZ-DPSK) data, which are avoided using any pre- processing measures.A multi-quantum-well Fabry-Perot semiconductor optical amplifier plays the dual role of the data format converter and the clock recovery device.Using this scheme, a stable and low jitter 35.80-GHz optical clock pulse sequence is directly extracted out from the input NRZ or NRZ-DPSK data.This scheme has some distinct advantages such as simple device fabrication, transparence to data format, multiwavelength opera- tion, free preprocessing and convenient tuning.Potential powerful adaptability of this scheme is very important for next-generation optical networks, in which there exist various modulation formats and the used devices are required to be transparent to data formats.%We demonstrate a simple scheme to perform all-optical clock recovery from the input nonreturn-to-zero (NRZ) and nonreturn-to-zero differential phase shifted keying (NRZ-DPSK) data, which are avoided using any preprocessing measures. A multi-quantum-well Fabry-Perot semiconductor optical amplifier plays the dual role of the data format converter and the clock recovery device. Using this scheme, a stable and low jitter 35.80-GHz optical clock pulse sequence is directly extracted out from the input NRZ or NRZ-DPSK data. This scheme has some distinct advantages such as simple device fabrication, transparence to data format, multiwavelength operation, free preprocessing and convenient tuning. Potential powerful adaptability of this scheme is very important for next-generation optical networks, in which there exist various modulation formats and the used devices are required to be transparent to data formats.

  1. International Best Practices for Pre-Processing and Co-Processing Municipal Solid Waste and Sewage Sludge in the Cement Industry

    Energy Technology Data Exchange (ETDEWEB)

    Hasanbeigi, Ali [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Lu, Hongyou [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Williams, Christopher [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Price, Lynn [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

    2012-07-01

    The purpose of this report is to describe international best practices for pre-processing and coprocessing of MSW and sewage sludge in cement plants, for the benefit of countries that wish to develop co-processing capacity. The report is divided into three main sections. Section 2 describes the fundamentals of co-processing, Section 3 describes exemplary international regulatory and institutional frameworks for co-processing, and Section 4 describes international best practices related to the technological aspects of co-processing.

  2. Enhanced methane and hydrogen yields from catalytic supercritical water gasification of pine wood sawdust via pre-processing in subcritical water

    OpenAIRE

    Onwudili, JA; Williams, PT

    2013-01-01

    A two-stage batch hydrothermal process has been investigated with the aim of enhancing the yields of hydrogen and methane from sawdust. Samples of the sawdust were rapidly treated in subcritical water and with added Na2CO3 (alkaline compound) and Nb2O3 (solid acid) at 280 °C, 8 MPa. Each pre-processing route resulted in a solid recovered product (SRP), an aqueous residue and a small amount of gas composed mainly of CO2. In the second stage, the SRP and the liquid residues were gasified in sup...

  3. 论中医学数据挖掘的数据预处理%Analysis on Data Pre-processing of TCM Data Mining

    Institute of Scientific and Technical Information of China (English)

    刘广; 孙宏

    2012-01-01

    This essay introduced the character of TCM data, analyzed the necessity of data preprocessing, as well as explained the method of standardization of experimental data structure, in order to improve the efficiency of data mining.%文章介绍了中医学数据的特点,阐述了对中医原始数据进行数据预处理的必要性和中医实验数据结构规范化的方法,以此可以大大提高数据挖掘的效率.

  4. b-Bit Minwise Hashing in Practice: Large-Scale Batch and Online Learning and Using GPUs for Fast Preprocessing with Simple Hash Functions

    CERN Document Server

    Li, Ping; Konig, Arnd Christian

    2012-01-01

    In this paper, we study several critical issues which must be tackled before one can apply b-bit minwise hashing to the volumes of data often used industrial applications, especially in the context of search. 1. (b-bit) Minwise hashing requires an expensive preprocessing step that computes k (e.g., 500) minimal values after applying the corresponding permutations for each data vector. We developed a parallelization scheme using GPUs and observed that the preprocessing time can be reduced by a factor of 20-80 and becomes substantially smaller than the data loading time. 2. One major advantage of b-bit minwise hashing is that it can substantially reduce the amount of memory required for batch learning. However, as online algorithms become increasingly popular for large-scale learning in the context of search, it is not clear if b-bit minwise yields significant improvements for them. This paper demonstrates that $b$-bit minwise hashing provides an effective data size/dimension reduction scheme and hence it can d...

  5. INFLUENCE OF RAW IMAGE PREPROCESSING AND OTHER SELECTED PROCESSES ON ACCURACY OF CLOSE-RANGE PHOTOGRAMMETRIC SYSTEMS ACCORDING TO VDI 2634

    Directory of Open Access Journals (Sweden)

    J. Reznicek

    2016-06-01

    Full Text Available This paper examines the influence of raw image preprocessing and other selected processes on the accuracy of close-range photogrammetric measurement. The examined processes and features includes: raw image preprocessing, sensor unflatness, distance-dependent lens distortion, extending the input observations (image measurements by incorporating all RGB colour channels, ellipse centre eccentricity and target detecting. The examination of each effect is carried out experimentally by performing the validation procedure proposed in the German VDI guideline 2634/1. The validation procedure is based on performing standard photogrammetric measurements of high-accurate calibrated measuring lines (multi-scale bars with known lengths (typical uncertainty = 5 μm at 2 sigma. The comparison of the measured lengths with the known values gives the maximum length measurement error LME, which characterize the accuracy of the validated photogrammetric system. For higher reliability the VDI test field was photographed ten times independently with the same configuration and camera settings. The images were acquired with the metric ALPA 12WA camera. The tests are performed on all ten measurements which gives the possibility to measure the repeatability of the estimated parameters as well. The influences are examined by comparing the quality characteristics of the reference and tested settings.

  6. Land 3D-Seismic Data: Preprocessing Quality Control Utilizing Survey Design Specifications, Noise Properties, Normal Moveout, First Breaks, and Offset

    Institute of Scientific and Technical Information of China (English)

    Abdelmoneam Raef

    2009-01-01

    The recent proliferation of the 3D reflection seismic method into the near-surface area of geophysical applications, especially in response to the emergence of the need to comprehensively characterize and monitor near-surface carbon dioxide sequestration in shallow saline aquifers around the world, Justifies the emphasis on cost-effective and robust quality control and assurance (QC/QA) workflow of 3D seismic data preprocessing that is suitable for near-surface applications. The main purpose of our seismic data preprocessing QC is to enable the use of appropriate header information, data that are free of noise-dominated traces, and/or flawed vertical stacking in subsequent processing steps. In this article, I provide an account of utilizing survey design specifications, noise properties, first breaks, and normal moveout for rapid and thorough graphical QC/QA diagnostics, which are easy to apply and efficient in the diagnosis of inconsistencies. A correlated vibroseis time-lapse 3D-seismic data set from n CO2-flood monitoring survey is used for demonstrating QC dlagnostles. An Important by-product of the QC workflow is establishing the number of layers for n refraction statics model in a data-driven graphical manner that capitalizes on the spatial coverage of the 3D seismic data.

  7. Land 3D-seismic data: Preprocessing quality control utilizing survey design specifications, noise properties, normal moveout, first breaks, and offset

    Science.gov (United States)

    Raef, A.

    2009-01-01

    The recent proliferation of the 3D reflection seismic method into the near-surface area of geophysical applications, especially in response to the emergence of the need to comprehensively characterize and monitor near-surface carbon dioxide sequestration in shallow saline aquifers around the world, justifies the emphasis on cost-effective and robust quality control and assurance (QC/QA) workflow of 3D seismic data preprocessing that is suitable for near-surface applications. The main purpose of our seismic data preprocessing QC is to enable the use of appropriate header information, data that are free of noise-dominated traces, and/or flawed vertical stacking in subsequent processing steps. In this article, I provide an account of utilizing survey design specifications, noise properties, first breaks, and normal moveout for rapid and thorough graphical QC/QA diagnostics, which are easy to apply and efficient in the diagnosis of inconsistencies. A correlated vibroseis time-lapse 3D-seismic data set from a CO2-flood monitoring survey is used for demonstrating QC diagnostics. An important by-product of the QC workflow is establishing the number of layers for a refraction statics model in a data-driven graphical manner that capitalizes on the spatial coverage of the 3D seismic data. ?? China University of Geosciences (Wuhan) and Springer-Verlag GmbH 2009.

  8. 导弹遥测数据预处理方法研究%Research on telemetry data preprocessing of missile

    Institute of Scientific and Technical Information of China (English)

    张东; 吴晓琳

    2011-01-01

    Data preprocessing is one important step of telemetry data processing. In this paper, data preprocessing is first expounded and discussed. Then some methods of engineering practice are given, also the corresponding rules and formulas are set forth. By many times tries, it shows that these methods are simple and effective with high reliability, which lay the foundation for telemetry data processing.%数据预处理是导弹遥测数据处理工作的重要环节,文中对数据预处理的方法和过程进行系统阐述和研究,给出工程实践方法,并根据试验任务需求制定相应的判定准则和计算公式,大量试验应用证明,这些方法简单易行、可靠性高,为导弹遥测数据处理奠定了基础.

  9. 复杂设备维修保障数据预处理技术研究%Maintenance Support Data Preprocessing Technology for Complex Devices

    Institute of Scientific and Technical Information of China (English)

    李季; 孙凯; 白文

    2016-01-01

    The preprocessing of maintenance support data for complex devices is an important guarantee of its correct ‐ness and accuracy ,which also determines the quality and effectiveness of the subsequent maintenance support work .The ap‐plication requirements and characteristics of complex equipment maintenance support data are analyzed ,the related technolo‐gies and methods are summarized ,and the basic process of data preprocessing is constructed .Finally ,a numerical example is analyzed with the information entropy theory for data protocol .%复杂设备维修保障数据的预处理是其正确性与准确性的重要保证,决定着后续维修保障工作的质量与效益。论文分析了复杂设备维修保障数据的应用需求及特点,对其预处理过程中的相关技术和方法进行了归纳,并在此基础上总结建立了维修保障数据预处理的基本流程,最后结合某设备维修性数据采用信息熵理论进行了数据规约实例分析。

  10. Effects of non-local diffusion on structural MRI preprocessing and default network mapping: statistical comparisons with isotropic/anisotropic diffusion.

    Directory of Open Access Journals (Sweden)

    Xi-Nian Zuo

    Full Text Available Neuroimaging community usually employs spatial smoothing to denoise magnetic resonance imaging (MRI data, e.g., Gaussian smoothing kernels. Such an isotropic diffusion (ISD based smoothing is widely adopted for denoising purpose due to its easy implementation and efficient computation. Beyond these advantages, Gaussian smoothing kernels tend to blur the edges, curvature and texture of images. Researchers have proposed anisotropic diffusion (ASD and non-local diffusion (NLD kernels. We recently demonstrated the effect of these new filtering paradigms on preprocessing real degraded MRI images from three individual subjects. Here, to further systematically investigate the effects at a group level, we collected both structural and functional MRI data from 23 participants. We first evaluated the three smoothing strategies' impact on brain extraction, segmentation and registration. Finally, we investigated how they affect subsequent mapping of default network based on resting-state functional MRI (R-fMRI data. Our findings suggest that NLD-based spatial smoothing maybe more effective and reliable at improving the quality of both MRI data preprocessing and default network mapping. We thus recommend NLD may become a promising method of smoothing structural MRI images of R-fMRI pipeline.

  11. 基于BP网络的GDP预测数据预处理方法研究%Data Preprocessing Study in GDP Forcasting Based on BP NN

    Institute of Scientific and Technical Information of China (English)

    李小飞

    2011-01-01

    Data prerocessing is one of the critical processes during modeling. Different methods of data preprocessing will lead to different results in the same model. The article discusses six methods and two new methods of data preprocessing in GDP forcasting model based on BP NN using MATLABTCX)L. After comparing and analyzing the results, the method of cut-division is a simple and most precise one.%数据预处理是决定建模成功与否的关键步骤之一,而同一个模型用不同的预处理方法所得到的结果也会不一样.文章基于BP神经网络的GDP预测模型,探讨了已有的6种预处理方法,在这些方法之上又新增了2种不同的方法,最后用这些不同方法在MATLAB下实现,将所得结果作比较与分析,文章提出的减比法是一种既简单准确度又高的方法.

  12. Influence of Raw Image Preprocessing and Other Selected Processes on Accuracy of Close-Range Photogrammetric Systems According to Vdi 2634

    Science.gov (United States)

    Reznicek, J.; Luhmann, T.; Jepping, C.

    2016-06-01

    This paper examines the influence of raw image preprocessing and other selected processes on the accuracy of close-range photogrammetric measurement. The examined processes and features includes: raw image preprocessing, sensor unflatness, distance-dependent lens distortion, extending the input observations (image measurements) by incorporating all RGB colour channels, ellipse centre eccentricity and target detecting. The examination of each effect is carried out experimentally by performing the validation procedure proposed in the German VDI guideline 2634/1. The validation procedure is based on performing standard photogrammetric measurements of high-accurate calibrated measuring lines (multi-scale bars) with known lengths (typical uncertainty = 5 μm at 2 sigma). The comparison of the measured lengths with the known values gives the maximum length measurement error LME, which characterize the accuracy of the validated photogrammetric system. For higher reliability the VDI test field was photographed ten times independently with the same configuration and camera settings. The images were acquired with the metric ALPA 12WA camera. The tests are performed on all ten measurements which gives the possibility to measure the repeatability of the estimated parameters as well. The influences are examined by comparing the quality characteristics of the reference and tested settings.

  13. 基于数据挖掘的专利数据预处理系统的设计与实现%Design and Implemention of Patent Data Preprocessing System Based on Data Mining Theory

    Institute of Scientific and Technical Information of China (English)

    赵蕴华; 张静

    2011-01-01

    本文针对当前专利数据预处理中存在的处理效率低、耗费资源量大、处理准确度不高的问题,结合数据挖掘中预处理技术,以欧洲专利局文献管理数据库(DOCDB)专利数据为例,设计并实现了DOCDB专利数据的预处理系统。该系统能够对DOCDB专利数据文件的结构进行解析,提取相关的专利信息,并将处理后的数据存入数据库中。实验结果表明,该系统能够高效处理专利数据,有力的提高了专利预处理的自动化水平。%In order to improve efficiency and accuracy of patent information preprocessing,according to data preprocessing methods of data mining theory,we designed and develop Patent Information Preprocessing System to preprocessing DOCDB information(Patent bibliographic information in XML format from European Patents Organization).It can parse DOCDB information,extract and reorganize important patent attributes,and load processed patent information into database system.The experimental results show that our Patent Information Preprocessing System is efficient.

  14. Groundwork for integration of hot water extraction as a potential pre-process in a biorefinery for downstream conversion and nano-fibrillation

    Science.gov (United States)

    Zhu, Rui

    The economic competitiveness of biofuels production is highly dependent on feedstock cost, which constitutes 35-50 % of the total biofuels production cost. Economically viable feedstock pre-process has a significant influence on all the subsequent downstream processes in the biorefinery supply chain. In this work, hot water extraction (HWE) was exploited as a pre-process to initially fractionate cell wall structure of softwood Douglas fir, which is considerably more recalcitrant compared to hardwoods and agricultural feedstocks. A response surface model was developed and the highest hemicellulose extraction yield (HEY) was obtained when the temperature is 180 °C and the time is 79 min. HWE process partially removed hemicelluloses, reduced the moisture absorption and improved the thermal stability of wood. To investigate the effects of HWE pre-process on sulfite pretreatment to overcome recalcitrance of lignocellulose (SPORL), a series of SPORL with reduced combined severity factor (CSF) were conducted using HWE treated Douglas fir. Sugar analysis after enzymatic hydrolysis indicated that SPORL can be conducted at lower temperature (145 °C), shorter time (80 min), and lower acid volume (3 %), while still maintaining considerably high enzymatic digestibility ( 55-60%). Deriving valuable co-products would increase the overall revenue and improve the economics of the biofuels supply chain. The feasibility of extracting cellulose nanofibrils (CNFs) from HWE treated Douglas fir by ultrasonication and CNFs' reinforcing potentials in nylon 6 matrix were evaluated. Morphology analysis indicated that finer fibrils can be obtained by increasing ultrasonication time and/or amplitude. CNFs was found to have higher crystallinity and maintained the thermal stability compared to untreated fiber. A method of fabricating nylon 6/CNFs as-spun nanocomposite filaments using a combination of extrusion, compounding and capillary rheometer to minimize thermal degradation of CNFs was

  15. Clinical data miner: an electronic case report form system with integrated data preprocessing and machine-learning libraries supporting clinical diagnostic model research.

    Science.gov (United States)

    Installé, Arnaud Jf; Van den Bosch, Thierry; De Moor, Bart; Timmerman, Dirk

    2014-10-20

    Using machine-learning techniques, clinical diagnostic model research extracts diagnostic models from patient data. Traditionally, patient data are often collected using electronic Case Report Form (eCRF) systems, while mathematical software is used for analyzing these data using machine-learning techniques. Due to the lack of integration between eCRF systems and mathematical software, extracting diagnostic models is a complex, error-prone process. Moreover, due to the complexity of this process, it is usually only performed once, after a predetermined number of data points have been collected, without insight into the predictive performance of the resulting models. The objective of the study of Clinical Data Miner (CDM) software framework is to offer an eCRF system with integrated data preprocessing and machine-learning libraries, improving efficiency of the clinical diagnostic model research workflow, and to enable optimization of patient inclusion numbers through study performance monitoring. The CDM software framework was developed using a test-driven development (TDD) approach, to ensure high software quality. Architecturally, CDM's design is split over a number of modules, to ensure future extendability. The TDD approach has enabled us to deliver high software quality. CDM's eCRF Web interface is in active use by the studies of the International Endometrial Tumor Analysis consortium, with over 4000 enrolled patients, and more studies planned. Additionally, a derived user interface has been used in six separate interrater agreement studies. CDM's integrated data preprocessing and machine-learning libraries simplify some otherwise manual and error-prone steps in the clinical diagnostic model research workflow. Furthermore, CDM's libraries provide study coordinators with a method to monitor a study's predictive performance as patient inclusions increase. To our knowledge, CDM is the only eCRF system integrating data preprocessing and machine-learning libraries

  16. 手势识别中的图像预处理技术研究%Research on image preprocessing techniques in gesturerecognition

    Institute of Scientific and Technical Information of China (English)

    梁娜

    2015-01-01

    In the gesture recognition process,the gesture image in the generation,transmission and transformation process will be subject to interference and influence of various factors, the gesture image quality will appear deformation,therefore the need to image preprocessing.During image preprocessing including image smoothing and image binarization, this paper for a variety of smoothing techniques and binary methods were studied and experimental comparison, for the gesture recognition process gesture segmentation and feature extraction provides effective data samples .%在手势识别过程中,当获取到的手势信息转换成可用计算机处理的数字图像时,手势图像在生成,传输或变换过程中会受到各种因素的干扰和影响,手势图像的画质将会因噪声而在不同程度上出现畸变,因此需要先对图像进行预处理。图像预处理的过程包括图像平滑和图像二值化,本文分别对多种平滑技术和二值化方法进行了研究和实验对比,为手势识别过程中的手势分割和特征提取提供了有效的数据样本。

  17. Data Pre-Processing Method to Remove Interference of Gas Bubbles and Cell Clusters During Anaerobic and Aerobic Yeast Fermentations in a Stirred Tank Bioreactor

    Science.gov (United States)

    Princz, S.; Wenzel, U.; Miller, R.; Hessling, M.

    2014-11-01

    One aerobic and four anaerobic batch fermentations of the yeast Saccharomyces cerevisiae were conducted in a stirred bioreactor and monitored inline by NIR spectroscopy and a transflectance dip probe. From the acquired NIR spectra, chemometric partial least squares regression (PLSR) models for predicting biomass, glucose and ethanol were constructed. The spectra were directly measured in the fermentation broth and successfully inspected for adulteration using our novel data pre-processing method. These adulterations manifested as strong fluctuations in the shape and offset of the absorption spectra. They resulted from cells, cell clusters, or gas bubbles intercepting the optical path of the dip probe. In the proposed data pre-processing method, adulterated signals are removed by passing the time-scanned non-averaged spectra through two filter algorithms with a 5% quantile cutoff. The filtered spectra containing meaningful data are then averaged. A second step checks whether the whole time scan is analyzable. If true, the average is calculated and used to prepare the PLSR models. This new method distinctly improved the prediction results. To dissociate possible correlations between analyte concentrations, such as glucose and ethanol, the feeding analytes were alternately supplied at different concentrations (spiking) at the end of the four anaerobic fermentations. This procedure yielded low-error (anaerobic) PLSR models for predicting analyte concentrations of 0.31 g/l for biomass, 3.41 g/l for glucose, and 2.17 g/l for ethanol. The maximum concentrations were 14 g/l biomass, 167 g/l glucose, and 80 g/l ethanol. Data from the aerobic fermentation, carried out under high agitation and high aeration, were incorporated to realize combined PLSR models, which have not been previously reported to our knowledge.

  18. Microarray meta-analysis database (M2DB: a uniformly pre-processed, quality controlled, and manually curated human clinical microarray database

    Directory of Open Access Journals (Sweden)

    Cheng Wei-Chung

    2010-08-01

    Full Text Available Abstract Background Over the past decade, gene expression microarray studies have greatly expanded our knowledge of genetic mechanisms of human diseases. Meta-analysis of substantial amounts of accumulated data, by integrating valuable information from multiple studies, is becoming more important in microarray research. However, collecting data of special interest from public microarray repositories often present major practical problems. Moreover, including low-quality data may significantly reduce meta-analysis efficiency. Results M2DB is a human curated microarray database designed for easy querying, based on clinical information and for interactive retrieval of either raw or uniformly pre-processed data, along with a set of quality-control metrics. The database contains more than 10,000 previously published Affymetrix GeneChip arrays, performed using human clinical specimens. M2DB allows online querying according to a flexible combination of five clinical annotations describing disease state and sampling location. These annotations were manually curated by controlled vocabularies, based on information obtained from GEO, ArrayExpress, and published papers. For array-based assessment control, the online query provides sets of QC metrics, generated using three available QC algorithms. Arrays with poor data quality can easily be excluded from the query interface. The query provides values from two algorithms for gene-based filtering, and raw data and three kinds of pre-processed data for downloading. Conclusion M2DB utilizes a user-friendly interface for QC parameters, sample clinical annotations, and data formats to help users obtain clinical metadata. This database provides a lower entry threshold and an integrated process of meta-analysis. We hope that this research will promote further evolution of microarray meta-analysis.

  19. 飞机结构应变信号的采集与预处理系统%Strain Data Acquisition and Preprocessing System for Aircraft Structure

    Institute of Scientific and Technical Information of China (English)

    薛军; 纪敦; 李猛; 吴志超

    2009-01-01

    介绍了某型飞机结构疲劳危险部位的机载应变采集与预处理系统的设计与组成.系统以Compact RIO技术构建硬件平台,通过FPGA开发,采取文件细分的思想,运用断点保护的方法,自动完成数据的采集和预处理.系统将应变数据处理成有效峰谷值,填充到频次矩阵中,解决了机载应变采集系统设备存储空间有限的问题.该系统完成了200多个飞行小时的科研试飞,结果表明系统简捷有效.%The design and the construction of in-flight strain data acquisition and preprocessing system are described for an aircraft eritical location.The system hardware platform consists of compact RIO.On the FPGA,the software can automatically complete strain data acquisition and preprocessing by file subdivision and interrupt protection.To save storage space in-flight strain data record system,the software can fill the frequency matrix with all real peaks and valleys from the strain data.Results show that the system is proved to be simple and effective by more than 200 h trial flights.

  20. Analysis of Landsat8 satellite remote sensing data preprocessing%Landsat8卫星遥感数据预处理方法

    Institute of Scientific and Technical Information of China (English)

    祝佳

    2016-01-01

    Landsat系列卫星是由美国航空航天局和美国地质调查局共同管理的资源遥感系列卫星,40多a来为地球遥感探测活动提供了大量清晰而稳定的图像数据。卫星遥感数据预处理是获取优质遥感基础图像的第一步,对后续各级卫星遥感产品的质量有着很重要的影响。针对Landsat8卫星原始数据,对卫星下传所采用的空间数据传输协议和数据传输格式进行了详细的解析,分析了原始数据从解同步、数据帧解析、任务数据包解析、图像数据获取直到生成0级图像产品的步骤;特别针对存在无损数据压缩的陆地成像仪( operational land imager,OLI)数据,讨论了基于空间数据系统咨询委员会( consultative committee for space data systems,CCSDS)相关标准进行无损数据解压缩处理的方法和过程。经数据预处理得到的Landsat8卫星0级图像产品,可为Landsat8卫星数据应用提供优质的基础图像。%The Landsat series satellites are the remote sensing resource series satellites, which are jointly managed by National Aeronautics and Space Administration and United States Geological Survey. Large quantities of high-resolution and stable image data provided by the Landsat series satellites have created good opportunities for the earth remote sensing exploration activities in the past forty years. Satellite remote sensing data preprocessing is the first step for obtaining remote sensing image, and has an important impact on the quality of the satellite remote sensing product. Aimed at tackling the Landsat8 raw data, the authors dealt in detail with the space data transmission protocol and data transmission format for Landsat8 data downlink. The preprocessing steps for raw data were analyzed, which included synchronization, transfer frame analyzing, unpack, mission data extracting, etc. In addition, the procedure of 0 - level image product acquisition was described. Specifically, based on CCSDS

  1. Improved Data Preprocessing Algorithm and Its Application%改进的数据预处理算法及其应用

    Institute of Scientific and Technical Information of China (English)

    许必宵; 陈升波; 韩重阳; 马梦环; 宫婧

    2015-01-01

    聚类分析是数据挖掘领域一项重要的课题. 针对重复数据与孤立数据的预处理可以优化聚类结果. 重复数据处理方面,文中在传统的重复数据查找算法SNM的基础上加入了伸缩窗口与变化移动速度的思想,提高了查找的准确率与效率;孤立数据方面,文中提出基于层次聚类分簇搜寻算法,算法利用层次聚类将数据分成独立的簇再依次搜寻孤立点提高了查询速率,并加入恢复检验的部分恢复被误删的非孤立点提高查找的准确率. 实验仿真中,首先抽取部分数据验证了改进后的数据预处理算法的准确性,然后将数据预处理算法用于处理移动用户消费数据后再对数据进行聚类分析,从而达到对客户的归属地信息识别的目的. 实验结果表明,文中提出的预处理算法具有很高的准确率与效率.%Clustering analysis is an important project in data mining. Data preprocessing for repeated data and isolated data can optimize the result of clustering. About repeated data processing,added the idea of elastic window and changeable movement speed in traditional SNM to improve the accuracy and efficiency of searching. About isolated data processing,proposed a searching algorithm based on hierar-chical clustering and searching in divided clusters. Algorithm utilizes hierarchical clustering to divide the data into several independent clusters and sequentially search isolated point to improve the query speed. Meanwhile,algorithm adds recovery partial to recover isolated points which are misestimated to improve the accuracy of searching. In the experiment part,first extract the partial data to verify the accu-racy of the data preprocessing algorithm,next applies the algorithm for processing data of a list of consumption of mobile customers. Then make use of processed data to cluster in order to identify customers' information on their hometown. The experimental results indicate that the preprocessing

  2. UAV侦察图像自适应预处理仿真研究%Simulation of UAV Surveillance Image Adaptive Preprocessing

    Institute of Scientific and Technical Information of China (English)

    刘慧霞; 张清; 席庆彪

    2012-01-01

    Research image enhancement of UAV automatic target recognition. This article focused on two problems which are common in UAV reconnaissance, one is weather interference, the other is image blur caused by high -altitude flight, and then proposed an adaptive preprocessing algorithm. Firstly, the existence of clouds in the image was determined, and if not present, the image was enhanced using guided filter which can refine the edge and increase the feature points of the image. If there are clouds in the image, then the image was dehazed using a dark-channel prior algorithm. In mist condition, this algorithm can restore the image scenes without fog. In fog condition, this algorithm can clearly distinguish the image of the road, the village and other meaningful goals. Simulation results show that, the image has been enhanced after the preprocessing and can meet identification needs.%研究无人机(Unmanned Aerial Vehicle,UAV)侦察目标识别中的图像增强问题.针对UAV侦察活动中因云雾干扰和由于成像距离较远而造成的图像不清晰这两个常见的原因,提出一种自适应的预处理算法.首先判断侦察图像上是否存在云雾,若不存在,则采用指示滤波器(Guided filter)对图像进行增强,可以细化图像的边缘,增强图像的对比度,并能增加可用于识别的特征点;若存在云雾,则采用基于暗通道优先的图像去雾技术,对于侦察获取的薄雾图像可以较清晰的还原出图像的无雾场景,而在浓雾条件下,可以清楚的分辨出图像中的道路、村庄等有意义的目标.仿真结果证明,通过自适应预处理算法后的图像清晰度有所增强,可满足识别需求.

  3. Comparison of Pre-Processing and Classification Techniques for Single-Trial and Multi-Trial P300-Based Brain Computer Interfaces

    Directory of Open Access Journals (Sweden)

    Chanan S. Syan

    2010-01-01

    Full Text Available The P300 component of Event Related Brain Potentials (ERP is commonly used in Brain Computer Interfaces (BCI to translate the intentions of an individual into commands for external devices. The P300 response, however, resides in a signal environment of high background noise. Consequently, the main problem in developing a P300-based BCI lies in identifying the P300 response in the presence of this noise. Traditionally, attenuating the background activity of P300 data is done by averaging multiple trials of recorded signals. This method, though effective, suffers two drawbacks. First, collecting multiple trials of data is time consuming and delays the BCI response. Second, latency distortions may appear in the averaged result due to variable time-locking of the P300 in the individual trials. Problem statement: The use of single-trial P300 data overcomes both these shortcomings. However, single-trial data must be properly denoised to allow for reliable BCI operation. Single-trial P300-based BCIs have been implemented using a variety of signal processing techniques and classification methodologies. However, comparing the accuracies of these systems to other multi-trial systems is likely to include the comparison of more than just the trial format (single-trial/multi-trial as the data quality and recording circumstances are likely to be dissimilar. Approach: This issue was directly addressed by comparing the performance comparison of three different preprocessing agents and three classification methodologies on the same data set over both the single-trial and multi-trial settings. The P300 data set of BCI Competition II was used to facilitate this comparison. Results: The LDA classifier exhibited the best performance in classifying unseen P300 spatiotemporal features in both the single-trial (74.19% and multi-trial format (100%. It is also very efficient in terms of computational and memory requirements. Conclusion: This study can serve as a general

  4. Effects of different correlation metrics and preprocessing factors on small-world brain functional networks: a resting-state functional MRI study.

    Directory of Open Access Journals (Sweden)

    Xia Liang

    Full Text Available Graph theoretical analysis of brain networks based on resting-state functional MRI (R-fMRI has attracted a great deal of attention in recent years. These analyses often involve the selection of correlation metrics and specific preprocessing steps. However, the influence of these factors on the topological properties of functional brain networks has not been systematically examined. Here, we investigated the influences of correlation metric choice (Pearson's correlation versus partial correlation, global signal presence (regressed or not and frequency band selection [slow-5 (0.01-0.027 Hz versus slow-4 (0.027-0.073 Hz] on the topological properties of both binary and weighted brain networks derived from them, and we employed test-retest (TRT analyses for further guidance on how to choose the "best" network modeling strategy from the reliability perspective. Our results show significant differences in global network metrics associated with both correlation metrics and global signals. Analysis of nodal degree revealed differing hub distributions for brain networks derived from Pearson's correlation versus partial correlation. TRT analysis revealed that the reliability of both global and local topological properties are modulated by correlation metrics and the global signal, with the highest reliability observed for Pearson's-correlation-based brain networks without global signal removal (WOGR-PEAR. The nodal reliability exhibited a spatially heterogeneous distribution wherein regions in association and limbic/paralimbic cortices showed moderate TRT reliability in Pearson's-correlation-based brain networks. Moreover, we found that there were significant frequency-related differences in topological properties of WOGR-PEAR networks, and brain networks derived in the 0.027-0.073 Hz band exhibited greater reliability than those in the 0.01-0.027 Hz band. Taken together, our results provide direct evidence regarding the influences of correlation metrics

  5. Design of Radar Echoes Pre-processing Module Based on COM Express%基于COM Express的回波预处理模块设计

    Institute of Scientific and Technical Information of China (English)

    潘奇; 倪卫芳; 张宏超

    2012-01-01

    A radar echoes pre-processing module based on COM Express was designed to met with the rebuild need of data processing platform which is used in shipbome radar.The echoes data receiving bottle-neck of radar system is solved by moving the radar echoes pre-processing function to the interface board,which adjusted the implementation of data processing platform and its software.The paper analyses the rebuild need of data processing platform,described the design plan of the module,the hardware design,the signal flow of echoes data as well as the driver of PEX8311 chip in Vxworks operating system.The module highly improved the processing ability of radar echoes data,witch can receive and process the radar echoes data correctly even in extremely hard station,so it met the need with the rebuild need of radar data processing platform.%针对舰载雷达数据处理平台的改造需求,设计了一种基于COM Express的回波预处理模块.通过调整雷达数据处理平台及软件的实现方式,将回波预处理前移,使用最小的改动量解决了雷达系统的回波数据接收瓶颈问题.分析了数据处理平台的改造需求,介绍了模块的设计方案、模块的硬件设计、回波预处理的数据传输流程以及PEX8311桥片在Vxworks下的驱动设计.工程实践表明,该模块大大提高了雷达的点迹处理能力,在极限情况下能正确接收并处理雷达回波数据,满足了雷达改造需求.

  6. Saving Grace - A Climate Change Documentary Education Program

    Science.gov (United States)

    Byrne, J. M.; McDaniel, S.; Graham, J.; Little, L.; Hoggan, J. C.

    2012-12-01

    Saving Grace conveys climate change knowledge from the best international scientists and social scientists using a series of new media formats. An Education and Communication Plan (ECP) has been developed to disseminate climate change knowledge on impacts, mitigation and adaptation for individuals, and for all sectors of society. The research team is seeking contacts with science and social science colleagues around the world to provide the knowledge base for the ECP. Poverty enslaves…and climate change has, and will, spread and deepen poverty to hundreds of millions of people, primarily in the developing world. And make no mistake; we are enslaving hundreds of millions of people in a depressing and debilitating poverty that in numbers will far surpass the horrors of the slave trade of past centuries. Saving Grace is the story of that poverty - and minimizing that poverty. Saving Grace stars the best of the world's climate researchers. Saving Grace presents the science; who, where and why of greenhouse gases that drive climate change; current and projected impacts of a changing climate around the world; and most important, solutions to the climate change challenges we face.

  7. Preprocessing by a Bayesian Single-Trial Event-Related Potential Estimation Technique Allows Feasibility of an Assistive Single-Channel P300-Based Brain-Computer Interface

    Directory of Open Access Journals (Sweden)

    Anahita Goljahani

    2014-01-01

    Full Text Available A major clinical goal of brain-computer interfaces (BCIs is to allow severely paralyzed patients to communicate their needs and thoughts during their everyday lives. Among others, P300-based BCIs, which resort to EEG measurements, have been successfully operated by people with severe neuromuscular disabilities. Besides reducing the number of stimuli repetitions needed to detect the P300, a current challenge in P300-based BCI research is the simplification of system’s setup and maintenance by lowering the number N of recording channels. By using offline data collected in 30 subjects (21 amyotrophic lateral sclerosis patients and 9 controls through a clinical BCI with N=5 channels, in the present paper we show that a preprocessing approach based on a Bayesian single-trial ERP estimation technique allows reducing N to 1 without affecting the system’s accuracy. The potentially great benefit for the practical usability of BCI devices (including patient acceptance that would be given by the reduction of the number N of channels encourages further development of the present study, for example, in an online setting.

  8. Sound quality prediction based on systematic metric selection and shrinkage: Comparison of stepwise, lasso, and elastic-net algorithms and clustering preprocessing

    Science.gov (United States)

    Gauthier, Philippe-Aubert; Scullion, William; Berry, Alain

    2017-07-01

    Sound quality is the impression of quality that is transmitted by the sound of a device. Its importance in sound and acoustical design of consumer products no longer needs to be demonstrated. One of the challenges is the creation of a prediction model that is able to predict the results of a listening test while using metrics derived from the sound stimuli. Often, these models are either derived using linear regression on a limited set of experimenter-selected metrics, or using more complex algorithms such as neural networks. In the former case, the user-selected metrics can bias the model and reflect the engineer pre-conceived idea of sound quality while missing potential features. In the latter case, although prediction might be efficient, the model is often in the form of a black-box which is difficult to use as a sound design guideline for engineers. In this paper, preprocessing by participants clustering and three different algorithms are compared in order to construct a sound quality prediction model that does not suffer from these limitations. The lasso, elastic-net and stepwise algorithms are tested for listening tests of consumer product for which 91 metrics are used as potential predictors. Based on the reported results, it is shown that the most promising algorithm is the lasso which is able to (1) efficiently limit the number of metrics, (2) most accurately predict the results of listening tests, and (3) provide a meaningful model that can be used as understandable design guidelines.

  9. A Novel Semantically-Time-Referrer based Approach of Web Usage Mining for Improved Sessionization in Pre-Processing of Web Log

    Directory of Open Access Journals (Sweden)

    Navjot Kaur

    2017-01-01

    Full Text Available Web usage mining(WUM , also known as Web Log Mining is the application of Data Mining techniques, which are applied on large volume of data to extract useful and interesting user behaviour patterns from web logs, in order to improve web based applications. This paper aims to improve the data discovery by mining the usage data from log files. In this paper the work is done in three phases. First and second phase0 which are data cleaning and user identification respectively are completed using traditional methods. The third phase, session identification is done using three different methods. The main focus of this paper is on sessionization of log file which is a critical step for extracting usage patterns. The proposed referrer-time and Semantically-time-referrer methods overcome the limitations of traditional methods. The main advantage of pre-processing model presented in this paper over other methods is that it can process text or excel log file of any format. The experiments are performed on three different log files which indicate that the proposed semantically-time-referrer based heuristic approach achieves better results than the traditional time and Referrer-time based methods. The proposed methods are not complex to use. Web log file is collected from different servers and contains the public information of visitors. In addition, this paper also discusses different types of web log formats.

  10. Making the most of RNA-seq: Pre-processing sequencing data with Opossum for reliable SNP variant detection [version 1; referees: 2 approved, 1 approved with reservations

    Directory of Open Access Journals (Sweden)

    Laura Oikkonen

    2017-01-01

    Full Text Available Identifying variants from RNA-seq (transcriptome sequencing data is a cost-effective and versatile alternative to whole-genome sequencing. However, current variant callers do not generally behave well with RNA-seq data due to reads encompassing intronic regions. We have developed a software programme called Opossum to address this problem. Opossum pre-processes RNA-seq reads prior to variant calling, and although it has been designed to work specifically with Platypus, it can be used equally well with other variant callers such as GATK HaplotypeCaller. In this work, we show that using Opossum in conjunction with either Platypus or GATK HaplotypeCaller maintains precision and improves the sensitivity for SNP detection compared to the GATK Best Practices pipeline. In addition, using it in combination with Platypus offers a substantial reduction in run times compared to the GATK pipeline so it is ideal when there are only limited time or computational resources available.

  11. Making the most of RNA-seq: Pre-processing sequencing data with Opossum for reliable SNP variant detection [version 2; referees: 2 approved, 1 approved with reservations

    Directory of Open Access Journals (Sweden)

    Laura Oikkonen

    2017-03-01

    Full Text Available Identifying variants from RNA-seq (transcriptome sequencing data is a cost-effective and versatile complement to whole-exome (WES and whole-genome sequencing (WGS analysis. RNA-seq (transcriptome sequencing is primarily considered a method of gene expression analysis but it can also be used to detect DNA variants in expressed regions of the genome. However, current variant callers do not generally behave well with RNA-seq data due to reads encompassing intronic regions. We have developed a software programme called Opossum to address this problem. Opossum pre-processes RNA-seq reads prior to variant calling, and although it has been designed to work specifically with Platypus, it can be used equally well with other variant callers such as GATK HaplotypeCaller. In this work, we show that using Opossum in conjunction with either Platypus or GATK HaplotypeCaller maintains precision and improves the sensitivity for SNP detection compared to the GATK Best Practices pipeline. In addition, using it in combination with Platypus offers a substantial reduction in run times compared to the GATK pipeline so it is ideal when there are only limited time or computational resources available.

  12. Preprocessing of Software Defect Data Based on GitHub%基于GitHub的软件缺陷数据预处理

    Institute of Scientific and Technical Information of China (English)

    类兴明; 杨春花

    2016-01-01

    缺陷数据是进行软件缺陷相关领域研究的基础。针对当前缺陷数据源受到获取方式与数据来源的局限等问题,设计并实现了一个软件缺陷数据预处理系统。实现对指定的GitHub上的项目的缺陷数据源进行获取、预处理及管理。通过该系统,用户可以方便地获取GitHub上自己所期望的项目的缺陷数据集。%Defect data is the basis of the relevant research in the field of software defects. In view of the defect data restricted by acquisition mode and data sources limitations and other issues, there are independent design and development of the software defect data pretreatment system. In this paper, a system is designed and implemented to acquire, preprocess, and manage the defect data source for the specified projects located on GitHub.Through the system,users can easily obtain an expected defect data collection on GitHub.

  13. Hadoop支持下海量出租车轨迹数据预处理技术研究%Based on the Hadoop Massive Taxi Trajectory Data Preprocessing Technology Research

    Institute of Scientific and Technical Information of China (English)

    吕江波; 张永忠

    2016-01-01

    海量出租车轨迹数据预处理是轨迹数据挖掘和应用的前提。出租车轨迹数据是典型的大数据,传统的数据处理技术无法解决大规模出租车轨迹数据误差分析和处理问题,文章在分析轨迹数据误差来源和误差类型的基础上,提出基于Hadoop的海量出租车轨迹数据预处理模型,使用Hive实现轨迹数据误差统计分析,设计MapReduce并行处理程序实现轨迹数据预处理。实验结果表明,该模型可以有效解决大规模出租车轨迹数据预处理问题,处理方式可靠性较高,大大提高了轨迹数据预处理效率,为后期轨迹数据深入挖掘和分析奠定了基础。%Massive taxi trajectory data preprocessing is the precondition of trajectory data mining and the application. Taxi trajectory data is a typical big data,the traditional data processing technology can not solve the problem of large scale taxi track data error analysis and preprocessing,on the basis of analyzing the trajectory data error source and error type, study of mass trajectory error statistical analysis method and data processing method,the taxi trajectory data preprocessing model based on Hadoop is put forward,using the hive for the realization of the trajectory error statistics,design MapReduce parallel processing procedures for the realization of trajectory data preprocessing. Experimental results show that,the model can effectively solve the problem of large scale taxi trajectory data preprocessing,high reliability,greatly improve the effi-ciency of the trajectory data preprocessing,late for trajectory data digging and analysis laid a foundation.

  14. 星载原子钟数据预处理的方法研究%Research on the Methods of Preprocessing the Satellite-Borne Atomic Clocks Data

    Institute of Scientific and Technical Information of China (English)

    李斌; 杨富春; 江峻毅

    2015-01-01

    Satellite-borne atomic clocks data preprocessing is the basis of atomic clocks performance analysis and clock forecasting,phase and frequency of data conversion and abnormal data analysis and processing methods are used to preprocess the clocks data, these methods can effectively guarantee the re-liability of the clocks data.%星载原子钟的数据预处理是进行原子钟性能分析和钟差预报的前提,本文主要利用相位数据和频率数据的转换和异常数据的分析处理方法对原子钟数据进行了预处理,有效的保证了数据的可靠性。

  15. 云中心海量交通数据预处理技术概述与应用实例%Overview of Preprocessing Technologies for Massive Traffic Data in Cloud Center and Application Examples

    Institute of Scientific and Technical Information of China (English)

    李敏; 刘晨; 谯志

    2015-01-01

    Cloud computing connects various computing capabilities with the help of developing network at high speed to provide infinite computing platform for big data,and meanwhile provide support for computation of massive traffic data.On the basis of expatiating the requirements of transportation industry for processing of massive data,this paper analyzes data preprocessing technologies frequently used at present,technological features and advantages of traffic cloud in preprocessing technologies for massive traffic data,presents the scene of preprocessing technologies for traffic application data,and carries out test with preprocessing examples of traffic big data of Yuzhou District,Chongqing in September,2014. The test results can provide a reference for further development of the relevant studies.%云计算借助于高速发展的网络联接各种计算能力,为大数据提供了无限的计算平台,同时也为海量交通数据的计算提供了支撑。在阐述交通行业对海量数据处理要求的基础上,分析目前常用的数据预处理技术、技术特点和交通云在海量交通数据预处理技术中所处的优势,给出交通云应用数据预处理技术的场景,并以重庆市渝中区2014年9月的交通大数据预处理实例为研究对象进行试验,试验结果可为相关研究的进一步展开提供参考。

  16. 井中磁测资料预处理与弱信号识别%DATA PREPROCESSING AND WEAK SIGNAL DETECTION IN BOREHOLE MAGNETIC SURVEY

    Institute of Scientific and Technical Information of China (English)

    王赛昕; 刘天佑; 欧洋; 高文利; 邱礼泉; 冯杰

    2014-01-01

    井中磁测仪器距离场源近,磁测资料受到钻孔打穿磁性体产生的内磁场影响,往往存在强干扰压制有用信号,因此研究井中磁测资料预处理与弱信号识别方法具有重要意义。笔者将地面重磁资料边界识别中倾斜角总水平导数和解析信号振幅方法改进后应用于井中磁测资料预处理中,提出倾斜角垂向导数和均值归一解析信号振幅方法。通过模型分析证明了两种方法都可识别强弱不同的井旁磁性体位置,反映边界信息,且给出了计算均值归一解析信号振幅过程中窗口选择的依据。将两种方法应用于江苏某工区ZK002孔磁测资料预处理中,得出倾斜角垂向导数方法应用于实测资料中会放大干扰,而均值归一解析信号振幅方法增强了该钻孔弱异常信息,且压制了干扰信号,处理结果与磁化率测井和钻孔资料相对应。%As the logging instrument is placed near the field source, the borehole magnetic data are likely to be affected by the internal magnetic field when the drill hole penetrates the magnetic body. As a result, the useful signal is often suppressed by strong jamming and not easy to identify. Therefore, it is important to find a method which can identify the weak signal through data preprocessing. In this paper, the methods for ground survey, called total horizontal derivate of tilt-angle and analytical Signal Amplitude, were used in bore-hole magnetic survey. The authors proposed vertical derivative of tilt-angle through coordinate transformation and a new method called Equalized Analytical Signal Amplitude. Through data processing of the model, the authors point out that the two kinds of methods can both identify the magnetic body and give the boundary information. The authors also give a standard to choose a suitable window to cal-culate the Equalized Analytical Signal Amplitude. The two kinds of methods were applied to Jiangsu ZK002 borehole

  17. Investigation of thermochemical biorefinery sizing and environmental sustainability impacts for conventional supply system and distributed pre-processing supply system designs

    Energy Technology Data Exchange (ETDEWEB)

    David J. Muth, Jr.; Matthew H. Langholtz; Eric C. D. Tan; Jacob J. Jacobson; Amy Schwab; May M. Wu; Andrew Argo; Craig C. Brandt; Kara G. Cafferty; Yi-Wen Chiu; Abhijit Dutta; Laurence M. Eaton; Erin M. Searcy

    2014-08-01

    The 2011 US Billion-Ton Update estimates that by 2030 there will be enough agricultural and forest resources to sustainably provide at least one billion dry tons of biomass annually, enough to displace approximately 30% of the country's current petroleum consumption. A portion of these resources are inaccessible at current cost targets with conventional feedstock supply systems because of their remoteness or low yields. Reliable analyses and projections of US biofuels production depend on assumptions about the supply system and biorefinery capacity, which, in turn, depend upon economic value, feedstock logistics, and sustainability. A cross-functional team has examined combinations of advances in feedstock supply systems and biorefinery capacities with rigorous design information, improved crop yield and agronomic practices, and improved estimates of sustainable biomass availability. A previous report on biochemical refinery capacity noted that under advanced feedstock logistic supply systems that include depots and pre-processing operations there are cost advantages that support larger biorefineries up to 10 000 DMT/day facilities compared to the smaller 2000 DMT/day facilities. This report focuses on analyzing conventional versus advanced depot biomass supply systems for a thermochemical conversion and refinery sizing based on woody biomass. The results of this analysis demonstrate that the economies of scale enabled by advanced logistics offsets much of the added logistics costs from additional depot processing and transportation, resulting in a small overall increase to the minimum ethanol selling price compared to the conventional logistic supply system. While the overall costs do increase slightly for the advanced logistic supply systems, the ability to mitigate moisture and ash in the system will improve the storage and conversion processes. In addition, being able to draw on feedstocks from further distances will decrease the risk of biomass supply to

  18. 基于WorldView-2的遥感影像预处理%Remote Sensing Image Preprocessing Based on WorldView-2

    Institute of Scientific and Technical Information of China (English)

    赵莹; 王环; 方圆

    2014-01-01

    For the country's new geographic conditions survey projects , satellite remote sensing data as its basic data have a vital role , directly affects the visual interpretation of the results .In this paper, we analyze the WorldView -2 images, and analyzes the process of pretreatment .Finally suggests for WorldView -2 radiation correction , geometric correction , image fusion images taken -data is correct technical route of the whole process of image preprocessing , the image data in geographic conditions survey project provides ref-erence and basis of pretreatment and lay a good foundation for subsequent visual interpretation process .%对于国家新推出的地理国情普查项目,卫星遥感数据作为其基础数据有着至关重要的作用,直接影响着目视解译的结果。本文主要对WorldView-2影像进行分析,并对其整个预处理的流程进行分析整理。最终表明,WorldView-2影像可采取辐射校正-几何校正-影像融合-数据正射纠正的技术路线对整个影像进行预处理,为地理国情普查项目中的影像数据预处理提供了参考及依据,为后续的目视解译过程奠定良好的基础。

  19. 桥梁结构监测数据预处理方法及其应用%Research on Method of Data Preprocessing and Its Application for Bridge Structural Monitoring

    Institute of Scientific and Technical Information of China (English)

    雷旺龙; 孙洪鑫; 郭雪涛

    2012-01-01

    介绍了桥梁结构监测数据预处理的几种方法,并基于九江大桥健康监测系统,对系统采集的数据首先选用其中的两种数据预处理方法,分别判别数据中的异常值,剔除粗大误差,其次采用数据平滑方法对噪声数据进行处理,采取这些合理的数据预处理方法最终可以获得对桥梁结构健康状态评估起重要作用的有效数据.%This paper introduces several methods of monitoring data preprocessing for the bridge structure. Based on Jiujiang Bridge health monitoring system, firstly, choose two kinds of data preprocessing methods from the data collected by the system are chosen; abnormal data is discriminated respectively and the bulky error is eliminated. Then the noise data is processed by the smoothing method. Finally, these reasonable data preprocessing methods are used to get the effective data which plays an important role in the bridge structure health condition assessment.

  20. 三维地层建模钻孔数据预处理研究%Study of preprocessing of drilling data for building 3D strata model

    Institute of Scientific and Technical Information of China (English)

    夏艳华; 白世伟

    2012-01-01

    Drilling data is one of main sources for building 3D strata model. Usually the stratigraphic interfaces are constructed by interpolation based on drilling data. Processing drilling data accurately is the basic premise of building realistic 3D strata model. Because the option of the interpolation data of the stratigraphic interfaces is not unique mathematically, the complex topological relationships of the strata usually result in the condition that the building 3D strata model is not realistic. In order to gain unique and realistic interpolation data, a stratigraphic column, which records the chronological order of the strata and their topological relationships, is defined in accordance with geological history. The topological relationships of the strata are generalized into two classes: either 'onlap' or 'erosion'. According to the stratigraphic column, the unique interpolation data of the stratigraphic interfaces can be determined and then be used to build realistic 3D strata model. Python is used to implement the algorithm of the preprocessing of drill data. The examples prove that the project can validly forecast realistic strata.%钻孔数据是三维地层建模的主要数据来源之一,在三维地层建模中,地层界面一般基于钻孔数据插值而成,正确处理钻孔数据是构建理想三维地层模型的前提和基础.三维地层拓扑关系极其复杂,地层界面插值数据在数学上选择也不是惟一的,因此,常导致构建地层模型出现与事实相悖的情形.针对这一点,根据地层地质演变历史,建立地层层序序列,此序列记录地层地质年代顺序及其拓扑关系.根据地层地质特征,可将地层拓扑关系概化为两类:“覆盖”和“侵蚀”.据此地层层序序列可惟一确定地层界面插值数据,从而构建正确地层模型.采用python语言实现了该钻孔数据预处理算法,实例表明,该预处理方案能正确地预测实际地层.

  1. 无人机地面站图像预处理系统的设计与应用%Design and Application of Video Image Preprocessing System for UAV Ground Control Station

    Institute of Scientific and Technical Information of China (English)

    李大健; 贾伟; 臧频阳; 孙美蕊

    2011-01-01

    In order to overlap graphics and character on UAV reconnaissance image a image preprocessing system which is loosely coupled with control computer is designed. The DSP technique based system is controlled through RS232 serial port to receive overlapping information and to switch image filtering and enhancing algorithms. The interference between preprocessing system and the control computer is largely reduced by packaging the preprocessing system in a separate standard 1 U box. The system principle and design is described in detail. Test results show that the system meets all the feature and function requirements, the image filtering and enhancing function is effective and real- time. The design concept of this system can be used for reference for similar system design.%为了实现无人机侦察图像的图形字符叠加和预处理,设计了与系统主计算机松耦合的图像处理系统,系统基于DSP技术,通过串行接口控制叠加信息的内容和位置以及滤波增强算法切换,系统封装于独立外置的IU机箱中,有效减小了其与主控制计算机间的相互干扰.对系统组成原理和软硬件实现进行了详细介绍,实验结果表明,系统实现满足性能指标要求,滤波增强功能高效实时,其设计思想对同类型系统具有一定借鉴意义.

  2. 基于Apriori算法的中医数据预处理方法分析及应用%Analysis and application of TCM data preprocessing method based on Apriori algorithm

    Institute of Scientific and Technical Information of China (English)

    仝武宁; 李宏斌; 王亚丽

    2015-01-01

    To present a data processing method of traditional Chinese medicine based on Apriori algorithm to improve the efficiency of data mining and ensure the accuracy of the knowledge or conclusion in the data mining. The importance of data preprocessing in data mining was analyzed, along with the characteristics of TCM data and the requirements of Apriori algorithm for mining data. Some new functions were formed with considerations on the exam-ples. The data preprocessing was explored from the aspects of terminology standardization, eliminating unqualified data, structured prescription data, data sorting and etc. The new functions were simple and easy to operate, and the preprocessed data made the efficiency of TCM data mining enhanced greatly. The preprocessing method based on Apriori algorithm for TCM data facilities the TCM data mining.%目的:为了提高数据挖掘的工作效率,保证数据挖掘出的知识或结论的准确性,提出基于Apriori算法的中医数据预处理方法。方法:通过分析数据预处理在数据挖掘中的重要地位、中医数据的特点以及Apriori算法对挖掘数据的要求,结合实例创建多个函数,从术语规范化、清理不合格数据、药方数据结构化、数据有序化等方面阐述了数据预处理方法。结果:构造的新函数清晰明确、便于调用,预处理过的数据大大提高了中医数据的挖掘效率。结论:基于Apriori算法的中医数据预处理方法简单易操作,从很大程度上解决了中医数据挖掘的先行问题。

  3. 基于方向图和 Gabor 滤波的指纹预处理算法%Preprocessing Algorithm for Fingerprint Image Based on Orientation and Gabor Filter

    Institute of Scientific and Technical Information of China (English)

    付玉虎; 杜月荣; 李哲哲

    2014-01-01

    The fingerprint image in preprocessing is often affected by many factors , but sometimes can not meet the requirements of fingerprint identification system .In this article , an effective fingerprint preprocessing algorithm is presented based on the tradi-tional methods for preprocessing .Firstly, a method based on block gradient variance is implemented to achieve the separation of the fingerprint image and the background area .Secondly , according to the fingerprint feature , we use orientation map and mean filter to enhance the image .And using simplified Gabor filter improves the filtering template to filter edge blur effect .After binari-zation, thinning and eliminating pseudo minutiae , we get the fingerprint image skeletonization and extract the fingerprint feature points.Experiments show that the preprocessing algorithm is applicable to images of different qualities and levels .It is flexible, efficient, easy, and accurate, meets the requirement of fingerprint identification system .%指纹图像在预处理过程中往往受多方面因素制约,有时无法满足指纹识别系统的要求。本文在传统指纹预处理算法基础上,给出一种有效的指纹预处理改进算法。首先,采用分块方差梯度分割算法分离指纹图像和背景区;再根据指纹特征,用方向图和均值滤波器进行图像增强,并用简化的Gabor滤波器,改进滤波模板滤除边缘模糊效应。二值化、细化并删除伪特征点后,提取出指纹脊线骨架并获得指纹特征点。实验表明,该预处理算法对不同质量的指纹图像均具有较好效果,算法灵活高效、易于实现、精确度高,达到了指纹识别系统的要求。

  4. 一种有效提升识别率的面部图像预处理方法%An Effective Method to Increase the Recognition Rate of Facial image Preprocessing

    Institute of Scientific and Technical Information of China (English)

    谭阳; 贺璐

    2012-01-01

    面部关键特征点的定位和取样直接关系到面部识别率的高低,本文提出了一种对人类面部图像进行细节提取,并进行了方向场化突出,最后将其特征2值化的预处理的方法,有效地抑制了面部图像中提取的特征点样本模糊的现象。实验证明,通过这种方法对面部图像的预处理,能够较好地提升一般面部识别算法的识别率。%Positioning and sampling of the facial feature points are directly related to the level of facial recognition rate. This paper proposes a new preprocessing method to extract details from human facial image, highlighted by line element field, and finally charac terized by two values. It effectively suppresses the blurred sample of feature points in facial image extraction. The experiments show that this method of facial image preprocessing could enhance the recognition rate of the general face recognition algorithms.

  5. Program Development of the Parameter for Pipe Welder PowerThe Application of Filtering Preprocessing in Watershed Segmentation Algorithm Source%滤波预处理在分水岭分割算法中的应用

    Institute of Scientific and Technical Information of China (English)

    刘晨

    2014-01-01

    In order to reduce the impact of over-segmentation of the watershed algorithm for image noise .we do the filter preprocessing first. The noise filtering algorithm, while smoothing the edge structure information of the image will be lost. In order to maintain the edge structure information in the image pre-processing filtering using the filter function similarity measure, and similarity measure extended from pixel-based approach to the method based on image block.%为了降低图像噪声对分水岭算法产生的过分割的影响,在分水岭算法之前进行滤波预处理。而滤波算法在平滑噪声的同时会丢失图像的边缘结构信息,为了保持在滤波预处理过程中图像的边缘结构信息,使用相似性度量的滤波函数,并且相似性度量从基于像素点的方法扩展到基于图像块的方法。

  6. Study of EMR Data Preprocessing in the Generation of Clinical Pathway Based on Data Mining%基于EMR数据挖掘的临床路径构建中EMR数据预处理

    Institute of Scientific and Technical Information of China (English)

    曹洪欣; 蔡海英; 王侠; 王霞

    2013-01-01

    高质量决策取决于高质量数据.与其他领域的数据挖掘一样,在基于电子病历数据挖掘构建临床路径中,也需要对拟挖掘的电子病历数据进行预处理,为最终的数据挖掘提供干净、准确、更有针对性的数据,从而提高挖掘的效率和准确度,但对于基于EMR数据挖掘的临床数据构建中的电子病历预处理又有其自身的特点.文章对此进行了分析与探索.%High quality decision depends on high quality data.As in other areas of data mining,the data preprocessing is also needed in the generation of Clinical Pathway by data mining based on electronic medical records so that the clean,accurate,specific data can be provided for the final mining,and therefore improve the efficiency and accuracy.But the data preprocessing in this processing has its own characteristic features.The paper gives an analysis and exploration.

  7. 包装形式对辐照调理鸡肉理化特性的影响%EFFECT OF PACKAGING ON PHYSICOCHEMICAL CHARACTERISTICS OF IRRADIATED PRE-PROCESSED CHICKEN

    Institute of Scientific and Technical Information of China (English)

    姜秀杰; 张德权; 张东杰; 李淑荣; 高美须; 王志东

    2011-01-01

    To explore the effect of modified atmosphere packaging and antioxidants on the physicochemical characteristics of irradiated pre-processed chicken, the pre-processed chicken was added antioxidants first, and then packaged in common, vacuum and gas respectively, and finally irradiated at 5kGy dosage.All samples was stored at 4℃.The pH,TBA, TVB-N and color deviation were evaluated after 0, 3, 7, 10, 14, 18 and 21d of storage.The results showed that pH value of pre-processed chicken with antioxidants and vacuum packaged increased with the storage time but not significantly amoung different trcatments.The TBA value was also increased but not significantly ( P > 0.05 ), which indicated that vacuum package inhibited the lipid oxidation.TVB-N value increased with storage time, TVB-N value of vacdum pacrage samples reached 14.29mg/100g at 21d storage, which did not exceeded the reference indexes of fresh meat.a * value of the pre-processed chicken of vacuum package and non-oxygen package samples increased significantly during storage (P <0.05), and chicken color kept bright red after 21d storage with vacuum package It is conduded that vacuum packaging of irradiated pre-processed chicken is effective on ensuring its physical and chemical properties during storage.%为探明不同包装协同抗氧化剂对辐照调理鸡肉理化指标的影响,对添加抗氧化剂的辐照调理鸡肉分别采取普通包装、真空包装及气调包装后经5kGy辐照处理,4℃下贮藏,分别在辐照的第0、3、7、10、14、18和21天对鸡肉pH值、TBA值、TVB-N值及色度a*值进行测定.结果表明:添加抗氧化剂结合气调包装的调理鸡肉在贮藏期间pH值有所升高,但变化相对不大;TBA值升高,真空包装组可有效抑制脂肪氧化的速度,但不显著(P>0.05);TVB-N值随贮藏时间的延长均呈上升趋势,其中真空包装效果较好,第21天TVB-N为14.29mg/100g,未超过鲜肉参考指标;在整个贮藏期内真空包装和无

  8. Performance evaluation of relay-aided downlink direct sequence CDMA based on transmitter preprocessing%基于发射机预处理的中继协助下行直接序列CDMA性能评估

    Institute of Scientific and Technical Information of China (English)

    张淑萍; 赵桂钦

    2014-01-01

    针对多用户下行链路直接序列码分多址系统中DL信号可能会相互影响导致中继处产生多用户干扰的问题,本文提出了一种基于矢量量化信道脉动响应的多用户发射预处理系统。该系统在各个中继处通过基于训练序列的估计技术对CIR进行估计并随机矢量量化这些被估计的CIR,同时其幅值及相位通过噪声衰退反馈信道被反馈至基站。基站内利用通过线性检测器恢复的CIR进行预处理矩阵的公式化以减少中继处DL多用户的干扰。分析结果表明,当调用受噪声及衰退影响的VQ-CIR实现预处理矩阵时,可以使得误码率性能降低,而基于VQ-CIR的MUTP的BER性能与基于完全CIR假定所得的性能相近。%In a multi-user downlink direct sequence code division multiple access system, the DL signals may in-terfere with each other resulting in MUI at the relays, to solve this problem, this paper proposes a system with multi-user transmitter preprocessing based on vector quantized channel impulse responses. The CIRs are estimated with the aid of training-sequence based estimation technique at each of the relays. These estimated CIRs are then VQ and the magnitudes and phases are fed back to the base station through feedback channels, that conflict noise and fad-ing. At the BS, the CIRs recovered using a linear detector are then exploited to formulate the preprocessing matrix to mitigate the DL multi-user interference at the relays. Our study shows that the attainable bit-error-rate degrades when noise and fading contaminated VQ-CIRs are invoked to realize the preprocessing matrix. Nevertheless, the resultant BER performance of the MUTP based on VQ-CIRs acquired via ideal feedback remains close to that achieved with perfect CIRs assumption.

  9. 基于NLMS自适应滤波的近红外光谱去噪处理方法研究%Preprocessing Methods of Near-Infrared Spectrum Based on NLMS Adaptive Filtering

    Institute of Scientific and Technical Information of China (English)

    陈丛; 卢启鹏; 彭忠琦

    2012-01-01

    The normalized least mean square(NLMS) adaptive filtering method is introduced to get the preprocessing of near-infrared (NIR) spectrum in order to deduct the noise from Near-infrared spectrum. Fifty-one soil samples are served as the target and the application of NLMS adaptive filtering method in NIR spectrum preprocessing is discussed. The result after denoising is related to the real organic content of soil samples, then constructing a model according to this. Experimental results show that the correlation coefficient of the prediction set is improved from 0. 8284 to 0. 9654, and the root mean square error of prediction (RMSEP) is reduced from 0. 3385 to 0. 1606 after denoising with NLMS adaptive filter. So NLMS adaptive filter has a good effect in denoising the NIR spectrum. And it is also very useful to make the final model more representative, stable and robust. NLMS adaptive filter provides a new method for near-infrared sprctrum preprocessment.%为了去除直接采集的近红外(NIR)光谱中含有的噪声,将归一化最小均方(NLMS)自适应滤波方法引入到NIR光谱去噪领域中.以51份土壤样品的NIR光谱为研究对象,探讨NLMS自适应滤波方法在NIR光谱预处理中的应用,并将处理后的结果与土壤中有机质的含量相关联,建立模型.结果表明,通过NLMS自适应滤波去噪后的光谱,预测集的相关系数r由处理前的0.8284提高至0.9654,预测均方根误差(RMSEP)由处理前的0.3385降至0.1606.由此可见,NLMS自适应滤波对NIR光谱的去噪有显著效果,可以有效地提高光谱的分析精度和模型的稳健性,为NIR光谱的预处理提供了一种新方法.

  10. 一种改进聚类分析方法在数据预处理中的应用%Application of Compositive Clustering Analysis Methods in Data Preprocessing

    Institute of Scientific and Technical Information of China (English)

    钟波; 肖智

    2002-01-01

    在通常的聚类分析方法的基础上,提出了一种改进聚类分析方法,并运用于数据源中缺损数据的修补.案例示算结果显示,该方法比传统的数据预处理方法更合理,置信度更大.%The problem of defective data often arises during the course of mining data. Thus datapreprocessing is necessary. On the basis of traditional cluster analysis methods, this paper attempts to present acompositive clustering analysis method which is employed to remedy lost data in data source. The result of calculations on mass data shows that this method is more reasonable and believable compared with traditional data preprocessing methods.

  11. Modified algorithm of bit-plane complexity segmentation steganography based on preprocessing%基于预处理的位平面复杂度分割隐写改进算法

    Institute of Scientific and Technical Information of China (English)

    刘虎; 袁海东

    2012-01-01

    Since Bit-Plane Complexity Segmentation ( BPCS) steganography is vulnerable to complex histogram attack, this paper proposed an improved algorithm based on preprocessing. The steganography derived compensatory rule from distribution of the cover image. Then it used reversed preprocessing in compensation to the change of complexity caused by embedded information. The experimental results show that the proposed algorithm can properly hide information and counteract the attack of complex histogram. The compensation happens before information hiding, so it can maintain the big capacity characteristic of the original algorithm.%位平面复杂度分割(BPCS)隐写易受复杂度直方图攻击,为了弥补这一缺陷,提出了一种基于预处理的改进隐写算法.算法针对载密图像进行统计特征的量化分析,求导出逆向预处理的补偿规则,进而对嵌入信息引起的复杂度变化进行逆向的预处理补偿.实验结果表明,改进的算法在保证隐蔽性的同时具有很好的抗复杂度直方图攻击的能力,由于补偿过程是在隐藏秘密信息之前发生的,算法也较好地保持了BPCS大容量隐写的优点.

  12. 地质体三维实体建模的数据预处理方法%Data Preprocessing Method for 3D Solid Modeling of Geologic Bodies

    Institute of Scientific and Technical Information of China (English)

    孟耀伟; 程菊明; 田胜利

    2011-01-01

    研究了三维地质体实体建模对地质勘探数据的要求,结合三雏地质体实体建模相关过程和操作,阐述了数据预处理的重要性.提出了数据预处理的数据检查、数据修正和建模辅助3个过程,并对这3个过程中的轮廓线自相交、轮廓线之间相交、任意简单多边形的重心计算算法、三维空间中的辅助标记添加等主要问题和相应解决算法进行了详细论述.实验结果表明,该些方法能够简化建模算法,具有很高的可操作性.%The requirements of data for 3D solid modeling of geologic bodies are analyzed.Combining with the process and manipulation of 3D solid modeling of geologic bodies, the importanee of data preprocessing is expounded.The data preprocessing are divided into three phases that are dats check,data modification and the addition of auxiliary data for modeling, and of which some main problems such as outline intersecting with itself or between the lines, the cg calculation of arbitrary simple polygons and the addition of auxiliary data in three - dimensional space,and corresponding algorithms to solve these problems are discussed detailedly.The experiment results show that the methods can simplify the modeling algorithms and have high maneuverability.

  13. Efficient Preprocessing of Subspace Skyline Queries in P2P Networks%有效预处理P2P网络中的子空间skyline查询

    Institute of Scientific and Technical Information of China (English)

    黄震华; 王智慧; 郭建魁; 汪卫; 施伯乐

    2009-01-01

    多维空间的skyline查询处理是近年来数据库领域的一个研究重点和热点.Vlachou等人首次考虑如何在P2P网络中有效进行子空间上的skyline查询,并提出"扩展skyline集合"的概念来减少预处理时的网络传输量.然而实验评估表明,扩展skyline集合只能有限地减少子空间skyline查询预处理的数据传输量.基于此,提出一种缩减预处理时数据传输量的有效方法TPAOSS(three-phase algorithm for optimizing skyline scalar).TPAOSS算法根据全空间skyline集合与子空间skyline集合间的语义关系分3个阶段来传输必要的数据,其中第1阶段发送全空间skyline对象;第2阶段接收种子skyline对象;而第3阶段基于Bloom filter技术发送种子skyline对象在子空间上的重复对象.为了降低第2阶段的数据传输量,给出两种接收种子skyline对象的有效策略.理论分析和实验评估结果表明,所给出的算法具有有效性和实用性.%Skyline query processing has recently received a lot of attention in database community. Lately, Akrivi Vlachou and D. Christos considered how to efficiently process subspace skyline queries in peer-to-peer networks, and proposed the concept of "extended skyline set" to reduce the volume of data transferred in the preprocessing phase for the first time. However, the experimental evaluation shows that this data structure is extremely limited in reducing the volume of data transferred in the preprocessing phase. Motivated by these facts, this paper proposes an efficient algorithm, i.e. TPAOSS (three-phase algorithm for optimizing skyline scalar), to reduce the volume of data transferred in the preprocessing phase. TPAOSS algorithm is based on the semantic relationship between full-space skylines and subspace skylines, and transfers the data through three phases. In the first phase, it only sends full-space skylines. In the second phase, it receives seed skylines. In the third phase, it exploits Bloom filter

  14. Preprocessing Method for Internet Search Data and Its Application to Stock Market Analysis%互联网搜索数据预处理方法及其在股市分析中的应用

    Institute of Scientific and Technical Information of China (English)

    刘颖; 吕本富; 彭赓

    2011-01-01

    The correlations between Internet search data and socio-economic behavior have been confirmed by manyresearches, but the basis of this study-data preprocessing, is short of general methodology now. In this paper, we present a systematic method for Internet search data preprocessing, which includes the critical steps: keywords selection, time difference measurement, and leading index composition. Using this method, we can get the leading keywords index with stable leading relation and high degree of fit. Specifically, the correlation coefficient between our leading keywords index and Shanghai Composite Index reaches 0.979, Granger test confirms that leading keywords index has significant predictive ability for Shanghai Composite Index, and the regression results show that each percentage point change of keywords index, Shanghai Composite Index moves 0.518 percentage points in the same direction in next period.%互联网搜索数据与社会经济行为的相关性已被多篇文献所证实,然而对于这项研究的基础工作--数据预处理,目前尚缺乏系统的方法.本文提出一套完整的搜索数据预处理流程,包括搜索关键词的选择、时差关系判定、关键词指数合成等步骤,并对各关键步骤给出了处理方法及标准.通过该方法可以得到稳定且高拟合度的先行关键词指数.本文以股票市场中上证指数为研究对象,实证检验得出,合成后的先行关键词指数与上证指数的拟合优度高达0 979.Granger检验证实了对上证指数具有显著的预测能力,回归结果显示关键词指数每变动1个百分点,后一期的上证指数将同方向变动0 518个百分点.

  15. 一种基于预处理的OFDM低复杂度频域线性均衡算法%A Low Complexity Frequency-Domain Linear Equalization Algorithm Based on Pre-processing in OFDM System

    Institute of Scientific and Technical Information of China (English)

    何晨; 徐行辉; 陈勇; 蒋铃鸽

    2011-01-01

    针对高频段OFDM无线传输系统中严重的载波间干扰(Inter-Carrier Interference,ICI),提出了一种基于预处理的低复杂度频域线性均衡算法.在接收端先对各时刻的信道响应提取时变因子,进行一维时域滤波预处理;再用带状结构去近似提取时变因子后信道的频域矩阵,进行频域线性均衡.仿真结果表明,相对于MMSE算法,该算法以1~2dB的性能代价将系统复杂度降低到10%以下;相对于对信道信息进行部分丢弃的低复杂度算法,高信噪比时该算法可以不增加复杂度而获得5dB以上的性能增益.%Considering the serious Inter-Carrier Interference (ICI) in Orthogonal Frequency Division Multiplexing (OFDM) wireless systems with high frequency,a low complexity frequency-domain linear equalization algorithm based on pre-processing is proposed.The new algorithm first extracts a time-variant factor from every channel response. Consequently,one-dimensional filtering pre-processing in time-domain can be achieved at the receiver. As for the time-variant channel after time-variant factor extraction, Banded structure is used to approximate its frequency response matrix, which achieves low complexity frequency-domain linear equalization at the receiver. The analysis and simulation results show that, the proposed algorithm can reduce the complexity to 10% or less at the cost of l-2dB performance,relative to MMSE algorithm;the proposed algorithm can obtain at least 5dB relative performance gain with the same complexity,compared to the existing low complexity algorithm which discards part channel information.

  16. A Low-complexity Image Pre-processing for Quick Response Barcode Recognition%一种QR码识别的低复杂度图像预处理方法

    Institute of Scientific and Technical Information of China (English)

    陈威兵; 杨高波; 冯璐

    2012-01-01

    Quick Response (QR) barcode is a quite common two dimensional barcode. Image preprocessing is a key step for the automatic recognition of QR barcode decoder. In order to lower the used threshold, a practical image preprocessing method was proposed for QR barcode recognition. It could increase the speed of recognition by this decoder so as to embed this algorithm into mobile terminals. It didn 't utilize the traditional methods such as edge detection and line detection, but only the encoding characteristic of QR to locate, thus the influence of geometric distortion and background noise was reduced. Moreover, in order to improve the recognition rate, it used the rectification images to adaptively sample the QR code in terms of regions. Experiment results demonstrate that the proposed approach can overcome the influence of noise, inhomogeneous light and geometric distortion, and meet the requirements of real-time decoding.%QR(Quick Response)码是一种常见的二维条形码,图像预处理是实现复杂条件下QR码图像自动识别的重要步骤.为了降低QR码使用门槛,提出了一种实用的低复杂度QR码识别图像预处理算法,它可以提高解码器识别速率,便于算法嵌入移动终端.由于没有使用传统的边缘检测和直线检测手段,而是依据QR码自身编码特点进行探测定位,所以几何失真、背景噪声对其影响较小;此外,为提高了QR码的识别率,采用结合校正图形并按区域自适应取样的方法来生成码流.实验结果表明,该算法可以克服QR码识别过程中易受噪声干扰、光照不均和几何失真等影响,并能满足实时解码的要求.

  17. Evaluation of the efficiency of continuous wavelet transform as processing and preprocessing algorithm for resolution of overlapped signals in univariate and multivariate regression analyses; an application to ternary and quaternary mixtures

    Science.gov (United States)

    Hegazy, Maha A.; Lotfy, Hayam M.; Mowaka, Shereen; Mohamed, Ekram Hany

    2016-07-01

    Wavelets have been adapted for a vast number of signal-processing applications due to the amount of information that can be extracted from a signal. In this work, a comparative study on the efficiency of continuous wavelet transform (CWT) as a signal processing tool in univariate regression and a pre-processing tool in multivariate analysis using partial least square (CWT-PLS) was conducted. These were applied to complex spectral signals of ternary and quaternary mixtures. CWT-PLS method succeeded in the simultaneous determination of a quaternary mixture of drotaverine (DRO), caffeine (CAF), paracetamol (PAR) and p-aminophenol (PAP, the major impurity of paracetamol). While, the univariate CWT failed to simultaneously determine the quaternary mixture components and was able to determine only PAR and PAP, the ternary mixtures of DRO, CAF, and PAR and CAF, PAR, and PAP. During the calculations of CWT, different wavelet families were tested. The univariate CWT method was validated according to the ICH guidelines. While for the development of the CWT-PLS model a calibration set was prepared by means of an orthogonal experimental design and their absorption spectra were recorded and processed by CWT. The CWT-PLS model was constructed by regression between the wavelet coefficients and concentration matrices and validation was performed by both cross validation and external validation sets. Both methods were successfully applied for determination of the studied drugs in pharmaceutical formulations.

  18. Development of the FEM Preprocessing and Postprocessing Assistant Sofeware Applied to Dental Biomechanics%口腔生物力学有限元分析前后处理辅助软件的开发及应用

    Institute of Scientific and Technical Information of China (English)

    蒋文涛; 蒲放; 樊瑜波

    2000-01-01

    分析了常用有限元分析软件在口腔生物力学领域应用的局限性。针对目前研究常采用的两类方法,结合二者的特点,提出了用有限元前、后处理辅助软件来提高有限元建模的自动化以及结果数据的直观提取的方法。实践证明,该方法在提高效率、减少工作量和研究成本、缩短研究周期等方面效果显著。%For the limitation of the general finite element analysis software applied to dental biomechanics and several shortcomings of the two common finite element process in this field, the FEM preprocessing and postprocessing assistant software was developed to improve the automatization of building a dental model, and results are demonstrated visually. It is shown that this software is a great help to enhance the model efficiency, to reduce workload and research cost, and to save time.

  19. AppIication of OpenCV Image Preprocessing TechnoIogy in Unmanned Aircraft Based on Video%基于OpenCV的图像预处理技术在无人机视频的应用

    Institute of Scientific and Technical Information of China (English)

    吴川平; 黄文恺; 伍冯洁; 张雯雯; 梁俊杰

    2015-01-01

    OpenCV is used for digital image processing and computer vision of the open source code libraries. Starts from image processing in the application of unmanned aircraft video angle, proposes a preprocessing method. Applies Gauss filtering algorithm for image processing, bi-lateral smoothing algorithm, image gain function, and image fusion operation to UAV video, through the video frame image operation, solves the problem of realization of generating video image captured the jitter, low resolution of issues and the environment to produce light, noise etc..%OpenCV(Open Source Computer Vision)是一种用于数字图像处理和计算机视觉的开放源代码函数库。从图像处理在无人机视频的应用的角度出发,提出一种预处理方法,将图像处理的高斯滤波算法、双边平滑算法、图像增益数以及图像融合操作应用于无人机拍摄视频中。通过对视频帧图像的操作,解决无人机视频图像出现的抖动、清晰度低等问题,同时有效降低环境光线及噪声干扰的影响。

  20. 基于扫描匹配预处理的即时定位与地图创建%Scan matching preprocess for simultaneous localization and mapping

    Institute of Scientific and Technical Information of China (English)

    温安邦; 吴怀宇; 赵季

    2009-01-01

    Simultaneous Localization And Mapping(SLAM) problem and its solution for indoor autonomous mobile robot are ana-lyzed.To improve the accuracy and robustness,a method using preprocess based on scan match for SLAM is proposed.This method can feed SLAM with a priori pose information,and is proved to be valid and practical by analyzing the result of experiment.%研究了室内自主移动机器人的即时定位与地图创建问题.分析了目前解决SLAM问题的方法,提出了基于扫描匹配预处理的即时定位与地图创建,用扫描匹配为SLAM提供机器人先验位姿信息.对实验结果和数据的分析,得出了所提出方法可进一步提高SLAM的精度和鲁棒性.

  1. Evaluation of the efficiency of continuous wavelet transform as processing and preprocessing algorithm for resolution of overlapped signals in univariate and multivariate regression analyses; an application to ternary and quaternary mixtures.

    Science.gov (United States)

    Hegazy, Maha A; Lotfy, Hayam M; Mowaka, Shereen; Mohamed, Ekram Hany

    2016-07-05

    Wavelets have been adapted for a vast number of signal-processing applications due to the amount of information that can be extracted from a signal. In this work, a comparative study on the efficiency of continuous wavelet transform (CWT) as a signal processing tool in univariate regression and a pre-processing tool in multivariate analysis using partial least square (CWT-PLS) was conducted. These were applied to complex spectral signals of ternary and quaternary mixtures. CWT-PLS method succeeded in the simultaneous determination of a quaternary mixture of drotaverine (DRO), caffeine (CAF), paracetamol (PAR) and p-aminophenol (PAP, the major impurity of paracetamol). While, the univariate CWT failed to simultaneously determine the quaternary mixture components and was able to determine only PAR and PAP, the ternary mixtures of DRO, CAF, and PAR and CAF, PAR, and PAP. During the calculations of CWT, different wavelet families were tested. The univariate CWT method was validated according to the ICH guidelines. While for the development of the CWT-PLS model a calibration set was prepared by means of an orthogonal experimental design and their absorption spectra were recorded and processed by CWT. The CWT-PLS model was constructed by regression between the wavelet coefficients and concentration matrices and validation was performed by both cross validation and external validation sets. Both methods were successfully applied for determination of the studied drugs in pharmaceutical formulations.

  2. Lunar Image Data Preprocessing and Quality Evaluation of CCD Stereo Camera on Chang'E-2%嫦娥二号CCD立体相机数据预处理与数据质量评价

    Institute of Scientific and Technical Information of China (English)

    刘建军; 任鑫; 谭旭; 李春来

    2013-01-01

    嫦娥二号是中国探月工程二期月面软着陆和巡视探测的先导星,经过为期半年多的科学探测,CCD立体相机获取了空间分辨率7 m的全月球影像数据,对于后续月球探测和月球科学研究具有非常重要的意义.本文介绍了CCD立体相机的数据获取、数据预处理、数据格式和数据质量情况,有助于月球科学家了解数据产品情况,挖掘数据中更多的科学信息,开展月球科学研究.%Chang'E-2 was launched on October 1st 2010, inaugurating the second phase of a three-step moon mission which will culminate in a soft-landing on the moon. After half a year's exploring, the globe image data of 7 m resolution had been acquired by CCD stereo camera on Chang'E-2. The image data acquiring and preprocessing, data format and quality are recommended, which are helpful for lunar scientists to understand and make use of these image data products, and to extract information for studying lunar science.

  3. Signal-Preprocessing Method of Insulated Partial Discharge On-Line Monitoring%绝缘局部放电在线检测的信号预处理方法

    Institute of Scientific and Technical Information of China (English)

    蒋国顺; 金向朝

    2011-01-01

    通过对绝缘局部放电在线检测技术的分析,给出了一种基于信号理论的局部放电数字信号预处理方法,其主要采用归一化信号处理方法,提取信号的时域和频域的特征信息。在具有噪声、内部放电、表面放电等干扰信号的变压器上进行检测,验证了该方法的识别特性和绝缘故障模式识别的准确率。%By analyzing on-line monitoring technology of insulated partial discharge,this paper put forward a digital signal-preprocessing method of partial discharge based on signal theory.This method primarily adopted the normalized signal processing method to ext

  4. Does HDR Pre-Processing Improve the Accuracy of 3D Models Obtained by Means of two Conventional SfM-MVS Software Packages? The Case of the Corral del Veleta Rock Glacier

    Directory of Open Access Journals (Sweden)

    Álvaro Gómez-Gutiérrez

    2015-08-01

    Full Text Available The accuracy of different workflows using Structure-from-Motion and Multi-View-Stereo techniques (SfM-MVS is tested. Twelve point clouds of the Corral del Veleta rock glacier, in Spain, were produced with two different software packages (123D Catch and Agisoft Photoscan, using Low Dynamic Range images and High Dynamic Range compositions (HDR for three different years (2011, 2012 and 2014. The accuracy of the resulting point clouds was assessed using benchmark models acquired every year with a Terrestrial Laser Scanner. Three parameters were used to estimate the accuracy of each point cloud: the RMSE, the Cloud-to-Cloud distance (C2C and the Multiscale-Model-to-Model comparison (M3C2. The M3C2 mean error ranged from 0.084 m (standard deviation of 0.403 m to 1.451 m (standard deviation of 1.625 m. Agisoft Photoscan overcome 123D Catch, producing more accurate and denser point clouds in 11 out 12 cases, being this work, the first available comparison between both software packages in the literature. No significant improvement was observed using HDR pre-processing. To our knowledge, this is the first time that the geometrical accuracy of 3D models obtained using LDR and HDR compositions are compared. These findings may be of interest for researchers who wish to estimate geomorphic changes using SfM-MVS approaches.

  5. 中医方剂数据库文本挖掘数据预处理的尝试%An Attempt on Data Preprocessing for Text Mining in TCM Prescription Database

    Institute of Scientific and Technical Information of China (English)

    吴磊; 李舒

    2015-01-01

    目的:针对中医方剂数据挖掘需要提出一套以数据清洗为主的数据预处理方法,使数据规范、准确和有序,利于后续处理。方法通过检索技术,在方剂数据库中获取文本数据源,将非规范化的数据通过辅助词群行处理、正则表达式替换、异名处理等步骤进行清洗,改进数据质量。结果在中国方剂数据库共检索到1758条记录,在方剂现代应用数据库共检索到91条记录。源文本数据经预处理后共得到有效记录6913味药,可成功导入相关信息挖掘系统进行方剂名称和中药名词的信息抽取。结论本方法适用于基于中医方剂数据库的文本挖掘和知识发现,可成功对源文本数据实施清洗,得到标准统一、无噪声的数据,实现所需方药信息的有效抽取,可为中医方剂文本型数据信息分析与挖掘研究提供有益的借鉴。%Objective To propose a set of data preprocessing method based on data cleaning for TCM prescription database;To make data more standard, accurate and orderly, and convenient for follow-up processing. Methods The text data source was retrieved from prescription databases by bibliographic searching techniques. Non-normalized data were processed through steps followed by auxiliary word group line processing, regular expression substitution, and synonyms processing, with a purpose to improve data quality. Results Totally 1758 effective records were retrieved from TCM prescription database, and 91 records were retrieved from prescription modern application database. 6913 effective Chinese herbal medicines were retrieved after preprocessing, which can be successfully imported into relevant information mining system, and information about prescription and herb names can be extracted. Conclusion This method is applicable for text mining and knowledge discovery in TCM prescription database. It can successfully implement data cleaning for source text data, get

  6. The effects to the touch DNA extraction by the magnetic beads method with three preprocess protocols%人体接触检材前处理方式对磁珠法提取DNA效果的影响

    Institute of Scientific and Technical Information of China (English)

    杨电; 刘超; 徐曲毅; 李越; 刘宏

    2011-01-01

    Objective To compare the touch DNA extraction effects by the magnetic beads method with three preprocess protocols. Methods DNA were extracted from 10 cigarette butts, 10 toothbrushes and 10 gloves by DNA IQ magnetic beads method respectively after 951, 701 direct lysis and digested with TNE, SDS and proteinase K. The comparison were performed with DNA amounts, IPC CT and the typing results of Sinofiler system. Results The value of IPC CT for all DNA extracted by the magnetic beads methods after three preprocess protocols was between 26. 63 and 27. 19, which meaned high purity. Digesting samples with TNE, SDS and proteinase K before the magnetic beads purification yielded more DNA than using direct lysis protocols, and the success rate of STR typing by the digesting protocol was accordingly higher than that of the direct lysis protocols. Whether 95^ or 701 direct lysis treatment, however, no significant difference was observed in both DNA yield and the success rate of STR typing. Conclusion The success rate of STR typing for touch DNA samples can be increased by the digesting protocol before the magnetic beads purification.%目的 比较3种常见的接触检材前处理方式对磁珠法提取DNA效果的影响.方法 收集烟蒂、牙刷、纱线手套各10份;分别采用95℃、70℃直接裂解和TNE、SDS、PK预消化方式进行前处理,再用磁珠法提取纯化DNA,并进行DNA定量,统计提取的接触DNA量和IPC CT值;同时用Sinofiler复合扩增系统进行STR分型检测.结果 3种方法前处理后用磁珠提取的DNA纯度均较高,IPC CT值在26.63~ 27.19之间.用预消化法获得的DNA量高于裂解法,而95℃裂解与70%裂解方法提取的DNA量无显著性差异.STR扩增检测结果亦表明,采用预消化法处理的样品STR分型成功率高于裂解法,95℃与70℃裂解方法处理的样品STR分型成功率无显著性差异.结论 人体接触检材采用预消化磁珠法提取DNA,有助于提高STR检验成功率.

  7. A research and application of image enhancement algorithm in license plate image preprocessing%图像增强算法在车牌图像预处理中的研究与应用

    Institute of Scientific and Technical Information of China (English)

    江治国; 章飞

    2012-01-01

    The license plate image quality collected by camera in fog scenes is lower,which have a greater impact on image preprocessing in the license plate recognition system,resulting in being less effective in license plate location and identification.A simulation experiment was performed by using color restoration of multi-scale Retinex algorithm and image dehazing algorithms of dark channel priority to enhance actual image based on actual license plate image acquisition in fog.Experimental results show that this method can improve the contrast of the image and filters the fog well so that the better results are achieved in license plate location and recognition.%雾天场景下摄像头采集到的车牌图像质量较低,在车牌识别系统中对图像预处理产生较大影响,从而造成车牌定位和识别效果较差。以雾天环境下实际采集的车牌图像为例,分别采用带色彩恢复的多尺度Retinex算法和暗原色先验去雾算法对实际图像增强进行仿真实验。实验结果表明,利用暗原色先验去雾算法对图像进行增强处理,能够提高图像的对比度,实现去雾效果,从而使车牌定位和车牌识别达到更好的效果。

  8. Image Recognition Preprocess of Insect Pests of Stored Grain Based on Improved Algorithm%基于改进算法的储粮害虫图像识别预处理

    Institute of Scientific and Technical Information of China (English)

    刘丽娟; 刘仲鹏

    2014-01-01

    引入图像预处理及模式识别技术,实现储粮害虫图像的预处理。结合害虫图像的特征,对传统的灰度化方法进行改进,以HSI的变换方法对图像进行灰度化处理;通过直方图均衡化调整图像的灰度间距,提升图像的对比度;对传统的滤波方法进行优化,以方向滤波算法保护图像边缘,滤除图像噪声;以FCM分割算法进行图像分割,获得害虫图像的主要特征,从而实现了对目标图像的去噪、增强,为储粮害虫进一步的智能化识别与处理打下基础。%The image pretreatment of pests of grain stored insect was realized by introducing image preprocessing and pattern recognition technology. Combining with the feature of insect images, the traditional gray processing was improved to transform method of HSI to gray scale image processing. The image gray level spacing was adjusted through histogram equalization to enhance the contrast grade of image. The traditional filtering methods was optimized to direction filtering algorithm protecting the image edge. The image noise was filtered. With FCM segmentation algorithm for image segmentation, the main characteris-tics of the pest images were acquired. This, the target image denoising and enhancement was realized. A good foundation for intellectualization and processing of pests in further recognition was laid.

  9. 斜井有杆泵数据挖掘系统中的预处理设计%Preprocessing Design in Data Mining System for Sucker-rod Pumping System in Inclined Well

    Institute of Scientific and Technical Information of China (English)

    高书香

    2012-01-01

    利用数据挖掘可以优化有杆泵系统的生产管理方案,其数据挖掘目标值宜定为单井生产成本。挖掘数据应是包括油层基础数据、井眼轨迹数据、井下流体数据、井下工具数据、井下作业数据、地面设备数据、日常生产数据等7大类数据在内的所有相关数据。在数据预处理设计中,应进行数据的标准化处理、过滤噪声数据、数据单位的规范化处理、表达术语的统一化处理、成本量化处理等工作。%Data mining system can obtain optimal production management schemes, and the goal may be production cost of a single well. The data can be classified into 7 kinds, such as basic data of the formation, data of the hole trajectory, data of the downhole fluid, data of the downhole tools, data of downhole operation, data of surface equipment, and daily production data. Standardization, filtrating,and unified term of data should be designed in preprocessing in data mining system.

  10. Practical Secure Computation with Pre-Processing

    DEFF Research Database (Denmark)

    Zakarias, Rasmus Winther

    are implemented in practice and show state of the art performance for the Oblivious AES bench- mark application. We do 680 AES circuits in parallel within 3 seconds, resulting in an amortized execution time of 4ms per AES block. The latency of 3 seconds is hard to cope with in practical scenarios...

  11. Comparing Binaural Pre-processing Strategies II

    Directory of Open Access Journals (Sweden)

    Regina M. Baumgärtel

    2015-12-01

    Full Text Available Several binaural audio signal enhancement algorithms were evaluated with respect to their potential to improve speech intelligibility in noise for users of bilateral cochlear implants (CIs. 50% speech reception thresholds (SRT50 were assessed using an adaptive procedure in three distinct, realistic noise scenarios. All scenarios were highly nonstationary, complex, and included a significant amount of reverberation. Other aspects, such as the perfectly frontal target position, were idealized laboratory settings, allowing the algorithms to perform better than in corresponding real-world conditions. Eight bilaterally implanted CI users, wearing devices from three manufacturers, participated in the study. In all noise conditions, a substantial improvement in SRT50 compared to the unprocessed signal was observed for most of the algorithms tested, with the largest improvements generally provided by binaural minimum variance distortionless response (MVDR beamforming algorithms. The largest overall improvement in speech intelligibility was achieved by an adaptive binaural MVDR in a spatially separated, single competing talker noise scenario. A no-pre-processing condition and adaptive differential microphones without a binaural link served as the two baseline conditions. SRT50 improvements provided by the binaural MVDR beamformers surpassed the performance of the adaptive differential microphones in most cases. Speech intelligibility improvements predicted by instrumental measures were shown to account for some but not all aspects of the perceptually obtained SRT50 improvements measured in bilaterally implanted CI users.

  12. Three modes of spatiotemporal preprocessing by eyes

    NARCIS (Netherlands)

    Hateren, J.H. van

    1993-01-01

    Optimal spatiotemporal filters for early vision were computed as a function of signal-to-noise ratio (SNR) and α, a parameter defined as the ratio of the width of the probability distribution of velocities as perceived by the naturally behaving animal, and the characteristic velocity of the photorec

  13. Automated preprocessing of spaceborne SAR data

    Science.gov (United States)

    Curlander, J. C.; Wu, C.; Pang, A.

    1982-01-01

    An efficient algorithm has been developed for estimation of the echo phase delay in spaceborne synthetic aperture radar (SAR) data. This algorithm utilizes the spacecraft ephemeris data and the radar echo data to produce estimates of two parameters: (1) the centroid of the Doppler frequency spectrum f(d) and (2) the Doppler frequency rate. Results are presented from tests conducted with Seasat SAR data. The test data indicates that estimation accuracies of 3 Hz for f(d) and 0.3 Hz/sec for the Doppler frequency rate are attainable. The clutterlock and autofocus techniques used for estimation of f(d) and the Doppler frequency rate, respectively are discussed and the algorithm developed for optimal implementation of these techniques is presented.

  14. Characterization of advanced preprocessed materials (Hydrothermal)

    Energy Technology Data Exchange (ETDEWEB)

    Rachel Emerson; Garold Gresham

    2012-09-01

    The initial hydrothermal treatment parameters did not achieve the proposed objective of this effort; the reduction of intrinsic ash in the corn stover. However, liquid fractions from the 170°C treatments was indicative that some of the elements routinely found in the ash that negatively impact the biochemical conversion processes had been removed. After reviewing other options for facilitating ash removal, sodium-citrate (chelating agent) was included in the hydrothermal treatment process, resulting in a 69% reduction in the physiological ash. These results indicated that chelation –hydrothermal treatment is one possible approach that can be utilized to reduce the overall ash content of feedstock materials and having a positive impact on conversion performance.

  15. Retinal image analysis: preprocessing and feature extraction

    Energy Technology Data Exchange (ETDEWEB)

    Marrugo, Andres G; Millan, Maria S, E-mail: andres.marrugo@upc.edu [Grup d' Optica Aplicada i Processament d' Imatge, Departament d' Optica i Optometria Univesitat Politecnica de Catalunya (Spain)

    2011-01-01

    Image processing, analysis and computer vision techniques are found today in all fields of medical science. These techniques are especially relevant to modern ophthalmology, a field heavily dependent on visual data. Retinal images are widely used for diagnostic purposes by ophthalmologists. However, these images often need visual enhancement prior to apply a digital analysis for pathological risk or damage detection. In this work we propose the use of an image enhancement technique for the compensation of non-uniform contrast and luminosity distribution in retinal images. We also explore optic nerve head segmentation by means of color mathematical morphology and the use of active contours.

  16. 不同前处理方法测定植物油中重金属的研究%Determination of heavy metals content in vegetable oils with different preprocessing methods

    Institute of Scientific and Technical Information of China (English)

    倪张林; 汤富彬; 屈明华

    2012-01-01

    考察了湿法消解、干法灰化、微波消解3种不同前处理方法对植物油中重金属含量测定结果的影响.结果表明,硝酸-过氧化氢-高氯酸湿法消解为植物油前处理的最佳方法.采用石墨炉原子吸收光谱法测定植物油中的铅、镉、铬,线性范围内相关性良好,相关系数大于0.99,加标回收率分别为88.2%、94.5%、84.9%,RSD分别为8.5%、5.3%、9.1%,检出限分别为6.21、0.89、1.99μg/kg.该方法耗时短、操作简单,应用于植物油中重金属含量的测定,结果令人满意.%Effects of three preprocessing methods, including wet digestion,ashing digestion and microwave digestion on the determination of heavy metals in vegetable oil were studied. After comparison, wet digestion method with HNO3 - H2O2 - HC1O4 was the best, and graphite aurnace atomic absorption spectrome-try(GFAAS)was used to determine lead,cadmium and chromium in vegetable oil. This method had a good linear relation(r>0.99). The recovery rate of lead,cadmium and chromium were 88.2% ,94.5% and 84.9% .respectively, the relative standard deviation were 8. 5% ,5. 3% and 9. 1% respectively, and the detection limits were 6.21,0.89 and 1.99 jig/kg,respectively.

  17. A Partial Order Reduction Based Method for Big Data Preprocessing in Smart Grid Environment%基于偏序约简的智能电网大数据预处理方法

    Institute of Scientific and Technical Information of China (English)

    李刚; 焦谱; 文福拴; 宋雨; 尚金成; 何洋

    2016-01-01

    针对电力一次系统和电力信息系统的数据所具有的多维度、时空混杂等特征,建立了一种基于偏序约简的大数据属性约简预处理方法.该方法综合利用了MapReduce的可并行化优点,着眼于并发事件间的独立性,可以满足电力大数据属性维度与约简方面的覆盖要求.最后,分别以某光伏发电系统监测数据、变压器故障诊断数据和智能变电站通信系统实时性与可靠性预测数据为例,对属性约简进行模拟计算,并通过Hadoop平台进行测试,表明所提出的电力大数据属性约简方法性能优良.%By taking into account the multi-dimensional,hybrid spatial and temporal characteristics of data from a power system and/or a power information system,a big data attribute reduction preprocessing method based on partial order reduction is presented.In the proposed method,the parallel advantage of MapReduce is employed and the independence of concurrent events is focused on,making it possible to meet the coverage requirements of big data attribute dimension and reduction in a power system.Finally,the proposed attribute reducing method is simulated by some examples including the monitoring data of a photovoltaic(PV)power generation system,the fault diagnosis data of a transformer,and real-time and reliability data of the communication system of an intelligent substation,with good performance observed through a Hadoop platform.

  18. 基于直方图预处理与BF算法的含噪图像分割%Noise-polluted image segmentation based on histogram preprocessing and BF algorithm

    Institute of Scientific and Technical Information of China (English)

    柳新妮; 马苗

    2014-01-01

    Image noise may have a direct influence on the quality of image segmentation. In order to distinguish targets from noise-polluted image quickly and accurately, this paper proposes a method based on histogram preprocessing and BF algorithm for noisy image segmentation. In this method, discrete wavelet transform is used to suppress the noise in the image firstly. Secondly, the histogram feature of the denoised image is analyzed to shrink the distribution range of the optimal threshold. Then, two-dimensional Otsu is selected as the segmentation objective function, and bacterial foraging algorithm is employed to find the optimal threshold in parallel. Experimental results show that this method performs better than some other methods based on swarm intelligence like genetic algorithm, artificial fish swarm algorithm as far as convergence speed, stability and segmentation effect are concerned.%图像中的噪声会直接影响图像分割质量,为快速、准确地识别含噪图像中的目标,提出一种基于直方图预处理与BF算法的含噪图像分割方法。该方法通过小波变换抑制图像中的噪声,分析增强图像的直方图特点以缩小分割阈值的分布范围,以二维最大类间方差为原则设计分割目标函数,利用BF算法快速搜索最优分割阈值。实验结果表明,该方法在收敛速度、稳定性和分割效果三个方面均优于基于遗传算法、人工鱼群算法等其他群体智能的分割方法。

  19. 海量道路路面测量数据的若干预处理方法研究%Methods of signal pre-processing with massive road surface measurement data

    Institute of Scientific and Technical Information of China (English)

    段虎明; 谢飞; 张开斌; 马颖; 石锋

    2011-01-01

    回顾了道路路面数据的测量系统的构成和测试方法.针对海量的道路路面试验数据,将若干信号处理的预处理方法,包括粗大误差的甄别与剔除、信号趋势项的提取、分段试验数据的平滑过渡连接、路面数据随机性检验和车速异常时刻的处理等应用于海量道路路面数据的分析处理中,并进行了相应的改进.在分析这些预处理方法原理的基础上,使用实际采集的试验路面数据作为例子,充分显示了这些方法的处理效果.实例分析结果表明,这些信号预处理方法用来处理和修正道路路面试验数据,使用简单方便,处理效果明显,可在道路路面数据处理及其它工程振动信号预处理中推广应用.%The components of road surface measurement system and measurement methods of road surface profile in automotive tests were reviewed. Based on the massive road surface measurement data, some signal pre-processing methods were proposed, including the identification and correction of abnormal errors, the extraction of signal trends, the smooth connection of segmental data, the randomness test of measurement data and the data processing at the moment of speed anomaly. After the introduction of principles and algorithms of these methods, several actually measured road data were taken as examples, and show that the results of calculation and processing are significant, and the methods are simple and of obvious effect. They can be widely applied to road surface data processing as well as to other engineering vibration signals processing.

  20. Project GRACE A grid based search tool for the global digital library

    CERN Document Server

    Scholze, Frank; Vigen, Jens; Prazak, Petra; The Seventh International Conference on Electronic Theses and Dissertations

    2004-01-01

    The paper will report on the progress of an ongoing EU project called GRACE - Grid Search and Categorization Engine (http://www.grace-ist.org). The project participants are CERN, Sheffield Hallam University, Stockholm University, Stuttgart University, GL 2006 and Telecom Italia. The project started in 2002 and will finish in 2005, resulting in a Grid based search engine that will search across a variety of content sources including a number of electronic thesis and dissertation repositories. The Open Archives Initiative (OAI) is expanding and is clearly an interesting movement for a community advocating open access to ETD. However, the OAI approach alone may not be sufficiently scalable to achieve a truly global ETD Digital Library. Many universities simply offer their collections to the world via their local web services without being part of any federated system for archiving and even those dissertations that are provided with OAI compliant metadata will not necessarily be picked up by a centralized OAI Ser...

  1. Grace: a Cross-platform Micromagnetic Simulator On Graphics Processing Units

    CERN Document Server

    Zhu, Ru

    2014-01-01

    A micromagnetic simulator running on graphics processing unit (GPU) is presented. It achieves significant performance boost as compared to previous central processing unit (CPU) simulators, up to two orders of magnitude for large input problems. Different from GPU implementations of other research groups, this simulator is developed with C++ Accelerated Massive Parallelism (C++ AMP) and is hardware platform compatible. It runs on GPU from venders include NVidia, AMD and Intel, which paved the way for fast micromagnetic simulation on both high-end workstations with dedicated graphics cards and low-end personal computers with integrated graphics card. A copy of the simulator software is publicly available.

  2. Fast Pre-processing of MERSI Data on FY-3 Meteorological Satellite Using IDL%FY-3气象卫星MERSI数据快速预处理的IDL实现

    Institute of Scientific and Technical Information of China (English)

    杨何群; 周红妹; 尹球; 韩涛; 葛伟强

    2012-01-01

    As a source of new satellite data,Medium Resolution Spectral Imager (MERSI) on board the second Generation Polar-Orbiting Meteorological Satellite of China (FY-3) is not only featured with its finer surface-observing ability,but also its multi-channel and huge amount,which takes too much time and frequent I/O operation to correct by Commercial RS software. This paper proposes a flowchart to fast preprocess FY-3/MERSI data, the key steps of that are, according to the study area's shapefile and FY-3/ MERSI self-positioning data,the corresponding region images of all 20 channels are georeferenced using ge- ometric correction algorithm based on triangular network, followed by radiometric calibration, solar elevation angle revision,and so on. The series of procedures are implemented based on IDL program,and realize module interface for user interaction. Test results show that the algorithms in this module take less memory and improve the efficiency of FY-3/MERSI data processing greatly,for an area of about 450 000 km^2 ,it takes only 450 s for all channels,significantly faster than using geographic lookup table (GLT) algorithm in ENVI software to achieve geometric correction,which takes over 30 min for only one channel. Besides, the quality and geographic accuracy of the corrected images are better. The development of this module greatly enhances FY-3/MERSI data processing efficiency,and it completes the prerequisite work for FY-3/ MERSI application in thermal environment, snow and other various fields.%作为新型卫星数据源,FY-3/MERSI(风云三号中分辨率光谱成像仪)影像的快速预处理方法与模块目前较少见。采用基于三角网的几何校正算法,根据研究区Shapefile文件和FY-3/MERSI自带定位数据提取处理区域,实现对应区域全部20个通道几何校正,并利用后向映射重采样输出各通道图像纠正后像元点DN值,随后进行辐射定标、太阳高度角订正等预处理

  3. Research on Preprocessing Algorithm for Differential Polarization Spectrum of Oil Spills on Water%水面油污染物差分偏振光谱信号预处理算法研究

    Institute of Scientific and Technical Information of China (English)

    袁越明; 方勇华; 崔方晓; 李大成

    2011-01-01

    When the oil spills on water are passively detected by differential polarization Fourier transform infrared (FTIR) spectroscopy in the band of 3 ~ 5 μm, the measured differential polarization spectrum will be a mixed spectrum of the strong atmospheric absorption signal and the minute signal of oil pollutant. It brings complexity to the identification of oil spills on water. In addition, the thickness distribution and the surface coarseness of oil film on water change with the influence of environmental factors and the tension of oil film. This makes the content of effective spectral information, which is contained in the differential polarization spectrum of oil spills on water, change continuously during the process of measurement. Using this characteristic, an algorithm based on a fast fixed-point algorithm for principal component analysis (FastPCA) is proposed for preprocessing the differential polarization spectrum of oil pollutant on water. Experimental results show that this algorithm can separate the spectral information of oil pollutant from the differential polarization spectrum with strong atmospheric absorption signal. The reconstructed spectral characteristic signal of oil pollutant can be used for the further qualitative and quantitative analyses.%采用差分偏振光谱法在3~5 μm波段对水面溢油污染物进行被动遥感时,所测差分偏振光谱是含有强大气吸收光谱信号与油污染物目标弱光谱信号的混合谱,这给油污染物光谱特征识别带来了困难.另外,受环境因素以及油膜自身张力影响,水面油膜厚度分布以及油膜表面粗糙度在测量过程中发生变化,从而使得连续测量的差分偏振光谱中油污染物光谱信息含量存在不同.利用这一特点,基于固定点迭代的快速主成分分析算法FastPCA设计了水面溢油污染物差分偏振光谱信号预处理算法.实验结果表明,该算法可以有效地将水面油污染物目标光谱特征信息从

  4. Design and application of the pre-processing system on Shenzhen minute precipitation data%深圳分钟降水数据预处理系统设计与应用

    Institute of Scientific and Technical Information of China (English)

    许沛华; 陈正洪; 李磊; 郑慧

    2012-01-01

    The running of"Shenzhen partition storm intensity formula projection software"needs to get more types of maximum precipitation samples in the direfrent duration year by year from all the automatic weather stations in Shenzhen.As calculated volume and timeliness demanding,the pre-processing system on Shenzhen minute precipitation data was designed and developed,which has a "seamless connection" with the software.In this paper the system is introduced in detail about partition principle,the quality control on minute rainfall data,different duration sliding precipitation calculation,automatic determination of the different precipitation processes,sample selection,as well as function,process and design technique.Meanwhile,taking daily minute precipitation data at Shenzhen local station from 1961 to 2010 for example,the software was employed to calculate 8 maximum rainfall sequences including 9 durations for 5 minutes,10 minutes,15 minutes,20 minutes,30 minutes,45 minutes,60 minutes,90 minutes,120 minutes for every year.The results show that the system can meet the demands of Shenzhen partition storm intensity formula projections software.%"深圳分区暴雨强度公式推算软件"运行,需要获取其境内所有自动气象站逐年不同历时多个样本最大降水量资料,基于其计算量大、时效性要求高,设计开发了与之配套的"深圳分钟降水数据预处理系统",并实现与该软件的"无缝衔接"。从分区原则确定、分钟资料质量控制、不同历时滑动雨量(强)计算、不同降水过程自动判断、样本挑选,以及功能、流程、设计技术等方面,对该系统作详细介绍。同时,以深圳本站1961—2010年共50年逐日分钟降水资料为例,采用该软件计算出5、10、15、20、30、45、60、90、120 min共9个历时每年8个最大降水量序列,结果表明,该系统能满足"深圳分区暴雨强度公式推算软件"的要求。

  5. Honeywell R-150 modular automation system and its application to the preprocessing shop of a water works%HoneywellR-150自动化系统及其在水厂预处理工段中的应用

    Institute of Scientific and Technical Information of China (English)

    赵英凯

    2001-01-01

    This paper presents the structure,functions and features of the R-150 modular automation system and its application to the preprocessing shop of a water works. The configuration software of this system is used in windows 3.1. Any process chart can edited by it. There are also dynamic display images and dynamic data exchange available. Various control algorithm configurations can be performed and generated application software can be sent down to controllers. In short, this system has operated well in the preprocessing workshop.%本文介绍了R-150自动化系统的结构,功能与特点以及它在水厂预处理工段中的应用。该系统运行于Windows环境,可编辑任意流程图,并实现动态显示与动态数据交换(DDE),可进行各种控制算法组态,生成的应用软件可下装到控制器中,该系统在预处理工段中运行良好。

  6. Composición centesimal, contenido de sodio y aporte energético de productos pre-elaborados Proximate composition, sodium content and energy intake of pre-processed products

    Directory of Open Access Journals (Sweden)

    Margarita Olivera Carrión

    2012-03-01

    assistance programs. We selected 16 samples produced by five companies, some fortified with vitamins and minerals. They were classified as pre-processed for complete meals like stews (rice, noodles, lentils, peas, pasta (different types of noodles, premixes for polenta and others (sauces, puree, fillings, etc. We calculated the Daily Value (DV percentage following the portions established by manufacturers or MERCOSUR. Protein levels were 10-19% for stews and only 5 samples exceeded 20% of the DV. For fats, the values were 6-12% in stews, 2-4% in pasta (except a polenta with 12% and only 2 samples showed higher values (16-27%: the main source of fat used was through hydrogenated vegetable oils in 5 samples (4 stews, 1 filling. The EI was between 9-15% of the DV for both stew and pasta dishes. The sodium remained at the levels reported by the manufacturers and the DV % was particularly high in stews (39-54% and two premixes (42-63%. The contribution of sauces and puree was significant only in sodium: 12-15% of the DV. Considering the importance these products may have in the diet of the target population, the nutritional quality of the ingredients used in their formulations and the DV percentage of protein, fat and sodium, the content and quality of protein and fat should be increased and the salt eliminated or reduced, regardless of whether or not fortified. The study was conducted in 2007 and its publication was considered important because most products are still being marketed and their composition contained in the database SARA of the Ministry of Health of the Nation.

  7. 基于HDF4格式的MODIS 1B影像数据读取的算法研究%THE STUDY OF EXTRACTION MODIS LEVEL 1B IMAGE DATA BASED ON A HDF4 FILE

    Institute of Scientific and Technical Information of China (English)

    史磊; 张柯; 洪俊光

    2008-01-01

    本文从HFD4的文件格式和HDF软件库的实现原理出发,介绍了MODIS 1B影像数据.介绍了MODIS 1B影像数据中两种主要的数据对象SDS(科学数据集)和Vdata(虚拟数据).通过访问SDS的SD接口和Vdata的VS接口,结合HDF软件库实现对MODIS 1B数据的读取.

  8. THE METHOD AND APPLICATION OF EXTRACTION MODIS LEVEL 1B DATA BASED ON HDF FILE FORMAT WITH MATLAB%Matlab对基于HDF格式的MODIS 1B数据的提取方法与实现

    Institute of Scientific and Technical Information of China (English)

    陈林; 牛生杰; 仲凌志

    2006-01-01

    基于HDF文件格式的MODIS数据的应用越来越广泛,MODIS数据开发应用的前提是对MODIS 1B数据的提取.本文详细介绍了利用Matlab对HDF文件进行读写操作的过程,在此基础上给出了提取MODIS 1B数据的流程图,实现了对MODIS 1B数据的提取,为MODIS二级产品的开发打下了基础.

  9. Optimizing Community Detection Using the Pre-processing of Edge Weighted Based on Random Walk in Networks%网络社区发现优化:基于随机游走的边权预处理方法

    Institute of Scientific and Technical Information of China (English)

    刘阳; 季新生; 刘彩霞

    2013-01-01

    In the context of social network becomes more and more complicated and huge, it is difficult to improve the accuracy and performance of existing community detection algorithms only relying on the network topological features. Based on Markov random walk theory, this paper proposes a method of edge weighted pre-processing for optimizing community detection, models community structures how to influence on the complex network behaviors. According to the situation of multiple random walk traverses on the network links, the network edges weight is reset, and makes it as the network topology effective supplementary information to promote the network community structure defuzzification, thus the performance of the existing algorithms is improved for community detection. For a set of typical benchmark computer-generated networks and real-world network data sets, the experimental results show that the pre-processing method can effectively improve the accuracy and efficiency of some existing community detection algorithms.%在网络日趋复杂化、巨大化的背景下,仅依靠网络拓扑特征难以提高现有社区发现算法的精确度和性能。该文提出一种优化网络社区发现的边权预处理方法,基于马尔可夫随机游走理论建模社区结构对复杂网络行为的影响,根据多重随机游走对网络连接的遍历情况,重新衡量网络边权。预处理后的边权作为网络拓扑的有效补充信息,能够将网络社区结构去模糊化,从而改善现有算法的社区发现性能。对于一些典型的计算机生成网络和真实网络,经实验验证:该预处理方法能够有效提升现有部分社区发现算法的准确性和效率。

  10. Looking age-appropriate while growing old gracefully: A qualitative study of ageing and body image among older adults.

    Science.gov (United States)

    Jankowski, Glen S; Diedrichs, Phillippa C; Williamson, Heidi; Christopher, Gary; Harcourt, Diana

    2016-04-01

    Body dissatisfaction can be significantly detrimental to wellbeing. Little is known about older adults' body image, despite the fact that ageing causes unique bodily changes and that sociocultural pressures to resist these changes abound. We conducted six focus groups with a UK community sample of White British and South Asian older adults aged 65-92 years. Thematic analysis highlighted four themes: appearance indicates capability and identity; physical ability trumps appearance; felt pressures to age 'gracefully' while resisting appearance changes; and gender and cultural differences. These findings suggest that older adults' body image can have important implications for their wellbeing and merits researchers' attention.

  11. Seasonal changes in the European gravity field from GRACE: A comparison with superconducting gravimeters and hydrology model predictions

    DEFF Research Database (Denmark)

    Hinderer, J.; Andersen, Ole Baltazar; Lemoine, F.

    2006-01-01

    This paper is devoted to the investigation of seasonal changes of the Earth's gravity field from GRACE satellites and the comparison with surface gravity measurements in Europe from the Global Geodynamics Project (GGP) sub-network, as well as with recent hydrology models for continental soil......-derived and ground gravity changes due to continental hydrology is studied and we also compute the theoretical ratio of gravity versus radial displacement (in mu Gal/mm) involved in the hydrological loading process. The 'mean' value (averaged in time and in space over Europe) from hydrologic forward modeling...... is found to be close to - 1.0 mu Gal/mm and we show that this value can be explained by a strong low degree (n = 5-6) peak in the hydrology amplitude spectrum. The dominant time-variable signal from GRACE is found to be annual with an amplitude and a phase both of which are in fair agreement...

  12. Surface Subsidence Analysis by Multi-Temporal InSAR and GRACE: A Case Study in Beijing

    Directory of Open Access Journals (Sweden)

    Jiming Guo

    2016-09-01

    Full Text Available The aim of this study was to investigate the relationship between surface subsidence and groundwater changes. To investigate this relationship, we first analyzed surface subsidence. This paper presents the results of a case study of surface subsidence in Beijing from 1 August 2007 to 29 September 2010. The Multi-temporal Interferometric Synthetic Aperture Radar (multi-temporal InSAR technique, which can simultaneously detect point-like stable reflectors (PSs and distributed scatterers (DSs, was used to retrieve the subsidence magnitude and distribution in Beijing using 18 ENVISAT ASAR images. The multi-temporal InSAR-derived subsidence was verified by leveling at an accuracy better than 5 mm/year. Based on the verified multi-temporal InSAR results, a prominent uneven subsidence was identified in Beijing. Specifically, most of the subsidence velocities in the downtown area were within 10 mm/year, and the largest subsidence was detected in Tongzhou, with velocities exceeding 140 mm/year. Furthermore, Gravity Recovery and Climate Experiment (GRACE data were used to derive the groundwater change series and trend. By comparison with the multi-temporal InSAR-derived subsidence results, the long-term decreasing trend between groundwater changes and surface subsidence showed a relatively high consistency, and a significant impact of groundwater changes on the surface subsidence was identified. Additionally, the spatial distribution of the subsidence funnel was partially consistent with that of groundwater depression, i.e., the former possessed a wider range than the latter. Finally, the relationship between surface subsidence and groundwater changes was determined.

  13. CONFRONTATION SYSTEMES DE REPORTING ET COMMUNICATION FINANCIERE DES ENTREPRISES COTEES GRACE A UNE MESURE DE LEUR QUALITE

    OpenAIRE

    Cavelius, Florence

    2009-01-01

    International audience; The listed companies disclose information to their institutional investors through their internal reporting. This article suggests confronting, thanks to the use of indexes, the quality of the practices of financial communication and the systems of reporting for listed companies. It comes out from it differences in quality within the studied sample, which lead to point out a typology of practices. According to the cases, a different usefulness of information for the in...

  14. Global Mass Flux Solutions from GRACE: A Comparison of Parameter Estimation Strategies - Mass Concentrations Versus Stokes Coefficients

    Science.gov (United States)

    Rowlands, D. D.; Luthcke, S. B.; McCarthy J. J.; Klosko, S. M.; Chinn, D. S.; Lemoine, F. G.; Boy, J.-P.; Sabaka, T. J.

    2010-01-01

    The differences between mass concentration (mas con) parameters and standard Stokes coefficient parameters in the recovery of gravity infonnation from gravity recovery and climate experiment (GRACE) intersatellite K-band range rate data are investigated. First, mascons are decomposed into their Stokes coefficient representations to gauge the range of solutions available using each of the two types of parameters. Next, a direct comparison is made between two time series of unconstrained gravity solutions, one based on a set of global equal area mascon parameters (equivalent to 4deg x 4deg at the equator), and the other based on standard Stokes coefficients with each time series using the same fundamental processing of the GRACE tracking data. It is shown that in unconstrained solutions, the type of gravity parameter being estimated does not qualitatively affect the estimated gravity field. It is also shown that many of the differences in mass flux derivations from GRACE gravity solutions arise from the type of smoothing being used and that the type of smoothing that can be embedded in mas con solutions has distinct advantages over postsolution smoothing. Finally, a 1 year time series based on global 2deg equal area mascons estimated every 10 days is presented.

  15. Seasonal changes in the European gravity field from GRACE: A comparison with superconducting gravimeters and hydrology model predictions

    DEFF Research Database (Denmark)

    Hinderer, J.; Andersen, Ole Baltazar; Lemoine, F.

    2006-01-01

    is found to be close to - 1.0 mu Gal/mm and we show that this value can be explained by a strong low degree (n = 5-6) peak in the hydrology amplitude spectrum. The dominant time-variable signal from GRACE is found to be annual with an amplitude and a phase both of which are in fair agreement......This paper is devoted to the investigation of seasonal changes of the Earth's gravity field from GRACE satellites and the comparison with surface gravity measurements in Europe from the Global Geodynamics Project (GGP) sub-network, as well as with recent hydrology models for continental soil......-derived and ground gravity changes due to continental hydrology is studied and we also compute the theoretical ratio of gravity versus radial displacement (in mu Gal/mm) involved in the hydrological loading process. The 'mean' value (averaged in time and in space over Europe) from hydrologic forward modeling...

  16. Improving Human Effectiveness Through Embedded Virtual Simulation (Amelioration de l’efficacite humaine grace a la simulation virtuelle integree)

    Science.gov (United States)

    2014-01-01

    environment and direct their activity towards achieving goals (Russell & Norvig , 2003). In computer-based tutors, learning agents observe and act upon... Norvig , P. (2003). Artificial Intelligence: A Modern Approach (2nd ed.). Upper Saddle River, New Jersey: Prentice Hall, ISBN 0-13-790395-2

  17. Experimental variability and data pre-processing as factors affecting the discrimination power of some chemometric approaches (PCA, CA and a new algorithm based on linear regression) applied to (+/-)ESI/MS and RPLC/UV data: Application on green tea extracts.

    Science.gov (United States)

    Iorgulescu, E; Voicu, V A; Sârbu, C; Tache, F; Albu, F; Medvedovici, A

    2016-08-01

    The influence of the experimental variability (instrumental repeatability, instrumental intermediate precision and sample preparation variability) and data pre-processing (normalization, peak alignment, background subtraction) on the discrimination power of multivariate data analysis methods (Principal Component Analysis -PCA- and Cluster Analysis -CA-) as well as a new algorithm based on linear regression was studied. Data used in the study were obtained through positive or negative ion monitoring electrospray mass spectrometry (+/-ESI/MS) and reversed phase liquid chromatography/UV spectrometric detection (RPLC/UV) applied to green tea extracts. Extractions in ethanol and heated water infusion were used as sample preparation procedures. The multivariate methods were directly applied to mass spectra and chromatograms, involving strictly a holistic comparison of shapes, without assignment of any structural identity to compounds. An alternative data interpretation based on linear regression analysis mutually applied to data series is also discussed. Slopes, intercepts and correlation coefficients produced by the linear regression analysis applied on pairs of very large experimental data series successfully retain information resulting from high frequency instrumental acquisition rates, obviously better defining the profiles being compared. Consequently, each type of sample or comparison between samples produces in the Cartesian space an ellipsoidal volume defined by the normal variation intervals of the slope, intercept and correlation coefficient. Distances between volumes graphically illustrates (dis)similarities between compared data. The instrumental intermediate precision had the major effect on the discrimination power of the multivariate data analysis methods. Mass spectra produced through ionization from liquid state in atmospheric pressure conditions of bulk complex mixtures resulting from extracted materials of natural origins provided an excellent data

  18. Data Security by Preprocessing the Text with Secret Hiding

    Directory of Open Access Journals (Sweden)

    Ajit Singh

    2012-06-01

    Full Text Available With the advent of the Internet, an open forum, the massive increase in the data travel across networkmake an issue for secure transmission. Cryptography is the term that involves many encryption method to make data secure. But the transmission of the secure data is an intricate task. Steganography here comes with effect of transmission without revealing the secure data. The research paper provide the mechanism which enhance the security of data by using a crypto+stegano combination to increase the security level without knowing the fact that some secret data is sharing across networks. In the firstphase data is encrypted by manipulating the text using the ASCII codes and some random generated strings for the codes by taking some parameters. Steganography related to cryptography forms the basisfor many data hiding techniques. The data is encrypted using a proposed approach and then hide the message in random N images with the help of perfect hashing scheme which increase the security of the message before sending across the medium. Thus the sending and receiving of message will be safe and secure with an increased confidentiality.

  19. High speed preprocessing in real time telemetry systems

    Science.gov (United States)

    Strock, O. J.; O'Brien, Michael

    A versatile high-speed preprocessor, the EMR 8715, is described which is used as a closed-coupled input device for the host computer in a telemetry system. Much of the data and time merging, number conversion, floating-point processing, and data distribution are performed by the system, reducing the host load. The EMR 8715 allows a choice of serial processing, parallel processing, or a combination of the two, on a measurement-by-measurement basis.

  20. Pre-process desilication of wheat straw with citrate

    DEFF Research Database (Denmark)

    Le, Duy Michael; Sorensen, Hanne R.; Meyer, Anne S.

    2017-01-01

    Effects of treatment time, citrate concentration, temperature, and pH on Si extraction from wheat straw prior to hydrothermal pretreatment were investigated for maximising Si removal and biomass recovery before biomass refining. With citrate, an almost linear negative correlation between Si conte...... the enzymatic cellulose hydrolysis, neither negatively nor positively....

  1. Preprocessing: A Step in Automating Early Detection of Cervical Cancer

    CERN Document Server

    Das, Abhishek; Bhattacharyya, Debasis

    2011-01-01

    Uterine Cervical Cancer is one of the most common forms of cancer in women worldwide. Most cases of cervical cancer can be prevented through screening programs aimed at detecting precancerous lesions. During Digital Colposcopy, colposcopic images or cervigrams are acquired in raw form. They contain specular reflections which appear as bright spots heavily saturated with white light and occur due to the presence of moisture on the uneven cervix surface and. The cervix region occupies about half of the raw cervigram image. Other parts of the image contain irrelevant information, such as equipment, frames, text and non-cervix tissues. This irrelevant information can confuse automatic identification of the tissues within the cervix. Therefore we focus on the cervical borders, so that we have a geometric boundary on the relevant image area. Our novel technique eliminates the SR, identifies the region of interest and makes the cervigram ready for segmentation algorithms.

  2. Preprocessing for Automating Early Detection of Cervical Cancer

    CERN Document Server

    Das, Abhishek; Bhattacharyya, Debasis

    2011-01-01

    Uterine Cervical Cancer is one of the most common forms of cancer in women worldwide. Most cases of cervical cancer can be prevented through screening programs aimed at detecting precancerous lesions. During Digital Colposcopy, colposcopic images or cervigrams are acquired in raw form. They contain specular reflections which appear as bright spots heavily saturated with white light and occur due to the presence of moisture on the uneven cervix surface and. The cervix region occupies about half of the raw cervigram image. Other parts of the image contain irrelevant information, such as equipment, frames, text and non-cervix tissues. This irrelevant information can confuse automatic identification of the tissues within the cervix. Therefore we focus on the cervical borders, so that we have a geometric boundary on the relevant image area. Our novel technique eliminates the SR, identifies the region of interest and makes the cervigram ready for segmentation algorithms.

  3. Architecture and performances of the AGILE Telemetry Preprocessing System (TMPPS)

    Science.gov (United States)

    Trifoglio, M.; Bulgarelli, A.; Gianotti, F.; Lazzarotto, F.; Di Cocco, G.; Fuschino, F.; Tavani, M.

    2008-07-01

    AGILE is an Italian Space Agency (ASI) satellite dedicated to high energy Astrophysics. It was launched successfully on 23 April 2007, and it has been operated by the AGILE Ground Segment, consisting of the Ground Station located in Malindi (Kenia), the Mission Operations Centre (MOC) and the AGILE Data Centre (ADC) established in Italy, at Telespazio in Fucino and at the ASI Science Data Centre (ASDC) in Frascati respectively. Due to the low equatorial orbit at ~ 530 Km. with inclination angle of ~ 2.5°, the satellite passes over the Ground Station every ~ 100'. During the visibility period of . ~ 12', the Telemetry (TM) is down linked through two separated virtual channels, VC0 and VC1. The former is devoted to the real time TM generated during the pass at the average rate of 50 Kbit/s and is directly relayed to the Control Centre. The latter is used to downlink TM data collected on the satellite on-board mass memory during the non visibility period. This generates at the Ground Station a raw TM file of up to 37 MByte. Within 20' after the end of the contact, both the real time and mass memory TM arrive at ADC through the dedicated VPN ASINet. Here they are automatically detected and ingested by the TMPPS pipeline in less than 5 minutes. The TMPPS archives each TM file and sorts its packets into one stream for each of the different TM layout. Each stream is processed in parallel in order to unpack the various telemetry field and archive them into suitable FITS files. Each operation is tracked into a MySQL data base which interfaces the TMPPS pipeline to the rest of the scientific pipeline running at ADC. In this paper the architecture and the performance of the TMPPS will be described and discussed.

  4. Parallelized LEDAPS method for Remote Sensing Preprocessing Based on MPI

    Institute of Scientific and Technical Information of China (English)

    Xionghua; CHEN; Xu; ZHANG; Ying; GUO; Yong; MA; Yanchen; YANG

    2013-01-01

    Based on Landsat image,the Landsat Ecosystem Disturbance Adaptive Processing System(LEDAPS)uses radiation change detection method for image processing and offers the surface reflectivity products for ecosystem carbon sequestration and carbon reserves.As the accumulation of massive remote sensing data especially for the Landsat image,the traditional serial LEDAPS for image processing has a long cycle that make a lot of difficulties in practical application.For this problem,this paper design a high performance parallel LEDAPS processing method based on MPI.The results not only aimed to improve the calculation speed and save computing time,but also considered the load balance between the flexibly extended computing nodes.Results show that the highest speed ratio of parallelized LEDAPS reached 7.37 when the number of MPI process is 8.It effectively improves the ability of LEDAPS to handle massive remote sensing data and reduces the forest carbon stocks calculation cycle by using the remote sensing images.

  5. Object localization based on smoothing preprocessing and cascade classifier

    Science.gov (United States)

    Zhang, Xingfu; Liu, Lei; Zhao, Feng

    2017-01-01

    An improved algorithm for image location is proposed in this paper. Firstly, the image is smoothed and the partial noise is removed. Then use the cascade classifier to train a template. Finally, the template is used to detect the related images. The advantage of the algorithm is that it is robust to noise and the proportion of the image is not sensitive to change. At the same time, the algorithm also has the advantages of fast computation speed. In this paper, a real truck bottom picture is chosen as the experimental object. Images of normal components and faulty components are all included in the image sample. Experimental results show that the accuracy rate of the image is more than 90 percent when the grade is more than 40. So we can draw a conclusion that the algorithm proposed in this paper can be applied to the actual image localization project.

  6. Preprocessing Techniques for Image Mining on Biopsy Images

    Directory of Open Access Journals (Sweden)

    Ms. Nikita Ramrakhiani

    2015-08-01

    Full Text Available Biomedical imaging has been undergoing rapid technological advancements over the last several decades and has seen the development of many new applications. A single Image can give all the details about an organ from the cellular level to the whole-organ level. Biomedical imaging is becoming increasingly important as an approach to synthesize, extract and translate useful information from large multidimensional databases accumulated in research frontiers such as functional genomics, proteomics, and functional imaging. To fulfill this approach Image Mining can be used. Image Mining will bridge this gap to extract and translate semantically meaningful information from biomedical images and apply it for testing and detecting any anomaly in the target organ. The essential component in image mining is identifying similar objects in different images and finding correlations in them. Integration of Image Mining and Biomedical field can result in many real world applications

  7. COMPARATIVE ANALYSIS OF SATELLITE IMAGE PRE-PROCESSING TECHNIQUES

    Directory of Open Access Journals (Sweden)

    T. Sree Sharmila

    2013-01-01

    Full Text Available Satellite images are corrupted by noise in its acquisition and transmission. The removal of noise from the image by attenuating the high frequency image components, removes some important details as well. In order to retain the useful information and improve the visual appearance, an effective denoising and resolution enhancement techniques are required. In this research, Hybrid Directional Lifting (HDL technique is proposed to retain the important details of the image and improve the visual appearance. The Discrete Wavelet Transform (DWT based interpolation technique is developed for enhancing the resolution of the denoised image. The performance of the proposed techniques are tested by Land Remote-Sensing Satellite (LANDSAT images, using the quantitative performance measure, Peak Signal to Noise Ratio (PSNR and computation time to show the significance of the proposed techniques. The PSNR of the HDL technique increases 1.02 dB compared to the standard denoising technique and the DWT based interpolation technique increases 3.94 dB. From the experimental results it reveals that newly developed image denoising and resolution enhancement techniques improve the image visual quality with rich textures.

  8. Digital preprocessing and classification of multispectral earth observation data

    Science.gov (United States)

    Anuta, P. E.

    1976-01-01

    The development of airborne and satellite multispectral image scanning sensors has generated wide-spread interest in application of these sensors to earth resource mapping. These point scanning sensors permit scenes to be imaged in a large number of electromagnetic energy bands between .3 and 15 micrometers. The energy sensed in each band can be used as a feature in a computer based multi-dimensional pattern recognition process to aid in interpreting the nature of elements in the scene. Images from each band can also be interpreted visually. Visual interpretation of five or ten multispectral images simultaneously becomes impractical especially as area studied increases; hence, great emphasis has been placed on machine (computer) techniques for aiding in the interpretation process. This paper describes a computer software system concept called LARSYS for analysis of multivariate image data and presents some examples of its application.

  9. Preprocessing of side-looking airborne radar data.

    NARCIS (Netherlands)

    Hoogeboom, P.

    1983-01-01

    Studies on microwave surface scattering in The Netherlands have indicated the need for accurate radar systems for applications in remote sensing. An SLAR system with digital recording was developed and is now being used for several programmes. This system was designed with special attention to speck

  10. Data screening and preprocessing for Landsat MSS data

    Science.gov (United States)

    Lambeck, P. F.; Kauth, R.; Thomas, G. S.

    1978-01-01

    Two computer algorithms are presented. The first, called SCREEN, is used to automatically identify pixels representing clouds, cloud shadows, snow, water, or anomalous signals in Landsat-2 data. The second, called XSTAR, compensates Landsat-2 data for the effects of atmospheric haze, without requiring ground measurements or ground references. The presentation of these algorithms includes their theoretical background, algebraic details, and performance characteristics. Verification of the algorithms has for the present been limited to Landsat agricultural data. Plans for further development of the XSTAR technique are also presented.

  11. A technique for real-time data preprocessing

    Science.gov (United States)

    Schaffner, M. R.

    1977-01-01

    A processing system is presented that implements simultaneously the efficiency of the special-purpose processor and the total applicability of the general-purpose computer - characteristics commonly thought of as being mutually exclusive. The solution adopted is that of specializing the machine by programming the hardware structure, rather than by adding software systems to it. Data are organized in circulating pages which form a plurality of local dynamic memories for each process. Programs are made up of modules, each describing a transient special-purpose machine. Applications to real-time processing of radar signals are referred to.

  12. An Approach to Fingerprint Image Pre-Processing

    Directory of Open Access Journals (Sweden)

    Om Preeti Chaurasia

    2012-07-01

    Full Text Available In this paper we have used all existing algorithms. When a fingerprint image is captured it is made pass through all the algorithms arranged in a particular order. We found that if we process a fingerprint in this particular order, the final output is good enough for minutiae detection and feature extraction. We have done many experiments on fingerprint images and found that this particular order of processing is producing better result. But for this we assume that the quality of the captured image is good enough. We have not worked on image quality enhancement. So if the input image is good our method will produce a good output. Off course this is a limitation of our proposed method, but if image is captured using a good quality device, then our method will produce an equal quality output as in other existing techniques.

  13. Improved Mainlobe Interference Suppression Based on Blocking Matrix Preprocess

    National Research Council Canada - National Science Library

    Yang, Jie; Liu, Congfeng

    2015-01-01

    ... on the combination of diagonal loading and linear constraints. Therein, the reason for mainlobe direction shifting is analyzed and found to be that the covariance matrix mismatch exists in the realization of the adaptive beamforming...

  14. Preprocessing and exploratory analysis of chromatographic profiles of plant extracts

    NARCIS (Netherlands)

    Hendriks, M.M.W.B.; Cruz-Juarez, L.; Bont, de D.; Hall, R.D.

    2005-01-01

    The characterization of herbal extracts to compare samples from different origin is important for robust production and quality control strategies. This characterization is now mainly performed by analysis of selected marker compounds. Metabolic fingerprinting of full metabolite profiles of plant ex

  15. Krylov subspace method based on data preprocessing technology

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    The performance of adaptive beamforming techniques is limited by the nonhomogeneous clutter scenario. An augmented Krylov subspace method is proposed, which utilizes only a single snapshot of the data for adaptive processing. The novel algorithm puts together a data preprocessor and adaptive Krylov subspace algorithm, where the data preprocessor suppresses discrete interference and the adaptive Krylov subspace algorithm suppresses homogeneous clutter. The novel method uses a single snapshot of the data received by the array antenna to generate a cancellation matrix that does not contain the signal of interest (SOI) component, thus, it mitigates the problem of highly nonstationary clutter environment and it helps to operate in real-time. The benefit of not requiring the training data comes at the cost of a reduced degree of freedom (DOF) of the system. Simulation illustrates the effectiveness in clutter suppression and adaptive beamforming. The numeric results show good agreement with the proposed theorem.

  16. Saliency-Based Fidelity Adaptation Preprocessing for Video Coding

    Institute of Scientific and Technical Information of China (English)

    Shao-Ping Lu; Song-Hai Zhang

    2011-01-01

    In this paper, we present a video coding scheme which applies the technique of visual saliency computation to adjust image fidelity before compression. To extract visually salient features, we construct a spatio-temporal saliency map by analyzing the video using a combined bottom-up and top-down visual saliency model. We then use an extended bilateral filter, in which the local intensity and spatial scales are adjusted according to visual saliency, to adaptively alter the image fidelity. Our implementation is based on the H.264 video encoder JM12.0. Besides evaluating our scheme with the H.264 reference software, we also compare it to a more traditional foreground-background segmentation-based method and a foveation-based approach which employs Gaussian blurring. Our results show that the proposed algorithm can improve the compression ratio significantly while effectively preserving perceptual visual quality.

  17. Texture classification using wavelet preprocessing and vector quantization

    Science.gov (United States)

    Lam, Eric P.

    2007-04-01

    In this paper, we will discuss a technique of texture image classification using a wavelet decomposition with selective wavelet packet node decomposition. This new approach uses a two-channel wavelet decomposition which is extended to two dimensions. Using the strength as a metric, selective wavelet decomposition is controlled. The metric is used to allow further decomposition or to terminate recursive decompositions. Decision of continuing further decompositions is based on each subband's strength with respect to the strengths of other subbands of the same wavelet decomposition level. Once the decompositions stop, the structure of the packet is stored in a data structure. Using the information from the data structure, dominating channels are extracted. These are defined as paths from the root of the packet to the leaf with the highest strengths. The list of dominating channels are used to train a learning vector quantization neural network.

  18. Prediction of speech intelligibility based on an auditory preprocessing model

    DEFF Research Database (Denmark)

    Christiansen, Claus Forup Corlin; Pedersen, Michael Syskind; Dau, Torsten

    2010-01-01

    in noise experiment was used for training and an ideal binary mask experiment was used for evaluation. All three models were able to capture the trends in the speech in noise training data well, but the proposed model provides a better prediction of the binary mask test data, particularly when the binary...... masks degenerate to a noise vocoder....

  19. Ash Reduction of Corn Stover by Mild Hydrothermal Preprocessing

    Energy Technology Data Exchange (ETDEWEB)

    M. Toufiq Reza; Rachel Emerson; M. Helal Uddin; Garold Gresham; Charles J. Coronella

    2014-04-22

    Lignocellulosic biomass such as corn stover can contain high ash content, which may act as an inhibitor in downstream conversion processes. Most of the structural ash in biomass is located in the cross-linked structure of lignin, which is mildly reactive in basic solutions. Four organic acids (formic, oxalic, tartaric, and citric) were evaluated for effectiveness in ash reduction, with limited success. Because of sodium citrate’s chelating and basic characteristics, it is effective in ash removal. More than 75 % of structural and 85 % of whole ash was removed from the biomass by treatment with 0.1 g of sodium citrate per gram of biomass at 130 °C and 2.7 bar. FTIR, fiber analysis, and chemical analyses show that cellulose and hemicellulose were unaffected by the treatment. ICP–AES showed that all inorganics measured were reduced within the biomass feedstock, except sodium due to the addition of Na through the treatment. Sodium citrate addition to the preconversion process of corn stover is an effective way to reduced physiological ash content of the feedstock without negatively impacting carbohydrate and lignin content.

  20. SPPAM - Statistical PreProcessing AlgorithM

    CERN Document Server

    Silva, Tiago

    2011-01-01

    Most machine learning tools work with a single table where each row is an instance and each column is an attribute. Each cell of the table contains an attribute value for an instance. This representation prevents one important form of learning, which is, classification based on groups of correlated records, such as multiple exams of a single patient, internet customer preferences, weather forecast or prediction of sea conditions for a given day. To some extent, relational learning methods, such as inductive logic programming, can capture this correlation through the use of intensional predicates added to the background knowledge. In this work, we propose SPPAM, an algorithm that aggregates past observations in one single record. We show that applying SPPAM to the original correlated data, before the learning task, can produce classifiers that are better than the ones trained using all records.

  1. Integrated Analytical Evaluation and Optimization of Model Parameters against Preprocessed Measurement Data

    Science.gov (United States)

    1989-06-23

    language plotting library distributed by Scientific Endeavors Corporation. was used to create a three-dimensional plotting program (3DSURF) suitable for the...process are S-Cubed, and GL/PHK. 3.1.1 System Implementation and Operation at GL In 1984 a Ridge 32C computer was aquired by PHK for dedicated POLAR and... aquired by PHK, we developed a program called IRMA (for IRis MAnipulater) to provide three-dimension color display of modeling results. Such a program is

  2. THREE PRE-PROCESSING STEPS TO INCREASE THE QUALITY OF KINECT RANGE DATA

    Directory of Open Access Journals (Sweden)

    M. Davoodianidaliki

    2013-09-01

    Full Text Available By developing technology with current rate, and increase in usage of active sensors in Close-Range Photogrammetry and Computer Vision, Range Images are the main extra data which has been added to the collection of present ones. Though main output of these data is point cloud, Range Images themselves can be considered important pieces of information. Being a bridge between 2D and 3D data enables it to hold unique and important attributes. There are 3 following properties that are taken advantage of in this study. First attribute to be considered is "Neighborhood of Null pixels" which will add a new field about accuracy of parameters into point cloud. This new field can be used later for data registration and integration. When there is a conflict between points of different stations we can abandon those with lower accuracy field. Next, polynomial fitting to known plane regions is applied. This step can help to soften final point cloud and just applies to some applications. Classification and region tracking in a series of images is needed for this process to be applicable. Finally, there is break-line created by errors of data transfer software. The break-line is caused by loss of some pixels in data transfer and store, and Image will shift along break-line. This error occurs usually when camera moves fast and processor can't handle transfer process entirely. The proposed method performs based on Edge Detection where horizontal lines are used to recognize break-line and near-vertical lines are used to determine shift value.

  3. Evaluasi Performa Pemecahan Database dengan Metode Klasifikasi Pada Data Preprocessing Data mining

    Directory of Open Access Journals (Sweden)

    Dedi Gunawan

    2016-06-01

    Full Text Available Database transaksi merupakan rekaman dari hasil transaksi yang dilakukan oleh konsumen ketika mereka membeli produk baik di toko, mall ataupun tempat lainnya. Data transaksi ini akan sangat bermanfaat ketika pemilik bisnis akan melakukan analisa terhadap transaksi yang dilakukan konsumen. Secara umum data transaksi akan dianalisis menggunakan teknik data mining seperti classification, clustering, ataupun prediction agar bisa memberikan informasi yang bernilai lebih kepada pemilik data. Analisis data transaksi akan menjadi tidak mudah jika ukuran data yang dimiliki sangat besar sehingga perlu dilakukan data pre-prosesing terlebih dahulu. Salah satu teknik data pre-prosesing untuk mengatasi ukuran database yang besar adalah dengan membagi database menjadi beberapa bagian sehingga akan mempercepat proses scanning data saat algoritma data mining diterapkan. Database bisa dipartisi menjadi beberapa bagian berdasarkan kasifikasi jenis item dari transaksi yang dilakukan oleh konsumen ataupun partisi secara otomatis dengan membagi beberapa bagian tanpa melihat item di dalamnya. Dalam penelitian ini kami akan membandingkan hasil kinerja dari kedua jenis model partisi database tersebut. Hasil perbandingan kinerja dikukur dari waktu komputasi dan jumlah memori yang terpakai dalam proses partisi database

  4. Development of integrated superconducting devices for signal preprocessing. Final report; Entwicklung supraleitender Bausteine der Signalvorverarbeitung. Abschlussbericht

    Energy Technology Data Exchange (ETDEWEB)

    Biehl, M.; Koch, R.; Neuhaus, M.; Scherer, T.; Jutzi, W.

    1998-02-01

    SPICE and CADENCE based tools for designing, simulating and optimizing SFQ and RSFQ circuits have been developed as well as a standard cell library corresponding to the fabrication technology established at the IEGI. A 12 bit flux shuttle shift register using Nb/Al Josephson junctions with a new write and readout gate has been fabricated and tested successfully; the power dissipation is 9 nW/bit/GHz. A pseudo random pulse generator was developed correspondingly. Simulations of RSFQ toggle flipflops during a large number of clock cycles demonstrated that the digital performance of counters is limited to clock frequencies below 100 GHz by dynamic effects, especially of parasitic inductances. Therefore dc measurements based on the voltage-frequency Josephson relationship must be followed by real time measurements of single SFQ word pulses. A four stage Nb based RSFQ counter in a coplanar waveguide test jig was tested up to a frequency of 2 GHz, limited by the available 32 bit pattern generator and the bandwith of the sampling oscilloscope, yielding bit error rates of about 10{sup -12}. Using YBCO technology, a 4 bit SFQ shift register (T=40 K) as well as miniaturized coplanar microwave devices for satellite and communication applications at 10 GHz (T=77 K) have been designed and fabricated. A 4 bit instantaneous real time frequency meter (IFM) and a microwave filter with a 3-dB bandwidth of only 1.8% have been mounted on the cold head of a split-cycle Stirling cooler (AEG, 1.5 W rate at 80 K) and tested successfully. Hybrid devices, e.g. amplifiers and oscillators, combining active semiconductor components and low loss coplanar YBCO transmission lines operated at 77 K seem very promising. (orig.) [Deutsch] Werkzeuge der Auslegung, Simulation und Optimierung von SFQ- und RSFQ-Schaltungen auf der Basis von SPICE und CADENCE wurden am IEGI entwickelt und eingesetzt. Eine auf die Technologien des Instituts zugeschnittene Bibliothek von Zellen ist vorhanden. Mit der bewaehrten Niob-Technologie wurde ein 12 bit SFQ-Schieberegister mit einer Schreib- und Leseschaltung und mit extrem kleiner Verlustleistung von 9 nW/bit/GHz hergestellt und impulsmaessig vermessen. Nach dem gleichen Prinzip wurde ein Zufallsimpulsgenerator ausgelegt. Die Simulation von RSFQ-Toggle-Flipflops mit einer grossen Zahl von Eingangsimpulsen zeigte, dass dynamische Effekte besonders durch Streuinduktivitaeten den digitalen Zaehlerbetrieb auf Taktfrequenzen unter 100 GHz begrenzen koennen, obwohl Gleichspannungsmessungen einen einwandfreien Betrieb bei hoeheren Frequenzen erwarten lassen. Daher sind Echtzeitmessungen der einzelnen SFQ-Impulse eines Wortes notwendig. Echtzeitmessungen mit einem Abtastoszillographen an einem 4-stufigen RSFQ-Zaehler der Niob-Technologie mit entsprechend breitbandiger Leitungszufuehrung ergaben bis zur Grenzfrequenz eines vorhandenen 32 bit Wortgenerators bei 2 GHz Bitfehlerraten um 10{sup -12}. Mit der YBCO-Technologie wurden ein 4 bit SFQ-Schieberegister fuer 40 K und miniaturisierte koplanare Mikrowellenschaltungen fuer die Satelliten- und Kommunikationstechnik um 10 GHz fuer 77 K ausgelegt und implementiert. Ein 4 bit Echtzeitfrequenzmesser und ein Filter mit einer 3 dB-Bandbreite von nur 1,8% wurden im IEGI erfolgreich auf dem Kaltkopf eines AEG-Kleinkuehlers ausgemessen. Hybride Schaltungen, z.B. Verstaerker und Oszillatoren, mit aktiven Halbleiterbauelementen und koplanaren YBCO-Verbindungen mit geringen Verlusten koennen bei 77 KJ interessante Eigenschaften besitzen. (orig.)

  5. A VISUAL BASIC program to pre-process MRI data for finite element modeling.

    Science.gov (United States)

    Todd, B A; Wang, H

    1996-11-01

    Investigators use non-invasive imaging to collect geometric data for finite element models. A preprocessor is described to facilitate model generation of anatomical regions from serial Magnetic Resonance Imaging (MRI) data stored in a bitmap format. The MRI Data Transfer System is a stand-alone Windows-based program developed in VISUAL BASIC 3.0 which generates a NASTRAN input file. The program can be modified to generate input files for other solvers. The software will executive on any IBM-compatible computer which runs Windows Version 3.1 or higher. To demonstrate the software, model generation of a portion of a tibia is described.

  6. A novel passive microfluidic device for preprocessing whole blood for point of care diagnostics

    DEFF Research Database (Denmark)

    Shah, Pranjul Jaykumar; Dimaki, Maria; Svendsen, Winnie Edith

    2009-01-01

    A novel strategy to sort the cells of interest (White Blood Cells (leukocytes)) by selectively lysing the Red Blood Cells (erythrocytes) in a miniaturized microfluidic device is presented. Various methods to lyse cells on a chip exist i.e. electrical, mechanical, chemical and thermal but they need...... integration of electrodes, traps, reservoirs, heaters, etc which is often difficult at microscale [1 – 4]. On the other hand, FACSlyse protocol uses only osmotic pressure to lyse erythrocytes allowing further isolation of leukocytes. This motivated us to develop a novel herringbone based lyser which works...

  7. Preprocessing of gravity gradients at the GOCE high-level processing facility

    NARCIS (Netherlands)

    Bouman, J.; Rispens, S.; Gruber, T.; Koop, R.; Schrama, E.; Visser, P.; Tscherning, C.C.; Veicherts, M.

    2008-01-01

    One of the products derived from the gravity field and steady-state ocean circulation explorer (GOCE) observations are the gravity gradients. These gravity gradients are provided in the gradiometer reference frame (GRF) and are calibrated in-flight using satellite shaking and star sensor data. To

  8. Viability assessment of regional biomass pre-processing center based bioethanol value chains

    Science.gov (United States)

    Carolan, Joseph E.

    Petroleum accounts for 94% of all liquid fuels and 36% of the total of all energy consumed in the United States. Petroleum dependence is problematic because global petroleum reserves are estimated to last only for 40 to 60 years at current consumption rates; global supplies are often located in politically unstable or unfriendly regions; and fossil fuels have negative environmental footprints. Domestic policies have aimed at promoting alternative, renewable liquid fuels, specifically bio-fuels derived from organic matter. Cellulosic bio-ethanol is one promising alternative fuel that has featured prominently in federal bio-fuel mandates under the Energy Independence and Security Act, 2007. However, the cellulosic bio-ethanol industry faces several technical, physical and industrial organization challenges. This dissertation examines the concept of a network of regional biomass pre-treatment centers (RBPC) that form an extended biomass supply chain feeding into a simplified biorefinery as a way to overcome these challenges. The analyses conducted address the structural and transactional issues facing bio-ethanol value chain establishment; the technical and financial feasibility of a stand alone pre-treatment center (RBPC); the impact of distributed pre-treatment on biomass transport costs; a comparative systems cost evaluation of the performance of the RBPC chain versus a fully integrated biorefinery (gIBRh), followed by application of the analytical framework to three case study regions.

  9. Research on Gaussian distribution preprocess method of infrared multispectral image background clutter

    Institute of Scientific and Technical Information of China (English)

    张伟; 武春风; 邓盼; 范宁

    2004-01-01

    This paper introduces a sliding-window mean removal high pass filter by which background clutter of infrared multispectral image is obtained. The method of selecting the optimum size of the sliding-window is based on the skewness-kurtosis test. In the end, a multivariate Gaussian distribution mathematical expression of background clutter image is given.

  10. Segmentation of Pre-processed Medical Images: An Approach Based on Range Filter

    Directory of Open Access Journals (Sweden)

    Amir Rajaei

    2012-09-01

    Full Text Available Medical image segmentation is a frequent processing step. Medical images are suffering from unrelated article and strong speckle noise. In this paper, we propose an approach to remove special markings such as arrow symbols and printed text along with medical image segmentation using range filter. The special markings are extracted using Sobel edge detection technique and then the intensity values of the detected markings are substituted by the intensity values of their corresponding neighborhood pixels. Next, three different image enhancement techniques are utilized to remove strong speckle noise as well enhance the weak boundaries of medical images. Finally range filter is applied to segment the texture content of different modalities of medical image. Experiment is conducted on ImageCLEF2010 database. Results show the efficacy of our proposed approach which lead to have precise content based medical image classification and retrieval systems.

  11. Improving Overhead Computation and pre-processing Time for Grid Scheduling System

    CERN Document Server

    Bouyer, Asgarali; Abdullah, Abdul Hanan

    2010-01-01

    Computational Grid is enormous environments with heterogeneous resources and stable infrastructures among other Internet-based computing systems. However, the managing of resources in such systems has its special problems. Scheduler systems need to get last information about participant nodes from information centers for the purpose of firmly job scheduling. In this paper, we focus on online updating resource information centers with processed and provided data based on the assumed hierarchical model. A hybrid knowledge extraction method has been used to classifying grid nodes based on prediction of jobs' features. An affirmative point of this research is that scheduler systems don't waste extra time for getting up-to-date information of grid nodes. The experimental result shows the advantages of our approach compared to other conservative methods, especially due to its ability to predict the behavior of nodes based on comprehensive data tables on each node.

  12. Interactions Between Pre-Processing and Classification Methods for Event-Related-Potential Classification

    NARCIS (Netherlands)

    Farquhar, J.D.R.; Hill, N.J.

    2013-01-01

    Detecting event related potentials (ERPs) from single trials is critical to the operation of many stimulus-driven brain computer interface (BCI) systems. The low strength of the ERP signal compared to the noise (due to artifacts and BCI irrelevant brain processes) makes this a challenging signal det

  13. Artifact Removal from Biosignal using Fixed Point ICA Algorithm for Pre-processing in Biometric Recognition

    Science.gov (United States)

    Mishra, Puneet; Singla, Sunil Kumar

    2013-01-01

    In the modern world of automation, biological signals, especially Electroencephalogram (EEG) and Electrocardiogram (ECG), are gaining wide attention as a source of biometric information. Earlier studies have shown that EEG and ECG show versatility with individuals and every individual has distinct EEG and ECG spectrum. EEG (which can be recorded from the scalp due to the effect of millions of neurons) may contain noise signals such as eye blink, eye movement, muscular movement, line noise, etc. Similarly, ECG may contain artifact like line noise, tremor artifacts, baseline wandering, etc. These noise signals are required to be separated from the EEG and ECG signals to obtain the accurate results. This paper proposes a technique for the removal of eye blink artifact from EEG and ECG signal using fixed point or FastICA algorithm of Independent Component Analysis (ICA). For validation, FastICA algorithm has been applied to synthetic signal prepared by adding random noise to the Electrocardiogram (ECG) signal. FastICA algorithm separates the signal into two independent components, i.e. ECG pure and artifact signal. Similarly, the same algorithm has been applied to remove the artifacts (Electrooculogram or eye blink) from the EEG signal.

  14. Erosion risk assessment in the southern Amazon - Data Preprocessing, data base application and process based modelling

    Science.gov (United States)

    Schindewolf, Marcus; Herrmann, Marie-Kristin; Herrmann, Anne-Katrin; Schultze, Nico; Amorim, Ricardo S. S.; Schmidt, Jürgen

    2015-04-01

    The study region along the BR 16 highway belongs to the "Deforestation Arc" at the southern border of the Amazon rainforest. At the same time, it incorporates a land use gradient as colonization started in the 1975-1990 in Central Mato Grosso in 1990 in northern Mato Grosso and most recently in 2004-2005 in southern Pará. Based on present knowledge soil erosion is one of the key driver of soil degradation. Hence, there is a strong need to implement soil erosion control measures in eroding landscapes. Planning and dimensioning of such measures require reliable and detailed information on the temporal and spatial distribution of soil loss, sediment transport and deposition. Soil erosion models are increasingly used, in order to simulate the physical processes involved and to predict the effects of soil erosion control measures. The process based EROSION 3D simulation model is used for surveying soil erosion and deposition on regional catchments. Although EROSION 3D is a widespread, extensively validated model, the application of the model on regional scale remains challenging due to the enormous data requirements and complex data processing operations. In this context the study includes the compilation, validation and generalisation of existing land use and soil data in order to generate a consistent EROSION 3D input datasets. As a part of this process a GIS-linked data base application allows to transfer the original soil and land use data into model specific parameter files. This combined methodology provides different risk assessment maps for certain demands on regional scale. Besides soil loss and sediment transport, sediment pass over points into surface water bodies and particle enrichment can be simulated using the EROSION 3D model. Thus the estimation of particle bound nutrient and pollutant inputs into surface water bodies becomes possible. The study ended up in a user-friendly, timesaving and improved software package for the simulation of soil loss and deposition on a regional scale providing essential information for the planning of soil and water conservation measures particularly under consideration of expected land use and climate changes.

  15. Unions of Onions: Preprocessing Imprecise Points for Fast Onion Layer Decomposition

    NARCIS (Netherlands)

    Löffler, Maarten|info:eu-repo/dai/nl/304836710; Mulzer, Wolfgang

    2014-01-01

    Let D be a set of n pairwise disjoint unit disks in the plane. We describe how to build a data structure for D so that for any point set P containing exactly one point from each disk, we can quickly nd the onion decomposition (convex layers) of P. Our data structure can be built in O(n log n) time

  16. Sabah snake grass extract pre-processing: Preliminary studies in drying and fermentation

    Science.gov (United States)

    Solibun, A.; Sivakumar, K.

    2016-06-01

    Clinacanthus nutans (Burm. F.) Lindau which also known as ‘Sabah Snake Grass’ among Malaysians have been studied in terms of its medicinal and chemical properties in Asian countries which is used to treat various diseases from cancer to viral-related diseases such as varicella-zoster virus lesions. Traditionally, this plant has been used by the locals to treat insect and snake bites, skin rashes, diabetes and dysentery. In Malaysia, the fresh leaves of this plant are usually boiled with water and consumed as herbal tea. The objectives of this study are to determine the key process parameters for Sabah Snake Grass fermentation which affect the chemical and biological constituent concentrations within the tea, extraction kinetics of fermented and unfermented tea and the optimal process parameters for the fermentation of this tea. Experimental methods such as drying, fermenting and extraction of C.nutans leaves were conducted before subjecting them to analysis of antioxidant capacity. Conventional oven- dried (40, 45 and 50°C) and fermented (6, 12 and 18 hours) whole C.nutans leaves were subjected to tea infusion extraction (water temperature was 80°C, duration was 90 minutes) and the sample liquid was extracted for every 5th, 10th, 15th, 25th, 40th, 60th and 90th minute. Analysis for antioxidant capacity and total phenolic content (TPC) were conducted by using 2, 2-diphenyl-1-pycryl-hydrazyl (DPPH) and Folin-Ciocaltheu reagent, respectively. The 40°C dried leaves sample produced the highest phenolic content at 0.1344 absorbance value in 15 minutes of extraction while 50°C dried leaves sample produced 0.1298 absorbance value in 10 minutes of extraction. The highest antioxidant content was produced by 50°C dried leaves sample with absorbance value of 1.6299 in 5 minutes of extraction. For 40°C dried leaves sample, the highest antioxidant content could be observed in 25 minutes of extraction with the absorbance value of 1.1456. The largest diameter of disc that could be observed at 18 hours of fermentation sample had a pile size of 3 cm that had expanded to 5.9 cm of diameter which indicated the microbe's growth.

  17. A System Design Tool for Automatically Generating Flowcharts and Preprocessing Pascal.

    Science.gov (United States)

    1979-12-01

    required for use in all programming courses as an aid in teaching program structure prior to developing code in any language. As a student, this author...hard copy device to print copies of the graphic display’s output. The shared printer is difficult to use. The procedure of unplugging the cable

  18. Gaia Data Release 1, Pre-processing and source list creation

    CERN Document Server

    Fabricius, C; Portell, J; others,

    2016-01-01

    The first data release from the Gaia mission contains accurate positions and magnitudes for more than a billion sources, and proper motions and parallaxes for the majority of the 2.5~million Hipparcos and Tycho-2 stars. We describe three essential elements of the initial data treatment leading to this catalogue: the image analysis, the construction of a source list, and the near real-time monitoring of the payload health. We also discuss some weak points that set limitations for the attainable precision at the present stage of the mission. Image parameters for point sources are derived from one-dimensional scans, using a maximum likelihood method, under the assumption of a line spread function constant in time, and a complete modelling of bias and background. These conditions are, however, not completely fulfilled. The Gaia source list is built starting from a large ground-based catalogue, but even so a significant number of new entries have been added, and a large number have been removed. The autonomous onb...

  19. MeteoIO 2.4.2: a preprocessing library for meteorological data

    Directory of Open Access Journals (Sweden)

    M. Bavay

    2014-06-01

    (i provides a uniform interface to meteorological data in the models; (ii hides the complexity of the processing taking place; and (iii guarantees a robust behaviour in case of format errors, erroneous or missing data. Moreover, in an operational context, this error handling should avoid unnecessary interruptions in the simulation process. A strong emphasis has been put on simplicity and modularity in order to make it extremely easy to support new data formats or protocols and to allow contributors with diverse backgrounds to participate. This library can also be used in the context of High Performance Computing in a parallel environment. Finally, it is released under an Open Source license and is available at http://models.slf.ch/p/meteoio. This paper gives an overview of the MeteoIO library from the point of view of conceptual design, architecture, features and computational performance. A scientific evaluation of the produced results is not given here since the scientific algorithms that are used have already been published elsewhere.

  20. Image pre-processing research of coal level in underground coal pocket

    Institute of Scientific and Technical Information of China (English)

    WU Bing; GAO Na

    2008-01-01

    Mathematical morphology is widely applicated in digital image procesing. Vari-ary morphology construction and algorithm being developed are used in deferent digital image processing. The basic idea of mathematical morphology is to use construction ele-ment measure image morphology for solving understand problem. The article presented advanced cellular neural network that forms mathematical morphological cellular neural network (MMCNN) equation to be suit for mathematical morphology filter. It gave the theo-ries of MMCNN dynamic extent and stable state. It is evidenced that arrived mathematical morphology filter through steady of dynamic process in definite condition.

  1. PepsNMR for the 1H-NMR metabolomic data pre-processing

    OpenAIRE

    Martin, Manon; Legat, Benoît; Leenders, Justine; Vanwinsberghe, Julien; Rousseau, Réjane; Boulanger, Bruno; Eilers, Paul H. C.; De Tullio, Pascal; Govaerts, Bernadette

    2017-01-01

    In the analysis of complex biological samples, control over experimental design and data acquisition procedures cannot ensure alone well-conditioned 1H-NMR spectra with maximal information recovery for data analysis. A third major element affects the accuracy and robustness of the results: the data pre-processing/pre-treatment for which not enough attention is usually devoted, in particular in the metabolomic studies. The usual approach is to use proprietary software provided by the analytica...

  2. Marine sediment sample pre-processing for macroinvertebrates metabarcoding: mechanical enrichment and homogenization

    Directory of Open Access Journals (Sweden)

    Eva Aylagas

    2016-10-01

    Full Text Available Metabarcoding is an accurate and cost-effective technique that allows for simultaneous taxonomic identification of multiple environmental samples. Application of this technique to marine benthic macroinvertebrate biodiversity assessment for biomonitoring purposes requires standardization of laboratory and data analysis procedures. In this context, protocols for creation and sequencing of amplicon libraries and their related bioinformatics analysis have been recently published. However, a standardized protocol describing all previous steps (i.e. processing and manipulation of environmental samples for macroinvertebrate community characterization is lacking. Here, we provide detailed procedures for benthic environmental sample collection, processing, enrichment for macroinvertebrates, homogenization, and subsequent DNA extraction for metabarcoding analysis. Since this is the first protocol of this kind, it should be of use to any researcher in this field, having the potential for improvement.

  3. Multi-objective metaheuristics for preprocessing EEG data in brain-computer interfaces

    Science.gov (United States)

    Aler, Ricardo; Vega, Alicia; Galván, Inés M.; Nebro, Antonio J.

    2012-03-01

    In the field of brain-computer interfaces, one of the main issues is to classify the electroencephalogram (EEG) accurately. EEG signals have a good temporal resolution, but a low spatial one. In this article, metaheuristics are used to compute spatial filters to improve the spatial resolution. Additionally, from a physiological point of view, not all frequency bands are equally relevant. Both spatial filters and relevant frequency bands are user-dependent. In this article a multi-objective formulation for spatial filter optimization and frequency-band selection is proposed. Several multi-objective metaheuristics have been tested for this purpose. The experimental results show, in general, that multi-objective algorithms are able to select a subset of the available frequency bands, while maintaining or improving the accuracy obtained with the whole set. Also, among the different metaheuristics tested, GDE3, which is based on differential evolution, is the most useful algorithm in this context.

  4. UNMIXING-BASED DENOISING AS A PRE-PROCESSING STEP FOR CORAL REEF ANALYSIS

    Directory of Open Access Journals (Sweden)

    D. Cerra

    2017-05-01

    Full Text Available Coral reefs, among the world’s most biodiverse and productive submerged habitats, have faced several mass bleaching events due to climate change during the past 35 years. In the course of this century, global warming and ocean acidification are expected to cause corals to become increasingly rare on reef systems. This will result in a sharp decrease in the biodiversity of reef communities and carbonate reef structures. Coral reefs may be mapped, characterized and monitored through remote sensing. Hyperspectral images in particular excel in being used in coral monitoring, being characterized by very rich spectral information, which results in a strong discrimination power to characterize a target of interest, and separate healthy corals from bleached ones. Being submerged habitats, coral reef systems are difficult to analyse in airborne or satellite images, as relevant information is conveyed in bands in the blue range which exhibit lower signal-to-noise ratio (SNR with respect to other spectral ranges; furthermore, water is absorbing most of the incident solar radiation, further decreasing the SNR. Derivative features, which are important in coral analysis, result greatly affected by the resulting noise present in relevant spectral bands, justifying the need of new denoising techniques able to keep local spatial and spectral features. In this paper, Unmixing-based Denoising (UBD is used to enable analysis of a hyperspectral image acquired over a coral reef system in the Red Sea based on derivative features. UBD reconstructs pixelwise a dataset with reduced noise effects, by forcing each spectrum to a linear combination of other reference spectra, exploiting the high dimensionality of hyperspectral datasets. Results show clear enhancements with respect to traditional denoising methods based on spatial and spectral smoothing, facilitating the coral detection task.

  5. Unmixing-Based Denoising as a Pre-Processing Step for Coral Reef Analysis

    Science.gov (United States)

    Cerra, D.; Traganos, D.; Gege, P.; Reinartz, P.

    2017-05-01

    Coral reefs, among the world's most biodiverse and productive submerged habitats, have faced several mass bleaching events due to climate change during the past 35 years. In the course of this century, global warming and ocean acidification are expected to cause corals to become increasingly rare on reef systems. This will result in a sharp decrease in the biodiversity of reef communities and carbonate reef structures. Coral reefs may be mapped, characterized and monitored through remote sensing. Hyperspectral images in particular excel in being used in coral monitoring, being characterized by very rich spectral information, which results in a strong discrimination power to characterize a target of interest, and separate healthy corals from bleached ones. Being submerged habitats, coral reef systems are difficult to analyse in airborne or satellite images, as relevant information is conveyed in bands in the blue range which exhibit lower signal-to-noise ratio (SNR) with respect to other spectral ranges; furthermore, water is absorbing most of the incident solar radiation, further decreasing the SNR. Derivative features, which are important in coral analysis, result greatly affected by the resulting noise present in relevant spectral bands, justifying the need of new denoising techniques able to keep local spatial and spectral features. In this paper, Unmixing-based Denoising (UBD) is used to enable analysis of a hyperspectral image acquired over a coral reef system in the Red Sea based on derivative features. UBD reconstructs pixelwise a dataset with reduced noise effects, by forcing each spectrum to a linear combination of other reference spectra, exploiting the high dimensionality of hyperspectral datasets. Results show clear enhancements with respect to traditional denoising methods based on spatial and spectral smoothing, facilitating the coral detection task.

  6. 虹膜图像预处理的研究%Research on Iris Image Preprocessing

    Institute of Scientific and Technical Information of China (English)

    孙树亮; 曾森森; 苏治红; 王伟杰; 林滟

    2014-01-01

    本文首先采用形态学对瞳孔进行定位,分割出瞳孔以后,对虹膜进行内外边缘的检测。对外边缘的定位首先采用梯度算子检测边缘,然后用Hough变换进行虹膜定位。由于上下眼皮会部分地遮挡虹膜顶端和底端,因此定位虹膜的时候,以瞳孔为中心,截取-π/4~π/4及3π/4~5π/4的虹膜区域作为研究对象。最后对图像归一化的方法用极坐标变换。实验结果证明该方法的有效性。%Morphology is employed in this paper for the positioning of pupil; after pupil segmentation, the inner and outer edges of the iris are examined; first, gradient operator is adopted to examine the edges of iris; then Hough Transformation is employed to position the iris. Since upper and lower eyelid will partly blur the top and bottom of iris, when positioning iris, the pupil is taken as the centre, -π/4~π/4 and 3π/4~5π/4 of the iris are singled out as the research objectives; Finally, image normalization is used for polar coordinates transformation. The results show that this method is valid.

  7. Absorption of Arsenite on Several Iron (Hydro-)Oxides and Impact from Pre-processing Methods

    Institute of Scientific and Technical Information of China (English)

    YE Ying; JI Shanshan; WU Daidai; LI Jun; ZHANG Weirui

    2006-01-01

    The absorption reactions of arsenite on Fe (hydro-)oxides are studied. The three absorbent types are Fe(OH)3 gel and two Fe (hydro-)oxides, in which the Fe(OH)3 gel was dried in a microwave oven under vacuum at 80℃. It is found that pH changes from 9.71 to 10.36 in 6 minutes after the Fe (OH)3 gel was mixed with NaAsO2 solution, as the arsenite replaces the OH- in goethite and Fe(OH)3.At the 40th minute after the start of the reaction, pH decreases, which is most probably because that the monodentate surface complex of absorbed arsenite has changed into mononuclear-bidentate complex and released proton. The decline in pH values indicates not the end of the absorption but a change in the reaction type. Temperature and dissolved gas has little effect on these two types of reactions. The total absorption of arsenite increases after the absorbent is irradiated with ultrasound, which also lead to difficulty in separating the solids from solution. The absorption capacity for arsenite of Fe(OH)3 gel dried in a microwave oven under vacuum is 53.18% and 17.22% respectively better than that of Fe (OH)3 gel and gel dried at 80℃. The possible reasons are that the water molecules in the gel vibrates with high frequency under the effect of microwave irradiation, thereby producing higher porosity and improved surface activity.

  8. Atmospheric dispersion models and pre-processing of meteorological data for real-time application

    DEFF Research Database (Denmark)

    Mikkelsen, T.; Desiato, F.

    1993-01-01

    and selects a series of suitable local scale atmospheric flow and dispersion models for RODOS, covering a variety of release types, terrain types and atmospheric stability conditions. The identification and ranking of suitable models is based on a discussion of principal modelling requirements, scale...... considerations, model performance and evaluation records, computational needs, user expertise, and type of sources to be modelled. Models suitable for a given accident scenario are chosen from this hierarchy in order to provide the dose assessments via the dispersion module. A forecasting feasibility......-processor provides the flow and dispersion models with on-site wind and atmospheric stability measures....

  9. Biomass Supply and Trade Opportunities of Preprocessed Biomass for Power Generation

    NARCIS (Netherlands)

    Batidzirai, B.; Junginger, M.; Klemm, M.; Schipfer, F.; Thrän, D.

    2016-01-01

    International trade of solid biomass is expected to increase significantly given the global distribution of biomass resources and anticipated expansion of bioenergy deployment in key global power markets. Given the unique characteristics of biomass, its long-distance trade requires optimized logisti

  10. Preprocessing of gravity gradients at the GOCE high-level processing facility

    NARCIS (Netherlands)

    Bouman, J.; Rispens, S.; Gruber, T.; Koop, R.; Schrama, E.; Visser, P.; Tscherning, C.C.; Veicherts, M.

    2008-01-01

    One of the products derived from the gravity field and steady-state ocean circulation explorer (GOCE) observations are the gravity gradients. These gravity gradients are provided in the gradiometer reference frame (GRF) and are calibrated in-flight using satellite shaking and star sensor data. To us

  11. Mining Sequential Access Pattern with Low Support From Large Pre-Processed Web Logs

    Directory of Open Access Journals (Sweden)

    S. Vijayalakshmi

    2010-01-01

    Full Text Available Problem statement: To find frequently occurring Sequential patterns from web log file on the basis of minimum support provided. We introduced an efficient strategy for discovering Web usage mining is the application of sequential pattern mining techniques to discover usage patterns from Web data, in order to understand and better serve the needs of Web-based applications. Approach: The approaches adopt a divide-and conquer pattern-growth principle. Our proposed method combined tree projection and prefix growth features from pattern-growth category with position coded feature from early-pruning category, all of these features are key characteristics of their respective categories, so we consider our proposed method as a pattern growth, early-pruning hybrid algorithm. Results: Our proposed Hybrid algorithm eliminated the need to store numerous intermediate WAP trees during mining. Since only the original tree was stored, it drastically cuts off huge memory access costs, which may include disk I/O cost in a virtual memory environment, especially when mining very long sequences with millions of records. Conclusion: An attempt had been made to our approach for improving efficiency. Our proposed method totally eliminates reconstructions of intermediate WAP-trees during mining and considerably reduces execution time.

  12. Impact of an innovated storage technology on the quality of preprocessed switchgrass bales

    Directory of Open Access Journals (Sweden)

    Christopher N. Boyer

    2016-03-01

    Full Text Available The purpose of this study was to determine the effects of three particle sizes of feedstock and two types of novel bale wraps on the quality of switchgrass by monitoring the chemical changes in cellulose, hemicellulose, lignin, extractives, and ash over a 225-day period. Using NIR (Near-infrared modeling to predict the chemical composition of the treated biomass, differences were found in cellulose, lignin, and ash content across switchgrass bales with different particle sizes. Enclosing bales in a net and film impacted the cellulose, lignin, and ash content. Cellulose, hemicellulose, lignin, extractives, and ash were different across the 225-day storage period. A quadratic response function made better prediction about cellulose, lignin, and ash response to storage, and a linear response function best described hemicellulose and extractives response to storage. This study yields valuable information regarding the quality of switchgrass at different intervals between the start and end date of storage, which is important to conversion facilities when determining optimal storage strategies to improve quality of the biomass feedstock, based on potential output yield of a bale over time.

  13. Evaluation of Two Absolute Radiometric Normalization Algorithms for Pre-processing of Landsat Imagery

    Institute of Scientific and Technical Information of China (English)

    Xu Hanqiu

    2006-01-01

    In order to evaluate radiometric normalization techniques, two image normalization algorithms for absolute radiometric correction of Landsat imagery were quantitatively compared in this paper, which are the Illumination Correction Model proposed by Markham and Irish and the Illumination and Atmospheric Correction Model developed by the Remote Sensing and GIS Laboratory of the Utah State University. Relative noise, correlation coefficient and slope value were used as the criteria for the evaluation and comparison, which were derived from pseudo-invariant features identified from multitemtween the normalized multitemporal images were significantly reduced when the seasons of multitemporal images were different. However, there was no significant difference between the normalized and unnormalized images with a similar seasonal condition. Furthermore, the correction results of two algorithms are similar when the images are relatively clear with a uniform atmospheric condition. Therefore, the radiometric normalization procedures should be carried out if the multitemporal images have a significant seasonal difference.

  14. Data preprocessing for parameter estimation. An application to a reactive bimolecular transport model

    CERN Document Server

    Cuch, Daniel A; Hasi, Claudio D El

    2015-01-01

    In this work we are concerned with the inverse problem of the estimation of modeling parameters for a reactive bimolecular transport based on experimental data that is non-uniformly distributed along the interval where the process takes place. We proposed a methodology that can help to determine the intervals where most of the data should be taken in order to obtain a good estimation of the parameters. For the purpose of reducing the cost of laboratory experiments, we propose to simulate data where is needed and it is not available, a PreProcesing Data Fitting (PPDF).We applied this strategy on the estimation of parameters for an advection-diffusion-reaction problem in a porous media. Each step is explained in detail and simulation results are shown and compared with previous ones.

  15. Comparing Three Preprocessing Strategies for Longitudinal Data: An Example in Functional Outcomes Research.

    Science.gov (United States)

    Yarnold, Paul R.; Feinglass, Joe; McCarthy, Walter J.; Martin, Gary J.

    1999-01-01

    Compared three methods for evaluating clinical outcomes for individual patients: (1) raw change score analysis, (2) normative statistical analysis, and (3) ipsative statistical analysis. Results with two samples of 39 and 20 patients show the ipsative method to be most consistent with a priori hypotheses evaluated for repeated-measures data. (SLD)

  16. Some current uses of array processors for preprocessing of remote sensing data

    Science.gov (United States)

    Fischel, D.

    1984-01-01

    The preparation of remotely sensed data sets into a form useful to the analyst is a significant computational task, involving the processing of spacecraft data (e.g., orbit, attitude, temperatures, etc.), decommutation of the video telemetry stream, radiometric correction and geometric correction. Many of these processes are extremely well suited for implementation on attached array processors. Currently, at Goddard Space Flight Center a number of computer systems provide such capability for earth observations or are under development as test beds for future ground segment support. Six such systems will be discussed.

  17. An investigation for the development of an integrated optical data preprocessor. [preprocessing remote sensor outputs

    Science.gov (United States)

    Verber, C. M.; Kenan, R. P.; Hartman, N. F.; Chapman, C. M.

    1980-01-01

    A laboratory model of a 16 channel integrated optical data preprocessor was fabricated and tested in response to a need for a device to evaluate the outputs of a set of remote sensors. It does this by accepting the outputs of these sensors, in parallel, as the components of a multidimensional vector descriptive of the data and comparing this vector to one or more reference vectors which are used to classify the data set. The comparison is performed by taking the difference between the signal and reference vectors. The preprocessor is wholly integrated upon the surface of a LiNbO3 single crystal with the exceptions of the source and the detector. He-Ne laser light is coupled in and out of the waveguide by prism couplers. The integrated optical circuit consists of a titanium infused waveguide pattern, electrode structures and grating beam splitters. The waveguide and electrode patterns, by virtue of their complexity, make the vector subtraction device the most complex integrated optical structure fabricated to date.

  18. An efficient approach for preprocessing data from a large-scale chemical sensor array.

    Science.gov (United States)

    Leo, Marco; Distante, Cosimo; Bernabei, Mara; Persaud, Krishna

    2014-09-24

    In this paper, an artificial olfactory system (Electronic Nose) that mimics the biological olfactory system is introduced. The device consists of a Large-Scale Chemical Sensor Array (16; 384 sensors, made of 24 different kinds of conducting polymer materials)that supplies data to software modules, which perform advanced data processing. In particular, the paper concentrates on the software components consisting, at first, of a crucial step that normalizes the heterogeneous sensor data and reduces their inherent noise. Cleaned data are then supplied as input to a data reduction procedure that extracts the most informative and discriminant directions in order to get an efficient representation in a lower dimensional space where it is possible to more easily find a robust mapping between the observed outputs and the characteristics of the odors in input to the device. Experimental qualitative proofs of the validity of the procedure are given by analyzing data acquired for two different pure analytes and their binary mixtures. Moreover, a classification task is performed in order to explore the possibility of automatically recognizing pure compounds and to predict binary mixture concentrations.

  19. Preprocessing of airborne remote sensing data. Part 1: the present situation.

    NARCIS (Netherlands)

    Hoogeboom, P.

    1981-01-01

    Studies on microwave surface scattering in The Netherlands have indicated the need of accurate radar systems for applications in remote sensing. A SLAR system with digital recording was developed and is now being used for several programs. This system was designed with special attention for speckle

  20. Profiling of liquid crystal displays with Raman spectroscopy: Preprocessing of spectra.

    NARCIS (Netherlands)

    O. Stanimirovic; H.F.M. Boelens; A.J.G. Mank; H.C.J. Hoefsloot; A.K. Smilde

    2005-01-01

    Raman spectroscopy is applied for characterizing paintable displays. Few other options than Raman spectroscopy exist for doing so because of the liquid nature of functional materials. The challenge is to develop a method that can be used for estimating the composition of a single display cell on the