Murtazaev, Akai K.; Babaev, Albert B.; Magomedov, Magomed A.; Kassan-Ogly, Felix A.; Proshkin, Alexey I.
2016-11-01
Using Monte Carlo simulations, we investigated phase transitions and frustrations in the three-state Potts model on a triangular lattice with allowance for antiferromagnetic exchange interactions between nearest-neighbors J1 and next-nearest-neighbors J2. The ratio of the next-nearest-neighbor and nearest-neighbor exchange constants r=J2/J1 is chosen within the range of 0≤r≤2. Based on the analysis of the entropy, specific heat, system state density function, and fourth order Binder cumulants, the phase transitions in the Potts model with interactions J1<0 and J2<0 are shown to be found in value ranges of 0≤r<0.2 and 1.25≤r≤2.0. In an intermediate range of 0.2≤r≤1.0 the phase transition fails and the frustrations are revealed.
Complex-Temperature Phase Diagrams of 1D Spin Models with Next-Nearest-Neighbor Couplings
1997-01-01
We study the dependence of complex-temperature phase diagrams on details of the Hamiltonian, focusing on the effect of non-nearest-neighbor spin-spin couplings. For this purpose, we consider a simple exactly solvable model, the 1D Ising model with nearest-neighbor (NN) and next-to-nearest-neighbor (NNN) couplings. We work out the exact phase diagrams for various values of $J_{nnn}/J_{nn}$ and compare these with the case of pure nearest-neighbor (NN) couplings. We also give some similar result...
Superconductivity in an attractive two-band Hubbard model with second nearest neighbors
Peraza-Salcedo, D. A.; Rodríguez-Núñez, J. J.; Bonalde, I.; Schmidt, A. A.
2017-04-01
This work extends the calculations performed by G. Litak, T. Örd, K. Rägo, and A. Vargunin, Physica C 483, 30 (2012), by including second nearest neighbors in an attractive two-orbital Hubbard model. We assumed that both the intra-orbital (Ui, i, with i = 1 , 2) and the inter-orbital Hubbard correlations (Ui, j, with i ≠ j) are negative; namely, Ui, j ≤ 0, ∀(i, j). We calculated the T - n phase diagram in the mean-field approximation. For a finite chemical potential ξ10 and a certain second nearest-neighbor parameter t2 superconductivity develops in two dome-like regions, each of which has its own energy gap. Notoriously, for t2 / |t1 | = 0.70 and ξ10 / |t1 | = 3 , where t1 is the nearest-neighbor parameter, Tc becomes zero around n = 2.5 .
Prediction of cavitation damage on spillway using K-nearest neighbor modeling.
Fadaei Kermani, E; Barani, G A; Ghaeini-Hessaroeyeh, M
2015-01-01
Cavitation is a common and destructive process on spillways that threatens the stability of the structure and causes damage. In this study, based on the nearest neighbor model, a method has been presented to predict cavitation damage on spillways. The model was tested using data from the Shahid Abbaspour dam spillway in Iran. The level of spillway cavitation damage was predicted for eight different flow rates, using the nearest neighbor model. Moreover, based on the cavitation index, five damage levels from no damage to major damage have been determined. Results showed that the present model predicted damage locations and levels close to observed damage during past floods. Finally, the efficiency and precision of the model was quantified by statistical coefficients. Appropriate values of the correlation coefficient, root mean square error, mean absolute error and coefficient of residual mass show the present model is suitable and efficient.
Multidimensional k-nearest neighbor model based on EEMD for financial time series forecasting
Zhang, Ningning; Lin, Aijing; Shang, Pengjian
2017-07-01
In this paper, we propose a new two-stage methodology that combines the ensemble empirical mode decomposition (EEMD) with multidimensional k-nearest neighbor model (MKNN) in order to forecast the closing price and high price of the stocks simultaneously. The modified algorithm of k-nearest neighbors (KNN) has an increasingly wide application in the prediction of all fields. Empirical mode decomposition (EMD) decomposes a nonlinear and non-stationary signal into a series of intrinsic mode functions (IMFs), however, it cannot reveal characteristic information of the signal with much accuracy as a result of mode mixing. So ensemble empirical mode decomposition (EEMD), an improved method of EMD, is presented to resolve the weaknesses of EMD by adding white noise to the original data. With EEMD, the components with true physical meaning can be extracted from the time series. Utilizing the advantage of EEMD and MKNN, the new proposed ensemble empirical mode decomposition combined with multidimensional k-nearest neighbor model (EEMD-MKNN) has high predictive precision for short-term forecasting. Moreover, we extend this methodology to the case of two-dimensions to forecast the closing price and high price of the four stocks (NAS, S&P500, DJI and STI stock indices) at the same time. The results indicate that the proposed EEMD-MKNN model has a higher forecast precision than EMD-KNN, KNN method and ARIMA.
Manifestations of Isospin in Nearest Neighbor Spacing Distributions for the f-p Model Space
Quinonez, Michael; Zamick, Larry
2016-01-01
The strong interactions are charge independent. If we limit ourselves to the strong interactions, we have the isospin $T$ as a good quantum number. Here we consider the lack of level repulsion of states of different isospin and how this effect manifests in nearest neighbor spacing (NNS) histograms, which provide a visual and statistical context in which to study distributions of energy level spacings. In particular, we study nucleons in the f-p model space for the nucleus $^{44}$Ti. We also study the effect of the Coulomb interaction on the level spacing distribution.
THE AVALANCHE DYNAMICS IN RANDOM NEAREST NEIGHBOR MODELS OF EVOLUTION WITH INTERACTION STRENGTH
Institute of Scientific and Technical Information of China (English)
无
2006-01-01
A generalized Bak-Sneppen model (BS model) of biological evolution with interaction strength θ is introduced in d-dimensional space, where the "nearest neighbors"are chosen among the 2d neighbors of the extremal site, with the probabilities related to the sizes of the fitnesses. Simulations of one- and two-dimensional models are given.For given θ＞ 0, the model can self-organize to a critical state, and the critical threshold fc(θ) decreases as θ increases. The exact gap equation depending on θ is presented, which reduces to the gap equation of BS model as θ tends to infinity. An exact equation for the critical exponent γ(θ) is also obtained. Scaling relations are established among the six critical exponents of the avalanches of the model.
k-Nearest neighbor models for microarray gene expression analysis and clinical outcome prediction.
Parry, R M; Jones, W; Stokes, T H; Phan, J H; Moffitt, R A; Fang, H; Shi, L; Oberthuer, A; Fischer, M; Tong, W; Wang, M D
2010-08-01
In the clinical application of genomic data analysis and modeling, a number of factors contribute to the performance of disease classification and clinical outcome prediction. This study focuses on the k-nearest neighbor (KNN) modeling strategy and its clinical use. Although KNN is simple and clinically appealing, large performance variations were found among experienced data analysis teams in the MicroArray Quality Control Phase II (MAQC-II) project. For clinical end points and controls from breast cancer, neuroblastoma and multiple myeloma, we systematically generated 463,320 KNN models by varying feature ranking method, number of features, distance metric, number of neighbors, vote weighting and decision threshold. We identified factors that contribute to the MAQC-II project performance variation, and validated a KNN data analysis protocol using a newly generated clinical data set with 478 neuroblastoma patients. We interpreted the biological and practical significance of the derived KNN models, and compared their performance with existing clinical factors.
Beyond the nearest-neighbor Zimm-Bragg model for helix-coil transition in peptides.
Murza, Adrian; Kubelka, Jan
2009-02-01
The nearest-neighbor (micro = 1) variant of the Zimm and Bragg (ZB) model has been extensively used to describe the helix-coil transition in biopolymers. In this work, we investigate the helix-coil transition for a 21-residue alanine peptide (AP) with the ZB model up to fourth nearest neighbor (micro = 1, 2, 3, and 4). We use a matrix approach that takes into account combinations of any number of helical stretches of any length and therefore gives the exact statistical weight of the chain within the assumptions of the ZB model. The parameters of the model are determined by fitting the temperature-dependent circular dichroism and Fourier transform infrared experimental spectra of the AP. All variants of the model fit the experimental data, thus giving similar results in terms of the macroscopic observables, such as temperature-dependent fractional helicity. However, the resulting microscopic parameters, such as distributions of the individual residue helical probabilities and free energy surfaces, vary significantly depending on the variant of the model. Overall, the mean residue enthalpy and entropy (in the absolute value) both increase with micro, but combined yield essentially the same "effective" value of the ZB propagation parameters for all micro. Greater helical probabilities for individual residues are predicted for larger micro, in particular, near the center of the sequence. The ZB nucleation parameters increase with increasing micro, which results in a lower free energy barrier to helix nucleation and lower apparent "cooperativity" of the transition. The significance of the long-range interactions for the predictions of ZB model for helix-coil transition, the calculated model parameters and the limitations of the model are discussed.
A Two-Lane Cellular Automata Model with Influence of Next-Nearest Neighbor Vehicle
Institute of Scientific and Technical Information of China (English)
无
2006-01-01
In this paper, we propose a new two-lane cellular automata model in which the influence of the next-nearest neighbor vehicle is considered. The attributes of the traffic system composed of fast-lane and slow-lane are investigated by the new traffic model. The simulation results show that the proposed two-lane traffic model can reproduce some traffic phenomena observed in real traffic, and that maximum flux and critical density are close to the field measurements.Moreover, the initial density distribution of the fast-lane and slow-lane has much influence on the traffic flow states.With the ratio between the densities of slow lane and fast lane increasing the lane changing frequency increases, but maximum flux decreases. Finally, the influence of the sensitivity coefficients is discussed.
Spiral versus modulated collinear phases in the quantum axial next-nearest-neighbor Heisenberg model
Oitmaa, J.; Singh, R. R. P.
2016-12-01
Motivated by the discovery of spiral and modulated collinear phases in several magnetic materials, we investigate the magnetic properties of Heisenberg spin S =1 /2 antiferromagnets in two and three dimensions, with frustration arising from second-neighbor couplings in one axial direction [the axial next-nearest-neighbor Heisenberg (ANNNH) model]. Our results clearly demonstrate the presence of an incommensurate spiral phase at T =0 in two dimensions, extending to finite temperatures in three dimensions. The crossover between Néel and spiral order occurs at a value of the frustration parameter considerably above the classical value 0.25, a sign of substantial quantum fluctuations. We also investigate a possible modulated collinear phase with a wavelength of four lattice spacings and find that it has substantially higher energy and hence is not realized in the model.
Rivas, Elena; Lang, Raymond; Eddy, Sean R
2012-02-01
The standard approach for single-sequence RNA secondary structure prediction uses a nearest-neighbor thermodynamic model with several thousand experimentally determined energy parameters. An attractive alternative is to use statistical approaches with parameters estimated from growing databases of structural RNAs. Good results have been reported for discriminative statistical methods using complex nearest-neighbor models, including CONTRAfold, Simfold, and ContextFold. Little work has been reported on generative probabilistic models (stochastic context-free grammars [SCFGs]) of comparable complexity, although probabilistic models are generally easier to train and to use. To explore a range of probabilistic models of increasing complexity, and to directly compare probabilistic, thermodynamic, and discriminative approaches, we created TORNADO, a computational tool that can parse a wide spectrum of RNA grammar architectures (including the standard nearest-neighbor model and more) using a generalized super-grammar that can be parameterized with probabilities, energies, or arbitrary scores. By using TORNADO, we find that probabilistic nearest-neighbor models perform comparably to (but not significantly better than) discriminative methods. We find that complex statistical models are prone to overfitting RNA structure and that evaluations should use structurally nonhomologous training and test data sets. Overfitting has affected at least one published method (ContextFold). The most important barrier to improving statistical approaches for RNA secondary structure prediction is the lack of diversity of well-curated single-sequence RNA secondary structures in current RNA databases.
Chin, Wen Cheong; Lee, Min Cherng; Yap, Grace Lee Ching
2016-01-01
High frequency financial data modelling has become one of the important research areas in the field of financial econometrics. However, the possible structural break in volatile financial time series often trigger inconsistency issue in volatility estimation. In this study, we propose a structural break heavy-tailed heterogeneous autoregressive (HAR) volatility econometric model with the enhancement of jump-robust estimators. The breakpoints in the volatility are captured by dummy variables after the detection by Bai-Perron sequential multi breakpoints procedure. In order to further deal with possible abrupt jump in the volatility, the jump-robust volatility estimators are composed by using the nearest neighbor truncation approach, namely the minimum and median realized volatility. Under the structural break improvements in both the models and volatility estimators, the empirical findings show that the modified HAR model provides the best performing in-sample and out-of-sample forecast evaluations as compared with the standard HAR models. Accurate volatility forecasts have direct influential to the application of risk management and investment portfolio analysis.
One-dimensional t-J model with next-nearest-neighbor hopping : Breakdown of the Luttinger liquid
Eder, R; Ohta, Y.
1997-01-01
We investigate the effect of a next-nearest-neighbor hopping integral t' in the one-dimensional t-J model, using Lanczos diagonalization of finite chains. Even moderate values of t' have a dramatic effect on the dynamical correlation functions and Fermi-surface topology. The high-energy holon bands
Al-Shakran, Mohammad; Kibler, Ludwig A.; Jacob, Timo; Ibach, Harald; Beltramo, Guillermo L.; Giesen, Margret
2016-09-01
This is Part I of two closely related papers, where we show that the specific adsorption of anions leads to a failure of the nearest-neighbor Ising model to describe island perimeter curvatures on Au(100) electrodes in dilute KBr, HCl and H2SO4 electrolytes and the therewith derived step diffusivity vs. step orientation. This result has major consequences for theoretical studies aiming at the understanding of growth, diffusion and degradation phenomena. Part I focuses on the experimental data. As shown theoretically in detail in Part II (doi:10.1016/j.susc.2016.03.022), a set of nearest-neighbor and next-nearest-neighbor interaction energies (ɛNN, ɛNNN) can uniquely be derived from the diffusivity of steps along and . We find strong repulsive next-nearest neighbor (NNN) interaction in KBr and HCl, whereas NNN interaction is negligibly for H2SO4. The NNN repulsive interaction energy ɛNNN therefore correlates positively with the Gibbs adsorption energy of the anions. We find furthermore that ɛNNN increases with increasing Br- and Cl- coverage. The results for ɛNN and ɛNNN are quantitatively consistent with the coverage dependence of the step line tension. We thereby establish a sound experimental base for theoretical studies on the energetics of steps in the presence of specific adsorption.
Energy Technology Data Exchange (ETDEWEB)
Gong, Longyan, E-mail: lygong@njupt.edu.cn [Information Physics Research Center and Department of Applied Physics, Nanjing University of Posts and Telecommunications, Nanjing, 210003 (China); Institute of Signal Processing and Transmission, Nanjing University of Posts and Telecommunications, Nanjing, 210003 (China); National Laboratory of Solid State Microstructures, Nanjing University, Nanjing 210093 (China); Feng, Yan; Ding, Yougen [Information Physics Research Center and Department of Applied Physics, Nanjing University of Posts and Telecommunications, Nanjing, 210003 (China); Institute of Signal Processing and Transmission, Nanjing University of Posts and Telecommunications, Nanjing, 210003 (China)
2017-02-12
Highlights: • Quasiperiodic lattice models with next-nearest-neighbor hopping are studied. • Shannon information entropies are used to reflect state localization properties. • Phase diagrams are obtained for the inverse bronze and golden means, respectively. • Our studies present a more complete picture than existing works. - Abstract: We explore the reduced relative Shannon information entropies SR for a quasiperiodic lattice model with nearest- and next-nearest-neighbor hopping, where an irrational number is in the mathematical expression of incommensurate on-site potentials. Based on SR, we respectively unveil the phase diagrams for two irrationalities, i.e., the inverse bronze mean and the inverse golden mean. The corresponding phase diagrams include regions of purely localized phase, purely delocalized phase, pure critical phase, and regions with mobility edges. The boundaries of different regions depend on the values of irrational number. These studies present a more complete picture than existing works.
Directory of Open Access Journals (Sweden)
J. Faradmal
2016-01-01
Full Text Available Introduction & Objective: Cox model is a common method to estimate survival and validity of the results is dependent on the proportional hazards assumption. K- Nearest neighbor is a nonparametric method for survival probability in heterogeneous communities. The purpose of this study was to compare the performance of k- nearest neighbor method (K-NN with Cox model. Materials & Methods: This retrospective cohort study was conducted in Hamadan Province, on 475 patients who had undergone kidney transplantation from 1994 to 2011. Data were extracted from patients’ medical records using a checklist. The duration of the time between kidney transplantation and rejection was considered as the survival time. Cox model and k- nearest neighbor method were used for Data modeling. The prediction error Brier score was used to compare the performance models. Results: Out of 475 transplantations, 55 episodes of rejection occurred. 5, 10 and 15 year survival rates of transplantation were 91.70 %, 84.90% and 74.50%, respectively. The number of neighborhood optimized using cross validation method was 45. Cumulative Brier score of k-NN algorithm for t=5, 10 and 15 years were 0.003, 0.006 and 0.007, respectively. Cumulative Brier of score Cox model for t=5, 10 and 15 years were 0.036, 0.058 and 0.058, respectively. Prediction error of k-NN algorithm for t=5, 10 and 15 years was less than Cox model that shows that the k-NN method outperforms. Conclusions: The results of this study show that the predictions of KNN has higher accuracy than the Cox model when sample sizes and the number of predictor variables are high. Sci J Hamadan Univ Med Sci . 2016; 22 (4 :300-308
Gong, Longyan; Feng, Yan; Ding, Yougen
2017-02-01
We explore the reduced relative Shannon information entropies SR for a quasiperiodic lattice model with nearest- and next-nearest-neighbor hopping, where an irrational number is in the mathematical expression of incommensurate on-site potentials. Based on SR, we respectively unveil the phase diagrams for two irrationalities, i.e., the inverse bronze mean and the inverse golden mean. The corresponding phase diagrams include regions of purely localized phase, purely delocalized phase, pure critical phase, and regions with mobility edges. The boundaries of different regions depend on the values of irrational number. These studies present a more complete picture than existing works.
Salari, Nader; Shohaimi, Shamarina; Najafi, Farid; Nallappan, Meenakshii; Karishnarajah, Isthrinayagy
2014-01-01
Among numerous artificial intelligence approaches, k-Nearest Neighbor algorithms, genetic algorithms, and artificial neural networks are considered as the most common and effective methods in classification problems in numerous studies. In the present study, the results of the implementation of a novel hybrid feature selection-classification model using the above mentioned methods are presented. The purpose is benefitting from the synergies obtained from combining these technologies for the development of classification models. Such a combination creates an opportunity to invest in the strength of each algorithm, and is an approach to make up for their deficiencies. To develop proposed model, with the aim of obtaining the best array of features, first, feature ranking techniques such as the Fisher's discriminant ratio and class separability criteria were used to prioritize features. Second, the obtained results that included arrays of the top-ranked features were used as the initial population of a genetic algorithm to produce optimum arrays of features. Third, using a modified k-Nearest Neighbor method as well as an improved method of backpropagation neural networks, the classification process was advanced based on optimum arrays of the features selected by genetic algorithms. The performance of the proposed model was compared with thirteen well-known classification models based on seven datasets. Furthermore, the statistical analysis was performed using the Friedman test followed by post-hoc tests. The experimental findings indicated that the novel proposed hybrid model resulted in significantly better classification performance compared with all 13 classification methods. Finally, the performance results of the proposed model was benchmarked against the best ones reported as the state-of-the-art classifiers in terms of classification accuracy for the same data sets. The substantial findings of the comprehensive comparative study revealed that performance of the
Energy Technology Data Exchange (ETDEWEB)
Murtazaev, A. K.; Ramazanov, M. K., E-mail: sheikh77@mail.ru; Badiev, V. K. [Russian Academy of Sciences, Institute of Physics, Dagestan Scientific Center (Russian Federation)
2012-08-15
The critical behavior of the three-dimensional antiferromagnetic Heisenberg model with nearest-neighbor (J) and next-to-nearest-neighbor (J{sub 1}) interactions is studied by the replica Monte Carlo method. The first-order phase transition and pseudouniversal critical behavior of this model are established for a small lattice in the interval R = vertical bar J{sub 1}/J vertical bar = 0-0.115. A complete set of the main static magnetic and chiral critical indices is calculated in this interval using the finite-dimensional scaling theory.
Pelizzola, Alessandro
1994-11-01
An explicit formula for the boundary magnetization of a two-dimensional Ising model with a strip of inhomogeneous interactions is obtained by means of a transfer matrix mean-field method introduced by Lipowski and Suzuki. There is clear numerical evidence that the formula is exact By taking the limit where the width of the strip approaches infinity and the interactions have well defined bulk limits, I arrive at the boundary magnetization for a model which includes the Hilhorst-van Leeuwen model. The rich critical behavior of the latter magnetization is thereby rederived with little effort.
Energy Technology Data Exchange (ETDEWEB)
Babaev, A. B., E-mail: b-albert78@mail.ru; Magomedov, M. A.; Murtazaev, A. K. [Russian Academy of Sciences, Amirkhanov Institute of Physics, Dagestan Scientific Center (Russian Federation); Kassan-Ogly, F. A.; Proshkin, A. I. [Russian Academy of Sciences, Institute of Metal Physics, Ural Branch (Russian Federation)
2016-02-15
Phase transitions (PTs) and frustrations in two-dimensional structures described by a three-vertex antiferromagnetic Potts model on a triangular lattice are investigated by the Monte Carlo method with regard to nearest and next-nearest neighbors with interaction constants J{sub 1} and J{sub 2}, respectively. PTs in these models are analyzed for the ratio r = J{sub 2}/J{sub 1} of next-nearest to nearest exchange interaction constants in the interval |r| = 0–1.0. On the basis of the analysis of the low-temperature entropy, the density of states function of the system, and the fourth-order Binder cumulants, it is shown that a Potts model with interaction constants J{sub 1} < 0 and J{sub 2} < 0 exhibits a first-order PT in the range of 0 ⩽ r < 0.2, whereas, in the interval 0.2 ⩽ r ⩽ 1.0, frustrations arise in the system. At the same time, for J{sub 1} > 0 and J{sub 2} < 0, frustrations arise in the range 0.5 < |r| < 1.0, while, in the interval 0 ⩽ |r| ⩽ 1/3, the model exhibits a second-order PT.
Ver Hoef, Jay M; Temesgen, Hailemariam
2013-01-01
Forest surveys provide critical information for many diverse interests. Data are often collected from samples, and from these samples, maps of resources and estimates of aerial totals or averages are required. In this paper, two approaches for mapping and estimating totals; the spatial linear model (SLM) and k-NN (k-Nearest Neighbor) are compared, theoretically, through simulations, and as applied to real forestry data. While both methods have desirable properties, a review shows that the SLM has prediction optimality properties, and can be quite robust. Simulations of artificial populations and resamplings of real forestry data show that the SLM has smaller empirical root-mean-squared prediction errors (RMSPE) for a wide variety of data types, with generally less bias and better interval coverage than k-NN. These patterns held for both point predictions and for population totals or averages, with the SLM reducing RMSPE from 9% to 67% over some popular k-NN methods, with SLM also more robust to spatially imbalanced sampling. Estimating prediction standard errors remains a problem for k-NN predictors, despite recent attempts using model-based methods. Our conclusions are that the SLM should generally be used rather than k-NN if the goal is accurate mapping or estimation of population totals or averages.
Shirakura, T.; Matsubara, F.; Suzuki, N.
2014-10-01
The spin structure of an axial next-nearest-neighbor Ising (ANNNI) model in two dimensions (2D) is a renewed problem because different Monte Carlo (MC) simulation methods predicted different spin orderings. The usual equilibrium simulation predicts the occurrence of a floating incommensurate (IC) Kosterlitz-Thouless (KT) type phase, which never emerges in non-equilibrium relaxation (NER) simulations. In this paper, we first examine previously published results of both methods, and then investigate a higher transition temperature Tc1 between the IC and paramagnetic phases. In the usual equilibrium simulation, we calculate the chain magnetization on larger lattices (up to 512×512 sites) and estimate Tc1≈1.16J with frustration ratio κ (≡-J2/J1)=0.6. We examine the nature of the phase transition in terms of the Binder ratio gL of spin overlap functions and the correlation-length ratio ξ /L. In the NER simulation, we observe the spin dynamics in equilibrium states by means of an autocorrelation function and also observe the chain magnetization relaxations from the ground and disordered states. These quantities exhibit an algebraic decay at T ≲1.17J. We conclude that the two-dimensional ANNNI model actually admits an IC phase transition of the KT type.
Como, F; Carnesecchi, E; Volani, S; Dorne, J L; Richardson, J; Bassan, A; Pavan, M; Benfenati, E
2017-01-01
Ecological risk assessment of plant protection products (PPPs) requires an understanding of both the toxicity and the extent of exposure to assess risks for a range of taxa of ecological importance including target and non-target species. Non-target species such as honey bees (Apis mellifera), solitary bees and bumble bees are of utmost importance because of their vital ecological services as pollinators of wild plants and crops. To improve risk assessment of PPPs in bee species, computational models predicting the acute and chronic toxicity of a range of PPPs and contaminants can play a major role in providing structural and physico-chemical properties for the prioritisation of compounds of concern and future risk assessments. Over the last three decades, scientific advisory bodies and the research community have developed toxicological databases and quantitative structure-activity relationship (QSAR) models that are proving invaluable to predict toxicity using historical data and reduce animal testing. This paper describes the development and validation of a k-Nearest Neighbor (k-NN) model using in-house software for the prediction of acute contact toxicity of pesticides on honey bees. Acute contact toxicity data were collected from different sources for 256 pesticides, which were divided into training and test sets. The k-NN models were validated with good prediction, with an accuracy of 70% for all compounds and of 65% for highly toxic compounds, suggesting that they might reliably predict the toxicity of structurally diverse pesticides and could be used to screen and prioritise new pesticides. Copyright © 2016 Elsevier Ltd. All rights reserved.
Approximate Nearest Neighbor Search through Comparisons
Tschopp, Dominique
2009-01-01
This paper addresses the problem of finding the nearest neighbor (or one of the R-nearest neighbors) of a query object q in a database of n objects. In contrast with most existing approaches, we can only access the ``hidden'' space in which the objects live through a similarity oracle. The oracle, given two reference objects and a query object, returns the reference object closest to the query object. The oracle attempts to model the behavior of human users, capable of making statements about similarity, but not of assigning meaningful numerical values to distances between objects.
Yin, Junqi; Landau, David
2010-03-01
Using the parallel tempering algorithm and GPU accelerated techniques, we have performed large-scale Monte Carlo simulations of the Ising (lattice gas) model on a square lattice with antiferromagnetic (repulsive) nearest-neighbor and next-nearest-neighbor interactions of the same strength and subject to a uniform magnetic field. Possibility of the XY-like transition is examined and both transitions from the (2x1) and row-shifted (2x2) ordered phases to the paramagnetic phase turn out to be continuous. From our data analysis, reentrance behavior of the (2x1) critical line and a bicritical point which separates the two ordered phases at T=0 are confirmed. Based on the non-universal critical exponents we obtained along the phase boundary, Suzuki's weak universality seems to hold.
Boykin, Timothy B.; Luisier, Mathieu; Klimeck, Gerhard; Jiang, Xueping; Kharche, Neerav; Zhou, Yu; Nayak, Saroj K
2011-01-01
Accurate modeling of the pi-bands of armchair graphene nanoribbons (AGNRs) requires correctly reproducing asymmetries in the bulk graphene bands as well as providing a realistic model for hydrogen passivation of the edge atoms. The commonly used single-pz orbital approach fails on both these counts. To overcome these failures we introduce a nearest-neighbor, three orbital per atom p/d tight-binding model for graphene. The parameters of the model are fit to first-principles density-functional ...
Boykin, Timothy B.; Luisier, Mathieu; Klimeck, Gerhard; Jiang, Xueping; Kharche, Neerav; Zhou, Yu; Nayak, Saroj K
2011-01-01
Accurate modeling of the ␣-bands of armchair graphene nanoribbons (AGNRs) requires correctly reproducing asymmetries in the bulk graphene bands as well as providing a realistic model for hydrogen passivation of the edge atoms. The commonly used single-pz orbital approach fails on both these counts. To overcome these failures we introduce a nearest-neighbor, three orbital per atom p/d tight-binding model for graphene. The parameters of the model are fit to first-principles density-functional t...
Frog sound identification using extended k-nearest neighbor classifier
Mukahar, Nordiana; Affendi Rosdi, Bakhtiar; Athiar Ramli, Dzati; Jaafar, Haryati
2017-09-01
Frog sound identification based on the vocalization becomes important for biological research and environmental monitoring. As a result, different types of feature extractions and classifiers have been employed to evaluate the accuracy of frog sound identification. This paper presents a frog sound identification with Extended k-Nearest Neighbor (EKNN) classifier. The EKNN classifier integrates the nearest neighbors and mutual sharing of neighborhood concepts, with the aims of improving the classification performance. It makes a prediction based on who are the nearest neighbors of the testing sample and who consider the testing sample as their nearest neighbors. In order to evaluate the classification performance in frog sound identification, the EKNN classifier is compared with competing classifier, k -Nearest Neighbor (KNN), Fuzzy k -Nearest Neighbor (FKNN) k - General Nearest Neighbor (KGNN)and Mutual k -Nearest Neighbor (MKNN) on the recorded sounds of 15 frog species obtained in Malaysia forest. The recorded sounds have been segmented using Short Time Energy and Short Time Average Zero Crossing Rate (STE+STAZCR), sinusoidal modeling (SM), manual and the combination of Energy (E) and Zero Crossing Rate (ZCR) (E+ZCR) while the features are extracted by Mel Frequency Cepstrum Coefficient (MFCC). The experimental results have shown that the EKNCN classifier exhibits the best performance in terms of accuracy compared to the competing classifiers, KNN, FKNN, GKNN and MKNN for all cases.
Randomized approximate nearest neighbors algorithm.
Jones, Peter Wilcox; Osipov, Andrei; Rokhlin, Vladimir
2011-09-20
We present a randomized algorithm for the approximate nearest neighbor problem in d-dimensional Euclidean space. Given N points {x(j)} in R(d), the algorithm attempts to find k nearest neighbors for each of x(j), where k is a user-specified integer parameter. The algorithm is iterative, and its running time requirements are proportional to T·N·(d·(log d) + k·(d + log k)·(log N)) + N·k(2)·(d + log k), with T the number of iterations performed. The memory requirements of the procedure are of the order N·(d + k). A by-product of the scheme is a data structure, permitting a rapid search for the k nearest neighbors among {x(j)} for an arbitrary point x ∈ R(d). The cost of each such query is proportional to T·(d·(log d) + log(N/k)·k·(d + log k)), and the memory requirements for the requisite data structure are of the order N·(d + k) + T·(d + N). The algorithm utilizes random rotations and a basic divide-and-conquer scheme, followed by a local graph search. We analyze the scheme's behavior for certain types of distributions of {x(j)} and illustrate its performance via several numerical examples.
Vladimirov, Igor; Jak, Eugene
2007-04-28
We study an interacting particle system on the simple cubic lattice satisfying the nearest neighbor exclusion (NNE) which forbids any two nearest sites to be simultaneously occupied. Under the constraint, we develop an edge-to-site reduction of the Bethe-Peierls entropy approximation of the cluster variation method. The resulting NNE-corrected Bragg-Williams approximation is applied to statistical mechanical modeling of a liquid silicate formed by silica and a univalent network modifier, for which we derive the molar Gibbs energy of mixing and enthalpy of mixing and compare the predictions with available thermodynamic data.
Dimensionality reduction with unsupervised nearest neighbors
Kramer, Oliver
2013-01-01
This book is devoted to a novel approach for dimensionality reduction based on the famous nearest neighbor method that is a powerful classification and regression approach. It starts with an introduction to machine learning concepts and a real-world application from the energy domain. Then, unsupervised nearest neighbors (UNN) is introduced as efficient iterative method for dimensionality reduction. Various UNN models are developed step by step, reaching from a simple iterative strategy for discrete latent spaces to a stochastic kernel-based algorithm for learning submanifolds with independent parameterizations. Extensions that allow the embedding of incomplete and noisy patterns are introduced. Various optimization approaches are compared, from evolutionary to swarm-based heuristics. Experimental comparisons to related methodologies taking into account artificial test data sets and also real-world data demonstrate the behavior of UNN in practical scenarios. The book contains numerous color figures to illustr...
Boykin, Timothy; Luisier, Mathieu; Klimeck, Gerhard; Jiang, Xueping; Kharche, Neerav; Zhou, Yu; Nayak, Saroj
2012-02-01
The commonly used single-pz orbital first nearest-neighbor tight-binding model faces two main problems: (i) it fails to reproduce asymmetries in the bulk graphene bands; (ii) it cannot provide a realistic model for hydrogen passivation of the edge atoms. As a result, some armchair graphene nanoribbons (AGNRs) are incorrectly predicted as metallic. A new nearest-neighbor, three orbital per atom p/d tight-binding model [1] is built to address these issues. The parameters of the model are fit to bandstructures obtained from first-principles density-functional theory and many-body perturbation theory within the GW approximation, giving excellent agreement with the ab initio AGNR bands. This model is employed to calculate the current-voltage characteristics of an AGNR MOSFET and the conductance of rough-edge AGNRs, finding significant differences versus the single-pz model. Taken together these results demonstrate the importance of an accurate and computational efficient band structure model for predicting the performance of graphene-based nanodevices. [1] T. B. Boykin, M. Luisier, G. Klimeck, X. Jiang, N. Kharche, Y. Zhou and S. Nayak, J. Appl. Phys. 109, 104304 (2011)
Approximate Nearest Neighbor Queries among Parallel Segments
DEFF Research Database (Denmark)
Emiris, Ioannis Z.; Malamatos, Theocharis; Tsigaridas, Elias
2010-01-01
We develop a data structure for answering efficiently approximate nearest neighbor queries over a set of parallel segments in three dimensions. We connect this problem to approximate nearest neighbor searching under weight constraints and approximate nearest neighbor searching on historical data...
Pruning nearest neighbor cluster trees
Kpotufe, Samory
2011-01-01
Nearest neighbor (k-NN) graphs are widely used in machine learning and data mining applications, and our aim is to better understand what they reveal about the cluster structure of the unknown underlying distribution of points. Moreover, is it possible to identify spurious structures that might arise due to sampling variability? Our first contribution is a statistical analysis that reveals how certain subgraphs of a k-NN graph form a consistent estimator of the cluster tree of the underlying distribution of points. Our second and perhaps most important contribution is the following finite sample guarantee. We carefully work out the tradeoff between aggressive and conservative pruning and are able to guarantee the removal of all spurious cluster structures at all levels of the tree while at the same time guaranteeing the recovery of salient clusters. This is the first such finite sample result in the context of clustering.
Scalable Nearest Neighbor Algorithms for High Dimensional Data.
Muja, Marius; Lowe, David G
2014-11-01
For many computer vision and machine learning problems, large training sets are key for good performance. However, the most computationally expensive part of many computer vision and machine learning algorithms consists of finding nearest neighbor matches to high dimensional vectors that represent the training data. We propose new algorithms for approximate nearest neighbor matching and evaluate and compare them with previous algorithms. For matching high dimensional features, we find two algorithms to be the most efficient: the randomized k-d forest and a new algorithm proposed in this paper, the priority search k-means tree. We also propose a new algorithm for matching binary features by searching multiple hierarchical clustering trees and show it outperforms methods typically used in the literature. We show that the optimal nearest neighbor algorithm and its parameters depend on the data set characteristics and describe an automated configuration procedure for finding the best algorithm to search a particular data set. In order to scale to very large data sets that would otherwise not fit in the memory of a single machine, we propose a distributed nearest neighbor matching framework that can be used with any of the algorithms described in the paper. All this research has been released as an open source library called fast library for approximate nearest neighbors (FLANN), which has been incorporated into OpenCV and is now one of the most popular libraries for nearest neighbor matching.
DEFF Research Database (Denmark)
Ladefoged, Claes N.; Andersen, Flemming L.; Keller, Sune H.
2014-01-01
n combined PET/MR, attenuation correction (AC) is performed indirectly based on the available MR image information. Metal implant-induced susceptibility artifacts and subsequent signal voids challenge MR-based AC. Several papers acknowledge the problem in PET attenuation correction when dental...... artifacts are ignored, but none of them attempts to solve the problem. We propose a clinically feasible correction method which combines Active Shape Models (ASM) and k- Nearest-Neighbors (kNN) into a simple approach which finds and corrects the dental artifacts within the surface boundaries of the patient...... anatomy. ASM is used to locate a number of landmarks in the T1-weighted MR-image of a new patient. We calculate a vector of offsets from each voxel within a signal void to each of the landmarks. We then use kNN to classify each voxel as belonging to an artifact or an actual signal void using this offset...
DEFF Research Database (Denmark)
Ladefoged, Claes N.; Andersen, Flemming L.; Keller, Sune H.;
2014-01-01
n combined PET/MR, attenuation correction (AC) is performed indirectly based on the available MR image information. Metal implant-induced susceptibility artifacts and subsequent signal voids challenge MR-based AC. Several papers acknowledge the problem in PET attenuation correction when dental...... artifacts are ignored, but none of them attempts to solve the problem. We propose a clinically feasible correction method which combines Active Shape Models (ASM) and k- Nearest-Neighbors (kNN) into a simple approach which finds and corrects the dental artifacts within the surface boundaries of the patient...... vector, and fill the artifact voxels with a value representing soft tissue. We tested the method using fourteen patients without artifacts, and eighteen patients with dental artifacts of varying sizes within the anatomical surface of the head/neck region. Though the method wrongly filled a small volume...
Hybrid k -Nearest Neighbor Classifier.
Yu, Zhiwen; Chen, Hantao; Liuxs, Jiming; You, Jane; Leung, Hareton; Han, Guoqiang
2016-06-01
Conventional k -nearest neighbor (KNN) classification approaches have several limitations when dealing with some problems caused by the special datasets, such as the sparse problem, the imbalance problem, and the noise problem. In this paper, we first perform a brief survey on the recent progress of the KNN classification approaches. Then, the hybrid KNN (HBKNN) classification approach, which takes into account the local and global information of the query sample, is designed to address the problems raised from the special datasets. In the following, the random subspace ensemble framework based on HBKNN (RS-HBKNN) classifier is proposed to perform classification on the datasets with noisy attributes in the high-dimensional space. Finally, the nonparametric tests are proposed to be adopted to compare the proposed method with other classification approaches over multiple datasets. The experiments on the real-world datasets from the Knowledge Extraction based on Evolutionary Learning dataset repository demonstrate that RS-HBKNN works well on real datasets, and outperforms most of the state-of-the-art classification approaches.
Boykin, Timothy B.; Luisier, Mathieu; Klimeck, Gerhard; Jiang, Xueping; Kharche, Neerav; Zhou, Yu; Nayak, Saroj K.
2011-05-01
Accurate modeling of the π-bands of armchair graphene nanoribbons (AGNRs) requires correctly reproducing asymmetries in the bulk graphene bands, as well as providing a realistic model for hydrogen passivation of the edge atoms. The commonly used single-pz orbital approach fails on both these counts. To overcome these failures we introduce a nearest-neighbor, three orbital per atom p/d tight-binding model for graphene. The parameters of the model are fit to first-principles density-functional theory -based calculations as well as to those based on the many-body Green's function and screened-exchange formalism, giving excellent agreement with the ab initio AGNR bands. We employ this model to calculate the current-voltage characteristics of an AGNR MOSFET and the conductance of rough-edge AGNRs, finding significant differences versus the single-pz model. These results show that an accurate band structure model is essential for predicting the performance of graphene-based nanodevices.
Khanfar, Mohammad A; Taha, Mutasem O
2013-10-28
The mammalian target of rapamycin (mTOR) has an important role in cell growth, proliferation, and survival. mTOR is frequently hyperactivated in cancer, and therefore, it is a clinically validated target for cancer therapy. In this study, we combined exhaustive pharmacophore modeling and quantitative structure-activity relationship (QSAR) analysis to explore the structural requirements for potent mTOR inhibitors employing 210 known mTOR ligands. Genetic function algorithm (GFA) coupled with k nearest neighbor (kNN) and multiple linear regression (MLR) analyses were employed to build self-consistent and predictive QSAR models based on optimal combinations of pharmacophores and physicochemical descriptors. Successful pharmacophores were complemented with exclusion spheres to optimize their receiver operating characteristic curve (ROC) profiles. Optimal QSAR models and their associated pharmacophore hypotheses were validated by identification and experimental evaluation of several new promising mTOR inhibitory leads retrieved from the National Cancer Institute (NCI) structural database. The most potent hit illustrated an IC50 value of 48 nM.
Murtazaev, A. K.; Ramazanov, M. K.; Kurbanova, D. R.; Badiev, M. K.; Abuev, Ya. K.
2017-06-01
The replica Monte Carlo method has been used to investigate the critical behavior of a threedimensional antiferromagnetic Ising model on a body-centered cubic lattice, taking into account interactions of the adjacent behind neighbors. Investigations are carried out for the ratios of the values of exchange interactions behind the nearest and next nearest neighbors k = J 2/ J 1 in the range of k ∈ [0.0, 1.0] with the step Δ k = 0.1. In the framework of the theory of finite-dimensional scaling the static critical indices of heat capacity α, susceptibility γ, of the order parameter β, correlation radius ν, and also the Fisher index η are calculated. It is shown that the universality class of the critical behavior of this model is kept in the interval of k ∈ [0.0, 0.6]. It is established that a nonuniversal critical behavior is observed in the range k ∈ [0.8, 1.0].
Nearest Neighbor Queries in Road Networks
DEFF Research Database (Denmark)
Jensen, Christian Søndergaard; Kolar, Jan; Pedersen, Torben Bach
2003-01-01
With wireless communications and geo-positioning being widely available, it becomes possible to offer new e-services that provide mobile users with information about other mobile objects. This paper concerns active, ordered k-nearest neighbor queries for query and data objects that are moving...... for the nearest neighbor search in the prototype is presented in detail. In addition, the paper reports on results from experiments with the prototype system....
Efficient nearest neighbor searches in N-ABLE.
Energy Technology Data Exchange (ETDEWEB)
Mackey, Greg Edward
2010-07-01
The nearest neighbor search is a significant problem in transportation modeling and simulation. This paper describes how the nearest neighbor search is implemented efficiently with respect to running time in the NISAC Agent-Based Laboratory for Economics. The paper shows two methods to optimize running time of the nearest neighbor search. The first optimization uses a different distance metric that is more computationally efficient. The concept of a magnitude-comparable distance is described, and the paper gives a specific magnitude-comparable distance that is more computationally efficient than the actual distance function. The paper also shows how the given magnitude-comparable distance can be used to speed up the actual distance calculation. The second optimization reduces the number of points the search examines by using a spatial data structure. The paper concludes with testing of the different techniques discussed and the results.
Institute of Scientific and Technical Information of China (English)
Kailei Liu; Zhijia Li; Cheng Yao; Ji Chen; Ke Zhang; Muhammad Saifullah
2016-01-01
The Kalman filter (KF) updating method has been widely used as an efficient measure to assimilate real-time hydrological variables for reducing forecast uncertainty and providing improved forecasts. However, the accuracy of the KF relies much on the estimates of the state transition matrix and is limited due to the errors inherit from parameters and variables of the flood forecasting models. A new real-time updating approach (named KN2K) is produced by coupling the k-nearest neighbor (KNN) procedure with the KF for flood forecasting models. The nonparametric KNN algorithm, which can be utilized to predict the response of a system on the basis of the k most representative predictors, is still efficient when the descriptions for input-output mapping are insufficient. In this study, the KNN procedure is used to provide more accurate estimates of the state transition matrix to extend the applicability of the KF. The updating performance of KN2K is investigated in the middle reach of the Huai River based on a one-dimensional hydraulic model with the lead times ranging from 2 to 12 h. The forecasts from the KN2K are compared with the observations, the original forecasts and the KF-updated forecasts. The results indicate that the KN2K method, with the Nash-Sutcliffe efficiency larger than 0.85 in the 12-h-ahead forecasts, has a significant advantage in accuracy and robustness compared to the KF method. It is demonstrated that improved updating results can be obtained through the use of KNN procedure. The tests show that the KN2K method can be used as an effective tool for real-time flood forecasting.
Ladefoged, Claes N.; Andersen, Flemming L.; Keller, Sune H.; Beyer, Thomas; Højgaard, Liselotte; Lauze, François
2014-03-01
In combined PET/MR, attenuation correction (AC) is performed indirectly based on the available MR image information. Metal implant-induced susceptibility artifacts and subsequent signal voids challenge MR-based AC. Several papers acknowledge the problem in PET attenuation correction when dental artifacts are ignored, but none of them attempts to solve the problem. We propose a clinically feasible correction method which combines Active Shape Models (ASM) and k- Nearest-Neighbors (kNN) into a simple approach which finds and corrects the dental artifacts within the surface boundaries of the patient anatomy. ASM is used to locate a number of landmarks in the T1-weighted MR-image of a new patient. We calculate a vector of offsets from each voxel within a signal void to each of the landmarks. We then use kNN to classify each voxel as belonging to an artifact or an actual signal void using this offset vector, and fill the artifact voxels with a value representing soft tissue. We tested the method using fourteen patients without artifacts, and eighteen patients with dental artifacts of varying sizes within the anatomical surface of the head/neck region. Though the method wrongly filled a small volume in the bottom part of a maxillary sinus in two patients without any artifacts, due to their abnormal location, it succeeded in filling all dental artifact regions in all patients. In conclusion, we propose a method, which combines ASM and kNN into a simple approach which, as the results show, succeeds to find and correct the dental artifacts within the anatomical surface.
Sznajd, J.
2016-12-01
The linear perturbation renormalization group (LPRG) is used to study the phase transition of the weakly coupled Ising chains with intrachain (J ) and interchain nearest-neighbor (J1) and next-nearest-neighbor (J2) interactions forming the triangular and rectangular lattices in a field. The phase diagrams with the frustration point at J2=-J1/2 for a rectangular lattice and J2=-J1 for a triangular lattice have been found. The LPRG calculations support the idea that the phase transition is always continuous except for the frustration point and is accompanied by a divergence of the specific heat. For the antiferromagnetic chains, the external field does not change substantially the shape of the phase diagram. The critical temperature is suppressed to zero according to the power law when approaching the frustration point with an exponent dependent on the value of the field.
Sznajd, J
2016-12-01
The linear perturbation renormalization group (LPRG) is used to study the phase transition of the weakly coupled Ising chains with intrachain (J) and interchain nearest-neighbor (J_{1}) and next-nearest-neighbor (J_{2}) interactions forming the triangular and rectangular lattices in a field. The phase diagrams with the frustration point at J_{2}=-J_{1}/2 for a rectangular lattice and J_{2}=-J_{1} for a triangular lattice have been found. The LPRG calculations support the idea that the phase transition is always continuous except for the frustration point and is accompanied by a divergence of the specific heat. For the antiferromagnetic chains, the external field does not change substantially the shape of the phase diagram. The critical temperature is suppressed to zero according to the power law when approaching the frustration point with an exponent dependent on the value of the field.
Approximate nearest neighbors via dictionary learning
Cherian, Anoop; Morellas, Vassilios; Papanikolopoulos, Nikolaos
2011-06-01
Approximate Nearest Neighbors (ANN) in high dimensional vector spaces is a fundamental, yet challenging problem in many areas of computer science, including computer vision, data mining and robotics. In this work, we investigate this problem from the perspective of compressive sensing, especially the dictionary learning aspect. High dimensional feature vectors are seldom seen to be sparse in the feature domain; examples include, but not limited to Scale Invariant Feature Transform (SIFT) descriptors, Histogram Of Gradients, Shape Contexts, etc. Compressive sensing advocates that if a given vector has a dense support in a feature space, then there should exist an alternative high dimensional subspace where the features are sparse. This idea is leveraged by dictionary learning techniques through learning an overcomplete projection from the feature space so that the vectors are sparse in the new space. The learned dictionary aids in refining the search for the nearest neighbors to a query feature vector into the most likely subspace combination indexed by its non-zero active basis elements. Since the size of the dictionary is generally very large, distinct feature vectors are most likely to have distinct non-zero basis. Utilizing this observation, we propose a novel representation of the feature vectors as tuples of non-zero dictionary indices, which then reduces the ANN search problem into hashing the tuples to an index table; thereby dramatically improving the speed of the search. A drawback of this naive approach is that it is very sensitive to feature perturbations. This can be due to two possibilities: (i) the feature vectors are corrupted by noise, (ii) the true data vectors undergo perturbations themselves. Existing dictionary learning methods address the first possibility. In this work we investigate the second possibility and approach it from a robust optimization perspective. This boils down to the problem of learning a dictionary robust to feature
Stoyanova-Slavova, Iva B; Slavov, Svetoslav H; Pearce, Bruce; Buzatu, Dan A; Beger, Richard D; Wilkes, Jon G
2014-06-01
A diverse set of 154 chemicals that included US Food and Drug Administration-regulated compounds tested for their aquatic toxicity in Daphnia magna were modeled by a 3-dimensional quantitative spectral data-activity relationship (3D-QSDAR). Two distinct algorithms, partial least squares (PLS) and Tanimoto similarity-based k-nearest neighbors (KNN), were used to process bin occupancy descriptor matrices obtained after tessellation of the 3D-QSDAR space into regularly sized bins. The performance of models utilizing bins ranging in size from 2 ppm × 2 ppm × 0.5 Å to 20 ppm × 20 ppm × 2.5 Å was explored. Rigorous quality-control criteria were imposed: 1) 100 randomized 20% hold-out test sets were generated and the average R(2) test of the respective models was used as a measure of their performance, and 2) a Y-scrambling procedure was used to identify chance correlations. A consensus between the best-performing composite PLS model using 0.5 Å × 14 ppm × 14 ppm bins and 10 latent variables (average R(2) test = 0.770) and the best composite KNN model using 0.5 Å × 8 ppm × 8 ppm and 2 neighbors (average R(2) test = 0.801) offered an improvement of about 7.5% (R(2) test consensus = 0.845). Projection of the most frequently occurring bins on the standard coordinate space indicated that the presence of a primary or secondary amino group-substituted aromatic systems-would result in an increased toxic effect in Daphnia. The presence of a second aromatic ring with highly electronegative substituents 5 Å to 7 Å apart from the first ring would lead to a further increase in toxicity. © 2014 SETAC.
Unsupervised K-Nearest Neighbor Regression
Kramer, Oliver
2011-01-01
In many scientific disciplines structures in high-dimensional data have to be found, e.g., in stellar spectra, in genome data, or for face recognition tasks. In this work we present a novel approach to non-linear dimensionality reduction. It is based on fitting K-nearest neighbor regression to the unsupervised regression framework for learning of low-dimensional manifolds. Similar to related approaches that are mostly based on kernel methods, unsupervised K-nearest neighbor (UKNN) regression optimizes latent variables w.r.t. the data space reconstruction error employing the K-nearest neighbor heuristic. The problem of optimizing latent neighborhoods is difficult to solve, but the UKNN formulation allows an efficient strategy of iteratively embedding latent points to fixed neighborhood topologies. The approaches will be tested experimentally.
Statistical downscaling using K-nearest neighbors
Gangopadhyay, Subhrendu; Clark, Martyn; Rajagopalan, Balaji
2005-02-01
Statistical downscaling provides a technique for deriving local-scale information of precipitation and temperature from numerical weather prediction model output. The K-nearest neighbor (K-nn) is a new analog-type approach that is used in this paper to downscale the National Centers for Environmental Prediction 1998 medium-range forecast model output. The K-nn algorithm queries days similar to a given feature vector in this archive and using empirical orthogonal function analysis identifies a subset of days (K) similar to the feature day. These K days are then weighted using a bisquare weight function and randomly sampled to generate ensembles. A set of 15 medium-range forecast runs was used, and seven ensemble members were generated from each run. The ensemble of 105 members was then used to select the local-scale precipitation and temperature values in four diverse basins across the contiguous United States. These downscaled precipitation and temperature estimates were subsequently analyzed to test the performance of this downscaling approach. The downscaled ensembles were evaluated in terms of bias, the ranked probability skill score as a measure of forecast skill, spatial covariability between stations, temporal persistence, consistency between variables, and conditional bias and to develop spread-skill relationships. Though this approach does not explicitly model the space-time variability of the weather fields at each individual station, the above statistics were extremely well captured. The K-nn method was also compared with a multiple-linear-regression-based downscaling model.
Lectures on the nearest neighbor method
Biau, Gérard
2015-01-01
This text presents a wide-ranging and rigorous overview of nearest neighbor methods, one of the most important paradigms in machine learning. Now in one self-contained volume, this book systematically covers key statistical, probabilistic, combinatorial and geometric ideas for understanding, analyzing and developing nearest neighbor methods. Gérard Biau is a professor at Université Pierre et Marie Curie (Paris). Luc Devroye is a professor at the School of Computer Science at McGill University (Montreal). .
Sawamura, Akitaka; Otsuka, Jun; Kato, Takashi; Kotani, Takao
2017-06-01
We report the determination of parameters for the nearest-neighbor sp3s* tight-binding (TB) model for GaP, GaAs, GaSb, InP, InAs, and InSb at 0, 77, and 300 K based on the hybrid quasi-particle self-consistent GW (QSGW) calculation and their application to a type II (InAs)/(GaSb) superlattice. The effects of finite temperature have been incorporated empirically by adjusting the parameter for blending the exchange-correlation terms of the pure QSGW method and local density approximation, in addition to the usage of experimental lattice parameters. As expected, the TB band gap shrinks with temperature and asymptotically with superlattice period when it is large. In addition, a bell curve in the band gap in the case of small superlattice period and slight and remarkable anisotropy in effective masses of electron and hole, both predicted by the hybrid QSGW method, respectively, are reproduced.
Nonparametric k-nearest-neighbor entropy estimator.
Lombardi, Damiano; Pant, Sanjay
2016-01-01
A nonparametric k-nearest-neighbor-based entropy estimator is proposed. It improves on the classical Kozachenko-Leonenko estimator by considering nonuniform probability densities in the region of k-nearest neighbors around each sample point. It aims to improve the classical estimators in three situations: first, when the dimensionality of the random variable is large; second, when near-functional relationships leading to high correlation between components of the random variable are present; and third, when the marginal variances of random variable components vary significantly with respect to each other. Heuristics on the error of the proposed and classical estimators are presented. Finally, the proposed estimator is tested for a variety of distributions in successively increasing dimensions and in the presence of a near-functional relationship. Its performance is compared with a classical estimator, and a significant improvement is demonstrated.
Nearest Neighbor Algorithms for Pattern Classification
Barrios, J. O.
1972-01-01
A solution of the discrimination problem is considered by means of the minimum distance classifier, commonly referred to as the nearest neighbor (NN) rule. The NN rule is nonparametric, or distribution free, in the sense that it does not depend on any assumptions about the underlying statistics for its application. The k-NN rule is a procedure that assigns an observation vector z to a category F if most of the k nearby observations x sub i are elements of F. The condensed nearest neighbor (CNN) rule may be used to reduce the size of the training set required categorize The Bayes risk serves merely as a reference-the limit of excellence beyond which it is not possible to go. The NN rule is bounded below by the Bayes risk and above by twice the Bayes risk.
Modified nearest neighbor phase unwrapping algorithm
Institute of Scientific and Technical Information of China (English)
CHEN Jia-feng; CHEN Hai-qing; YANG Zhen-gang
2006-01-01
Phase unwrapping is so important in interferometry that it determines the veracity of the absolute phase value.Goldstein's branch-cut algorithm performs path-independent algorithm that uses a nearest neighbor heuristic to link and balance the residues based on identifying the residues.A modified nearest neighbor algorithm is presented based on the principle,the mathematic formula of the Goldstein's algorithm and in-depth analysis of the key problem of phase unwrapping.It not only holds the advantage of the Goldstein's algorithm but also solves the problem that the Goldstein's algorithm is incapable to be used at high residue densities.Therefore,it extends the application of the Goldstein's algorithm and enhances the precision of phase unwrapping.
Weighting of the k-Nearest-Neighbors
DEFF Research Database (Denmark)
Chernoff, Konstantin; Nielsen, Mads
2010-01-01
This paper presents two distribution independent weighting schemes for k-Nearest-Neighbors (kNN). Applying the first scheme in a Leave-One-Out (LOO) setting corresponds to performing complete b-fold cross validation (b-CCV), while applying the second scheme corresponds to performing bootstrapping...... in the limit of infinite iterations. We demonstrate that the soft kNN errors obtained through b-CCV can be obtained by applying the weighted kNN in a LOO setting, and that the proposed weighting schemes can decrease the variance and improve the generalization of kNN in a CV setting....
Weighting of the k-Nearest-Neighbors
DEFF Research Database (Denmark)
Chernoff, Konstantin; Nielsen, Mads
This paper presents two distribution independent weighting schemes for k-Nearest-Neighbors (kNN). Applying the first scheme in a Leave-One-Out (LOO) setting corresponds to performing complete b-fold cross validation (b-CCV), while applying the second scheme corresponds to performing bootstrapping...... in the limit of infinite iterations. We demonstrate that the soft kNN errors obtained through b-CCV can be obtained by applying the weighted kNN in a LOO setting, and that the proposed weighting schemes can decrease the variance and improve the generalization of kNN in a CV setting....
Evolving edited k-nearest neighbor classifiers.
Gil-Pita, Roberto; Yao, Xin
2008-12-01
The k-nearest neighbor method is a classifier based on the evaluation of the distances to each pattern in the training set. The edited version of this method consists of the application of this classifier with a subset of the complete training set in which some of the training patterns are excluded, in order to reduce the classification error rate. In recent works, genetic algorithms have been successfully applied to determine which patterns must be included in the edited subset. In this paper we propose a novel implementation of a genetic algorithm for designing edited k-nearest neighbor classifiers. It includes the definition of a novel mean square error based fitness function, a novel clustered crossover technique, and the proposal of a fast smart mutation scheme. In order to evaluate the performance of the proposed method, results using the breast cancer database, the diabetes database and the letter recognition database from the UCI machine learning benchmark repository have been included. Both error rate and computational cost have been considered in the analysis. Obtained results show the improvement achieved by the proposed editing method.
Taherkhani, Farid; Abroshan, Hadi; Akbarzadeh, Hamed; Fortunelli, Alessandro
2012-07-01
The effects of second-neighbor spin coupling interactions and a magnetic field are investigated on the free energies of a finite-size 1-D Ising model. For both ferromagnetic of nearest neighbor (NN) and next-nearest neighbor (NNN) spin coupling interactions, the finite-size free energy first increases and then approaches a constant value for any size of the spin chain. In contrast, when NNN and NN spin coupling interactions are antiferromagnetic and ferromagnetic, respectively, the finite-size free energy gradually decreases by increasing the competition factor and eventually vanishes for large values of it. When a magnetic field is applied, the finite-size free energy decreases with respect to the case of zero magnetic fields for both ferromagnetic and antiferromagnetic spin coupling interactions. Deviation of free energy per size for finite-size systems relative to the infinite system increases when the spin coupling interactions as well as the f parameter (the ratio of the magnetic field to NN spin coupling interaction) increase.
Anisotropic k-Nearest Neighbor Search Using Covariance Quadtree
Marinho, Eraldo Pereira
2011-01-01
We present a variant of the hyper-quadtree that divides a multidimensional space according to the hyperplanes associated to the principal components of the data in each hyperquadrant. Each of the $2^\\lambda$ hyper-quadrants is a data partition in a $\\lambda$-dimension subspace, whose intrinsic dimensionality $\\lambda\\leq d$ is reduced from the root dimensionality $d$ by the principal components analysis, which discards the irrelevant eigenvalues of the local covariance matrix. In the present method a component is irrelevant if its length is smaller than, or comparable to, the local inter-data spacing. Thus, the covariance hyper-quadtree is fully adaptive to the local dimensionality. The proposed data-structure is used to compute the anisotropic K nearest neighbors (kNN), supported by the Mahalanobis metric. As an application, we used the present k nearest neighbors method to perform density estimation over a noisy data distribution. Such estimation method can be further incorporated to the smoothed particle h...
Frequency and Correlation of Nearest Neighboring Nucleotides in Human Genome
Jin, Neng-zhi; Liu, Zi-xian; Qiu, Wen-yuan
2009-02-01
Zipf's approach in linguistics is utilized to analyze the statistical features of frequency and correlation of 16 nearest neighboring nucleotides (AA, AC, AG, ..., TT) in 12 human chromosomes (Y, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, and 12). It is found that these statistical features of nearest neighboring nucleotides in human genome: (i) the frequency distribution is a linear function, and (ii) the correlation distribution is an inverse function. The coefficients of the linear function and inverse function depend on the GC content. It proposes the correlation distribution of nearest neighboring nucleotides for the first time and extends the descriptor about nearest neighboring nucleotides.
Efficient nearest neighbors via robust sparse hashing.
Cherian, Anoop; Sra, Suvrit; Morellas, Vassilios; Papanikolopoulos, Nikolaos
2014-08-01
This paper presents a new nearest neighbor (NN) retrieval framework: robust sparse hashing (RSH). Our approach is inspired by the success of dictionary learning for sparse coding. Our key idea is to sparse code the data using a learned dictionary, and then to generate hash codes out of these sparse codes for accurate and fast NN retrieval. But, direct application of sparse coding to NN retrieval poses a technical difficulty: when data are noisy or uncertain (which is the case with most real-world data sets), for a query point, an exact match of the hash code generated from the sparse code seldom happens, thereby breaking the NN retrieval. Borrowing ideas from robust optimization theory, we circumvent this difficulty via our novel robust dictionary learning and sparse coding framework called RSH, by learning dictionaries on the robustified counterparts of the perturbed data points. The algorithm is applied to NN retrieval on both simulated and real-world data. Our results demonstrate that RSH holds significant promise for efficient NN retrieval against the state of the art.
Monitoring nearest neighbor queries with cache strategies
Institute of Scientific and Technical Information of China (English)
PAN Peng; LU Yan-sheng
2007-01-01
The problem of continuously monitoring multiple K-nearest neighbor (K-NN) queries with dynamic object and query dataset is valuable for many location-based applications. A practical method is to partition the data space into grid cells, with both object and query table being indexed by this grid structure, while solving the problem by periodically joining cells of objects with queries having their influence regions intersecting the cells. In the worst case, all cells of objects will be accessed once. Object and query cache strategies are proposed to further reduce the I/O cost. With object cache strategy, queries remaining static in current processing cycle seldom need I/O cost, they can be returned quickly. The main I/O cost comes from moving queries, the query cache strategy is used to restrict their search-regions, which uses current results of queries in the main memory buffer. The queries can share not only the accessing of object pages, but also their influence regions. Theoretical analysis of the expected I/O cost is presented, with the I/O cost being about 40% that of the SEA-CNN method in the experiment results.
Dimensional testing for reverse k-nearest neighbor search
DEFF Research Database (Denmark)
Casanova, Guillaume; Englmeier, Elias; Houle, Michael E.
2017-01-01
Given a query object q, reverse k-nearest neighbor (RkNN) search aims to locate those objects of the database that have q among their k-nearest neighbors. In this paper, we propose an approximation method for solving RkNN queries, where the pruning operations and termination tests are guided by a...
An Affine Invariant $k$-Nearest Neighbor Regression Estimate
Biau, Gérard; Dujmovic, Vida; Krzyzak, Adam
2012-01-01
We design a data-dependent metric in $\\mathbb R^d$ and use it to define the $k$-nearest neighbors of a given point. Our metric is invariant under all affine transformations. We show that, with this metric, the standard $k$-nearest neighbor regression estimate is asymptotically consistent under the usual conditions on $k$, and minimal requirements on the input data.
Blel, Sonia; Hamouda, Ajmi BH.; Mahjoub, B.; Einstein, T. L.
2017-02-01
In this paper we explore the meandering instability of vicinal steps with a kinetic Monte Carlo simulations (kMC) model including the attractive next-nearest-neighbor (NNN) interactions. kMC simulations show that increase of the NNN interaction strength leads to considerable reduction of the meandering wavelength and to weaker dependence of the wavelength on the deposition rate F. The dependences of the meandering wavelength on the temperature and the deposition rate obtained with simulations are in good quantitative agreement with the experimental result on the meandering instability of Cu(0 2 24) [T. Maroutian et al., Phys. Rev. B 64, 165401 (2001), 10.1103/PhysRevB.64.165401]. The effective step stiffness is found to depend not only on the strength of NNN interactions and the Ehrlich-Schwoebel barrier, but also on F. We argue that attractive NNN interactions intensify the incorporation of adatoms at step edges and enhance step roughening. Competition between NNN and nearest-neighbor interactions results in an alternative form of meandering instability which we call "roughening-limited" growth, rather than attachment-detachment-limited growth that governs the Bales-Zangwill instability. The computed effective wavelength and the effective stiffness behave as λeff˜F-q and β˜eff˜F-p , respectively, with q ≈p /2 .
Institute of Scientific and Technical Information of China (English)
尚晓丽; 包向辉
2015-01-01
The intelligent transportation scheduling model design is the key to ensure the smooth flow of traffic network. In⁃telligent transportation scheduling model PID control law in the presence of survivability and robustness of the problem based on the traditional. Reverse nearest neighbor query intelligent transportation scheduling model based on the improved method, the introduction of follow bee search nectar operator, reverse nearest neighbor queries based on the improved meth⁃od, the establishment of a feed forward compensation of dynamic game mathematical model based on rough set theory, fea⁃ture extraction of control traffic congestion, vehicle density of different lanes within the vehicle weighted draw speed and other information. As the output of PID network system, overcome the entity irregular growth leading to scheduling control problem of high precision. Artificial bee colony algorithm using reverse nearest neighbor query improved extraction method based on the characteristics of traffic information, improve the realization of intelligent scheduling algorithm and control model. The simulation experiments, using the intelligent transportation scheduling model, can effectively improve the traffic throughput, shorten the road resistance time, it ensures the smooth operation of the vehicle.%智能交通调度模型设计是保障交通网络畅通的关键。传统的基于PID控制律的智能交通调度模型存在抗毁性和鲁棒性不好的问题。提出基于反向最近邻查询改进方法的智能交通调度模型，引入跟随蜂搜索蜜源算子，基于反向最近邻查询改进方法，建立一种基于粗糙集理论的前馈补偿动态博弈数学模型，提取制约交通拥堵的车辆密度、不同车道内的车辆加权平局速度等信息特征，作为PID路网系统的输出，克服实体无规则增长导致调度控制精度不高的问题。采用基于蜂群算法的反向最近邻查询改进方法交通信息特
Institute of Scientific and Technical Information of China (English)
李修云; 周桐; 杨智勇
2015-01-01
In order to effectively enhance the stability performance of traffic flow,based on the optimal velocity difference (OVD)model,an improved model containing acceleration term is deduced with the consideration of the effect of information about the nearest-neighbor leading car’s acceleration to the following car.In the model,a parameter p is introduced,which expresses the consideration of the nearest-neighbor leading car’s acceleration.The linear stable judging condition is obtained by linear stability analysis.Simulation results are compared with those of OVD model.It shows negative velocity under the low sensitivity can be avoided by adjusting the parameter p , and the stability of traffic flow is enhanced.Therefore,the model can not only suppress traffic jam more effectively but describe the actual traffic phenomenon more precisely.It provides theoretical fundaments for cooperative driving.%为了提高车流的稳定性能，考虑最紧邻前车加速度信息的影响，在优化速度差模型（Optimal velocity difference，OVD）的基础上，引入参数 p 表示驾驶人对最紧邻车辆加速度信息的关注程度，提出了含加速度项的跟驰模型。通过线性稳定性分析，得到交通流的临界稳定判据。数值仿真表明新模型与 OVD 模型比较，通过调节参数 p ，可以避免在低敏感系数下 OVD 模型中负速度现象的出现，同时，加速度效应对车流致稳效果更加明显。因此，研究模型能更有效地增强车流稳定性能和更好地描述实际交通流现象，为多车协同驾驶策略提供了先导作用。
Statistical analysis of $k$-nearest neighbor collaborative recommendation
Biau, Gérard; Rouvière, Laurent; 10.1214/09-AOS759
2010-01-01
Collaborative recommendation is an information-filtering technique that attempts to present information items that are likely of interest to an Internet user. Traditionally, collaborative systems deal with situations with two types of variables, users and items. In its most common form, the problem is framed as trying to estimate ratings for items that have not yet been consumed by a user. Despite wide-ranging literature, little is known about the statistical properties of recommendation systems. In fact, no clear probabilistic model even exists which would allow us to precisely describe the mathematical forces driving collaborative filtering. To provide an initial contribution to this, we propose to set out a general sequential stochastic model for collaborative recommendation. We offer an in-depth analysis of the so-called cosine-type nearest neighbor collaborative method, which is one of the most widely used algorithms in collaborative filtering, and analyze its asymptotic performance as the number of user...
Frequency and Correlation of Nearest Neighboring Nucleotides in Human Genome
Institute of Scientific and Technical Information of China (English)
Neng-zhi Jin; Zi-xian Liu; Wen-yuan Qiu
2009-01-01
Zipf's approach in linguistics is utilized to analyze the statistical features of frequency and mosomes (Y, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, and 12). It is found that these statistical features of nearest neighboring nucleotides in human genome: (ⅰ) the frequency distribution is a linear function, and (ⅱ) the correlation distribution is an inverse function. The coeffi-cients of the linear function and inverse function depend on the GC content. It proposes the correlation distribution of nearest neighboring nucleotides for the first time and extends the descriptor about nearest neighboring nucleotides.
K-nearest neighbor finding using MaxNearestDist.
Samet, Hanan
2008-02-01
Similarity searching often reduces to finding the k nearest neighbors to a query object. Finding the k nearest neighbors is achieved by applying either a depth- first or a best-first algorithm to the search hierarchy containing the data. These algorithms are generally applicable to any index based on hierarchical clustering. The idea is that the data is partitioned into clusters which are aggregated to form other clusters, with the total aggregation being represented as a tree. These algorithms have traditionally used a lower bound corresponding to the minimum distance at which a nearest neighbor can be found (termed MinDist) to prune the search process by avoiding the processing of some of the clusters as well as individual objects when they can be shown to be farther from the query object q than all of the current k nearest neighbors of q. An alternative pruning technique that uses an upper bound corresponding to the maximum possible distance at which a nearest neighbor is guaranteed to be found (termed MaxNearestDist) is described. The MaxNearestDist upper bound is adapted to enable its use for finding the k nearest neighbors instead of just the nearest neighbor (i.e., k=1) as in its previous uses. Both the depth-first and best-first k-nearest neighbor algorithms are modified to use MaxNearestDist, which is shown to enhance both algorithms by overcoming their shortcomings. In particular, for the depth-first algorithm, the number of clusters in the search hierarchy that must be examined is not increased thereby potentially lowering its execution time, while for the best-first algorithm, the number of clusters in the search hierarchy that must be retained in the priority queue used to control the ordering of processing of the clusters is also not increased, thereby potentially lowering its storage requirements.
Ntalaperas, D
2016-01-01
We propose an architecture based on Quantum cellular Automata which allows the use of only one type of quantum gates per computational step in order to perform nearest neighbor interactions. The model is built in partial steps, each one of them analyzed using nearest neighbor interactions, starting with single qubit operations and continuing with two qubit ones. The effectiveness of the model is tested and valuated by developing a quantum circuit implementing the Quantum Fourier Transform. The important outcome of this validation was that the operations are performed in a local and controlled manner thus reducing the error rate of each computational step.
Nearest-neighbor interactions, habitat fragmentation, and the persistence of host-pathogen systems.
Wodarz, Dominik; Sun, Zhiying; Lau, John W; Komarova, Natalia L
2013-09-01
Spatial interactions are known to promote stability and persistence in enemy-victim interactions if instability and extinction occur in well-mixed settings. We investigate the effect of spatial interactions in the opposite case, where populations can persist in well-mixed systems. A stochastic agent-based model of host-pathogen dynamics is considered that describes nearest-neighbor interactions in an undivided habitat. Contrary to previous notions, we find that in this setting, spatial interactions in fact promote extinction. The reason is that, in contrast to the mass-action system, the outcome of the nearest-neighbor model is governed by dynamics in small "local neighborhoods." This is an abstraction that describes interactions in a minimal grid consisting of an individual plus its nearest neighbors. The small size of this characteristic scale accounts for the higher extinction probabilities. Hence, nearest-neighbor interactions in a continuous habitat lead to outcomes reminiscent of a fragmented habitat, which is underlined further with a metapopulation model that explicitly assumes habitat fragmentation. Beyond host-pathogen dynamics, axiomatic modeling shows that our results hold for generic enemy-victim interactions under specified assumptions. These results are used to interpret a set of published experiments that provide a first step toward model testing and are discussed in the context of the literature.
Recursive nearest neighbor search in a sparse and multiscale domain for comparing audio signals
DEFF Research Database (Denmark)
Sturm, Bob L.; Daudet, Laurent
2011-01-01
We investigate recursive nearest neighbor search in a sparse domain at the scale of audio signals. Essentially, to approximate the cosine distance between the signals we make pairwise comparisons between the elements of localized sparse models built from large and redundant multiscale dictionaries...
k-Nearest Neighbors for automated classification of celestial objects
Institute of Scientific and Technical Information of China (English)
LI LiLi; ZHANG YanXia; ZHAO YongHeng
2008-01-01
The nearest neighbors (NNs) classifiers, especially the k-Nearest Neighbors (kNNs) algorithm, are among the simplest and yet most efficient classification rules and widely used in practice. It is a nonparametric method of pattern recognition. In this paper, k-Nearest Neighbors, one of the most commonly used machine learning methods, work in automatic classification of multi-wavelength astronomical objects. Through the experiment, we conclude that the running speed of the kNN classier is rather fast and the classification accuracy is up to 97.73%. As a result, it is efficient and applicable to discriminate active objects from stars and normal galaxies with this method. The classifiers trained by the kNN method can be used to solve the automated classification problem faced by astronomy and the virtual observatory (VO).
Multiple k Nearest Neighbor Query Processing in Spatial Network Databases
DEFF Research Database (Denmark)
Xuegang, Huang; Jensen, Christian Søndergaard; Saltenis, Simonas
2006-01-01
This paper concerns the efficient processing of multiple k nearest neighbor queries in a road-network setting. The assumed setting covers a range of scenarios such as the one where a large population of mobile service users that are constrained to a road network issue nearest-neighbor queries...... for points of interest that are accessible via the road network. Given multiple k nearest neighbor queries, the paper proposes progressive techniques that selectively cache query results in main memory and subsequently reuse these for query processing. The paper initially proposes techniques for the case...... where an upper bound on k is known a priori and then extends the techniques to the case where this is not so. Based on empirical studies with real-world data, the paper offers insight into the circumstances under which the different proposed techniques can be used with advantage for multiple k nearest...
Nearest neighbor interaction in the Path Integral Renormalization Group method
de Silva, Wasanthi; Clay, R. Torsten
2014-03-01
The Path Integral Renormalization Group (PIRG) method is an efficient numerical algorithm for studying ground state properties of strongly correlated electron systems. The many-body ground state wave function is approximated by an optimized linear combination of Slater determinants which satisfies the variational principle. A major advantage of PIRG is that is does not suffer the Fermion sign problem of quantum Monte Carlo. Results are exact in the noninteracting limit and can be enhanced using space and spin symmetries. Many observables can be calculated using Wick's theorem. PIRG has been used predominantly for the Hubbard model with a single on-site Coulomb interaction U. We describe an extension of PIRG to the extended Hubbard model (EHM) including U and a nearest-neighbor interaction V. The EHM is particularly important in models of charge-transfer solids (organic superconductors) and at 1/4-filling drives a charge-ordered state. The presence of lattice frustration also makes studying these systems difficult. We test the method with comparisons to small clusters and long one dimensional chains, and show preliminary results for a coupled-chain model for the (TMTTF)2X materials. This work was supported by DOE grant DE-FG02-06ER46315.
River Flow Prediction Using the Nearest Neighbor Probabilistic Ensemble Method
Directory of Open Access Journals (Sweden)
H. Sanikhani
2016-02-01
Full Text Available Introduction: In the recent years, researchers interested on probabilistic forecasting of hydrologic variables such river flow.A probabilistic approach aims at quantifying the prediction reliability through a probability distribution function or a prediction interval for the unknown future value. The evaluation of the uncertainty associated to the forecast is seen as a fundamental information, not only to correctly assess the prediction, but also to compare forecasts from different methods and to evaluate actions and decisions conditionally on the expected values. Several probabilistic approaches have been proposed in the literature, including (1 methods that use resampling techniques to assess parameter and model uncertainty, such as the Metropolis algorithm or the Generalized Likelihood Uncertainty Estimation (GLUE methodology for an application to runoff prediction, (2 methods based on processing the forecast errors of past data to produce the probability distributions of future values and (3 methods that evaluate how the uncertainty propagates from the rainfall forecast to the river discharge prediction, as the Bayesian forecasting system. Materials and Methods: In this study, two different probabilistic methods are used for river flow prediction.Then the uncertainty related to the forecast is quantified. One approach is based on linear predictors and in the other, nearest neighbor was used. The nonlinear probabilistic ensemble can be used for nonlinear time series analysis using locally linear predictors, while NNPE utilize a method adapted for one step ahead nearest neighbor methods. In this regard, daily river discharge (twelve years of Dizaj and Mashin Stations on Baranduz-Chay basin in west Azerbijan and Zard-River basin in Khouzestan provinces were used, respectively. The first six years of data was applied for fitting the model. The next three years was used to calibration and the remained three yeas utilized for testing the models
Fast agglomerative clustering using a k-nearest neighbor graph.
Fränti, Pasi; Virmajoki, Olli; Hautamäki, Ville
2006-11-01
We propose a fast agglomerative clustering method using an approximate nearest neighbor graph for reducing the number of distance calculations. The time complexity of the algorithm is improved from O(tauN2) to O(tauNlogN) at the cost of a slight increase in distortion; here, tau denotes the number of nearest neighbor updates required at each iteration. According to the experiments, a relatively small neighborhood size is sufficient to maintain the quality close to that of the full search.
k-Nearest Neighbors Algorithm in Profiling Power Analysis Attacks
Directory of Open Access Journals (Sweden)
Z. Martinasek
2016-06-01
Full Text Available Power analysis presents the typical example of successful attacks against trusted cryptographic devices such as RFID (Radio-Frequency IDentifications and contact smart cards. In recent years, the cryptographic community has explored new approaches in power analysis based on machine learning models such as Support Vector Machine (SVM, RF (Random Forest and Multi-Layer Perceptron (MLP. In this paper, we made an extensive comparison of machine learning algorithms in the power analysis. For this purpose, we implemented a verification program that always chooses the optimal settings of individual machine learning models in order to obtain the best classification accuracy. In our research, we used three datasets, the first containing the power traces of an unprotected AES (Advanced Encryption Standard implementation. The second and third datasets are created independently from public available power traces corresponding to a masked AES implementation (DPA Contest v4. The obtained results revealed some interesting facts, namely, an elementary k-NN (k-Nearest Neighbors algorithm, which has not been commonly used in power analysis yet, shows great application potential in practice.
Latching chains in K-nearest-neighbor and modular small-world networks.
Song, Sanming; Yao, Hongxun; Simonov, Alexander Yurievich
2015-01-01
Latching dynamics retrieve pattern sequences successively by neural adaption and pattern correlation. We have previously proposed a modular latching chain model in Song et al. (2014) to better accommodate the structured transitions in the brain. Different cortical areas have different network structures. To explore how structural parameters like rewiring probability, threshold, noise and feedback connections affect the latching dynamics, two different connection schemes, K-nearest-neighbor network and modular network both having modular structure are considered. Latching chains are measured using two proposed measures characterizing length of intra-modular latching chains and sequential inter-modular association transitions. Our main findings include: (1) With decreasing threshold coefficient and rewiring probability, both the K-nearest-neighbor network and the modular network experience quantitatively similar phase change processes. (2) The modular network exhibits selectively enhanced latching in the small-world range of connectivity. (3) The K-nearest-neighbor network is more robust to changes in rewiring probability, while the modular network is more robust to the presence of noise pattern pairs and to changes in the strength of feedback connections. According to our findings, the relationships between latching chains in K-nearest-neighbor and modular networks and different forms of cognition and information processing emerging in the brain are discussed.
Impact of the Next-Nearest-Neighbor Interaction on Traffic Flow of Highway with Slopes
Institute of Scientific and Technical Information of China (English)
李志鹏; 周盈
2012-01-01
In this paper,we study the motion course of traffic flow on the slopes of a highway by applying a microscopic traffic model,which takes into account the next-nearest-neighbor interaction in an intelligent transportation system environment.Three common gradients of the highway,which are sag terrain,uphill terrain,and downhill terrain on a single-lane roadway,are selected to clarify the impact on the traffic flow by the next-nearest-neighbor interaction in relative velocity.We obtain the current-density relation for traffic flow on the sag,the uphill and the downhill under the next-nearest-neighbor interaction strategy.It is observed that the current saturates when the density is greater than a critical value and the current decreases when the density is greater than another critical value.When the density falls into the intermediate range between the two critical densities it is also found that the oscillatory jam,easily leads to traffic accidents,often appears in the downhill stage,and the next-nearest-neighbor interaction in relative velocity has a strong suppressing effect on this kind of dangerous congestion.A theoretical analysis is also presented to explain this important conclusion.
Nearest Neighbor Algorithm in Handwritten Character
Directory of Open Access Journals (Sweden)
P. R. Deshmukh
2014-07-01
Full Text Available The proposed system extracts the geometric features of the character Contour. The system gives a feature vector as its output. The feature vectors so generated from a training set is then used to train a pattern recognition engine based on Neural Networks so that the system can be benchmarked. There was an attempt made to develop a system that used the methods that humans use to perceive handwritten characters. Hence a system that recognizes handwritten characters using Pattern recognition was developed. Here the data generated by comparing two images was stored in excel format and then that data was called as an individual input for generation of Simulink diagram. Pattern recognition can be used to model human perception. The mathematics that Pattern recognition requires is extremely fundamental. Any algorithm developed using Pattern recognition would require relatively simple and not so lengthy calculations. Due to simplicity of calculations, they can be implemented on any hardware or software platform without worrying about the computing power. In this paper first part is about introduction to character Recognition. The second part deals with the short introduction to neural network implementation for image processing using MATLAB
Secure Nearest Neighbor Query on Crowd-Sensing Data.
Cheng, Ke; Wang, Liangmin; Zhong, Hong
2016-09-22
Nearest neighbor queries are fundamental in location-based services, and secure nearest neighbor queries mainly focus on how to securely and quickly retrieve the nearest neighbor in the outsourced cloud server. However, the previous big data system structure has changed because of the crowd-sensing data. On the one hand, sensing data terminals as the data owner are numerous and mistrustful, while, on the other hand, in most cases, the terminals find it difficult to finish many safety operation due to computation and storage capability constraints. In light of they Multi Owners and Multi Users (MOMU) situation in the crowd-sensing data cloud environment, this paper presents a secure nearest neighbor query scheme based on the proxy server architecture, which is constructed by protocols of secure two-party computation and secure Voronoi diagram algorithm. It not only preserves the data confidentiality and query privacy but also effectively resists the collusion between the cloud server and the data owners or users. Finally, extensive theoretical and experimental evaluations are presented to show that our proposed scheme achieves a superior balance between the security and query performance compared to other schemes.
Secure Nearest Neighbor Query on Crowd-Sensing Data
Directory of Open Access Journals (Sweden)
Ke Cheng
2016-09-01
Full Text Available Nearest neighbor queries are fundamental in location-based services, and secure nearest neighbor queries mainly focus on how to securely and quickly retrieve the nearest neighbor in the outsourced cloud server. However, the previous big data system structure has changed because of the crowd-sensing data. On the one hand, sensing data terminals as the data owner are numerous and mistrustful, while, on the other hand, in most cases, the terminals find it difficult to finish many safety operation due to computation and storage capability constraints. In light of they Multi Owners and Multi Users (MOMU situation in the crowd-sensing data cloud environment, this paper presents a secure nearest neighbor query scheme based on the proxy server architecture, which is constructed by protocols of secure two-party computation and secure Voronoi diagram algorithm. It not only preserves the data confidentiality and query privacy but also effectively resists the collusion between the cloud server and the data owners or users. Finally, extensive theoretical and experimental evaluations are presented to show that our proposed scheme achieves a superior balance between the security and query performance compared to other schemes.
Fully Retroactive Approximate Range and Nearest Neighbor Searching
Goodrich, Michael T
2011-01-01
We describe fully retroactive dynamic data structures for approximate range reporting and approximate nearest neighbor reporting. We show how to maintain, for any positive constant $d$, a set of $n$ points in $\\R^d$ indexed by time such that we can perform insertions or deletions at any point in the timeline in $O(\\log n)$ amortized time. We support, for any small constant $\\epsilon>0$, $(1+\\epsilon)$-approximate range reporting queries at any point in the timeline in $O(\\log n + k)$ time, where $k$ is the output size. We also show how to answer $(1+\\epsilon)$-approximate nearest neighbor queries for any point in the past or present in $O(\\log n)$ time.
Nearest-neighbor Entropy Estimators with Weak Metrics
Timofeev, Evgeniy
2012-01-01
A problem of improving the accuracy of nonparametric entropy estimation for a stationary ergodic process is considered. New weak metrics are introduced and relations between metrics, measures, and entropy are discussed. Based on weak metrics, a new nearest-neighbor entropy estimator is constructed and has a parameter with which the estimator is optimized to reduce its bias. It is shown that estimator's variance is upper-bounded by a nearly optimal Cramer-Rao lower bound.
Stabilization and enhancement of traffic flow by the next-nearest-neighbor interaction.
Nagatani, T
1999-12-01
The car-following model of traffic is extended to take into account the car interaction before the next car ahead (the next-nearest-neighbor interaction). The traffic behavior of the extended car-following model is investigated numerically and analytically. It is shown that the next-nearest-neighbor interaction stabilizes the traffic flow. The jamming transition between the freely moving and jammed phases occurs at a higher density than the threshold of the original car-following model. By increasing the maximal velocity, the traffic current is enhanced without jam by the stabilization effect. The jamming transition is analyzed with the use of the linear stability and nonlinear perturbation methods. The traffic jam is described by the kink solution of the modified Korteweg-de Vries equation. The theoretical coexisting curve is in good agreement with the simulation result.
Fast and accurate hashing via iterative nearest neighbors expansion.
Jin, Zhongming; Zhang, Debing; Hu, Yao; Lin, Shiding; Cai, Deng; He, Xiaofei
2014-11-01
Recently, the hashing techniques have been widely applied to approximate the nearest neighbor search problem in many real applications. The basic idea of these approaches is to generate binary codes for data points which can preserve the similarity between any two of them. Given a query, instead of performing a linear scan of the entire data base, the hashing method can perform a linear scan of the points whose hamming distance to the query is not greater than rh , where rh is a constant. However, in order to find the true nearest neighbors, both the locating time and the linear scan time are proportional to O(∑i=0(rh)(c || i)) ( c is the code length), which increase exponentially as rh increases. To address this limitation, we propose a novel algorithm named iterative expanding hashing in this paper, which builds an auxiliary index based on an offline constructed nearest neighbor table to avoid large rh . This auxiliary index can be easily combined with all the traditional hashing methods. Extensive experimental results over various real large-scale datasets demonstrate the superiority of the proposed approach.
[Galaxy/quasar classification based on nearest neighbor method].
Li, Xiang-Ru; Lu, Yu; Zhou, Jian-Ming; Wang, Yong-Jun
2011-09-01
With the wide application of high-quality CCD in celestial spectrum imagery and the implementation of many large sky survey programs (e. g., Sloan Digital Sky Survey (SDSS), Two-degree-Field Galaxy Redshift Survey (2dF), Spectroscopic Survey Telescope (SST), Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) program and Large Synoptic Survey Telescope (LSST) program, etc.), celestial observational data are coming into the world like torrential rain. Therefore, to utilize them effectively and fully, research on automated processing methods for celestial data is imperative. In the present work, we investigated how to recognizing galaxies and quasars from spectra based on nearest neighbor method. Galaxies and quasars are extragalactic objects, they are far away from earth, and their spectra are usually contaminated by various noise. Therefore, it is a typical problem to recognize these two types of spectra in automatic spectra classification. Furthermore, the utilized method, nearest neighbor, is one of the most typical, classic, mature algorithms in pattern recognition and data mining, and often is used as a benchmark in developing novel algorithm. For applicability in practice, it is shown that the recognition ratio of nearest neighbor method (NN) is comparable to the best results reported in the literature based on more complicated methods, and the superiority of NN is that this method does not need to be trained, which is useful in incremental learning and parallel computation in mass spectral data processing. In conclusion, the results in this work are helpful for studying galaxies and quasars spectra classification.
Rainfall forecast in northeast of thailand using modified k-nearest neighbor
Directory of Open Access Journals (Sweden)
Uruya Weesakul
2014-06-01
Full Text Available Since damage from natural disasters have increased due to anomalous global climate, scientists and engineers are interested in studying incorporation of the occurence of natural disasters. Thailand faces with flood in the wet season and drought in the dry season every year. The Northeast of Thailand is a region where found damages from disasters especially. This study developed a statistical model for forecasting rainfall in the Chi River Basin using large-scale atmospheric variables (LAV as the independent variables to the modified k-nearest neighbor model. The significant LAV were identified over both Indian and Pacific Oceans. The model performance was evaluated using box plot of 3-month rainfall to present how well the model can capture the historical data and likelihood skill score (LLH. From both model evaluation, approximately 62% of historical rainfall data was captured forecasting model. LLH of rainfall ensembles in the Chi River Basin are quite good and better LLH can be found post 2000, especially June-August and July-September rainfall.
Ito, T; Tanimoto, M; Ito, Toshiaki; Okamura, Naotoshi; Tanimoto, Morimitsu
1998-01-01
We propose the Fritzsch-Branco-Silva-Marcos type fermion mass matrix, which is a typical texture in the nearest-neighbor interaction form, in SU(5) GUT. By evolution of the mass matrices with SU(5) GUT relations in the minimal SUSY standard model, we obtain predictions for the unitarity triangle of CP violation as well as the quark flavor mixing angles, which are consistent with experimental data, in the case of \\tan\\beta \\simeq 3.
Classification of EEG Signals using adaptive weighted distance nearest neighbor algorithm
Directory of Open Access Journals (Sweden)
E. Parvinnia
2014-01-01
Full Text Available Electroencephalogram (EEG signals are often used to diagnose diseases such as seizure, alzheimer, and schizophrenia. One main problem with the recorded EEG samples is that they are not equally reliable due to the artifacts at the time of recording. EEG signal classification algorithms should have a mechanism to handle this issue. It seems that using adaptive classifiers can be useful for the biological signals such as EEG. In this paper, a general adaptive method named weighted distance nearest neighbor (WDNN is applied for EEG signal classification to tackle this problem. This classification algorithm assigns a weight to each training sample to control its influence in classifying test samples. The weights of training samples are used to find the nearest neighbor of an input query pattern. To assess the performance of this scheme, EEG signals of thirteen schizophrenic patients and eighteen normal subjects are analyzed for the classification of these two groups. Several features including, fractal dimension, band power and autoregressive (AR model are extracted from EEG signals. The classification results are evaluated using Leave one (subject out cross validation for reliable estimation. The results indicate that combination of WDNN and selected features can significantly outperform the basic nearest-neighbor and the other methods proposed in the past for the classification of these two groups. Therefore, this method can be a complementary tool for specialists to distinguish schizophrenia disorder.
Boundary effect correction in k-nearest-neighbor estimation.
Alizad Rahvar, A R; Ardakani, M
2011-05-01
The problem of the boundary effect for the k-nearest-neighbor (kNN) estimation is addressed, and a correction method is suggested. The correction is proposed for bounded distributions, but it can be used for any set of bounded samples. We apply the proposed correction to entropy estimation of multidimensional distributions and time series, and this correction reduces considerably the bias and statistical errors in the estimation. For a small sample size or high-dimensional data, the corrected estimator outperforms the uncorrected estimator significantly. This advantage makes the kNN method applicable to more real-life situations, e.g., the analysis of biological and molecular data.
Nearest and reverse nearest neighbor queries for moving objects
DEFF Research Database (Denmark)
Benetis, R.; Jensen, Christian Søndergaard; Karciauskas, G.
2006-01-01
With the continued proliferation of wireless communications and advances in positioning technologies, algorithms for efficiently answering queries about large populations of moving objects are gaining in interest. This paper proposes algorithms for k nearest and reverse k nearest neighbor queries...... on the current and anticipated future positions of points moving continuously in the plane. The former type of query returns k objects nearest to a query object for each time point during a time interval, while the latter returns the objects that have a specified query object as one of their k closest neighbors...
Mapping of second-nearest-neighbor fluoride ions of orthorhombic Gd 3+-Ag + complexes in CaF 2
Nakata, R.; Den Hartog, H. W.
The ENDOR technique is applied to determine the positions of 24 second-nearest-neighbor F - ions around an orthorhombic Gd 3+-Ag + complex in CaF 2 crystals. Experimental ENDOR data of the second-nearest-neighbor F - ions are analyzed by using the usual spin Hamiltonian and a least-squares fitting method. The best fits of the experimental results give superhyperfine (shf) constants and the F - directions ( K, L, M) with respect to the Gd 3+ ion, from which the distance between the second-nearest-neighbor F - ion and the Gd 3+ ion is determined by assuming that the hyperfine interaction is due to the classical dipole-dipole interaction. The displacements of the F - ions are estimated and compared with the theoretical values calculated by Bijvank and den Hartog on the basis of a polarizable point charge model.
Using K-Nearest Neighbor in Optical Character Recognition
Directory of Open Access Journals (Sweden)
Veronica Ong
2016-03-01
Full Text Available The growth in computer vision technology has aided society with various kinds of tasks. One of these tasks is the ability of recognizing text contained in an image, or usually referred to as Optical Character Recognition (OCR. There are many kinds of algorithms that can be implemented into an OCR. The K-Nearest Neighbor is one such algorithm. This research aims to find out the process behind the OCR mechanism by using K-Nearest Neighbor algorithm; one of the most influential machine learning algorithms. It also aims to find out how precise the algorithm is in an OCR program. To do that, a simple OCR program to classify alphabets of capital letters is made to produce and compare real results. The result of this research yielded a maximum of 76.9% accuracy with 200 training samples per alphabet. A set of reasons are also given as to why the program is able to reach said level of accuracy.
Directory of Open Access Journals (Sweden)
Cobaugh Christian W
2004-08-01
Full Text Available Abstract Background A detailed understanding of an RNA's correct secondary and tertiary structure is crucial to understanding its function and mechanism in the cell. Free energy minimization with energy parameters based on the nearest-neighbor model and comparative analysis are the primary methods for predicting an RNA's secondary structure from its sequence. Version 3.1 of Mfold has been available since 1999. This version contains an expanded sequence dependence of energy parameters and the ability to incorporate coaxial stacking into free energy calculations. We test Mfold 3.1 by performing the largest and most phylogenetically diverse comparison of rRNA and tRNA structures predicted by comparative analysis and Mfold, and we use the results of our tests on 16S and 23S rRNA sequences to assess the improvement between Mfold 2.3 and Mfold 3.1. Results The average prediction accuracy for a 16S or 23S rRNA sequence with Mfold 3.1 is 41%, while the prediction accuracies for the majority of 16S and 23S rRNA structures tested are between 20% and 60%, with some having less than 20% prediction accuracy. The average prediction accuracy was 71% for 5S rRNA and 69% for tRNA. The majority of the 5S rRNA and tRNA sequences have prediction accuracies greater than 60%. The prediction accuracy of 16S rRNA base-pairs decreases exponentially as the number of nucleotides intervening between the 5' and 3' halves of the base-pair increases. Conclusion Our analysis indicates that the current set of nearest-neighbor energy parameters in conjunction with the Mfold folding algorithm are unable to consistently and reliably predict an RNA's correct secondary structure. For 16S or 23S rRNA structure prediction, Mfold 3.1 offers little improvement over Mfold 2.3. However, the nearest-neighbor energy parameters do work well for shorter RNA sequences such as tRNA or 5S rRNA, or for larger rRNAs when the contact distance between the base-pairs is less than 100 nucleotides.
Designing lattice structures with maximal nearest-neighbor entanglement
Energy Technology Data Exchange (ETDEWEB)
Navarro-Munoz, J C; Lopez-Sandoval, R [Instituto Potosino de Investigacion CientIfica y Tecnologica, Camino a la presa San Jose 2055, 78216 San Luis Potosi (Mexico); Garcia, M E [Theoretische Physik, FB 18, Universitaet Kassel and Center for Interdisciplinary Nanostructure Science and Technology (CINSaT), Heinrich-Plett-Str.40, 34132 Kassel (Germany)
2009-08-07
In this paper, we study the numerical optimization of nearest-neighbor concurrence of bipartite one- and two-dimensional lattices, as well as non-bipartite two-dimensional lattices. These systems are described in the framework of a tight-binding Hamiltonian while the optimization of concurrence was performed using genetic algorithms. Our results show that the concurrence of the optimized lattice structures is considerably higher than that of non-optimized systems. In the case of one-dimensional chains, the concurrence increases dramatically when the system begins to dimerize, i.e., it undergoes a structural phase transition (Peierls distortion). This result is consistent with the idea that entanglement is maximal or shows a singularity near quantum phase transitions. Moreover, the optimization of concurrence in two-dimensional bipartite and non-bipartite lattices is achieved when the structures break into smaller subsystems, which are arranged in geometrically distinguishable configurations.
IPADE: Iterative prototype adjustment for nearest neighbor classification.
Triguero, Isaac; Garcia, Salvador; Herrera, Francisco
2010-12-01
Nearest prototype methods are a successful trend of many pattern classification tasks. However, they present several shortcomings such as time response, noise sensitivity, and storage requirements. Data reduction techniques are suitable to alleviate these drawbacks. Prototype generation is an appropriate process for data reduction, which allows the fitting of a dataset for nearest neighbor (NN) classification. This brief presents a methodology to learn iteratively the positioning of prototypes using real parameter optimization procedures. Concretely, we propose an iterative prototype adjustment technique based on differential evolution. The results obtained are contrasted with nonparametric statistical tests and show that our proposal consistently outperforms previously proposed methods, thus becoming a suitable tool in the task of enhancing the performance of the NN classifier.
Approximate Nearest Neighbor Search for a Dataset of Normalized Vectors
Terasawa, Kengo; Tanaka, Yuzuru
This paper describes a novel algorithm for approximate nearest neighbor searching. For solving this problem especially in high dimensional spaces, one of the best-known algorithm is Locality-Sensitive Hashing (LSH). This paper presents a variant of the LSH algorithm that outperforms previously proposed methods when the dataset consists of vectors normalized to unit length, which is often the case in pattern recognition. The LSH scheme is based on a family of hash functions that preserves the locality of points. This paper points out that for our special case problem we can design efficient hash functions that map a point on the hypersphere into the closest vertex of the randomly rotated regular polytope. The computational analysis confirmed that the proposed method could improve the exponent ρ, the main indicator of the performance of the LSH algorithm. The practical experiments also supported the efficiency of our algorithm both in time and in space.
Nearest Neighbor Estimates of Entropy for Multivariate Circular Distributions
Directory of Open Access Journals (Sweden)
Neeraj Misra
2010-05-01
Full Text Available In molecular sciences, the estimation of entropies of molecules is important for the understanding of many chemical and biological processes. Motivated by these applications, we consider the problem of estimating the entropies of circular random vectors and introduce non-parametric estimators based on circular distances between n sample points and their k th nearest neighbors (NN, where k (≤ n – 1 is a fixed positive integer. The proposed NN estimators are based on two different circular distances, and are proven to be asymptotically unbiased and consistent. The performance of one of the circular-distance estimators is investigated and compared with that of the already established Euclidean-distance NN estimator using Monte Carlo samples from an analytic distribution of six circular variables of an exactly known entropy and a large sample of seven internal-rotation angles in the molecule of tartaric acid, obtained by a realistic molecular-dynamics simulation.
Enhanced Approximate Nearest Neighbor via Local Area Focused Search.
Energy Technology Data Exchange (ETDEWEB)
Gonzales, Antonio [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Blazier, Nicholas Paul [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
2017-02-01
Approximate Nearest Neighbor (ANN) algorithms are increasingly important in machine learning, data mining, and image processing applications. There is a large family of space- partitioning ANN algorithms, such as randomized KD-Trees, that work well in practice but are limited by an exponential increase in similarity comparisons required to optimize recall. Additionally, they only support a small set of similarity metrics. We present Local Area Fo- cused Search (LAFS), a method that enhances the way queries are performed using an existing ANN index. Instead of a single query, LAFS performs a number of smaller (fewer similarity comparisons) queries and focuses on a local neighborhood which is refined as candidates are identified. We show that our technique improves performance on several well known datasets and is easily extended to general similarity metrics using kernel projection techniques.
Anomaly Detection with Score functions based on Nearest Neighbor Graphs
Zhao, Manqi
2009-01-01
We propose a novel non-parametric adaptive anomaly detection algorithm for high dimensional data based on score functions derived from nearest neighbor graphs on $n$-point nominal data. Anomalies are declared whenever the score of a test sample falls below $\\alpha$, which is supposed to be the desired false alarm level. The resulting anomaly detector is shown to be asymptotically optimal in that it is uniformly most powerful for the specified false alarm level, $\\alpha$, for the case when the anomaly density is a mixture of the nominal and a known density. Our algorithm is computationally efficient, being linear in dimension and quadratic in data size. It does not require choosing complicated tuning parameters or function approximation classes and it can adapt to local structure such as local change in dimensionality. We demonstrate the algorithm on both artificial and real data sets in high dimensional feature spaces.
Approximate aggregate nearest neighbor search on moving objects trajectories
Institute of Scientific and Technical Information of China (English)
Mohammad; Reza; Abbasifard; Hassan; Naderi; Zohreh; Fallahnejad; Omid; Isfahani; Alamdari
2015-01-01
Aggregate nearest neighbor(ANN) search retrieves for two spatial datasets T and Q, segment(s) of one or more trajectories from the set T having minimum aggregate distance to points in Q. When interacting with large amounts of trajectories, this process would be very time-consuming due to consecutive page loads. An approximate method for finding segments with minimum aggregate distance is proposed which can improve the response time. In order to index large volumes of trajectories, scalable and efficient trajectory index(SETI) structure is used. But some refinements are provided to temporal index of SETI to improve the performance of proposed method. The experiments were performed with different number of query points and percentages of dataset. It is shown that proposed method besides having an acceptable precision, can reduce the computation time significantly. It is also shown that the main fraction of search time among load time, ANN and computing convex and centroid, is related to ANN.
Nearest neighbor search algorithm for GBD tree spatial data structure
Institute of Scientific and Technical Information of China (English)
Yutaka Ohsawa; Takanobu Kurihara; Ayaka Ohki
2007-01-01
This paper describes the nearest neighbor (NN) search algorithm on the GBD(generalized BD) tree. The GBD tree is a spatial data structure suitable for two- or three-dimensional data and has good performance characteristics with respect to the dynamic data environment. On GIS and CAD systems, the R-tree and its successors have been used. In addition, the NN search algorithm is also proposed in an attempt to obtain good performance from the R-tree. On the other hand, the GBD tree is superior to the R-tree with respect to exact match retrieval, because the GBD tree has auxiliary data that uniquely determines the position of the object in the structure. The proposed NN search algorithm depends on the property of the GBD tree described above. The NN search algorithm on the GBD tree was studied and the performance thereof was evaluated through experiments.
ENN: Extended Nearest Neighbor Method for Pattern Recognition [Research Frontier
2015-07-16
dimensional data (i.e., the curse of dimensionality problem). Feature nor- malization and dimensionality reduction methods are able to remedy this issue...2.77 KNOwlEDGE 23.93 ! 4.69 27.11 ! 4.45 12.66 ! 2.45 6.97 ! 2.53 14.42 ! 3.86 VErTEBrAl 35.13 ! 4.83 37.64 ! 5.06 47.93 ! 3.41 36.88 ! 4.83 45.11...Workshops, 2008, pp. 1–6. [29] P. Indyk and R. Motwani, “Approximate nearest neighbors: Towards removing the curse of dimensional- ity,” in Proc. 30th
FINDING A RESIDENCE WITH ALL FACILITIES USING NEAREST NEIGHBOR SEARCH
Directory of Open Access Journals (Sweden)
K. Padmapriya
2014-01-01
Full Text Available Nearest neighbor search is one of the most widely-used techniques and its applications including mobile communication, Geographic information systems, bioinformatics, computer vision and marketing. For example, four friends want to rent an apartment which should be nearer to their working places. Our paper discussed about the problems on finding the most appropriate location among a set of available places. The problem is defined as a top-k query which gives output of k points from a set of available places P along with the conveniences. We proposed algorithms based on R-trees to answer the query exactly. The efficiency of our proposed algorithms is verified through various experiments and found that it is better than existing algorithms use large scale real datasets.
Query-Adaptive Reciprocal Hash Tables for Nearest Neighbor Search.
Liu, Xianglong; Deng, Cheng; Lang, Bo; Tao, Dacheng; Li, Xuelong
2016-02-01
Recent years have witnessed the success of binary hashing techniques in approximate nearest neighbor search. In practice, multiple hash tables are usually built using hashing to cover more desired results in the hit buckets of each table. However, rare work studies the unified approach to constructing multiple informative hash tables using any type of hashing algorithms. Meanwhile, for multiple table search, it also lacks of a generic query-adaptive and fine-grained ranking scheme that can alleviate the binary quantization loss suffered in the standard hashing techniques. To solve the above problems, in this paper, we first regard the table construction as a selection problem over a set of candidate hash functions. With the graph representation of the function set, we propose an efficient solution that sequentially applies normalized dominant set to finding the most informative and independent hash functions for each table. To further reduce the redundancy between tables, we explore the reciprocal hash tables in a boosting manner, where the hash function graph is updated with high weights emphasized on the misclassified neighbor pairs of previous hash tables. To refine the ranking of the retrieved buckets within a certain Hamming radius from the query, we propose a query-adaptive bitwise weighting scheme to enable fine-grained bucket ranking in each hash table, exploiting the discriminative power of its hash functions and their complement for nearest neighbor search. Moreover, we integrate such scheme into the multiple table search using a fast, yet reciprocal table lookup algorithm within the adaptive weighted Hamming radius. In this paper, both the construction method and the query-adaptive search method are general and compatible with different types of hashing algorithms using different feature spaces and/or parameter settings. Our extensive experiments on several large-scale benchmarks demonstrate that the proposed techniques can significantly outperform both
Sann: solvent accessibility prediction of proteins by nearest neighbor method.
Joo, Keehyoung; Lee, Sung Jong; Lee, Jooyoung
2012-07-01
We present a method to predict the solvent accessibility of proteins which is based on a nearest neighbor method applied to the sequence profiles. Using the method, continuous real-value prediction as well as two-state and three-state discrete predictions can be obtained. The method utilizes the z-score value of the distance measure in the feature vector space to estimate the relative contribution among the k-nearest neighbors for prediction of the discrete and continuous solvent accessibility. The Solvent accessibility database is constructed from 5717 proteins extracted from PISCES culling server with the cutoff of 25% sequence identities. Using optimal parameters, the prediction accuracies (for discrete predictions) of 78.38% (two-state prediction with the threshold of 25%), 65.1% (three-state prediction with the thresholds of 9 and 36%), and the Pearson correlation coefficient (between the predicted and true RSA's for continuous prediction) of 0.676 are achieved An independent benchmark test was performed with the CASP8 targets where we find that the proposed method outperforms existing methods. The prediction accuracies are 80.89% (for two state prediction with the threshold of 25%), 67.58% (three-state prediction), and the Pearson correlation coefficient of 0.727 (for continuous prediction) with mean absolute error of 0.148. We have also investigated the effect of increasing database sizes on the prediction accuracy, where additional improvement in the accuracy is observed as the database size increases. The SANN web server is available at http://lee.kias.re.kr/~newton/sann/.
Zandvliet, Henricus J.W.
2015-01-01
We have derived within the framework of a solid-on-solid model with anisotropic nearest-neighbor interactions an exact expression for the free energy of an arbitrarily oriented step edge or boundary on a rectangular two-dimensional lattice. The full angular dependence of the step free energy allows
Melting point prediction employing k-nearest neighbor algorithms and genetic parameter optimization.
Nigsch, Florian; Bender, Andreas; van Buuren, Bernd; Tissen, Jos; Nigsch, Eduard; Mitchell, John B O
2006-01-01
We have applied the k-nearest neighbor (kNN) modeling technique to the prediction of melting points. A data set of 4119 diverse organic molecules (data set 1) and an additional set of 277 drugs (data set 2) were used to compare performance in different regions of chemical space, and we investigated the influence of the number of nearest neighbors using different types of molecular descriptors. To compute the prediction on the basis of the melting temperatures of the nearest neighbors, we used four different methods (arithmetic and geometric average, inverse distance weighting, and exponential weighting), of which the exponential weighting scheme yielded the best results. We assessed our model via a 25-fold Monte Carlo cross-validation (with approximately 30% of the total data as a test set) and optimized it using a genetic algorithm. Predictions for drugs based on drugs (separate training and test sets each taken from data set 2) were found to be considerably better [root-mean-squared error (RMSE)=46.3 degrees C, r2=0.30] than those based on nondrugs (prediction of data set 2 based on the training set from data set 1, RMSE=50.3 degrees C, r2=0.20). The optimized model yields an average RMSE as low as 46.2 degrees C (r2=0.49) for data set 1, and an average RMSE of 42.2 degrees C (r2=0.42) for data set 2. It is shown that the kNN method inherently introduces a systematic error in melting point prediction. Much of the remaining error can be attributed to the lack of information about interactions in the liquid state, which are not well-captured by molecular descriptors.
Ozone Monitoring Using Support Vector Machine and K-Nearest Neighbors Methods
Directory of Open Access Journals (Sweden)
FALEH Rabeb
2017-05-01
Full Text Available Due to health impacts caused by the pollutant gases, monitoring and controlling air quality is an important field of interest. This paper deals with ozone monitoring in four stations measuring air quality located in many Tunisian cities using numerous measuring instruments and polluting gas analyzers. Prediction of ozone concentrations in two Tunisian cities, Tunis and Sfax is screened based on supervised classification models. The K -Nearest neighbors results reached 98.7 % success rate in the recognition and ozone identification. Support Vector Machines (SVM with the linear, polynomial and RBF kernel were applied to build a classifier and full accuracy (100% was again achieved with the RBF kernel.
Hudson, Graham A; Bloomingdale, Richard J; Znosko, Brent M
2013-11-01
Pseudouridine (Ψ) is the most common noncanonical nucleotide present in naturally occurring RNA and serves a variety of roles in the cell, typically appearing where structural stability is crucial to function. Ψ residues are isomerized from native uridine residues by a class of highly conserved enzymes known as pseudouridine synthases. In order to quantify the thermodynamic impact of pseudouridylation on U-A base pairs, 24 oligoribonucleotides, 16 internal and eight terminal Ψ-A oligoribonucleotides, were thermodynamically characterized via optical melting experiments. The thermodynamic parameters derived from two-state fits were used to generate linearly independent parameters for use in secondary structure prediction algorithms using the nearest-neighbor model. On average, internally pseudouridylated duplexes were 1.7 kcal/mol more stable than their U-A counterparts, and terminally pseudouridylated duplexes were 1.0 kcal/mol more stable than their U-A equivalents. Due to the fact that Ψ-A pairs maintain the same Watson-Crick hydrogen bonding capabilities as the parent U-A pair in A-form RNA, the difference in stability due to pseudouridylation was attributed to two possible sources: the novel hydrogen bonding capabilities of the newly relocated imino group as well as the novel stacking interactions afforded by the electronic configuration of the Ψ residue. The newly derived nearest-neighbor parameters for Ψ-A base pairs may be used in conjunction with other nearest-neighbor parameters for accurately predicting the most likely secondary structure of A-form RNA containing Ψ-A base pairs.
[Prediction of protein subcellular locations by ensemble of improved K-nearest neighbor].
Xue, Wei; Wang, Xiongfei; Zhao, Nan; Yang, Rongli; Hong, Xiaoyu
2017-04-25
Adaboost algorithm with improved K-nearest neighbor classifiers is proposed to predict protein subcellular locations. Improved K-nearest neighbor classifier uses three sequence feature vectors including amino acid composition, dipeptide and pseudo amino acid composition of protein sequence. K-nearest neighbor uses Blast in classification stage. The overall success rates by the jackknife test on two data sets of CH317 and Gram1253 are 92.4% and 93.1%. Adaboost algorithm with the novel K-nearest neighbor improved by Blast is an effective method for predicting subcellular locations of proteins.
A Heterogeneous High Dimensional Approximate Nearest Neighbor Algorithm
Dubiner, Moshe
2008-01-01
We consider the problem of finding high dimensional approximate nearest neighbors. Suppose there are d independent rare features, each having its own independent statistics. A point x will have x_{i}=0 denote the absence of feature i, and x_{i}=1 its existence. Sparsity means that usually x_{i}=0. Distance between points is a variant of the Hamming distance. Dimensional reduction converts the sparse heterogeneous problem into a lower dimensional full homogeneous problem. However we will see that the converted problem can be much harder to solve than the original problem. Instead we suggest a direct approach. It consists of T tries. In try t we rearrange the coordinates in decreasing order of (1-r_{t,i})\\frac{p_{i,11}}{p_{i,01}+p_{i,10}} \\ln\\frac{1}{p_{i,1*}} where 0
Local Naive Bayes Nearest Neighbor for Image Classification
McCann, Sancho
2011-01-01
We present Local Naive Bayes Nearest Neighbor, an improvement to the NBNN image classification algorithm that increases classification accuracy and improves its ability to scale to large numbers of object classes. The key observation is that only the classes represented in the local neighborhood of a descriptor contribute significantly and reliably to their posterior probability estimates. Instead of maintaining a separate search structure for each class, we merge all of the reference data together into one search structure, allowing quick identification of a descriptor's local neighborhood. We show an increase in classification accuracy when we ignore adjustments to the more distant classes and show that the run time grows with the log of the number of classes rather than linearly in the number of classes as did the original. This gives a 100 times speed-up over the original method on the Caltech 256 dataset. We also provide the first head-to-head comparison of NBNN against spatial pyramid methods using a co...
Distributed Adaptive Binary Quantization for Fast Nearest Neighbor Search.
Liu, Xianglong; Li, Zhujin; Deng, Cheng; Tao, Dacheng
2017-11-01
Hashing has been proved an attractive technique for fast nearest neighbor search over big data. Compared with the projection based hashing methods, prototype-based ones own stronger power to generate discriminative binary codes for the data with complex intrinsic structure. However, existing prototype-based methods, such as spherical hashing and K-means hashing, still suffer from the ineffective coding that utilizes the complete binary codes in a hypercube. To address this problem, we propose an adaptive binary quantization (ABQ) method that learns a discriminative hash function with prototypes associated with small unique binary codes. Our alternating optimization adaptively discovers the prototype set and the code set of a varying size in an efficient way, which together robustly approximate the data relations. Our method can be naturally generalized to the product space for long hash codes, and enjoys the fast training linear to the number of the training data. We further devise a distributed framework for the large-scale learning, which can significantly speed up the training of ABQ in the distributed environment that has been widely deployed in many areas nowadays. The extensive experiments on four large-scale (up to 80 million) data sets demonstrate that our method significantly outperforms state-of-the-art hashing methods, with up to 58.84% performance gains relatively.
Approximate aggregate nearest neighbor search on moving objects trajectories
Institute of Scientific and Technical Information of China (English)
Mohammad Reza Abbasifard; Hassan Naderi; Zohreh Fallahnejad; Omid Isfahani Alamdari
2015-01-01
Aggregate nearest neighbor (ANN) search retrieves for two spatial datasetsT andQ, segment(s) of one or more trajectories from the setT having minimum aggregate distance to points inQ. When interacting with large amounts of trajectories, this process would be very time-consuming due to consecutive page loads. An approximate method for finding segments with minimum aggregate distance is proposed which can improve the response time. In order to index large volumes of trajectories, scalable and efficient trajectory index (SETI) structure is used. But some refinements are provided to temporal index of SETI to improve the performance of proposed method. The experiments were performed with different number of query points and percentages of dataset. It is shown that proposed method besides having an acceptable precision, can reduce the computation time significantly. It is also shown that the main fraction of search time among load time, ANN and computing convex and centroid, is related to ANN.
Nearest Neighbor Classifier Method for Making Loan Decision in Commercial Bank
Directory of Open Access Journals (Sweden)
Md.Mahbubur Rahman
2014-07-01
Full Text Available Bank plays the central role for the economic development world-wide. The failure and success of the banking sector depends upon the ability to proper evaluation of credit risk. Credit risk evaluation of any potential credit application has remained a challenge for banks all over the world till today. Artificial neural network plays a tremendous role in the field of finance for making critical, enigmatic and sensitive decisions those are sometimes impossible for human being. Like other critical decision in the finance, the decision of sanctioning loan to the customer is also an enigmatic problem. The objective of this paper is to design such a Neural Network that can facilitate loan officers to make correct decision for providing loan to the proper client. This paper checks the applicability of one of the new integrated model with nearest neighbor classifier on a sample data taken from a Bangladeshi Bank named Brac Bank. The Neural network will consider several factors of the client of the bank and make the loan officer informed about client’s eligibility of getting a loan. Several effective methods of neural network can be used for making this bank decision such as back propagation learning, regression model, gradient descent algorithm, nearest neighbor classifier etc.
Predicting Audience Location on the Basis of the k-Nearest Neighbor Multilabel Classification
Directory of Open Access Journals (Sweden)
Haitao Wu
2014-01-01
Full Text Available Understanding audience location information in online social networks is important in designing recommendation systems, improving information dissemination, and so on. In this paper, we focus on predicting the location distribution of audiences on YouTube. And we transform this problem to a multilabel classification problem, while we find there exist three problems when the classical k-nearest neighbor based algorithm for multilabel classification (ML-kNN is used to predict location distribution. Firstly, the feature weights are not considered in measuring the similarity degree. Secondly, it consumes considerable computing time in finding similar items by traversing all the training set. Thirdly, the goal of ML-kNN is to find relevant labels for every sample which is different from audience location prediction. To solve these problems, we propose the methods of measuring similarity based on weight, quickly finding similar items, and ranking a specific number of labels. On the basis of these methods and the ML-kNN, the k-nearest neighbor based model for audience location prediction (AL-kNN is proposed for predicting audience location. The experiments based on massive YouTube data show that the proposed model can more accurately predict the location of YouTube video audience than the ML-kNN, MLNB, and Rank-SVM methods.
Quality and efficiency in high dimensional Nearest neighbor search
Tao, Yufei
2009-01-01
Nearest neighbor (NN) search in high dimensional space is an important problem in many applications. Ideally, a practical solution (i) should be implementable in a relational database, and (ii) its query cost should grow sub-linearly with the dataset size, regardless of the data and query distributions. Despite the bulk of NN literature, no solution fulfills both requirements, except locality sensitive hashing (LSH). The existing LSH implementations are either rigorous or adhoc. Rigorous-LSH ensures good quality of query results, but requires expensive space and query cost. Although adhoc-LSH is more efficient, it abandons quality control, i.e., the neighbor it outputs can be arbitrarily bad. As a result, currently no method is able to ensure both quality and efficiency simultaneously in practice. Motivated by this, we propose a new access method called the locality sensitive B-tree (LSB-tree) that enables fast highdimensional NN search with excellent quality. The combination of several LSB-trees leads to a structure called the LSB-forest that ensures the same result quality as rigorous-LSH, but reduces its space and query cost dramatically. The LSB-forest also outperforms adhoc-LSH, even though the latter has no quality guarantee. Besides its appealing theoretical properties, the LSB-tree itself also serves as an effective index that consumes linear space, and supports efficient updates. Our extensive experiments confirm that the LSB-tree is faster than (i) the state of the art of exact NN search by two orders of magnitude, and (ii) the best (linear-space) method of approximate retrieval by an order of magnitude, and at the same time, returns neighbors with much better quality. © 2009 ACM.
Topological phase transitions driven by next-nearest-neighbor hopping in two-dimensional lattices
Beugeling, W.; Everts, J.C.; de Morais Smith, C.
2012-01-01
For two-dimensional lattices in a tight-binding description, the intrinsic spin-orbit coupling, acting as a complex next-nearest-neighbor hopping, opens gaps that exhibit the quantum spin Hall effect. In this paper, we study the effect of a real next-nearest-neighbor hopping term on the band structu
Lattice gas with nearest- and next-to-nearest-neighbor exclusion
X. Feng; H.W.J. Blöte; B. Nienhuis
2011-01-01
We investigate a hard-square lattice gas on the square lattice by means of transfer-matrix and Monte Carlo methods. The size of the hard squares is equal to two lattice constants, so the simultaneous occupation of nearest-neighbor sites as well as of next-to-nearest-neighbor sites is excluded. Near
Density functional theory for nearest-neighbor exclusion lattice gases in two and three dimensions
Lafuente, Luis; Cuesta, José A.
2003-12-01
To speak about fundamental measure theory obliges us to mention dimensional crossover. This feature, inherent to the systems themselves, was incorporated in the theory almost from the beginning. Although at first it was thought to be a consistency check for the theory, it rapidly became its fundamental pillar, thus becoming the only density functional theory which possesses such a property. It is straightforward that dimensional crossover connects, for instance, the parallel hard cube system (three dimensional) with that of squares (two dimensional) and rods (one dimensional). We show here that there are many more connections which can be established in this way. Through them we deduce from the functional for parallel hard (hyper)cubes in the simple (hyper)cubic lattice the corresponding functionals for the nearest-neighbor exclusion lattice gases in the square, triangular, simple cubic, face-centered-cubic, and body-centered-cubic lattices. As an application, the bulk phase diagram for all these systems is obtained.
Energy Technology Data Exchange (ETDEWEB)
Jahnel, Benedikt, E-mail: Benedikt.Jahnel@ruhr-uni-bochum.de; Külske, Christof, E-mail: Christof.Kuelske@ruhr-uni-bochum.de [Ruhr-Universität Bochum, Fakultät für Mathematik (Germany); Botirov, Golibjon I., E-mail: botirovg@yandex.ru [Bukhara State University, Faculty of Physics and Mathematics (Uzbekistan)
2014-12-15
We consider a ferromagnetic nearest-neighbor model on a Cayley tree of degree k2 with uncountable local state space [0,1] where the energy function depends on a parameter θ ∊[0, 1). We show that for 0θ(5/(3k)) the model has a unique translation-invariant Gibbs measure. If 5/(3k) <θ < 1, there is a phase transition, in particular there are three translation-invariant Gibbs measures.
Lee, Jae Hwan; Kim, Seung-Yeon; Lee, Julian
2013-05-01
We study distributions of the partition function zeros in the complex temperature plane for a square-lattice homopolymer with nearest-neighbor (NN) and next-nearest-neighbor (NNN) interactions. The dependence of distributions on the ratio of NN and NNN interaction strengths R is examined. The finite-size scaling of the zeros is performed to obtain the crossover exponent, which is shown to be independent of R within error bars, suggesting that all of these models belong to the same universality class. The transition temperatures are also computed by the zeros to obtain the phase diagram, and the results confirm that the model with stronger NNN interaction exhibits stronger effects of cooperativity.
Xiong, Daxing; Zhang, Yong; Zhao, Hong
2014-08-01
We show numerically that introducing the next-nearest-neighbor interactions (of appropriate strength) into the one-dimensional (1D) Fermi-Pasta-Ulam-β (FPU-β) lattice can result in an unusual, nonmonotonic temperature dependent divergence behavior in a wide temperature range, which is in clear contrast to the universal divergence manner independent of temperature as suggested previously in the conventional 1D FPU-β models with nearest-neighbor (NN) coupling only. We also discuss the underlying mechanism of this finding by analyzing the temperature variations of the properties of discrete breathers, especially that with frequencies having the intraband components. The results may provide useful information for establishing the connection between the macroscopic heat transport properties and the underlying dynamics in general 1D systems with interactions beyond NN couplings.
Nearest Neighbor Search in the Metric Space of a Complex Network for Community Detection
Directory of Open Access Journals (Sweden)
Suman Saha
2016-03-01
Full Text Available The objective of this article is to bridge the gap between two important research directions: (1 nearest neighbor search, which is a fundamental computational tool for large data analysis; and (2 complex network analysis, which deals with large real graphs but is generally studied via graph theoretic analysis or spectral analysis. In this article, we have studied the nearest neighbor search problem in a complex network by the development of a suitable notion of nearness. The computation of efficient nearest neighbor search among the nodes of a complex network using the metric tree and locality sensitive hashing (LSH are also studied and experimented. For evaluation of the proposed nearest neighbor search in a complex network, we applied it to a network community detection problem. Experiments are performed to verify the usefulness of nearness measures for the complex networks, the role of metric tree and LSH to compute fast and approximate node nearness and the the efficiency of community detection using nearest neighbor search. We observed that nearest neighbor between network nodes is a very efficient tool to explore better the community structure of the real networks. Several efficient approximation schemes are very useful for large networks, which hardly made any degradation of results, whereas they save lot of computational times, and nearest neighbor based community detection approach is very competitive in terms of efficiency and time.
Dubiner, Moshe
2008-01-01
Consider the problem of finding high dimensional approximate nearest neighbors, where the data is generated by some known probabilistic model. We will investigate a large natural class of algorithms which we call bucketing codes. We will define bucketing information, prove that it bounds the performance of all bucketing codes, and that the bucketing information bound can be asymptotically attained by randomly constructed bucketing codes. For example suppose we have n Bernoulli(1/2) very long (length d-->infinity) sequences of bits. Let n-2m sequences be completely independent, while the remaining 2m sequences are composed of m independent pairs. The interdependence within each pair is that their bits agree with probability 1/20. Moreover if one sequence out of each pair belongs to a a known set of n^{(2p-1)^{2}-\\epsilon} sequences, than pairing can be done using order n comparisons!
Conductivity in the Heisenberg chain with next-to-nearest-neighbor interaction.
Mastropietro, Vieri
2013-04-01
We consider a spin chain given by the XXZ model with a weak next-to-nearest-neighbor perturbation that breaks its exact integrability. We prove that such a system has an ideal metallic behavior (infinite conductivity), by rigorously establishing strict lower bounds on the zero-temperature Drude weight, which are strictly positive. The proof is based on exact renormalization group methods allowing us to prove the convergence of the expansions and to fully take into account the irrelevant terms, which play an essential role in ensuring the correct lattice symmetries. We also prove that the Drude weight verifies the same parameter-free relations as in the absence of the integrability-breaking perturbation.
Non-nearest-neighbor dependence of stability for group III RNA single nucleotide bulge loops.
Kent, Jessica L; McCann, Michael D; Phillips, Daniel; Panaro, Brandon L; Lim, Geoffrey F S; Serra, Martin J
2014-06-01
Thirty-five RNA duplexes containing single nucleotide bulge loops were optically melted and the thermodynamic parameters for each duplex determined. The bulge loops were of the group III variety, where the bulged nucleotide is either a AG/U or CU/G, leading to ambiguity to the exact position and identity of the bulge. All possible group III bulge loops with Watson-Crick nearest-neighbors were examined. The data were used to develop a model to predict the free energy of an RNA duplex containing a group III single nucleotide bulge loop. The destabilization of the duplex by the group III bulge could be modeled so that the bulge nucleotide leads to the formation of the Watson-Crick base pair rather than the wobble base pair. The destabilization of an RNA duplex caused by the insertion of a group III bulge is primarily dependent upon non-nearest-neighbor interactions and was shown to be dependent upon the stability of second least stable stem of the duplex. In-line structure probing of group III bulge loops embedded in a hairpin indicated that the bulged nucleotide is the one positioned further from the hairpin loop irrespective of whether the resulting stem formed a Watson-Crick or wobble base pair. Fourteen RNA hairpins containing group III bulge loops, either 3' or 5' of the hairpin loop, were optically melted and the thermodynamic parameters determined. The model developed to predict the influence of group III bulge loops on the stability of duplex formation was extended to predict the influence of bulge loops on hairpin stability.
Fast construction of k-nearest neighbor graphs for point clouds.
Connor, Michael; Kumar, Piyush
2010-01-01
We present a parallel algorithm for k-nearest neighbor graph construction that uses Morton ordering. Experiments show that our approach has the following advantages over existing methods: 1) faster construction of k-nearest neighbor graphs in practice on multicore machines, 2) less space usage, 3) better cache efficiency, 4) ability to handle large data sets, and 5) ease of parallelization and implementation. If the point set has a bounded expansion constant, our algorithm requires one-comparison-based parallel sort of points, according to Morton order plus near-linear additional steps to output the k-nearest neighbor graph.
Reymbaut, A.; Charlebois, M.; Asiani, M. Fellous; Fratino, L.; Sémon, P.; Sordi, G.; Tremblay, A.-M. S.
2016-10-01
The nearest-neighbor superexchange-mediated mechanism for dx2-y2 superconductivity in the one-band Hubbard model faces the challenge that nearest-neighbor Coulomb repulsion can be larger than superexchange. To answer this question, we use cellular dynamical mean-field theory (CDMFT) with a continuous-time quantum Monte Carlo solver to determine the superconducting phase diagram as a function of temperature and doping for on-site repulsion U =9 t and nearest-neighbor repulsion V =0 ,2 t ,4 t . In the underdoped regime, V increases the CDMFT superconducting transition temperature Tcd even though it decreases the superconducting order parameter at low temperature for all dopings. However, in the overdoped regime V decreases Tcd. We gain insight into these paradoxical results through a detailed study of the frequency dependence of the anomalous spectral function, extracted at finite temperature via the MaxEntAux method for analytic continuation. A systematic study of dynamical positive and negative contributions to pairing reveals that even though V has a high-frequency depairing contribution, it also has a low frequency pairing contribution since it can reinforce superexchange through J =4 t2/(U -V ) . Retardation is thus crucial to understanding pairing in doped Mott insulators, as suggested by previous zero-temperature studies. We also comment on the tendency to charge order for large V and on the persistence of d -wave superconductivity over extended-s or s +d wave.
K-Nearest Neighbors Relevance Annotation Model for Distance Education
Ke, Xiao; Li, Shaozi; Cao, Donglin
2011-01-01
With the rapid development of Internet technologies, distance education has become a popular educational mode. In this paper, the authors propose an online image automatic annotation distance education system, which could effectively help children learn interrelations between image content and corresponding keywords. Image automatic annotation is…
Verevkin, Sergey P; Emel'yanenko, Vladimir N; Nagrimanov, Ruslan N
2016-12-15
Standard molar enthalpies of formation of 2- and 4-hydroxybenzamides were measured by combustion calorimetry. Vapor pressures of benzamide and 2-hydroxybenzamide were derived by the transpiration method. Standard molar enthalpies of sublimation or vaporization of these compounds at 298 K were obtained from vapor pressure temperature dependence. Thermochemical data on benzamides with hydroxyl, methyl, methoxy, amino, and amide substituents were collected, evaluated, and tested for internal consistency. The high-level G4 quantum-chemical method was used for mutual validation of the experimental and theoretical gas-phase enthalpies of formation. Sets of nearest-neighbor and non-nearest-neighbor interactions between substituents in the benzene ring have been evaluated. A simple incremental procedure has been suggested for a quick appraisal of the vaporization and gas-phase formation enthalpies of the substituted benzamides.
Improved locality-sensitive hashing method for the approximate nearest neighbor problem
Lu, Ying-Hua; Ma, Ting-Huai; Zhong, Shui-Ming; Cao, Jie; Wang, Xin; Abdullah, Al-Dhelaan
2014-08-01
In recent years, the nearest neighbor search (NNS) problem has been widely used in various interesting applications. Locality-sensitive hashing (LSH), a popular algorithm for the approximate nearest neighbor problem, is proved to be an efficient method to solve the NNS problem in the high-dimensional and large-scale databases. Based on the scheme of p-stable LSH, this paper introduces a novel improvement algorithm called randomness-based locality-sensitive hashing (RLSH) based on p-stable LSH. Our proposed algorithm modifies the query strategy that it randomly selects a certain hash table to project the query point instead of mapping the query point into all hash tables in the period of the nearest neighbor query and reconstructs the candidate points for finding the nearest neighbors. This improvement strategy ensures that RLSH spends less time searching for the nearest neighbors than the p-stable LSH algorithm to keep a high recall. Besides, this strategy is proved to promote the diversity of the candidate points even with fewer hash tables. Experiments are executed on the synthetic dataset and open dataset. The results show that our method can cost less time consumption and less space requirements than the p-stable LSH while balancing the same recall.
Earthquake Declustering via a Nearest-Neighbor Approach in Space-Time-Magnitude Domain
Zaliapin, I. V.; Ben-Zion, Y.
2016-12-01
We propose a new method for earthquake declustering based on nearest-neighbor analysis of earthquakes in space-time-magnitude domain. The nearest-neighbor approach was recently applied to a variety of seismological problems that validate the general utility of the technique and reveal the existence of several different robust types of earthquake clusters. Notably, it was demonstrated that clustering associated with the largest earthquakes is statistically different from that of small-to-medium events. In particular, the characteristic bimodality of the nearest-neighbor distances that helps separating clustered and background events is often violated after the largest earthquakes in their vicinity, which is dominated by triggered events. This prevents using a simple threshold between the two modes of the nearest-neighbor distance distribution for declustering. The current study resolves this problem hence extending the nearest-neighbor approach to the problem of earthquake declustering. The proposed technique is applied to seismicity of different areas in California (San Jacinto, Coso, Salton Sea, Parkfield, Ventura, Mojave, etc.), as well as to the global seismicity, to demonstrate its stability and efficiency in treating various clustering types. The results are compared with those of alternative declustering methods.
A k-nearest neighbor classification of hERG K(+) channel blockers.
Chavan, Swapnil; Abdelaziz, Ahmed; Wiklander, Jesper G; Nicholls, Ian A
2016-03-01
A series of 172 molecular structures that block the hERG K(+) channel were used to develop a classification model where, initially, eight types of PaDEL fingerprints were used for k-nearest neighbor model development. A consensus model constructed using Extended-CDK, PubChem and Substructure count fingerprint-based models was found to be a robust predictor of hERG activity. This consensus model demonstrated sensitivity and specificity values of 0.78 and 0.61 for the internal dataset compounds and 0.63 and 0.54 for the external (PubChem) dataset compounds, respectively. This model has identified the highest number of true positives (i.e. 140) from the PubChem dataset so far, as compared to other published models, and can potentially serve as a basis for the prediction of hERG active compounds. Validating this model against FDA-withdrawn substances indicated that it may even be useful for differentiating between mechanisms underlying QT prolongation.
Zhang, Zhongzhi; Sheng, Yibin
2015-01-01
Random walks including non-nearest-neighbor jumps appear in many real situations such as the diffusion of adatoms and have found numerous applications including PageRank search algorithm, however, related theoretical results are much less for this dynamical process. In this paper, we present a study of mixed random walks in a family of fractal scale-free networks, where both nearest-neighbor and next-nearest-neighbor jumps are included. We focus on trapping problem in the network family, which is a particular case of random walks with a perfect trap fixed at the central high-degree node. We derive analytical expressions for the average trapping time (ATT), a quantitative indicator measuring the efficiency of the trapping process, by using two different methods, the results of which are consistent with each other. Furthermore, we analytically determine all the eigenvalues and their multiplicities for the fundamental matrix characterizing the dynamical process. Our results show that although next-nearest-neighb...
Efficient kNN Classification With Different Numbers of Nearest Neighbors.
Zhang, Shichao; Li, Xuelong; Zong, Ming; Zhu, Xiaofeng; Wang, Ruili
2017-04-12
k nearest neighbor (kNN) method is a popular classification method in data mining and statistics because of its simple implementation and significant classification performance. However, it is impractical for traditional kNN methods to assign a fixed k value (even though set by experts) to all test samples. Previous solutions assign different k values to different test samples by the cross validation method but are usually time-consuming. This paper proposes a kTree method to learn different optimal k values for different test/new samples, by involving a training stage in the kNN classification. Specifically, in the training stage, kTree method first learns optimal k values for all training samples by a new sparse reconstruction model, and then constructs a decision tree (namely, kTree) using training samples and the learned optimal k values. In the test stage, the kTree fast outputs the optimal k value for each test sample, and then, the kNN classification can be conducted using the learned optimal k value and all training samples. As a result, the proposed kTree method has a similar running cost but higher classification accuracy, compared with traditional kNN methods, which assign a fixed k value to all test samples. Moreover, the proposed kTree method needs less running cost but achieves similar classification accuracy, compared with the newly kNN methods, which assign different k values to different test samples. This paper further proposes an improvement version of kTree method (namely, k*Tree method) to speed its test stage by extra storing the information of the training samples in the leaf nodes of kTree, such as the training samples located in the leaf nodes, their kNNs, and the nearest neighbor of these kNNs. We call the resulting decision tree as k*Tree, which enables to conduct kNN classification using a subset of the training samples in the leaf nodes rather than all training samples used in the newly kNN methods. This actually reduces running cost of
A fast approximate nearest neighbor search algorithm in the Hamming space.
Esmaeili, Mani Malek; Ward, Rabab Kreidieh; Fatourechi, Mehrdad
2012-12-01
A fast approximate nearest neighbor search algorithm for the (binary) Hamming space is proposed. The proposed Error Weighted Hashing (EWH) algorithm is up to 20 times faster than the popular locality sensitive hashing (LSH) algorithm and works well even for large nearest neighbor distances where LSH fails. EWH significantly reduces the number of candidate nearest neighbors by weighing them based on the difference between their hash vectors. EWH can be used for multimedia retrieval and copy detection systems that are based on binary fingerprinting. On a fingerprint database with more than 1,000 videos, for a specific detection accuracy, we demonstrate that EWH is more than 10 times faster than LSH. For the same retrieval time, we show that EWH has a significantly better detection accuracy with a 15 times lower error rate.
A Novel Preferential Diffusion Recommendation Algorithm Based on User’s Nearest Neighbors
Directory of Open Access Journals (Sweden)
Fuguo Zhang
2017-01-01
Full Text Available Recommender system is a very efficient way to deal with the problem of information overload for online users. In recent years, network based recommendation algorithms have demonstrated much better performance than the standard collaborative filtering methods. However, most of network based algorithms do not give a high enough weight to the influence of the target user’s nearest neighbors in the resource diffusion process, while a user or an object with high degree will obtain larger influence in the standard mass diffusion algorithm. In this paper, we propose a novel preferential diffusion recommendation algorithm considering the significance of the target user’s nearest neighbors and evaluate it in the three real-world data sets: MovieLens 100k, MovieLens 1M, and Epinions. Experiments results demonstrate that the novel preferential diffusion recommendation algorithm based on user’s nearest neighbors can significantly improve the recommendation accuracy and diversity.
Qi, F; Ma, Q Y; Qi, A Y; Xu, P; Zhu, S N; Zheng, W H
2016-01-01
Next-nearest-neighbor Heisenberg chain plays important roles in solid state physics, such as predicting exotic electric properties of two-dimensional materials or magnetic properties of organic compounds. Direct experimental studies of the many-body electron systems or spin systems associating to these materials are challenging tasks, while optical simulation provides an effective and economical way for immediate observation. Comparing with bulk optics, integrated optics are more of fascinating for steady, large scale and long-time evolution simulations. Photonic crystal is an artificial microstructure material with multiple methods to tune the propagation properties, which are essential for various simulation tasks. Here we report for the first time an experimental simulation of next-nearest-neighbor Heisenberg chain with an integrated optical chip of photonic crystal waveguide array. The use of photonic crystal enhances evanescent field thus allows coupling between next-nearest-neighbor waveguides in such a...
A nearest neighbor search algorithm of high-dimensional data based on sequential NPsim matrix
Institute of Scientific and Technical Information of China (English)
李文法
2016-01-01
Problems existin similarity measurement and index tree construction which affect the perform-ance of nearest neighbor search of high-dimensional data .The equidistance problem is solved using NPsim function to calculate similarity .And a sequential NPsim matrix is built to improve indexing performance .To sum up the above innovations , a nearest neighbor search algorithm of high-dimen-sional data based on sequential NPsim matrix is proposed in comparison with the nearest neighbor search algorithms based on KD-tree or SR-tree on Munsell spectral data set .Experimental results show that the proposed algorithm similarity is better than that of other algorithms and searching speed is more than thousands times of others .In addition , the slow construction speed of sequential NPsim matrix can be increased by using parallel computing .
Nearest neighbor density ratio estimation for large-scale applications in astronomy
Kremer, J.; Gieseke, F.; Steenstrup Pedersen, K.; Igel, C.
2015-09-01
In astronomical applications of machine learning, the distribution of objects used for building a model is often different from the distribution of the objects the model is later applied to. This is known as sample selection bias, which is a major challenge for statistical inference as one can no longer assume that the labeled training data are representative. To address this issue, one can re-weight the labeled training patterns to match the distribution of unlabeled data that are available already in the training phase. There are many examples in practice where this strategy yielded good results, but estimating the weights reliably from a finite sample is challenging. We consider an efficient nearest neighbor density ratio estimator that can exploit large samples to increase the accuracy of the weight estimates. To solve the problem of choosing the right neighborhood size, we propose to use cross-validation on a model selection criterion that is unbiased under covariate shift. The resulting algorithm is our method of choice for density ratio estimation when the feature space dimensionality is small and sample sizes are large. The approach is simple and, because of the model selection, robust. We empirically find that it is on a par with established kernel-based methods on relatively small regression benchmark datasets. However, when applied to large-scale photometric redshift estimation, our approach outperforms the state-of-the-art.
Estimating the posterior probabilities using the k-nearest neighbor rule.
Atiya, Amir F
2005-03-01
In many pattern classification problems, an estimate of the posterior probabilities (rather than only a classification) is required. This is usually the case when some confidence measure in the classification is needed. In this article, we propose a new posterior probability estimator. The proposed estimator considers the K-nearest neighbors. It attaches a weight to each neighbor that contributes in an additive fashion to the posterior probability estimate. The weights corresponding to the K-nearest-neighbors (which add to 1) are estimated from the data using a maximum likelihood approach. Simulation studies confirm the effectiveness of the proposed estimator.
Support Vector Machine combined with K-Nearest Neighbors for Solar Flare Forecasting
Institute of Scientific and Technical Information of China (English)
Rong Li; Hua-Ning Wang; Han He; Yan Mei; Zhan-Le Du
2007-01-01
A method combining the support vector machine(SVM)the K-Nearest Neighbors (KNN),labelled the SVM-KNN method,is used to construct a solar flare forecasting model.Based on a proven relationship between SVM and KNN.the SVM-KNN method improves the SVM algorithm of classification by taking advantage of the KNN algorithm according to the distribution of test samples in a feature space.In our flare forecast study.sunspots and 10cm radio flux data observe during Solar Cycle 23 are taken as predictors,and whether an M class flare will occur for each active region within two days will be predicted.The SVMKNN method is compared with the SVM and Neural networks-based method.The test results indicate that the rate of correct predictions from the SVM-KNN method is higher than that from the other two methods.This method shows promise as a practicable future forecasting model.
Greedy Multiple Instance Learning via Codebook Learning and Nearest Neighbor Voting
Chen, Gang
2012-01-01
Multiple instance learning (MIL) has attracted great attention recently in machine learning community. However, most MIL algorithms are very slow and cannot be applied to large datasets. In this paper, we propose a greedy strategy to speed up the multiple instance learning process. Our contribution is two fold. First, we propose a density ratio model, and show that maximizing a density ratio function is the low bound of the DD model under certain conditions. Secondly, we make use of a histogram ratio between positive bags and negative bags to represent the density ratio function and find codebooks separately for positive bags and negative bags by a greedy strategy. For testing, we make use of a nearest neighbor strategy to classify new bags. We test our method on both small benchmark datasets and the large TRECVID MED11 dataset. The experimental results show that our method yields comparable accuracy to the current state of the art, while being up to at least one order of magnitude faster.
Zhu, Mu; Chen, Wenhong; Hirdes, John P; Stolee, Paul
2007-10-01
There may be great potential for using computer-modeling techniques and machine-learning algorithms in clinical decision making, if these can be shown to produce results superior to clinical protocols currently in use. We aim to explore the potential to use an automatic, data-driven, machine-learning algorithm in clinical decision making. Using a database containing comprehensive health assessment information (the interRAI-HC) on home care clients (N=24,724) from eight community-care regions in Ontario, Canada, we compare the performance of the K-nearest neighbor (KNN) algorithm and a Clinical Assessment Protocol (the "ADLCAP") currently used to predict rehabilitation potential. For our purposes, we define a patient as having rehabilitation potential if the patient had functional improvement or remained at home over a follow-up period of approximately 1 year. The KNN algorithm has a lower false positive rate in all but one of the eight regions in the sample, and lower false negative rates in all regions. Compared using likelihood ratio statistics, KNN is uniformly more informative than the ADLCAP. This article illustrates the potential for a machine-learning algorithm to enhance clinical decision making.
Optimal Control of Vehicular Formations with Nearest Neighbor Interactions
Lin, Fu; Jovanović, Mihailo R
2011-01-01
We consider the design of optimal localized feedback gains for one-dimensional formations in which vehicles only use information from their immediate neighbors. The control objective is to enhance coherence of the formation by making it behave like a rigid lattice. For the single-integrator model with symmetric gains, we establish convexity, implying that the globally optimal controller can be computed efficiently. We also identify a class of convex problems for double-integrators by restricting the controller to symmetric position and uniform diagonal velocity gains. To obtain the optimal non-symmetric gains for both the single- and the double-integrator models, we solve a parameterized family of optimal control problems ranging from an easily solvable problem to the problem of interest as the underlying parameter increases. When this parameter is kept small, we employ perturbation analysis to decouple the matrix equations that result from the optimality conditions, thereby rendering the unique optimal feedb...
Efficient Parallel Computation of Nearest Neighbor Interchange Distances
Gast, Mikael
2012-01-01
The nni-distance is a well-known distance measure for phylogenetic trees. We construct an efficient parallel approximation algorithm for the nni-distance in the CRCW-PRAM model running in O(log n) time on O(n) processors. Given two phylogenetic trees T1 and T2 on the same set of taxa and with the same multi-set of edge-weights, the algorithm constructs a sequence of nni-operations of weight at most O(log n) \\cdot opt, where opt denotes the minimum weight of a sequence of nni-operations transforming T1 into T2 . This algorithm is based on the sequential approximation algorithm for the nni-distance given by DasGupta et al. (2000). Furthermore, we show that the problem of identifying so called good edge-pairs between two weighted phylogenies can be computed in O(log n) time on O(n log n) processors.
Cash Currencies Recognition Using k-Nearest Neighbor Classifier
Directory of Open Access Journals (Sweden)
Ghazi Ibrahim Raho
2015-10-01
Full Text Available The appearance of the currency is part of this development and it is affected directly, where there is exploited in incorrect form by copying the currency in a manner similar to the reality. Therefore, it became necessary to implement a proposal for being a suitable as solution not inconsistent with the different cultures, time and place, to reduce the risk of problem that represented in distinguish between real and fake currency. This clear through add the watermarks inside currency, which is difficult to be copied. At the same time, this watermarks may be visible to the naked eye so can easily inferred or it is invisible. However the high resolution imaging devices can copy these additions. In this research, we have proposed a system to distinguish the currencies by the program that working a submission inferred to the watermark by feature extraction determined the type of currency and its reality. In addition to, the algorithm (k-NN determined category of the currency. Benefit of it, is reducing as much as possible the spread of counterfeit currency and this system can be used by any user wants to make sure of the currency reality. The proposed model applied on 100 banknote, the success rate was 91% and the failure rate was 9%.
Mapping change of older forest with nearest-neighbor imputation and Landsat time-series
Janet L. Ohmann; Matthew J. Gregory; Heather M. Roberts; Warren B. Cohen; Robert E. Kennedy; Zhiqiang. Yang
2012-01-01
The Northwest Forest Plan (NWFP), which aims to conserve late-successional and old-growth forests (older forests) and associated species, established new policies on federal lands in the Pacific Northwest USA. As part of monitoring for the NWFP, we tested nearest-neighbor imputation for mapping change in older forest, defined by threshold values for forest attributes...
Text Categorization Based on K-Nearest Neighbor Approach for Web Site Classification.
Kwon, Oh-Woog; Lee, Jong-Hyeok
2003-01-01
Discusses text categorization and Web site classification and proposes a three-step classification system that includes the use of Web pages linked with the home page. Highlights include the k-nearest neighbor (k-NN) approach; improving performance with a feature selection method and a term weighting scheme using HTML tags; and similarity…
van Dam, Herman T.; Seifert, Stefan; Vinke, Ruud; Dendooven, Peter; Lohner, Herbert; Beekman, Freek J.; Schaart, Dennis R.
2011-01-01
Monolithic scintillator detectors have been shown to provide good performance and to have various practical advantages for use in PET systems. Excellent results for the gamma photon interaction position determination in these detectors have been obtained by means of the k-nearest neighbor (k-NN)
An adaptable k-nearest neighbors algorithm for MMSE image interpolation.
Ni, Karl S; Nguyen, Truong Q
2009-09-01
We propose an image interpolation algorithm that is nonparametric and learning-based, primarily using an adaptive k-nearest neighbor algorithm with global considerations through Markov random fields. The empirical nature of the proposed algorithm ensures image results that are data-driven and, hence, reflect "real-world" images well, given enough training data. The proposed algorithm operates on a local window using a dynamic k -nearest neighbor algorithm, where k differs from pixel to pixel: small for test points with highly relevant neighbors and large otherwise. Based on the neighbors that the adaptable k provides and their corresponding relevance measures, a weighted minimum mean squared error solution determines implicitly defined filters specific to low-resolution image content without yielding to the limitations of insufficient training. Additionally, global optimization via single pass Markov approximations, similar to cited nearest neighbor algorithms, provides additional weighting for filter generation. The approach is justified in using a sufficient quantity of training per test point and takes advantage of image properties. For in-depth analysis, we compare to existing methods and draw parallels between intuitive concepts including classification and ideas introduced by other nearest neighbor algorithms by explaining manifolds in low and high dimensions.
Kenneth B. Jr. Pierce; C. Kenneth Brewer; Janet L. Ohmann
2010-01-01
This study was designed to test the feasibility of combining a method designed to populate pixels with inventory plot data at the 30-m scale with a new national predictor data set. The new national predictor data set was developed by the USDA Forest Service Remote Sensing Applications Center (hereafter RSAC) at the 250-m scale. Gradient Nearest Neighbor (GNN)...
Bianca N. I. Eskelson; Hailemariam Temesgen; Valerie Lemay; Tara M. Barrett; Nicholas L. Crookston; Andrew T. Hudak
2009-01-01
Almost universally, forest inventory and monitoring databases are incomplete, ranging from missing data for only a few records and a few variables, common for small land areas, to missing data for many observations and many variables, common for large land areas. For a wide variety of applications, nearest neighbor (NN) imputation methods have been developed to fill in...
A Regression-based K nearest neighbor algorithm for gene function prediction from heterogeneous data
Directory of Open Access Journals (Sweden)
Ruzzo Walter L
2006-03-01
Full Text Available Abstract Background As a variety of functional genomic and proteomic techniques become available, there is an increasing need for functional analysis methodologies that integrate heterogeneous data sources. Methods In this paper, we address this issue by proposing a general framework for gene function prediction based on the k-nearest-neighbor (KNN algorithm. The choice of KNN is motivated by its simplicity, flexibility to incorporate different data types and adaptability to irregular feature spaces. A weakness of traditional KNN methods, especially when handling heterogeneous data, is that performance is subject to the often ad hoc choice of similarity metric. To address this weakness, we apply regression methods to infer a similarity metric as a weighted combination of a set of base similarity measures, which helps to locate the neighbors that are most likely to be in the same class as the target gene. We also suggest a novel voting scheme to generate confidence scores that estimate the accuracy of predictions. The method gracefully extends to multi-way classification problems. Results We apply this technique to gene function prediction according to three well-known Escherichia coli classification schemes suggested by biologists, using information derived from microarray and genome sequencing data. We demonstrate that our algorithm dramatically outperforms the naive KNN methods and is competitive with support vector machine (SVM algorithms for integrating heterogenous data. We also show that by combining different data sources, prediction accuracy can improve significantly. Conclusion Our extension of KNN with automatic feature weighting, multi-class prediction, and probabilistic inference, enhance prediction accuracy significantly while remaining efficient, intuitive and flexible. This general framework can also be applied to similar classification problems involving heterogeneous datasets.
Yao, Zizhen; Ruzzo, Walter L
2006-03-20
As a variety of functional genomic and proteomic techniques become available, there is an increasing need for functional analysis methodologies that integrate heterogeneous data sources. In this paper, we address this issue by proposing a general framework for gene function prediction based on the k-nearest-neighbor (KNN) algorithm. The choice of KNN is motivated by its simplicity, flexibility to incorporate different data types and adaptability to irregular feature spaces. A weakness of traditional KNN methods, especially when handling heterogeneous data, is that performance is subject to the often ad hoc choice of similarity metric. To address this weakness, we apply regression methods to infer a similarity metric as a weighted combination of a set of base similarity measures, which helps to locate the neighbors that are most likely to be in the same class as the target gene. We also suggest a novel voting scheme to generate confidence scores that estimate the accuracy of predictions. The method gracefully extends to multi-way classification problems. We apply this technique to gene function prediction according to three well-known Escherichia coli classification schemes suggested by biologists, using information derived from microarray and genome sequencing data. We demonstrate that our algorithm dramatically outperforms the naive KNN methods and is competitive with support vector machine (SVM) algorithms for integrating heterogenous data. We also show that by combining different data sources, prediction accuracy can improve significantly Our extension of KNN with automatic feature weighting, multi-class prediction, and probabilistic inference, enhance prediction accuracy significantly while remaining efficient, intuitive and flexible. This general framework can also be applied to similar classification problems involving heterogeneous datasets.
Directory of Open Access Journals (Sweden)
K. Duraiswamy
2012-01-01
Full Text Available Problem statement: A database that is optimized to store and query data that is related to objects in space, including points, lines and polygons is called spatial database. Identifying nearest neighbor object search is a vital part of spatial database. Many nearest neighbor search techniques such as Authenticated Multi-step NN (AMNN, Superseding Nearest Neighbor (SNN search, Bayesian Nearest Neighbor (BNN and so on are available. But they had some difficulties while performing NN in uncertain spatial database. AMNN does not process the queries from distributed server and it accesses the queries only from single server. In SNN, the high dimensional data structure could not be used in NN search and it accesses only low dimensional data for NN search. Approach: The previous works described the process of NN using SNN with marginal object weight ranking. The downside over the previous work is that the performance is poor when compared to another work which performed NN using BNN. To improve the NN search in spatial databases using BNN, we are going to present a new technique as BNN search using marginal object weight ranking. Based on events occurring in the nearest object, BNN starts its search using MOW. The MOW is done by computing the weight of each NN objects and rank each object based on its frequency and distance of NN object for an efficient NN search in spatial databases. Results: Marginal Object Weight (MOW is introduced to all nearest neighbor object identified using BNN for any relevant query point. It processes the queries from distributed server using MOW. Conclusion: The proposed BNN using MOW framework is experimented with real data sets to show the performance improvement with the previous MOW using SNN in terms of execution time, memory consumption and query result accuracy.
Li, Guohui; Fan, Ping; Yuan, Ling
2014-03-01
Recent research has focused on Continuous K-Nearest Neighbor (CKNN) query over moving objects in road networks. A CKNN query is to find among all moving objects the K-Nearest Neighbors (KNNs) of a moving query point within a given time interval. As the data objects move frequently and arbitrarily in road networks, the frequent updates of object locations make it complicated to process CKNN accurately and efficiently. In this paper, according to the relative moving situation between the moving objects and the query point, a Moving State of Object (MSO) model is presented to indicate the relative moving state of the object to the query point. With the help of this model, we propose a novel Object Candidate Processing (OCP) algorithm to highly reduce the repetitive query cost with pruning phase and refining phase. In the pruning phase, the data objects which cannot be the KNN query results are excluded within the given time interval. In the refining phase, the time subintervals of the given time interval are determined where the certain KNN query results are obtained. Comprehensive experiments are conducted and the results verify the effectiveness of the proposed methods.
Gangopadhyay, S.; Clark, M. P.; Rajagopalan, B.
2002-12-01
The success of short term (days to fortnight) streamflow forecasting largely depends on the skill of surface climate (e.g., precipitation and temperature) forecasts at local scales in the individual river basins. The surface climate forecasts are used to drive the hydrologic models for streamflow forecasting. Typically, Medium Range Forecast (MRF) models provide forecasts of large scale circulation variables (e.g. pressures, wind speed, relative humidity etc.) at different levels in the atmosphere on a regular grid - which are then used to "downscale" to the surface climate at locations within the model grid box. Several statistical and dynamical methods are available for downscaling. This paper compares the utility of two statistical downscaling methodologies: (1) multiple linear regression (MLR) and (2) a nonparametric approach based on k-nearest neighbor (k-NN) bootstrap method, in providing local-scale information of precipitation and temperature at a network of stations in the Upper Colorado River Basin. Downscaling to the stations is based on output of large scale circulation variables (i.e. predictors) from the NCEP Medium Range Forecast (MRF) database. Fourteen-day six hourly forecasts are developed using these two approaches, and their forecast skill evaluated. A stepwise regression is performed at each location to select the predictors for the MLR. The k-NN bootstrap technique resamples historical data based on their "nearness" to the current pattern in the predictor space. Prior to resampling a Principal Component Analysis (PCA) is performed on the predictor set to identify a small subset of predictors. Preliminary results using the MLR technique indicate a significant value in the downscaled MRF output in predicting runoff in the Upper Colorado Basin. It is expected that the k-NN approach will match the skill of the MLR approach at individual stations, and will have the added advantage of preserving the spatial co-variability between stations, capturing
Bindewald, Eckart; Shapiro, Bruce A
2006-03-01
We present a machine learning method (a hierarchical network of k-nearest neighbor classifiers) that uses an RNA sequence alignment in order to predict a consensus RNA secondary structure. The input to the network is the mutual information, the fraction of complementary nucleotides, and a novel consensus RNAfold secondary structure prediction of a pair of alignment columns and its nearest neighbors. Given this input, the network computes a prediction as to whether a particular pair of alignment columns corresponds to a base pair. By using a comprehensive test set of 49 RFAM alignments, the program KNetFold achieves an average Matthews correlation coefficient of 0.81. This is a significant improvement compared with the secondary structure prediction methods PFOLD and RNAalifold. By using the example of archaeal RNase P, we show that the program can also predict pseudoknot interactions.
A novel template reduction approach for the K-nearest neighbor method.
Fayed, Hatem A; Atiya, Amir F
2009-05-01
The K-nearest neighbor (KNN) rule is one of the most widely used pattern classification algorithms. For large data sets, the computational demands for classifying patterns using KNN can be prohibitive. A way to alleviate this problem is through the condensing approach. This means we remove patterns that are more of a computational burden but do not contribute to better classification accuracy. In this brief, we propose a new condensing algorithm. The proposed idea is based on defining the so-called chain. This is a sequence of nearest neighbors from alternating classes. We make the point that patterns further down the chain are close to the classification boundary and based on that we set a cutoff for the patterns we keep in the training set. Experiments show that the proposed approach effectively reduces the number of prototypes while maintaining the same level of classification accuracy as the traditional KNN. Moreover, it is a simple and a fast condensing algorithm.
Collective coherence in nearest neighbor coupled metamaterials: A metasurface ruler equation
Energy Technology Data Exchange (ETDEWEB)
Xu, Ningning; Zhang, Weili, E-mail: weili.zhang@okstate.edu [School of Electrical and Computer Engineering, Oklahoma State University, Stillwater, Oklahoma 74078 (United States); Singh, Ranjan, E-mail: ranjans@ntu.edu.sg [Center for Disruptive Photonic Technologies, Division of Physics and Applied Physics, School of Physical and Mathematical Sciences, Nanyang Technological University, 21 Nanyang Link, Singapore 637371 (Singapore)
2015-10-28
The collective coherent interactions in a meta-atom lattice are the key to myriad applications and functionalities offered by metasurfaces. We demonstrate a collective coherent response of the nearest neighbor coupled split-ring resonators whose resonance shift decays exponentially in the strong near-field coupled regime. This occurs due to the dominant magnetic coupling between the nearest neighbors which leads to the decay of the electromagnetic near fields. Based on the size scaling behavior of the different periodicity metasurfaces, we identified a collective coherent metasurface ruler equation. From the coherent behavior, we also show that the near-field coupling in a metasurface lattice exists even when the periodicity exceeds the resonator size. The identification of a universal coherence in metasurfaces and their scaling behavior would enable the design of novel metadevices whose spectral tuning response based on near-field effects could be calibrated across microwave, terahertz, infrared, and the optical parts of the electromagnetic spectrum.
Combining Nearest Neighbor Search with Tabu Search for Large-Scale Vehicle Routing Problem
Du, Lingling; He, Ruhan
The vehicle routing problem is a classical problem in operations research, where the objective is to design least cost routes for a fleet of identical capacitated vehicles to service geographically scattered customers. In this paper, we present a new and effective hybrid metaheuristic algorithm for large-scale vehicle routing problem. The algorithm combines the strengths of the well-known Nearest Neighbor Search and Tabu Search into a two-stage procedure. More precisely, Nearest Neighbor Search is used to construct initial routes in the first stage and the Tabu Search is utilized to optimize the intra-route and the inter-route in the second stage. The presented algorithm is specifically designed for large-scale problems. The computational experiments were carried out on a standard benchmark and a real dataset with 6772 tobacco customers. The results demonstrate that the suggested method is highly competitive.
Domain adaptation of image classification based on collective target nearest-neighbor representation
Tang, Song; Ye, Mao; Liu, Qihe; Li, Fan
2016-05-01
In many practical applications, we frequently face the awkward problem in which an image classifier trained in a scenario is difficult to use in a new scenario. Traditionally, the probability inference-based methods are used to solve this problem. From the point of image representation, we propose an approach for domain adaption of image classification. First, all source samples are supposed to form the dictionary. Then, we encode the target sample by combining this dictionary and the local geometric information. Based on this new representation, called target nearest-neighbor representation, image classification can obtain good performance in the target domain. Our core contribution is that the nearest-neighbor information of the target sample is technically exploited to form more robust representation. Experimental results confirm the effectiveness of our method.
Quantum Simulation of Pairing Hamiltonians with Nearest-Neighbor Interacting Qubits
Wang, Zhixin; Gu, Xiu; Wu, Lian-Ao; Liu, Yu-xi
2014-01-01
Although a universal quantum computer is still far from reach, the tremendous advances in controllable quantum devices, in particular with solid-state systems, make it possible to physically implement "quantum simulators". Quantum simulators are physical setups able to simulate other quantum systems efficiently that are intractable on classical computers. Based on solid-state qubit systems with various types of nearest-neighbor interactions, we propose a complete set of algorithms for simulat...
Recovery of delay time from time series based on the nearest neighbor method
Energy Technology Data Exchange (ETDEWEB)
Prokhorov, M.D., E-mail: mdprokhorov@yandex.ru [Saratov Branch of Kotel' nikov Institute of Radio Engineering and Electronics of Russian Academy of Sciences, Zelyonaya Street, 38, Saratov 410019 (Russian Federation); Ponomarenko, V.I. [Saratov Branch of Kotel' nikov Institute of Radio Engineering and Electronics of Russian Academy of Sciences, Zelyonaya Street, 38, Saratov 410019 (Russian Federation); Department of Nano- and Biomedical Technologies, Saratov State University, Astrakhanskaya Street, 83, Saratov 410012 (Russian Federation); Khorev, V.S. [Department of Nano- and Biomedical Technologies, Saratov State University, Astrakhanskaya Street, 83, Saratov 410012 (Russian Federation)
2013-12-09
We propose a method for the recovery of delay time from time series of time-delay systems. The method is based on the nearest neighbor analysis. The method allows one to reconstruct delays in various classes of time-delay systems including systems of high order, systems with several coexisting delays, and nonscalar time-delay systems. It can be applied to time series heavily corrupted by additive and dynamical noise.
Liew, Sing
2012-01-01
The author would like to propose a simple but yet effective method, convex layers, nearest neighbor and triangle inequality, to approach the Traveling Salesman Problem (TSP). No computer is needed in this method. This method is designed for plain folks who faced the TSP everyday but do not have the sophisticated knowledge of computer science, programming language or applied mathematics. The author also hopes that it would give some insights to researchers who are interested in the TSP.
Recovery of delay time from time series based on the nearest neighbor method
Prokhorov, M. D.; Ponomarenko, V. I.; Khorev, V. S.
2013-12-01
We propose a method for the recovery of delay time from time series of time-delay systems. The method is based on the nearest neighbor analysis. The method allows one to reconstruct delays in various classes of time-delay systems including systems of high order, systems with several coexisting delays, and nonscalar time-delay systems. It can be applied to time series heavily corrupted by additive and dynamical noise.
Testing spatial symmetry using contingency tables based on nearest neighbor relations
Elvan Ceyhan
2014-01-01
Research Article Testing Spatial Symmetry Using Contingency Tables Based on Nearest Neighbor Relations Elvan Ceyhan Department of Mathematics, Koc¸ University, Sarıyer, 34450 Istanbul, Turkey Correspondence should be addressed to Elvan Ceyhan; Received 23 August 2013; Accepted 22 October 2013; Published 19 January 2014 Academic Editors: A. Barra, S. Casado, and J. Pacheco Copyright © 2014 Elvan Ceyhan. This is an open access article distributed under...
A gamma dose distribution evaluation technique using the k-d tree for nearest neighbor searching.
Yuan, Jiankui; Chen, Weimin
2010-09-01
The authors propose an algorithm based on the k-d tree for nearest neighbor searching to improve the gamma calculation time for 2D and 3D dose distributions. The gamma calculation method has been widely used for comparisons of dose distributions in clinical treatment plans and quality assurances. By specifying the acceptable dose and distance-to-agreement criteria, the method provides quantitative measurement of the agreement between the reference and evaluation dose distributions. The gamma value indicates the acceptability. In regions where gamma nearest neighbor can be an O (log N) operation with a k-d tree, where N is the total number of the dose points, the authors propose an algorithm based on the k-d tree for the gamma evaluation in this work. In the experiment, the authors found that the average k-d tree construction time per reference point is O (log N), while the nearest neighbor searching time per evaluation point is proportional to O (N(1/k), where k is between 2 and 3 for two-dimensional and three-dimensional dose distributions, respectively. Comparing with other algorithms such as exhaustive search and sorted list O (N), the k-d tree algorithm for gamma evaluation is much more efficient.
Accurate prediction of enzyme subfamily class using an adaptive fuzzy k-nearest neighbor method.
Huang, Wen-Lin; Chen, Hung-Ming; Hwang, Shiow-Fen; Ho, Shinn-Ying
2007-01-01
Amphiphilic pseudo-amino acid composition (Am-Pse-AAC) with extra sequence-order information is a useful feature for representing enzymes. This study first utilizes the k-nearest neighbor (k-NN) rule to analyze the distribution of enzymes in the Am-Pse-AAC feature space. This analysis indicates the distributions of multiple classes of enzymes are highly overlapped. To cope with the overlap problem, this study proposes an efficient non-parametric classifier for predicting enzyme subfamily class using an adaptive fuzzy r-nearest neighbor (AFK-NN) method, where k and a fuzzy strength parameter m are adaptively specified. The fuzzy membership values of a query sample Q are dynamically determined according to the position of Q and its weighted distances to the k nearest neighbors. Using the same enzymes of the oxidoreductases family for comparisons, the prediction accuracy of AFK-NN is 76.6%, which is better than those of Support Vector Machine (73.6%), the decision tree method C5.0 (75.4%) and the existing covariant-discriminate algorithm (70.6%) using a jackknife test. To evaluate the generalization ability of AFK-NN, the datasets for all six families of entirely sequenced enzymes are established from the newly updated SWISS-PROT and ENZYME database. The accuracy of AFK-NN on the new large-scale dataset of oxidoreductases family is 83.3%, and the mean accuracy of the six families is 92.1%.
Optimal Selection of Reference Set for the Nearest Neighbor Classification by Tabu Search
Institute of Scientific and Technical Information of China (English)
张鸿宾; 孙广煜
2001-01-01
In this paper, a new approach is presented to find the reference set for the nearest neighbor classifier. The optimal reference set, which has minimum sample size and satisfies a certain error rate threshold, is obtained through a Tabu search algorithm. When the error rate threshold is set to zero, the algorithm obtains a near minimal consistent subset of a given training set. While the threshold is set to a small appropriate value, the obtained reference set may compensate the bias of the nearest neighbor estimate. An aspiration criterion for Tabu search is introduced, which aims to prevent the search process from the inefficient wandering between the feasible and infeasible regions in the search space and speed up the convergence. Experimental results based on a number of typical data sets are presented and analyzed to illustrate the benefits of the proposed method. Compared to conventional methods, such as CNN and Dasarathy's algorithm, the size of the reduced reference sets is much smaller, and the nearest neighbor classification performance is better, especially when the error rate thresholds are set to appropriate nonzero values. The experimental results also illustrate that the MCS (minimal consistent set) of Dasarathy's algorithm is not minimal, and its candidate consistent set is not always ensured to reduce monotonically. A counter example is also given to confirm this claim.
Directory of Open Access Journals (Sweden)
Sumarlin Sumarlin
2016-04-01
Full Text Available In line with the growth in the academic field especially college, scholarship is a problem that is interesting to study. Several studies in the field of computers for the screening or classification scholarships have been carried out in the academic authorities to minimize the error in awarding scholarships. This study discusses the classification of PPA and BBM scholarships based on variables that have been determined by applying the k-nearest neighbor algorithm. The process of selecting awardees PPA and BBM requires a decision support system (DSS to help provide alternative solutions. The results of the classification system will be used as a decision in awarding scholarships to students who submit. Results of testing to measure the performance of k - nearest neighbor algorithm using cross validation method, Confusion Matrix and the Receiver Operating Characteristic (ROC curve, the accuracy obtained for PPA scholarships reached 88.33% with a value of 0.925 area under curve (AUC dataset of 227 records, while accuracy is obtained for fuel BBM scholarships reached 90% with a value of 0.937% AUC dataset of 183 records, accuracy for PPA and BBM scholarships reached 85,56% and AUC value 0,958. Because AUC values were in the range of 0.9 to 1.0 the method falls within the category of very good (excellent. Keywords: Decision Support System; K-nearest neighbor; Classification; Scholarship
A Hybrid Instance Selection Using Nearest-Neighbor for Cross-Project Defect Prediction
Institute of Scientific and Technical Information of China (English)
Duksan Ryu; Jong-In Jang; Jongmoon Baik; Member; ACM; IEEE
2015-01-01
Software defect prediction (SDP) is an active research field in software engineering to identify defect-prone modules. Thanks to SDP, limited testing resources can be effectively allocated to defect-prone modules. Although SDP requires suffcient local data within a company, there are cases where local data are not available, e.g., pilot projects. Companies without local data can employ cross-project defect prediction (CPDP) using external data to build classifiers. The major challenge of CPDP is different distributions between training and test data. To tackle this, instances of source data similar to target data are selected to build classifiers. Software datasets have a class imbalance problem meaning the ratio of defective class to clean class is far low. It usually lowers the performance of classifiers. We propose a Hybrid Instance Selection Using Nearest-Neighbor (HISNN) method that performs a hybrid classification selectively learning local knowledge (via k-nearest neighbor) and global knowledge (via na¨ıve Bayes). Instances having strong local knowledge are identified via nearest-neighbors with the same class label. Previous studies showed low PD (probability of detection) or high PF (probability of false alarm) which is impractical to use. The experimental results show that HISNN produces high overall performance as well as high PD and low PF.
Lu, Zhi John; Turner, Douglas H; Mathews, David H
2006-01-01
A complete set of nearest neighbor parameters to predict the enthalpy change of RNA secondary structure formation was derived. These parameters can be used with available free energy nearest neighbor parameters to extend the secondary structure prediction of RNA sequences to temperatures other than 37 degrees C. The parameters were tested by predicting the secondary structures of sequences with known secondary structure that are from organisms with known optimal growth temperatures. Compared with the previous set of enthalpy nearest neighbor parameters, the sensitivity of base pair prediction improved from 65.2 to 68.9% at optimal growth temperatures ranging from 10 to 60 degrees C. Base pair probabilities were predicted with a partition function and the positive predictive value of structure prediction is 90.4% when considering the base pairs in the lowest free energy structure with pairing probability of 0.99 or above. Moreover, a strong correlation is found between the predicted melting temperatures of RNA sequences and the optimal growth temperatures of the host organism. This indicates that organisms that live at higher temperatures have evolved RNA sequences with higher melting temperatures.
Role of nearest-neighbor drops in the kinetics of homogeneous nucleation in a supersaturated vapor
Grinin, A. P.; Zhuvikina, I. A.; Kuni, F. M.; Reiss, H.
2004-12-01
A theory of simultaneous nucleation and drop growth in a supersaturated vapor is developed. The theory makes use of the concept of "nearest-neighbor" drops. The effect of vapor heterogeneity caused by vapor diffusion to a growing drop, formed previously, is accounted for by considering the nucleation of the nearest-neighbor drop. The diffusional boundary value problem is solved through the application of a recent theory that maintains material balance between the vapor and the drop, even though the drop boundary is a moving one. This is fundamental to the use of the proper time and space dependent vapor supersaturation in the application of nucleation theory. The conditions are formulated under which the mean distance to the nearest-neighbor drop and the mean time to its appearance can be determined reliably. Under these conditions, the mean time provides an estimate of the duration of the nucleation stage, while the mean distance provides an estimate of the number of drops formed per unit volume during the nucleation stage. It turns out, surprisingly, that these estimates agree fairly well with the predictions of the simpler and more standard approach based on the approximation that the density of the vapor phase remains uniform during the nucleation stage. Thus, as a practical matter, in many situations, the use of the simpler and less rigorous method is justified by the predictions of the more rigorous, but more complicated theory.
Institute of Scientific and Technical Information of China (English)
郭昌辉; 刘贵全; 张磊
2012-01-01
Storage device performance prediction is a significant element of self-managed storage systems and application planning tasks, such as data assignment. The traditional methods for storage device performance prediction, such as accurate simulations and analytic models, needs sufficient expertise about storages. As the storage devices are becoming more and more high-end and complex, the accurate simulations and analytic models are not available. Compared with traditional methods, the machine learning methods consider the storage devices as black boxes, and needs no information about the internal components or algorithms of those storage devices. So machine learning methods are more appropriate for the trend of current storage devices development. Classification and regression tree（CART） method for modelling storage devices is simple. This work explores an interactive model based on regression tree and K-nearest neighbor algorithm to improve the machine learning method. Experiments show that our proposed model has a higher prediction precise and a better stability than regression tree or KNN. In our experiments, we found out that the caching effect is very important. We improved the method of workload characterization considering caching effect, which makes a substantial difference on prediction accuracy.%存储设备性能预测在存储系统的自动化管理以及规划任务中发挥重要的作用．传统的方法是利用分析模型、仿真模型来预测存储设备性能，但这类方法需要大量领域专家知识，也无法适应越来越高端、复杂的存储系统；利用机器学习的方法构建存储设备的预测模型不需要了解存储设备的内部结构和调度算法，但缺陷是预测精度不够高．本文提出一种基于回归树与K-最近邻这两种具备潜在优劣互补特性的交互模型来预测存储设备性能，以获取更高的预测精度．通过实验表明，该混合模型较单一模型（回归
Efficient and accurate nearest neighbor and closest pair search in high-dimensional space
Tao, Yufei
2010-07-01
Nearest Neighbor (NN) search in high-dimensional space is an important problem in many applications. From the database perspective, a good solution needs to have two properties: (i) it can be easily incorporated in a relational database, and (ii) its query cost should increase sublinearly with the dataset size, regardless of the data and query distributions. Locality-Sensitive Hashing (LSH) is a well-known methodology fulfilling both requirements, but its current implementations either incur expensive space and query cost, or abandon its theoretical guarantee on the quality of query results. Motivated by this, we improve LSH by proposing an access method called the Locality-Sensitive B-tree (LSB-tree) to enable fast, accurate, high-dimensional NN search in relational databases. The combination of several LSB-trees forms a LSB-forest that has strong quality guarantees, but improves dramatically the efficiency of the previous LSH implementation having the same guarantees. In practice, the LSB-tree itself is also an effective index which consumes linear space, supports efficient updates, and provides accurate query results. In our experiments, the LSB-tree was faster than: (i) iDistance (a famous technique for exact NN search) by two orders ofmagnitude, and (ii) MedRank (a recent approximate method with nontrivial quality guarantees) by one order of magnitude, and meanwhile returned much better results. As a second step, we extend our LSB technique to solve another classic problem, called Closest Pair (CP) search, in high-dimensional space. The long-term challenge for this problem has been to achieve subquadratic running time at very high dimensionalities, which fails most of the existing solutions. We show that, using a LSB-forest, CP search can be accomplished in (worst-case) time significantly lower than the quadratic complexity, yet still ensuring very good quality. In practice, accurate answers can be found using just two LSB-trees, thus giving a substantial
Bi, Jiang-lin; Wang, Wei; Li, Qi
2017-07-01
In this paper, the effects of the next-nearest neighbors exchange couplings on the magnetic and thermal properties of the ferrimagnetic mixed-spin (2, 5/2) Ising model on a 3D honeycomb lattice have been investigated by the use of Monte Carlo simulation. In particular, the influences of exchange couplings (Ja, Jb, Jan) and the single-ion anisotropy(Da) on the phase diagrams, the total magnetization, the sublattice magnetization, the total susceptibility, the internal energy and the specific heat have been discussed in detail. The results clearly show that the system can express the critical and compensation behavior within the next-nearest neighbors exchange coupling. Great deals of the M curves such as N-, Q-, P- and L-types have been discovered, owing to the competition between the exchange coupling and the temperature. Compared with other theoretical and experimental works, our results have an excellent consistency with theirs.
Schmalz, M.; Key, G.
Accurate spectral signature classification is a crucial step in the nonimaging detection and recognition of spaceborne objects. In classical hyperspectral recognition applications, especially where linear mixing models are employed, signature classification accuracy depends on accurate spectral endmember discrimination. In selected target recognition (ATR) applications, previous non-adaptive techniques for signature classification have yielded class separation and classifier refinement results that tend to be suboptimal. In practice, the number of signatures accurately classified often depends linearly on the number of inputs. This can lead to potentially severe classification errors in the presence of noise or densely interleaved signatures. In this paper, we present an enhancement of an emerging technology for nonimaging spectral signature classification based on a highly accurate, efficient search engine called Tabular Nearest Neighbor Encoding (TNE). Adaptive TNE can optimize its classifier performance to track input nonergodicities and yield measures of confidence or caution for evaluation of classification results. Unlike neural networks, TNE does not have a hidden intermediate data structure (e.g., a neural net weight matrix). Instead, TNE generates and exploits a user-accessible data structure called the agreement map (AM), which can be manipulated by Boolean logic operations to effect accurate classifier refinement through programmable algorithms. The open architecture and programmability of TNE's pattern-space (AM) processing allows a TNE developer to determine the qualitative and quantitative reasons for classification accuracy, as well as characterize in detail the signatures for which TNE does not obtain classification matches, and why such mis-matches occur. In this study AM-based classification has been modified to partially compensate for input statistical changes, in response to performance metrics such as probability of correct classification (Pd
A γ dose distribution evaluation technique using the k-d tree for nearest neighbor searching.
Yuan, Jiankui; Chen, Weimin
2010-09-01
The authors propose an algorithm based on the k-d tree for nearest neighbor searching to improve theγ calculation time for 2D and 3D dose distributions. Theγ calculation method has been widely used for comparisons of dose distributions in clinical treatment plans and quality assurances. By specifying the acceptable dose and distance-to-agreement criteria, the method provides quantitative measurement of the agreement between the reference and evaluation dose distributions. The γ value indicates the acceptability. In regions where γ≤1, the predefined criterion is satisfied and thus the agreement is acceptable; otherwise, the agreement fails. Although the concept of the method is not complicated and a quick naïve implementation is straightforward, an efficient and robust implementation is not trivial. Recent algorithms based on exhaustive searching within a maximum radius, the geometric Euclidean distance, and the table lookup method have been proposed to improve the computational time for multidimensional dose distributions. Motivated by the fact that the least searching time for finding a nearest neighbor can be an O(logN) operation with a k-d tree, where N is the total number of the dose points, the authors propose an algorithm based on the k-d tree for the γ evaluation in this work. In the experiment, the authors found that the average k-d tree construction time per reference point isO(logN), while the nearest neighbor searching time per evaluation point is proportional to O(N1/k), where k is between 2 and 3 for two-dimensional and three-dimensional dose distributions, respectively. Comparing with other algorithms such as exhaustive search and sorted listO(N), the k-d tree algorithm for γ evaluation is much more efficient. © 2010 American Association of Physicists in Medicine.
Dula, J.; Zare, A.; Ho, Dominic; Gader, P.
2013-06-01
A possibilistic K-Nearest Neighbors classifier is presented to classify mine and non-mine objects using data collected from a wideband electromagnetic induction (WEMI) sensor. The proposed classifier is motivated by the observation that buried objects often have consistent signatures depending on their metal content, size, shape, and depth. Given a joint orthogonal matching pursuits (JOMP) sparse representation, particular target types consistently selected the same dictionary elements. The proposed classifier distinguishes between target types using the frequency of dictionary elements selected by potential landmine alarms. Results are shown on data containing sixteen landmine types and several non-mine examples.
Efficient K-Nearest Neighbor Join Algorithms for High Dimensional Sparse Data
Wang, Jijie; Huang, Ting; Wang, Jingjing; He, Zengyou
2010-01-01
The K-Nearest Neighbor (KNN) join is an expensive but important operation in many data mining algorithms. Several recent applications need to perform KNN join for high dimensional sparse data. Unfortunately, all existing KNN join algorithms are designed for low dimensional data. To fulfill this void, we investigate the KNN join problem for high dimensional sparse data. In this paper, we propose three KNN join algorithms: a brute force (BF) algorithm, an inverted index-based(IIB) algorithm and an improved inverted index-based(IIIB) algorithm. Extensive experiments on both synthetic and real-world datasets were conducted to demonstrate the effectiveness of our algorithms for high dimensional sparse data.
Face Recognition Based on Support Vector Machine and Nearest Neighbor Classifier
Institute of Scientific and Technical Information of China (English)
张燕昆; 刘重庆
2003-01-01
Support vector machine (SVM), as a novel approach in pattern recognition, has demonstrated a success in face detection and face recognition. In this paper, a face recognition approach based on the SVM classifier with the nearest neighbor classifier (NNC) is proposed. The principal component analysis (PCA) is used to reduce the dimension and extract features. Then one-against-all stratedy is used to train the SVM classifiers. At the testing stage, we propose an algorithm by combining SVM classifier with NNC to improve the correct recognition rate. We conduct the experiment on the Cambridge ORL face database. The result shows that our approach outperforms the standard eigenface approach and some other approaches.
Weighted K-Nearest Neighbor Classification Algorithm Based on Genetic Algorithm
Directory of Open Access Journals (Sweden)
Xuesong Yan
2013-10-01
Full Text Available K-Nearest Neighbor (KNN is one of the most popular algorithms for data classification. Many researchers have found that the KNN algorithm accomplishes very good performance in their experiments on different datasets. The traditional KNN text classification algorithm has limitations: calculation complexity, the performance is solely dependent on the training set, and so on. To overcome these limitations, an improved version of KNN is proposed in this paper, we use genetic algorithm combined with weighted KNN to improve its classification performance. and the experiment results shown that our proposed algorithm outperforms the KNN with greater accuracy.
An, Fengwei; Mihara, Keisuke; Yamasaki, Shogo; Chen, Lei; Jürgen Mattausch, Hans
2016-04-01
VLSI-implementations are often applied to solve the high computational cost of pattern matching but have usually low flexibility for satisfying different target applications. In this paper, a digital word-parallel associative memory architecture for k nearest neighbor (KNN) search, which is one of the most basic algorithms in pattern recognition, is reported applying the squared Euclidean distance measure. The reported architecture features reconfigurable parallelism, dual-storage space to achieve a flexible number of reference vectors, and a dedicated majority vote circuit. Programmable switching circuits, located between vector components, enable scalability of the searching parallelism by configuring the reference feature-vector dimensionality. A pipelined storage with dual static-random-access-memory (SRAM) cells for each unit and an intermediate winner control circuit are designed to extend the applicability by improving the flexibility of the reference storage. A test chip in 180 nm CMOS technology, which has 32 rows, 4 elements in each row and 2-parallel 8-bit dual-components in each element, consumes altogether 61.4 mW and in particular only 11.9 mW during the reconfigurable KNN classification (at 45.58 MHz and 1.8 V).
Directory of Open Access Journals (Sweden)
D.A. Adeniyi
2016-01-01
Full Text Available The major problem of many on-line web sites is the presentation of many choices to the client at a time; this usually results to strenuous and time consuming task in finding the right product or information on the site. In this work, we present a study of automatic web usage data mining and recommendation system based on current user behavior through his/her click stream data on the newly developed Really Simple Syndication (RSS reader website, in order to provide relevant information to the individual without explicitly asking for it. The K-Nearest-Neighbor (KNN classification method has been trained to be used on-line and in Real-Time to identify clients/visitors click stream data, matching it to a particular user group and recommend a tailored browsing option that meet the need of the specific user at a particular time. To achieve this, web users RSS address file was extracted, cleansed, formatted and grouped into meaningful session and data mart was developed. Our result shows that the K-Nearest Neighbor classifier is transparent, consistent, straightforward, simple to understand, high tendency to possess desirable qualities and easy to implement than most other machine learning techniques specifically when there is little or no prior knowledge about data distribution.
k-Nearest Neighbor Query Processing Algorithms for a Query Region in Road Networks
Institute of Scientific and Technical Information of China (English)
Hyeong-Il Kim; Jae-Woo Chang
2013-01-01
Recent development of wireless communication technologies and the popularity of smart phones are making location-based services (LBS) popular.However,requesting queries to LBS servers with users' exact locations may threat the privacy of users.Therefore,there have been many researches on generating a cloaked query region for user privacy protection.Consequently,an efficient query processing algorithm for a query region is required.So,in this paper,we propose k-nearest neighbor query (k-NN) processing algorithms for a query region in road networks.To efficiently retrieve k-NN points of interest (POIs),we make use of the Island index.We also propose a method that generates an adaptive Island index to improve the query processing performance and storage usage.Finally,we show by our performance analysis that our k-NN query processing algorithms outperform the existing k-Range Nearest Neighbor (kRNN) algorithm in terms of network expansion cost and query processing time.
Enhancing the Jacquez k nearest neighbor test for space-time interaction.
Malizia, Nicholas; Mack, Elizabeth A
2012-09-20
The Jacquez k nearest neighbor test, originally developed to improve upon shortcomings of existing tests for space-time interaction, has been shown to be a robust and powerful method of detecting interaction. Despite its flexibility and power, however, the test has three main shortcomings: (i) it discards important information regarding the spatial and temporal scales at which the detected interaction takes place; (ii) the results of the test have not been visualized; and (iii) recent research demonstrates the test to be susceptible to population shift bias. This study presents enhancements to the Jacquez k nearest neighbors test with the goal of addressing each of these three shortcomings and of improving the utility of the test. Data on Burkitt's lymphoma cases in Uganda between 1961 and 1975 are used to illustrate the modifications and enhanced visual output of the test. Output from the enhanced test is compared with that provided by alternative tests of space-time interaction. Results show the enhancements presented in this study transform the Jacquez test into a complete, descriptive, and informative metric that can be used as a stand-alone measure of global space-time interaction. Copyright © 2012 John Wiley & Sons, Ltd.
PENGENALAN MOTIF BATIK MENGGUNAKAN DETEKSI TEPI CANNY DAN K-NEAREST NEIGHBOR
Directory of Open Access Journals (Sweden)
Johanes Widagdho Yodha
2014-11-01
Full Text Available Salah satu budaya ciri khas Indonesia yang telah dikenal dunia adalah batik. Penelitian ini bertujuan untuk mengenali 6 jenis motif batik pada buku karangan H.Santosa Doellah yang berjudul “Batik: Pengaruh Zaman dan Lingkungan”. Proses klasifikasi akan melalui 3 tahap yaitu preprosesing, feature extraction dan klasifikasi. Preproses mengubah citra warna batik menjadi citra grayscale. Pada tahap feature extraction citra grayscale ditingkatkan kontrasnya dengan histogram equalization dan kemudian menggunakan deteksi tepi Canny untuk memisahkan motif batik dengan backgroundnya dan untuk mendapatkan pola dari motif batik tersebut. Hasil ekstraksi kemudian dikelompokkan dan diberi label sesuai motifnya masing-masing dan kemudian diklasifikasikan menggunakan k-¬Nearest Neighbor menggunakan pencarian jarak Manhattan. Hasil uji coba diperoleh akurasi tertinggi mencapai 100% pada penggunaan data¬ testing sama dengan data training (dataset sebanyak 300 image. Pada penggunaan data training yang berbeda dengan data testing diperoleh akurasi tertinggi 66,67%. Kedua akurasi tersebut diperoleh dengan menggunakan lower threshold = 0.010 dan upper threshold = 0.115 dan menggunakan k=1. Kata kunci : Batik, Edge Detection, Canny, k-Nearest Neighbor, Manhattan distance
Suratanee, Apichat; Plaimas, Kitiporn
2014-08-01
Inflammatory bowel disease (IBD) is a chronic disease whose incidence and prevalence increase every year; however, the pathogenesis of IBD is still unclear. Thus, identifying IBD-related proteins is important for understanding its complex disease mechanism. Here, we propose a new and simple network-based approach using a reverse k-nearest neighbor ( R k NN ) search to identify novel IBD-related proteins. Protein-protein interactions (PPI) and Genome-Wide Association Studies (GWAS) were used in this study. After constructing the PPI network, the R k NN search was applied to all of the proteins to identify sets of influenced proteins among their k-nearest neighbors ( R k NNs ). An observed protein whose influenced proteins were mostly known IBD-related proteins was statistically identified as a novel IBD-related protein. Our method outperformed a random aspect, k NN search, and centrality measures based on the network topology. A total of 39 proteins were identified as IBD-related proteins. Of these proteins, 71% were reported at least once in the literature as related to IBD. Additionally, these proteins were found over-represented in the IBD pathway and enriched in importantly functional pathways in IBD. In conclusion, the R k NN search with the statistical enrichment test is a great tool to identify IBD-related proteins to better understand its complex disease mechanism.
Prediction of protein solvent accessibility using fuzzy k-nearest neighbor method.
Sim, Jaehyun; Kim, Seung-Yeon; Lee, Julian
2005-06-15
The solvent accessibility of amino acid residues plays an important role in tertiary structure prediction, especially in the absence of significant sequence similarity of a query protein to those with known structures. The prediction of solvent accessibility is less accurate than secondary structure prediction in spite of improvements in recent researches. The k-nearest neighbor method, a simple but powerful classification algorithm, has never been applied to the prediction of solvent accessibility, although it has been used frequently for the classification of biological and medical data. We applied the fuzzy k-nearest neighbor method to the solvent accessibility prediction, using PSI-BLAST profiles as feature vectors, and achieved high prediction accuracies. With leave-one-out cross-validation on the ASTRAL SCOP reference dataset constructed by sequence clustering, our method achieved 64.1% accuracy for a 3-state (buried/intermediate/exposed) prediction (thresholds of 9% for buried/intermediate and 36% for intermediate/exposed) and 86.7, 82.0, 79.0 and 78.5% accuracies for 2-state (buried/exposed) predictions (thresholds of each 0, 5, 16 and 25% for buried/exposed), respectively. Our method also showed slightly better accuracies than other methods by about 2-5% on the RS126 dataset and a benchmarking dataset with 229 proteins. Program and datasets are available at http://biocom1.ssu.ac.kr/FKNNacc/ jul@ssu.ac.kr.
Prediction of carbamylated lysine sites based on the one-class k-nearest neighbor method.
Huang, Guohua; Zhou, You; Zhang, Yuchao; Li, Bi-Qing; Zhang, Ning; Cai, Yu-Dong
2013-11-01
Protein carbamylation is one of the important post-translational modifications, which plays a pivotal role in a number of biological conditions, such as diseases, chronic renal failure and atherosclerosis. Therefore, recognition and identification of protein carbamylated sites are essential for disease treatment and prevention. Yet the mechanism of action of carbamylated lysine sites is still not realized. Thus it remains a largely unsolved challenge to uncover it, whether experimentally or theoretically. To address this problem, we have presented a computational framework for theoretically predicting and analyzing carbamylated lysine sites based on both the one-class k-nearest neighbor method and two-stage feature selection. The one-class k-nearest neighbor method requires no negative samples in training. Experimental results showed that by using 280 optimal features the presented method achieved promising performances of SN=82.50% for the jackknife test on the training set, and SN=66.67%, SP=100.00% and MCC=0.8097 for the independent test on the testing set, respectively. Further analysis of the optimal features provided insights into the mechanism of action of carbamylated lysine sites. It is anticipated that our method could be a potentially useful and essential tool for biologists to theoretically investigate carbamylated lysine sites.
Efficient Metric All-k-Nearest-Neighbor Search on Datasets Without Any Index
Institute of Scientific and Technical Information of China (English)
Hai-Da Zhang; Zhi-Hao Xing; Lu Chen; Yun-Jun Gao
2016-01-01
An all-k-nearest-neighbor (AkNN) query finds k nearest neighbors for each query object. This problem arises naturally in many areas, such as GIS (geographic information system), multimedia retrieval, and recommender systems. To support various data types and flexible distance metrics involved in real applications, we study AkNN retrieval in metric spaces, namely, metric AkNN (MAkNN) search. Consider that the underlying indexes on the query set and the object set may not exist, which is natural in many scenarios. For example, the query set and the object set could be the results of other queries, and thus, the underlying indexes cannot be built in advance. To support MAkNN search on datasets without any underlying index, we propose an eﬃcient disk-based algorithm, termed as Partition-Based MAkNN Algorithm (PMA), which follows a partition-search framework and employs a series of pruning rules for accelerating the search. In addition, we extend our techniques to tackle an interesting variant of MAkNN queries, i.e., metric self-AkNN (MSAkNN) search, where the query set is identical to the object set. Extensive experiments using both real and synthetic datasets demonstrate the effectiveness of our pruning rules and the eﬃciency of the proposed algorithms, compared with state-of-the-art MAkNN and MSAkNN algorithms.
Simulating ensembles of source water quality using a K-nearest neighbor resampling approach.
Towler, Erin; Rajagopalan, Balaji; Seidel, Chad; Summers, R Scott
2009-03-01
Climatological, geological, and water management factors can cause significant variability in surface water quality. As drinking water quality standards become more stringent, the ability to quantify the variability of source water quality becomes more important for decision-making and planning in water treatment for regulatory compliance. However, paucity of long-term water quality data makes it challenging to apply traditional simulation techniques. To overcome this limitation, we have developed and applied a robust nonparametric K-nearest neighbor (K-nn) bootstrap approach utilizing the United States Environmental Protection Agency's Information Collection Rule (ICR) data. In this technique, first an appropriate "feature vector" is formed from the best available explanatory variables. The nearest neighbors to the feature vector are identified from the ICR data and are resampled using a weight function. Repetition of this results in water quality ensembles, and consequently the distribution and the quantification of the variability. The main strengths of the approach are its flexibility, simplicity, and the ability to use a large amount of spatial data with limited temporal extent to provide water quality ensembles for any given location. We demonstrate this approach by applying it to simulate monthly ensembles of total organic carbon for two utilities in the U.S. with very different watersheds and to alkalinity and bromide at two other U.S. utilities.
Institute of Scientific and Technical Information of China (English)
王玉丹; 南卓铜; 陈浩; 吴小波
2016-01-01
青藏高原的降水数据主要由遥感产品和多源观测数据融合产生,由于青藏高原的观测站点分布稀疏不均,遥感数据误差较大,因此常用的CMORPH (Climate Prediction Center Morphing Technique)等降水数据集精度有限.通过K最近邻(K-Nearest Neighbor,简称KNN)模型,可以建立环境(海拔、坡度、坡向、植被)、气象因子(气温、湿度、风速)和日降水量的关系,从而订正青藏高原的CMORPH日降水数据集,提高数据精度.对CMORPH日降水数据的误差分析表明,采用KNN模型订正后的CMORPH降水数据优于原始数据和采用PDF(Probability Density Function Matching Method)法订正的CMORPH数据,且空间分布较好地符合青藏高原的降水分布特征.
Seismic clusters analysis in North-Eastern Italy by the nearest-neighbor approach
Peresan, Antonella; Gentili, Stefania
2016-04-01
The main features of earthquake clusters in the Friuli Venezia Giulia Region (North Eastern Italy) are explored, with the aim to get some new insights on local scale patterns of seismicity in the area. The study is based on a systematic analysis of robustly and uniformly detected seismic clusters of small-to-medium magnitude events, as opposed to selected clusters analyzed in earlier studies. To characterize the features of seismicity for FVG, we take advantage of updated information from local OGS bulletins, compiled at the National Institute of Oceanography and Experimental Geophysics, Centre of Seismological Research, since 1977. A preliminary reappraisal of the earthquake bulletins is carried out, in order to identify possible missing events and to remove spurious records (e.g. duplicates and explosions). The area of sufficient completeness is outlined; for this purpose, different techniques are applied, including a comparative analysis with global ISC data, which are available in the region for large and moderate size earthquakes. Various techniques are considered to estimate the average parameters that characterize the earthquake occurrence in the region, including the b-value and the fractal dimension of epicenters distribution. Specifically, besides the classical Gutenberg-Richter Law, the Unified Scaling Law for Earthquakes, USLE, is applied. Using the updated and revised OGS data, a new formal method for detection of earthquake clusters, based on nearest-neighbor distances of events in space-time-energy domain, is applied. The bimodality of the distribution, which characterizes the earthquake nearest-neighbor distances, is used to decompose the seismic catalog into sequences of individual clusters and background seismicity. Accordingly, the method allows for a data-driven identification of main shocks (first event with the largest magnitude in the cluster), foreshocks and aftershocks. Average robust estimates of the USLE parameters (particularly, b
Discrimination of outer membrane proteins using a K-nearest neighbor method.
Yan, C; Hu, J; Wang, Y
2008-06-01
Identification of outer membrane proteins (OMPs) from genome is an important task. This paper presents a k-nearest neighbor (K-NN) method for discriminating outer membrane proteins (OMPs). The method makes predictions based on a weighted Euclidean distance that is computed from residue composition. The method achieves 89.1% accuracy with 0.668 MCC (Matthews correlation coefficient) in discriminating OMPs and non-OMPs. The performance of the method is improved by including homologous information into the calculation of residue composition. The final method achieves an accuracy of 96.1%, with 0.873 MCC, 87.5% sensitivity, and 98.2% specificity. Comparisons with multiple recently published methods show that the method proposed in this study outperforms the others.
Enhancing Patient Safety Event Reporting by K-nearest Neighbor Classifier.
Liang, Chen; Gong, Yang
2015-01-01
Data quality was placed as a major reason for the low utility of patient safety event reporting systems. A pressing need in improving data quality has advanced recent research focus in data entry associated with human factors. The debate on structured data entry or unstructured data entry reveals not only a trade-off problem among data accuracy, completeness, and timeliness, but also a technical gap on text mining. The present study suggested a text classification method, k-nearest neighbor (KNN), for predicting subject categories as in our proposed reporting system. Our results demonstrated the feasibility of KNN classifier used for text classification and indicated the advantage of such an application to raise data quality and clinical decision support in reporting patient safety events.
Quantum Algorithm for K-Nearest Neighbors Classification Based on the Metric of Hamming Distance
Ruan, Yue; Xue, Xiling; Liu, Heng; Tan, Jianing; Li, Xi
2017-08-01
K-nearest neighbors (KNN) algorithm is a common algorithm used for classification, and also a sub-routine in various complicated machine learning tasks. In this paper, we presented a quantum algorithm (QKNN) for implementing this algorithm based on the metric of Hamming distance. We put forward a quantum circuit for computing Hamming distance between testing sample and each feature vector in the training set. Taking advantage of this method, we realized a good analog for classical KNN algorithm by setting a distance threshold value t to select k - n e a r e s t neighbors. As a result, QKNN achieves O(n 3) performance which is only relevant to the dimension of feature vectors and high classification accuracy, outperforms Llyod's algorithm (Lloyd et al. 2013) and Wiebe's algorithm (Wiebe et al. 2014).
Quantum simulation of pairing Hamiltonians with nearest-neighbor-interacting qubits
Wang, Zhixin; Gu, Xiu; Wu, Lian-Ao; Liu, Yu-xi
2016-06-01
Although a universal quantum computer is still far from reach, the tremendous advances in controllable quantum devices, in particular with solid-state systems, make it possible to physically implement "quantum simulators." Quantum simulators are physical setups able to simulate other quantum systems efficiently that are intractable on classical computers. Based on solid-state qubit systems with various types of nearest-neighbor interactions, we propose a complete set of algorithms for simulating pairing Hamiltonians. The fidelity of the target states corresponding to each algorithm is numerically studied. We also compare algorithms designed for different types of experimentally available Hamiltonians and analyze their complexity. Furthermore, we design a measurement scheme to extract energy spectra from the simulators. Our simulation algorithms might be feasible with state-of-the-art technology in solid-state quantum devices.
Categorizing document by fuzzy C-Means and K-nearest neighbors approach
Priandini, Novita; Zaman, Badrus; Purwanti, Endah
2017-08-01
Increasing of technology had made categorizing documents become important. It caused by increasing of number of documents itself. Managing some documents by categorizing is one of Information Retrieval application, because it involve text mining on its process. Whereas, categorization technique could be done both Fuzzy C-Means (FCM) and K-Nearest Neighbors (KNN) method. This experiment would consolidate both methods. The aim of the experiment is increasing performance of document categorize. First, FCM is in order to clustering training documents. Second, KNN is in order to categorize testing document until the output of categorization is shown. Result of the experiment is 14 testing documents retrieve relevantly to its category. Meanwhile 6 of 20 testing documents retrieve irrelevant to its category. Result of system evaluation shows that both precision and recall are 0,7.
K-Nearest Neighbor Intervals Based AP Clustering Algorithm for Large Incomplete Data
Directory of Open Access Journals (Sweden)
Cheng Lu
2015-01-01
Full Text Available The Affinity Propagation (AP algorithm is an effective algorithm for clustering analysis, but it can not be directly applicable to the case of incomplete data. In view of the prevalence of missing data and the uncertainty of missing attributes, we put forward a modified AP clustering algorithm based on K-nearest neighbor intervals (KNNI for incomplete data. Based on an Improved Partial Data Strategy, the proposed algorithm estimates the KNNI representation of missing attributes by using the attribute distribution information of the available data. The similarity function can be changed by dealing with the interval data. Then the improved AP algorithm can be applicable to the case of incomplete data. Experiments on several UCI datasets show that the proposed algorithm achieves impressive clustering results.
Sdika, Michaël
2010-04-01
In this paper, different methods to improve atlas based segmentation are presented. The first technique is a new mapping of the labels of an atlas consistent with a given intensity classification segmentation. This new mapping combines the two segmentations using the nearest neighbor transform and is especially effective for complex and folded regions like the cortex where the registration is difficult. Then, in a multi atlas context, an original weighting is introduced to combine the segmentation of several atlases using a voting procedure. This weighting is derived from statistical classification theory and is computed offline using the atlases as a training dataset. Concretely, the accuracy map of each atlas is computed and the vote is weighted by the accuracy of the atlases. Numerical experiments have been performed on publicly available in vivo datasets and show that, when used together, the two techniques provide an important improvement of the segmentation accuracy.
Using the joint transform correlator as the feature extractor for the nearest neighbor classifier
Soon, Boon Y.; Karim, Mohammad A.; Alam, Mohammad S.
1999-01-01
Financial transactions using credit cards have gained popularity but the growing number of counterfeits and frauds may defeat the purpose of the cards. The search for a superior method to curb the criminal acts has become urgent especially in the brilliant information age. Currently, neural-network-based pattern recognition techniques are employed for security verification. However, it has been a time consuming experience, as some techniques require a long period of training time. Here, a faster and more efficient method is proposed to perform security verification that verifies the fingerprint images using the joint transform correlator as a feature extractor for nearest neighbor classifier. The uniqueness comparison scheme is proposed to improve the accuracy of the system verification. The performance of the system under noise corruption, variable contrast, and rotation of the input image is verified with a computer simulation.
Contribution to Transfer Entropy Estimation via the k-Nearest-Neighbors Approach
Directory of Open Access Journals (Sweden)
Jie Zhu
2015-06-01
Full Text Available This paper deals with the estimation of transfer entropy based on the k-nearest neighbors (k-NN method. To this end, we first investigate the estimation of Shannon entropy involving a rectangular neighboring region, as suggested in already existing literature, and develop two kinds of entropy estimators. Then, applying the widely-used error cancellation approach to these entropy estimators, we propose two novel transfer entropy estimators, implying no extra computational cost compared to existing similar k-NN algorithms. Experimental simulations allow the comparison of the new estimators with the transfer entropy estimator available in free toolboxes, corresponding to two different extensions to the transfer entropy estimation of the Kraskov–Stögbauer–Grassberger (KSG mutual information estimator and prove the effectiveness of these new estimators.
Supporting K nearest neighbors query on high-dimensional data in P2P systems
Institute of Scientific and Technical Information of China (English)
Mei LI; Wang-Chien LEE; Anand SIVASUBRAMANIAM; Jizhong ZHAO
2008-01-01
Peer-to-peer systems have been widely used for sharing and exchanging data and resources among numerous computer nodes.Various data objects identifiable with high dimensional feature vectors,such as text,images,genome sequences,are starting to leverage P2P technology.Most of the existing works have been focusing on queries on data objects with one or few attributes and thus are not applicable on high dimensional data objects.In this study,we investigate K nearest neighbors query (KNN)on high dimensional data objects in P2P systems.Efficient query algorithm and solutions that address various technical challenges raised by high dimensionality,such as search space resolution and incremental search space refinement,are proposed.An extensive simulation using both synthetic and real data sets demonstrates that our proposal efficiently supports KNN query on high dimensional data in P2P systems.
Distance-Constraint k-Nearest Neighbor Searching in Mobile Sensor Networks.
Han, Yongkoo; Park, Kisung; Hong, Jihye; Ulamin, Noor; Lee, Young-Koo
2015-07-27
The κ-Nearest Neighbors ( κNN) query is an important spatial query in mobile sensor networks. In this work we extend κNN to include a distance constraint, calling it a l-distant κ-nearest-neighbors (l-κNN) query, which finds the κ sensor nodes nearest to a query point that are also at or greater distance from each other. The query results indicate the objects nearest to the area of interest that are scattered from each other by at least distance l. The l-κNN query can be used in most κNN applications for the case of well distributed query results. To process an l-κNN query, we must discover all sets of κNN sensor nodes and then find all pairs of sensor nodes in each set that are separated by at least a distance l. Given the limited battery and computing power of sensor nodes, this l-κNN query processing is problematically expensive in terms of energy consumption. In this paper, we propose a greedy approach for l-κNN query processing in mobile sensor networks. The key idea of the proposed approach is to divide the search space into subspaces whose all sides are l. By selecting κ sensor nodes from the other subspaces near the query point, we guarantee accurate query results for l-κNN. In our experiments, we show that the proposed method exhibits superior performance compared with a post-processing based method using the κNN query in terms of energy efficiency, query latency, and accuracy.
Schmalz, M.; Ritter, G.; Key, R.
Accurate and computationally efficient spectral signature classification is a crucial step in the nonimaging detection and recognition of spaceborne objects. In classical hyperspectral recognition applications using linear mixing models, signature classification accuracy depends on accurate spectral endmember discrimination [1]. If the endmembers cannot be classified correctly, then the signatures cannot be classified correctly, and object recognition from hyperspectral data will be inaccurate. In practice, the number of endmembers accurately classified often depends linearly on the number of inputs. This can lead to potentially severe classification errors in the presence of noise or densely interleaved signatures. In this paper, we present an comparison of emerging technologies for nonimaging spectral signature classfication based on a highly accurate, efficient search engine called Tabular Nearest Neighbor Encoding (TNE) [3,4] and a neural network technology called Morphological Neural Networks (MNNs) [5]. Based on prior results, TNE can optimize its classifier performance to track input nonergodicities, as well as yield measures of confidence or caution for evaluation of classification results. Unlike neural networks, TNE does not have a hidden intermediate data structure (e.g., the neural net weight matrix). Instead, TNE generates and exploits a user-accessible data structure called the agreement map (AM), which can be manipulated by Boolean logic operations to effect accurate classifier refinement algorithms. The open architecture and programmability of TNE's agreement map processing allows a TNE programmer or user to determine classification accuracy, as well as characterize in detail the signatures for which TNE did not obtain classification matches, and why such mis-matches occurred. In this study, we will compare TNE and MNN based endmember classification, using performance metrics such as probability of correct classification (Pd) and rate of false
Chavan, Swapnil; Friedman, Ran; Nicholls, Ian A
2015-05-21
A k-nearest neighbor (k-NN) classification model was constructed for 118 RDT NEDO (Repeated Dose Toxicity New Energy and industrial technology Development Organization; currently known as the Hazard Evaluation Support System (HESS)) database chemicals, employing two acute toxicity (LD50)-based classes as a response and using a series of eight PaDEL software-derived fingerprints as predictor variables. A model developed using Estate type fingerprints correctly predicted the LD50 classes for 70 of 94 training set chemicals and 19 of 24 test set chemicals. An individual category was formed for each of the chemicals by extracting its corresponding k-analogs that were identified by k-NN classification. These categories were used to perform the read-across study for prediction of the chronic toxicity, i.e., Lowest Observed Effect Levels (LOEL). We have successfully predicted the LOELs of 54 of 70 training set chemicals (77%) and 14 of 19 test set chemicals (74%) to within an order of magnitude from their experimental LOEL values. Given the success thus far, we conclude that if the k-NN model predicts LD50 classes correctly for a certain chemical, then the k-analogs of such a chemical can be successfully used for data gap filling for the LOEL. This model should support the in silico prediction of repeated dose toxicity.
Directory of Open Access Journals (Sweden)
Swapnil Chavan
2015-05-01
Full Text Available A k-nearest neighbor (k-NN classification model was constructed for 118 RDT NEDO (Repeated Dose Toxicity New Energy and industrial technology Development Organization; currently known as the Hazard Evaluation Support System (HESS database chemicals, employing two acute toxicity (LD50-based classes as a response and using a series of eight PaDEL software-derived fingerprints as predictor variables. A model developed using Estate type fingerprints correctly predicted the LD50 classes for 70 of 94 training set chemicals and 19 of 24 test set chemicals. An individual category was formed for each of the chemicals by extracting its corresponding k-analogs that were identified by k-NN classification. These categories were used to perform the read-across study for prediction of the chronic toxicity, i.e., Lowest Observed Effect Levels (LOEL. We have successfully predicted the LOELs of 54 of 70 training set chemicals (77% and 14 of 19 test set chemicals (74% to within an order of magnitude from their experimental LOEL values. Given the success thus far, we conclude that if the k-NN model predicts LD50 classes correctly for a certain chemical, then the k-analogs of such a chemical can be successfully used for data gap filling for the LOEL. This model should support the in silico prediction of repeated dose toxicity.
Kumar, Mukesh; Rath, Nitish Kumar; Rath, Santanu Kumar
2016-04-01
Microarray-based gene expression profiling has emerged as an efficient technique for classification, prognosis, diagnosis, and treatment of cancer. Frequent changes in the behavior of this disease generates an enormous volume of data. Microarray data satisfies both the veracity and velocity properties of big data, as it keeps changing with time. Therefore, the analysis of microarray datasets in a small amount of time is essential. They often contain a large amount of expression, but only a fraction of it comprises genes that are significantly expressed. The precise identification of genes of interest that are responsible for causing cancer are imperative in microarray data analysis. Most existing schemes employ a two-phase process such as feature selection/extraction followed by classification. In this paper, various statistical methods (tests) based on MapReduce are proposed for selecting relevant features. After feature selection, a MapReduce-based K-nearest neighbor (mrKNN) classifier is also employed to classify microarray data. These algorithms are successfully implemented in a Hadoop framework. A comparative analysis is done on these MapReduce-based models using microarray datasets of various dimensions. From the obtained results, it is observed that these models consume much less execution time than conventional models in processing big data. Copyright © 2016 Elsevier Inc. All rights reserved.
Directory of Open Access Journals (Sweden)
E. E. Miandoab
2016-06-01
Full Text Available The inherent uncertainty to factors such as technology and creativity in evolving software development is a major challenge for the management of software projects. To address these challenges the project manager, in addition to examining the project progress, may cope with problems such as increased operating costs, lack of resources, and lack of implementation of key activities to better plan the project. Software Cost Estimation (SCE models do not fully cover new approaches. And this lack of coverage is causing problems in the consumer and producer ends. In order to avoid these problems, many methods have already been proposed. Model-based methods are the most familiar solving technique. But it should be noted that model-based methods use a single formula and constant values, and these methods are not responsive to the increasing developments in the field of software engineering. Accordingly, researchers have tried to solve the problem of SCE using machine learning algorithms, data mining algorithms, and artificial neural networks. In this paper, a hybrid algorithm that combines COA-Cuckoo optimization and K-Nearest Neighbors (KNN algorithms is used. The so-called composition algorithm runs on six different data sets and is evaluated based on eight evaluation criteria. The results show an improved accuracy of estimated cost.
Pires, A. S. T.
2017-01-01
I present in details the SU(N) Schwinger boson formalism, also known as flavor wave theory, that has been used several times in the literature. I use the method to study the ferroquadrupolar phase of a quantum biquadratic Heisenberg model with spin S=1 on the triangular lattice with third-nearest-neighbor interactions. Results for the phase diagram at zero temperature and the static and dynamical quadrupolar structure factors are presented. In principle, the results could be applied to NiGa2S4.
Alex, K.; Mclellan, R. B.
1971-01-01
A previous calculation of the thermodynamic properties of interstitial solid solutions based on the technique of Kirkwood expansions has been extended to include the effects of second nearest neighbor solute atom mutual interactions. The error inherent in the first order (or quasi-chemical) counting of the degeneracy of the solution crystal is avoided. It is shown that, at high temperatures, even strong second nearest neighbor solute mutual interactions have a negligible effect on the entropy of the solution and a small, temperature-dependent effect on the solute partial enthalpy.
Kamath, Sudha D; Mahato, K K
2007-01-01
The spectral analysis and classification for discrimination of pulsed laser-induced autofluorescence spectra of pathologically certified normal, premalignant, and malignant oral tissues recorded at a 325-nm excitation are carried out using MATLAB@R6-based principal component analysis (PCA) and k-means nearest neighbor (k-NN) analysis separately on the same set of spectral data. Six features such as mean, median, maximum intensity, energy, spectral residuals, and standard deviation are extracted from each spectrum of the 60 training samples (spectra) belonging to the normal, premalignant, and malignant groups and they are used to perform PCA on the reference database. Standard calibration models of normal, premalignant, and malignant samples are made using cluster analysis. We show that a feature vector of length 6 could be reduced to three components using the PCA technique. After performing PCA on the feature space, the first three principal component (PC) scores, which contain all the diagnostic information, are retained and the remaining scores containing only noise are discarded. The new feature space is thus constructed using three PC scores only and is used as input database for the k-NN classification. Using this transformed feature space, the centroids for normal, premalignant, and malignant samples are computed and the efficient classification for different classes of oral samples is achieved. A performance evaluation of k-NN classification results is made by calculating the statistical parameters specificity, sensitivity, and accuracy and they are found to be 100, 94.5, and 96.17%, respectively.
Tissue classification of large-scale multi-site MR data using fuzzy k-nearest neighbor method
Ghayoor, Ali; Paulsen, Jane S.; Kim, Regina E. Y.; Johnson, Hans J.
2016-03-01
This paper describes enhancements to automate classification of brain tissues for multi-site degenerative magnetic resonance imaging (MRI) data analysis. Processing of large collections of MR images is a key research technique to advance our understanding of the human brain. Previous studies have developed a robust multi-modal tool for automated tissue classification of large-scale data based on expectation maximization (EM) method initialized by group-wise prior probability distributions. This work aims to augment the EM-based classification using a non-parametric fuzzy k-Nearest Neighbor (k-NN) classifier that can model the unique anatomical states of each subject in the study of degenerative diseases. The presented method is applicable to multi-center heterogeneous data analysis and is quantitatively validated on a set of 18 synthetic multi-modal MR datasets having six different levels of noise and three degrees of bias-field provided with known ground truth. Dice index and average Hausdorff distance are used to compare the accuracy and robustness of the proposed method to a state-of-the-art classification method implemented based on EM algorithm. Both evaluation measurements show that presented enhancements produce superior results as compared to the EM only classification.
Directory of Open Access Journals (Sweden)
ShaoPeng Wang
2016-01-01
Full Text Available The development of biochemistry and molecular biology has revealed an increasingly important role of compounds in several biological processes. Like the aptamer-protein interaction, aptamer-compound interaction attracts increasing attention. However, it is time-consuming to select proper aptamers against compounds using traditional methods, such as exponential enrichment. Thus, there is an urgent need to design effective computational methods for searching effective aptamers against compounds. This study attempted to extract important features for aptamer-compound interactions using feature selection methods, such as Maximum Relevance Minimum Redundancy, as well as incremental feature selection. Each aptamer-compound pair was represented by properties derived from the aptamer and compound, including frequencies of single nucleotides and dinucleotides for the aptamer, as well as the constitutional, electrostatic, quantum-chemical, and space conformational descriptors of the compounds. As a result, some important features were obtained. To confirm the importance of the obtained features, we further discussed the associations between them and aptamer-compound interactions. Simultaneously, an optimal prediction model based on the nearest neighbor algorithm was built to identify aptamer-compound interactions, which has the potential to be a useful tool for the identification of novel aptamer-compound interactions. The program is available upon the request.
QRS detection using K-Nearest Neighbor algorithm (KNN) and evaluation on standard ECG databases.
Saini, Indu; Singh, Dilbag; Khosla, Arun
2013-07-01
The performance of computer aided ECG analysis depends on the precise and accurate delineation of QRS-complexes. This paper presents an application of K-Nearest Neighbor (KNN) algorithm as a classifier for detection of QRS-complex in ECG. The proposed algorithm is evaluated on two manually annotated standard databases such as CSE and MIT-BIH Arrhythmia database. In this work, a digital band-pass filter is used to reduce false detection caused by interference present in ECG signal and further gradient of the signal is used as a feature for QRS-detection. In addition the accuracy of KNN based classifier is largely dependent on the value of K and type of distance metric. The value of K = 3 and Euclidean distance metric has been proposed for the KNN classifier, using fivefold cross-validation. The detection rates of 99.89% and 99.81% are achieved for CSE and MIT-BIH databases respectively. The QRS detector obtained a sensitivity Se = 99.86% and specificity Sp = 99.86% for CSE database, and Se = 99.81% and Sp = 99.86% for MIT-BIH Arrhythmia database. A comparison is also made between proposed algorithm and other published work using CSE and MIT-BIH Arrhythmia databases. These results clearly establishes KNN algorithm for reliable and accurate QRS-detection.
A New Nearest Neighbor Classification Algorithm Based on Local Probability Centers
Directory of Open Access Journals (Sweden)
I-Jing Li
2014-01-01
Full Text Available The nearest neighbor is one of the most popular classifiers, and it has been successfully used in pattern recognition and machine learning. One drawback of kNN is that it performs poorly when class distributions are overlapping. Recently, local probability center (LPC algorithm is proposed to solve this problem; its main idea is giving weight to samples according to their posterior probability. However, LPC performs poorly when the value of k is very small and the higher-dimensional datasets are used. To deal with this problem, this paper suggests that the gradient of the posterior probability function can be estimated under sufficient assumption. The theoretic property is beneficial to faithfully calculate the inner product of two vectors. To increase the performance in high-dimensional datasets, the multidimensional Parzen window and Euler-Richardson method are utilized, and a new classifier based on local probability centers is developed in this paper. Experimental results show that the proposed method yields stable performance with a wide range of k for usage, robust performance to overlapping issue, and good performance to dimensionality. The proposed theorem can be applied to mathematical problems and other applications. Furthermore, the proposed method is an attractive classifier because of its simplicity.
The distance function effect on k-nearest neighbor classification for medical datasets.
Hu, Li-Yu; Huang, Min-Wei; Ke, Shih-Wen; Tsai, Chih-Fong
2016-01-01
K-nearest neighbor (k-NN) classification is conventional non-parametric classifier, which has been used as the baseline classifier in many pattern classification problems. It is based on measuring the distances between the test data and each of the training data to decide the final classification output. Since the Euclidean distance function is the most widely used distance metric in k-NN, no study examines the classification performance of k-NN by different distance functions, especially for various medical domain problems. Therefore, the aim of this paper is to investigate whether the distance function can affect the k-NN performance over different medical datasets. Our experiments are based on three different types of medical datasets containing categorical, numerical, and mixed types of data and four different distance functions including Euclidean, cosine, Chi square, and Minkowsky are used during k-NN classification individually. The experimental results show that using the Chi square distance function is the best choice for the three different types of datasets. However, using the cosine and Euclidean (and Minkowsky) distance function perform the worst over the mixed type of datasets. In this paper, we demonstrate that the chosen distance function can affect the classification accuracy of the k-NN classifier. For the medical domain datasets including the categorical, numerical, and mixed types of data, K-NN based on the Chi square distance function performs the best.
Jiang, Yuning; Kang, Jinfeng; Wang, Xinan
2017-03-01
Resistive switching memory (RRAM) is considered as one of the most promising devices for parallel computing solutions that may overcome the von Neumann bottleneck of today’s electronic systems. However, the existing RRAM-based parallel computing architectures suffer from practical problems such as device variations and extra computing circuits. In this work, we propose a novel parallel computing architecture for pattern recognition by implementing k-nearest neighbor classification on metal-oxide RRAM crossbar arrays. Metal-oxide RRAM with gradual RESET behaviors is chosen as both the storage and computing components. The proposed architecture is tested by the MNIST database. High speed (~100 ns per example) and high recognition accuracy (97.05%) are obtained. The influence of several non-ideal device properties is also discussed, and it turns out that the proposed architecture shows great tolerance to device variations. This work paves a new way to achieve RRAM-based parallel computing hardware systems with high performance.
A Proposal for Local $k$ Values for $k$ -Nearest Neighbor Rule.
Garcia-Pedrajas, Nicolas; Romero Del Castillo, Juan A; Cerruela-Garcia, Gonzalo
2017-02-01
The k -nearest neighbor ( k -NN) classifier is one of the most widely used methods of classification due to several interesting features, including good generalization and easy implementation. Although simple, it is usually able to match and even outperform more sophisticated and complex methods. One of the problems with this approach is fixing the appropriate value of k . Although a good value might be obtained using cross validation, it is unlikely that the same value could be optimal for the whole space spanned by the training set. It is evident that different regions of the feature space would require different values of k due to the different distributions of prototypes. The situation of a query instance in the center of a class is very different from the situation of a query instance near the boundary between two classes. In this brief, we present a simple yet powerful approach to setting a local value of k . We associate a potentially different k to every prototype and obtain the best value of k by optimizing a criterion consisting of the local and global effects of the different k values in the neighborhood of the prototype. The proposed method has a fast training stage and the same complexity as the standard k -NN approach at the testing stage. The experiments show that this simple approach can significantly outperform the standard k -NN rule for both standard and class-imbalanced problems in a large set of different problems.
Directory of Open Access Journals (Sweden)
Hyung-Ju Cho
2012-01-01
Full Text Available Given two positive parameters k and r, a constrained k-nearest neighbor (CkNN query returns the k closest objects within a network distance r of the query location in road networks. In terms of the scalability of monitoring these CkNN queries, existing solutions based on central processing at a server suffer from a sudden and sharp rise in server load as well as messaging cost as the number of queries increases. In this paper, we propose a distributed and scalable scheme called DAEMON for the continuous monitoring of CkNN queries in road networks. Our query processing is distributed among clients (query objects and server. Specifically, the server evaluates CkNN queries issued at intersections of road segments, retrieves the objects on the road segments between neighboring intersections, and sends responses to the query objects. Finally, each client makes its own query result using this server response. As a result, our distributed scheme achieves close-to-optimal communication costs and scales well to large numbers of monitoring queries. Exhaustive experimental results demonstrate that our scheme substantially outperforms its competitor in terms of query processing time and messaging cost.
Van de Wiele, Ben; Fin, Samuele; Pancaldi, Matteo; Vavassori, Paolo; Sarella, Anandakumar; Bisero, Diego
2016-05-01
Various proposals for future magnetic memories, data processing devices, and sensors rely on a precise control of the magnetization ground state and magnetization reversal process in periodically patterned media. In finite dot arrays, such control is hampered by the magnetostatic interactions between the nanomagnets, leading to the non-uniform magnetization state distributions throughout the sample while reversing. In this paper, we evidence how during reversal typical geometric arrangements of dots in an identical magnetization state appear that originate in the dominance of either Global Configurational Anisotropy or Nearest-Neighbor Magnetostatic interactions, which depends on the fields at which the magnetization reversal sets in. Based on our findings, we propose design rules to obtain the uniform magnetization state distributions throughout the array, and also suggest future research directions to achieve non-uniform state distributions of interest, e.g., when aiming at guiding spin wave edge-modes through dot arrays. Our insights are based on the Magneto-Optical Kerr Effect and Magnetic Force Microscopy measurements as well as the extensive micromagnetic simulations.
Efficient k-Nearest-Neighbor Search Algorithms for Historical Moving Object Trajectories
Institute of Scientific and Technical Information of China (English)
Yun-Jun Gao; Chun Li; Gen-Cai Chen; Ling Chen; Xian-Ta Jiang; Chun Chen
2007-01-01
k Nearest Neighbor (kNN) search is one of the most important operations in spatial and spatio-temporal databases. Although it has received considerable attention in the database literature, there is little prior work on kNN retrieval for moving object trajectories. Motivated by this observation, this paper studies the problem of efficiently processing kNN (k≥1) search on R-tree-like structures storing historical information about moving object trajectories. Two algorithms are developed based on best-first traversal paradigm, called BFPkNN and BFTkNN, which handle the kNN retrieval with respect to the static query point and the moving query trajectory, respectively. Both algorithms minimize the number of node access, that is, they perform a single access only to those qualifying nodes that may contain the final result. Aiming at saving main-memory consumption and reducing CPU cost further, several effective pruning heuristics are also presented. Extensive experiments with synthetic and real datasets confirm that the proposed algorithms in this paper outperform their competitors significantly in both efficiency and scalability.
An RFID Indoor Positioning Algorithm Based on Bayesian Probability and K-Nearest Neighbor
Ding, Ye; Li, Peng; Wang, Ruchuan; Li, Yizhu
2017-01-01
The Global Positioning System (GPS) is widely used in outdoor environmental positioning. However, GPS cannot support indoor positioning because there is no signal for positioning in an indoor environment. Nowadays, there are many situations which require indoor positioning, such as searching for a book in a library, looking for luggage in an airport, emergence navigation for fire alarms, robot location, etc. Many technologies, such as ultrasonic, sensors, Bluetooth, WiFi, magnetic field, Radio Frequency Identification (RFID), etc., are used to perform indoor positioning. Compared with other technologies, RFID used in indoor positioning is more cost and energy efficient. The Traditional RFID indoor positioning algorithm LANDMARC utilizes a Received Signal Strength (RSS) indicator to track objects. However, the RSS value is easily affected by environmental noise and other interference. In this paper, our purpose is to reduce the location fluctuation and error caused by multipath and environmental interference in LANDMARC. We propose a novel indoor positioning algorithm based on Bayesian probability and K-Nearest Neighbor (BKNN). The experimental results show that the Gaussian filter can filter some abnormal RSS values. The proposed BKNN algorithm has the smallest location error compared with the Gaussian-based algorithm, LANDMARC and an improved KNN algorithm. The average error in location estimation is about 15 cm using our method. PMID:28783073
SLIM-Decomposition: Nearest-Neighbor Interaction Systems in the Tensor Train Format
Gelß, Patrick; Matera, Sebastian; Schütte, Christof
2016-01-01
Low-rank tensor approximation approaches have become an important tool in the scientific computing community. The aim is to enable the simulation and analysis of high-dimensional problems which cannot be solved using conventional methods anymore due to the so-called curse of dimensionality. This requires techniques to handle linear operators defined on extremely large state spaces and to solve the resulting systems of linear equations or eigenvalue problems. In this paper, we present a systematic tensor train decomposition for nearest neighbor interaction systems which is applicable to a host of different problems. With the aid of this decomposition, it is possible to reduce the memory consumption as well as the computational costs significantly. Furthermore, it can be shown that in some cases the rank of the tensor decomposition does not depend on the network size. The format is thus feasible even for high-dimensional systems. We will illustrate the results with several guiding examples such as the Ising mod...
Jiang, Yuning; Kang, Jinfeng; Wang, Xinan
2017-01-01
Resistive switching memory (RRAM) is considered as one of the most promising devices for parallel computing solutions that may overcome the von Neumann bottleneck of today’s electronic systems. However, the existing RRAM-based parallel computing architectures suffer from practical problems such as device variations and extra computing circuits. In this work, we propose a novel parallel computing architecture for pattern recognition by implementing k-nearest neighbor classification on metal-oxide RRAM crossbar arrays. Metal-oxide RRAM with gradual RESET behaviors is chosen as both the storage and computing components. The proposed architecture is tested by the MNIST database. High speed (~100 ns per example) and high recognition accuracy (97.05%) are obtained. The influence of several non-ideal device properties is also discussed, and it turns out that the proposed architecture shows great tolerance to device variations. This work paves a new way to achieve RRAM-based parallel computing hardware systems with high performance. PMID:28338069
Institute of Scientific and Technical Information of China (English)
Yi Zhuang; Yue-Ting Zhuang; Fei Wu
2007-01-01
Due to the famous dimensionality curse problem, search in a high-dimensional space is considered as a "hard" problem. In this paper, a novel composite distance transformation method, which is called CDT, is proposed to support a fast k-nearest-neighbor (k-NN) search in high-dimensional spaces. In CDT, all (n) data points are first grouped into some clusters by a k-Means clustering algorithm. Then a composite distance key of each data point is computed. Finally, these index keys of such n data points are inserted by a partition-based B+-tree. Thus, given a query point, its k-NN search in high-dimensional spaces is transformed into the search in the single dimensional space with the aid of CDT index. Extensive performance studies are conducted to evaluate the effectiveness and efficiency of the proposed scheme. Our results showthat this method outperforms the state-of-the-art high-dimensional search techniques, such as the X-Tree, VA-file, iDistance and NB-Tree.
An RFID Indoor Positioning Algorithm Based on Bayesian Probability and K-Nearest Neighbor
Directory of Open Access Journals (Sweden)
He Xu
2017-08-01
Full Text Available The Global Positioning System (GPS is widely used in outdoor environmental positioning. However, GPS cannot support indoor positioning because there is no signal for positioning in an indoor environment. Nowadays, there are many situations which require indoor positioning, such as searching for a book in a library, looking for luggage in an airport, emergence navigation for fire alarms, robot location, etc. Many technologies, such as ultrasonic, sensors, Bluetooth, WiFi, magnetic field, Radio Frequency Identification (RFID, etc., are used to perform indoor positioning. Compared with other technologies, RFID used in indoor positioning is more cost and energy efficient. The Traditional RFID indoor positioning algorithm LANDMARC utilizes a Received Signal Strength (RSS indicator to track objects. However, the RSS value is easily affected by environmental noise and other interference. In this paper, our purpose is to reduce the location fluctuation and error caused by multipath and environmental interference in LANDMARC. We propose a novel indoor positioning algorithm based on Bayesian probability and K-Nearest Neighbor (BKNN. The experimental results show that the Gaussian filter can filter some abnormal RSS values. The proposed BKNN algorithm has the smallest location error compared with the Gaussian-based algorithm, LANDMARC and an improved KNN algorithm. The average error in location estimation is about 15 cm using our method.
An RFID Indoor Positioning Algorithm Based on Bayesian Probability and K-Nearest Neighbor.
Xu, He; Ding, Ye; Li, Peng; Wang, Ruchuan; Li, Yizhu
2017-08-05
The Global Positioning System (GPS) is widely used in outdoor environmental positioning. However, GPS cannot support indoor positioning because there is no signal for positioning in an indoor environment. Nowadays, there are many situations which require indoor positioning, such as searching for a book in a library, looking for luggage in an airport, emergence navigation for fire alarms, robot location, etc. Many technologies, such as ultrasonic, sensors, Bluetooth, WiFi, magnetic field, Radio Frequency Identification (RFID), etc., are used to perform indoor positioning. Compared with other technologies, RFID used in indoor positioning is more cost and energy efficient. The Traditional RFID indoor positioning algorithm LANDMARC utilizes a Received Signal Strength (RSS) indicator to track objects. However, the RSS value is easily affected by environmental noise and other interference. In this paper, our purpose is to reduce the location fluctuation and error caused by multipath and environmental interference in LANDMARC. We propose a novel indoor positioning algorithm based on Bayesian probability and K-Nearest Neighbor (BKNN). The experimental results show that the Gaussian filter can filter some abnormal RSS values. The proposed BKNN algorithm has the smallest location error compared with the Gaussian-based algorithm, LANDMARC and an improved KNN algorithm. The average error in location estimation is about 15 cm using our method.
Institute of Scientific and Technical Information of China (English)
Jian-Hua Xu
2005-01-01
G-protein coupled receptors (GPCRs) are a class of seven-helix transmembrane proteins that have been used in bioinformatics as the targets to facilitate drug discovery for human diseases. Although thousands of GPCR sequences have been collected, the ligand specificity of many GPCRs is still unknown and only one crystal structure of the rhodopsin-like family has been solved. Therefore, identifying GPCR types only from sequence data has become an important research issue. In this study, a novel technique for identifying GPCR types based on the weighted Levenshtein distance between two receptor sequences and the nearest neighbor method (NNM) is introduced, which can deal with receptor sequences with different lengths directly. In our experiments for classifying four classes(acetylcholine, adrenoceptor, dopamine, and serotonin) of the rhodopsin-like family of GPCRs, the error rates from the leave-one-out procedure and the leave-half-out procedure were 0.62% and 1.24%, respectively. These results are prior to those of the covariant discriminant algorithm, the support vector machine method, and the NNM with Euclidean distance.
Directory of Open Access Journals (Sweden)
U Ravi Babu
2014-02-01
Full Text Available This paper presents a new approach to off-line handwritten numeral recognition based on structural and statistical features. Five different types of skeleton features: (horizontal, vertical crossings, end, branch, and cross points, number of contours in the image, Width-to-Height ratio, and distribution features are used for the recognition of numerals. We create two vectors Sample Feature Vector (SFV is a vector which contains Structural and Statistical features of MNIST sample data base of handwritten numerals and Test Feature Vector (TFV is a vector which contains Structural and Statistical features of MNIST test database of handwritten numerals. The performance of digit recognition system depends mainly on what kind of features are being used. The objective of this paper is to provide efficient and reliable techniques for recognition of handwritten numerals. A Euclidian minimum distance criterion is used to find minimum distances and k-nearest neighbor classifier is used to classify the numerals. MNIST database is used for both training and testing the system. A total 5000 numeral images are tested, and the overall accuracy is found to be 98.42%.
Yanxia, Zhang; Nanbo, Peng; Yongheng, Zhao; Xue-bing, Wu
2013-01-01
We apply one of lazy learning methods named k-nearest neighbor algorithm (kNN) to estimate the photometric redshifts of quasars, based on various datasets from the Sloan Digital Sky Survey (SDSS), UKIRT Infrared Deep Sky Survey (UKIDSS) and Wide-field Infrared Survey Explorer (WISE) (the SDSS sample, the SDSS-UKIDSS sample, the SDSS-WISE sample and the SDSS-UKIDSS-WISE sample). The influence of the k value and different input patterns on the performance of kNN is discussed. kNN arrives at the best performance when k is different with a special input pattern for a special dataset. The best result belongs to the SDSS-UKIDSS-WISE sample. The experimental results show that generally the more information from more bands, the better performance of photometric redshift estimation with kNN. The results also demonstrate that kNN using multiband data can effectively solve the catastrophic failure of photometric redshift estimation, which is met by many machine learning methods. By comparing the performance of various m...
Angle Tree: Nearest Neighbor Search in High Dimensions with Low Intrinsic Dimensionality
Zvedeniouk, Ilia
2010-01-01
We propose an extension of tree-based space-partitioning indexing structures for data with low intrinsic dimensionality embedded in a high dimensional space. We call this extension an Angle Tree. Our extension can be applied to both classical kd-trees as well as the more recent rp-trees. The key idea of our approach is to store the angle (the "dihedral angle") between the data region (which is a low dimensional manifold) and the random hyperplane that splits the region (the "splitter"). We show that the dihedral angle can be used to obtain a tight lower bound on the distance between the query point and any point on the opposite side of the splitter. This in turn can be used to efficiently prune the search space. We introduce a novel randomized strategy to efficiently calculate the dihedral angle with a high degree of accuracy. Experiments and analysis on real and synthetic data sets shows that the Angle Tree is the most efficient known indexing structure for nearest neighbor queries in terms of preprocessing ...
Sarwono, A. A.; Ai, T. J.; Wigati, S. S.
2017-01-01
Vehicle Routing Problem (VRP) is a method for determining the optimal route of vehicles in order to serve customers starting from depot. Combination of the two most important problems in distribution logistics, which is called the two dimensional loading vehicle routing problem, is considered in this paper. This problem combines the loading of the freight into the vehicles and the successive routing of the vehicles along the route. Moreover, an additional feature of last-in-first-out loading sequencesis also considered. In the sequential two dimensional loading capacitated vehicle routing problem (sequential 2L-CVRP), the loading must be compatible with the trip sequence: when the vehicle arrives at a customer i, there must be no obstacle (items for other customers) between the item of i and the loading door (rear part) of the vehicle. In other words, it is not necessary to move non-i’s items whenever the unloading process of the items of i. According with aforementioned conditions, a program to solve sequential 2L-CVRP is required. A nearest neighbor algorithm for solving the routing problem is presented, in which the loading component of the problem is solved through a collection of 5 packing heuristics.
Generative local metric learning in k nearest neighbors%kNN中局部生成模型测度学习
Institute of Scientific and Technical Information of China (English)
赵传钢
2011-01-01
已有的关于k近邻测度学习算法的工作主要集中于纯区分模型.在假定隐含的生成模型已知的情况下,提出了一种通过分析样本的k个近邻点的概率密度学习测度的方法.实验表明,这种基于类的生成模型假设学习到的局部测度可以有效改善kNN区分模型的性能.%Previous work on metric learning for k Nearest Neighbor(Knn) has focused on purely discriminative approach.A approach is proposed to learn a metric by analyzing the probability distribution on nearest neighbors provided that the underlying generative model is known.Experiments show that this learned local metric can improve the performance of the discriminative Knn approach using simple class conditional generative model.
Noolvi, Malleshappa N; Patel, Harun M
2010-06-01
Epidermal growth factor receptor (EGFR) protein tyrosine kinases (PTKs) are known for its role in cancer. Quinazoline have been reported to be the molecules of interest, with potent anticancer activity and they act by binding to ATP site of protein kinases. ATP binding site of protein kinases provides an extensive opportunity to design newer analogs. With this background, we report an attempt to discern the structural and physicochemical requirements for inhibition of EGFR tyrosine kinase. The k-Nearest Neighbor Molecular Field Analysis (kNN-MFA), a three dimensional quantitative structure activity relationship (3D- QSAR) method has been used in the present case to study the correlation between the molecular properties and the tyrosine kinase (EGFR) inhibitory activities on a series of quinazoline derivatives. kNNMFA calculations for both electrostatic and steric field were carried out. The master grid maps derived from the best model has been used to display the contribution of electrostatic potential and steric field. The statistical results showed significant correlation coefficient r(2) (q(2)) of 0.846, r(2) for external test set (pred_r2) 0.8029, coefficient of correlation of predicted data set (pred_r(2)se) of 0.6658, degree of freedom 89 and k nearest neighbor of 2. Therefore, this study not only casts light on binding mechanism between EGFR and its inhibitors, but also provides hints for the design of new EGFR inhibitors with observable structural diversity.
Using K-Nearest Neighbor Classification to Diagnose Abnormal Lung Sounds.
Chen, Chin-Hsing; Huang, Wen-Tzeng; Tan, Tan-Hsu; Chang, Cheng-Chun; Chang, Yuan-Jen
2015-06-04
A reported 30% of people worldwide have abnormal lung sounds, including crackles, rhonchi, and wheezes. To date, the traditional stethoscope remains the most popular tool used by physicians to diagnose such abnormal lung sounds, however, many problems arise with the use of a stethoscope, including the effects of environmental noise, the inability to record and store lung sounds for follow-up or tracking, and the physician's subjective diagnostic experience. This study has developed a digital stethoscope to help physicians overcome these problems when diagnosing abnormal lung sounds. In this digital system, mel-frequency cepstral coefficients (MFCCs) were used to extract the features of lung sounds, and then the K-means algorithm was used for feature clustering, to reduce the amount of data for computation. Finally, the K-nearest neighbor method was used to classify the lung sounds. The proposed system can also be used for home care: if the percentage of abnormal lung sound frames is > 30% of the whole test signal, the system can automatically warn the user to visit a physician for diagnosis. We also used bend sensors together with an amplification circuit, Bluetooth, and a microcontroller to implement a respiration detector. The respiratory signal extracted by the bend sensors can be transmitted to the computer via Bluetooth to calculate the respiratory cycle, for real-time assessment. If an abnormal status is detected, the device will warn the user automatically. Experimental results indicated that the error in respiratory cycles between measured and actual values was only 6.8%, illustrating the potential of our detector for home care applications.
Detection and localization of myocardial infarction using K-nearest neighbor classifier.
Arif, Muhammad; Malagore, Ijaz A; Afsar, Fayyaz A
2012-02-01
This paper presents automatic detection and localization of myocardial infarction (MI) using K-nearest neighbor (KNN) classifier. Time domain features of each beat in the ECG signal such as T wave amplitude, Q wave and ST level deviation, which are indicative of MI, are extracted from 12 leads ECG. Detection of MI aims to classify normal subjects without myocardial infarction and subjects suffering from Myocardial Infarction. For further investigation, Localization of MI is done to specify the region of infarction of the heart. Total 20,160 ECG beats from PTB database available on Physio-bank is used to investigate the performance of extracted features with KNN classifier. In the case of MI detection, sensitivity and specificity of KNN is found to be 99.9% using half of the randomly selected beats as training set and rest of the beats for testing. Moreover, Arif-Fayyaz pruning algorithm is used to prune the data which will reduce the storage requirement and computational cost of search. After pruning, sensitivity and specificity are dropped to 97% and 99.6% respectively but training is reduced by 93%. Myocardial Infarction beats are divided into ten classes based on the location of the infarction along with one class of normal subjects. Sensitivity and Specificity of above 90% is achieved for all eleven classes with overall classification accuracy of 98.8%. Some of the ECG beats are misclassified but interestingly these are misclassified to those classes whose location of infarction is near to the true classes of the ECG beats. Pruning is done on the training set for eleven classes and training set is reduced by 70% and overall classification accuracy of 98.3% is achieved. The proposed method due to its simplicity and high accuracy over the PTB database can be very helpful in correct diagnosis of MI in a practical scenario.
Vrooman, Henri A; Cocosco, Chris A; van der Lijn, Fedde; Stokking, Rik; Ikram, M Arfan; Vernooij, Meike W; Breteler, Monique M B; Niessen, Wiro J
2007-08-01
Conventional k-Nearest-Neighbor (kNN) classification, which has been successfully applied to classify brain tissue in MR data, requires training on manually labeled subjects. This manual labeling is a laborious and time-consuming procedure. In this work, a new fully automated brain tissue classification procedure is presented, in which kNN training is automated. This is achieved by non-rigidly registering the MR data with a tissue probability atlas to automatically select training samples, followed by a post-processing step to keep the most reliable samples. The accuracy of the new method was compared to rigid registration-based training and to conventional kNN-based segmentation using training on manually labeled subjects for segmenting gray matter (GM), white matter (WM) and cerebrospinal fluid (CSF) in 12 data sets. Furthermore, for all classification methods, the performance was assessed when varying the free parameters. Finally, the robustness of the fully automated procedure was evaluated on 59 subjects. The automated training method using non-rigid registration with a tissue probability atlas was significantly more accurate than rigid registration. For both automated training using non-rigid registration and for the manually trained kNN classifier, the difference with the manual labeling by observers was not significantly larger than inter-observer variability for all tissue types. From the robustness study, it was clear that, given an appropriate brain atlas and optimal parameters, our new fully automated, non-rigid registration-based method gives accurate and robust segmentation results. A similarity index was used for comparison with manually trained kNN. The similarity indices were 0.93, 0.92 and 0.92, for CSF, GM and WM, respectively. It can be concluded that our fully automated method using non-rigid registration may replace manual segmentation, and thus that automated brain tissue segmentation without laborious manual training is feasible.
Institute of Scientific and Technical Information of China (English)
李一良; 李玉芝; 等
1998-01-01
It is well known that in pyroxene structure,there are two metal sites,M1 and M2.Generally speaking,Ferrous iron in each of these sites would normally be expected to give rise to a doublet,However,anomalies have been found in the relative areas of the peaks in the room temperature spectra of some clinopyroxene(CPX)when the above assignment is followed.According to the calculation of Next Nearest Neighbor configurations of divalent cations in M1,we found that the four configurations of M1 can be divided into two groups.One group is 3Ca configuration that increases with the content of Ca(p.f.u);the other group is made up of three No-3Ca configurations that decrease with the content of Ca.The two groups contribute to the spectrum structure of M1.so in this study we fit two doublets for ferrous iron in M1.Though there were several reports on Fe3+ in tetrahedral site previously,it was not sure that Fe3+ occupies the T site is a universal fact in CPX,despite of the content of Al.We found that the Fe3+ in the T site fitted by Moessbauer spectroscopy is negatively correlated to the Si content in the T site and positively correlated to the Fe3+ in the T site estimated on the supposition that Fe3+ and Al occupy the T site randomly.If it is true.it is important in the modeling of ion exchange geobarometries and geothermomeries.
Zhang, He-Hua; Yang, Liuyang; Liu, Yuchuan; Wang, Pin; Yin, Jun; Li, Yongming; Qiu, Mingguo; Zhu, Xueru; Yan, Fang
2016-11-16
The use of speech based data in the classification of Parkinson disease (PD) has been shown to provide an effect, non-invasive mode of classification in recent years. Thus, there has been an increased interest in speech pattern analysis methods applicable to Parkinsonism for building predictive tele-diagnosis and tele-monitoring models. One of the obstacles in optimizing classifications is to reduce noise within the collected speech samples, thus ensuring better classification accuracy and stability. While the currently used methods are effect, the ability to invoke instance selection has been seldomly examined. In this study, a PD classification algorithm was proposed and examined that combines a multi-edit-nearest-neighbor (MENN) algorithm and an ensemble learning algorithm. First, the MENN algorithm is applied for selecting optimal training speech samples iteratively, thereby obtaining samples with high separability. Next, an ensemble learning algorithm, random forest (RF) or decorrelated neural network ensembles (DNNE), is used to generate trained samples from the collected training samples. Lastly, the trained ensemble learning algorithms are applied to the test samples for PD classification. This proposed method was examined using a more recently deposited public datasets and compared against other currently used algorithms for validation. Experimental results showed that the proposed algorithm obtained the highest degree of improved classification accuracy (29.44%) compared with the other algorithm that was examined. Furthermore, the MENN algorithm alone was found to improve classification accuracy by as much as 45.72%. Moreover, the proposed algorithm was found to exhibit a higher stability, particularly when combining the MENN and RF algorithms. This study showed that the proposed method could improve PD classification when using speech data and can be applied to future studies seeking to improve PD classification methods.
Djoufack, Z. I.; Tala-Tebue, E.; Nguenang, J. P.; Kenfack-Jiotsa, A.
2016-10-01
We report in this work, an analytical study of quantum soliton in 1D Heisenberg spin chains with Dzyaloshinsky-Moriya Interaction (DMI) and Next-Nearest-Neighbor Interactions (NNNI). By means of the time-dependent Hartree approximation and the semi-discrete multiple-scale method, the equation of motion for the single-boson wave function is reduced to the nonlinear Schrödinger equation. It comes from this present study that the spectrum of the frequencies increases, its periodicity changes, in the presence of NNNI. The antisymmetric feature of the DMI was probed from the dispersion curve while changing the sign of the parameter controlling it. Five regions were identified in the dispersion spectrum, when the NNNI are taken into account instead of three as in the opposite case. In each of these regions, the quantum model can exhibit quantum stationary localized and stable bright or dark soliton solutions. In each region, we could set up quantum localized n-boson Hartree states as well as the analytical expression of their energy level, respectively. The accuracy of the analytical studies is confirmed by the excellent agreement with the numerical calculations, and it certifies the stability of the stationary quantum localized solitons solutions exhibited in each region. In addition, we found that the intensity of the localization of quantum localized n-boson Hartree states increases when the NNNI are considered. We also realized that the intensity of Hartree n-boson states corresponding to quantum discrete soliton states depend on the wave vector.
Chikh, Mohamed Amine; Saidi, Meryem; Settouti, Nesma
2012-10-01
The use of expert systems and artificial intelligence techniques in disease diagnosis has been increasing gradually. Artificial Immune Recognition System (AIRS) is one of the methods used in medical classification problems. AIRS2 is a more efficient version of the AIRS algorithm. In this paper, we used a modified AIRS2 called MAIRS2 where we replace the K- nearest neighbors algorithm with the fuzzy K-nearest neighbors to improve the diagnostic accuracy of diabetes diseases. The diabetes disease dataset used in our work is retrieved from UCI machine learning repository. The performances of the AIRS2 and MAIRS2 are evaluated regarding classification accuracy, sensitivity and specificity values. The highest classification accuracy obtained when applying the AIRS2 and MAIRS2 using 10-fold cross-validation was, respectively 82.69% and 89.10%.
Institute of Scientific and Technical Information of China (English)
LEE Tien-hsu; WANG Jong-tzy; CHEN Jhih-bin; CHANG Pao-chi
2006-01-01
Although H.264 video coding standard provides several error resilience tools, the damage caused by error propagation may still be tremendous. This work is aimed at developing a robust and standard-compliant error resilient coding scheme for H.264and uses techniques of mode decision, data hiding, and error concealment to reduce the damage from error propagation. This paper proposes a system with two error resilience techniques that can improve the robustness of H.264 in noisy channels. The first technique is Nearest Neighbor motion compensated Error Concealment (NNEC) that chooses the nearest neighbors in the reference frames for error concealment. The second technique is Distortion Estimated Mode Decision (DEMD) that selects an optimal mode based on stochastically distorted frames. Observed simulation results showed that the rate-distortion performances of the proposed algorithms are better than those of the compared algorithms.
Directory of Open Access Journals (Sweden)
Ahmed A. Mehdawi
2013-08-01
Full Text Available This study gives sophisticated result in the use of K-Nearest Neighbor Method classification of forest. The major focus is on the data and technique that can be used to identify the changes in forest features. This study will concentrate on identifying forest encroachment in tropical forests such as the forests of Malaysia. This technique study will establish a strong mechanism that can be used by different sectors such as forestry, local administration, surveying and agriculture. The main contribution of this study is that it utilizes of K-Nearest Neighbor Method with remote sensing data to detect forest encroachment. Hopefully, this study will serve as a reference for any future research on utilizes of K-Nearest classification as tools to identify of tropical forest encroachment.
Idrissi, A; Vyalov, I; Damay, P; Frolov, A; Oparin, R; Kiselev, M
2009-12-03
The nearest neighbor approach was used to characterize the local structure of CO(2) fluid along its coexistence curve (CC) and along the critical isochore (CI). The distributions of the distances, orientations, and interaction energies between a reference CO(2) molecule and its subsequent nearest neighbors were calculated. Our results show that the local structure may be resolved into two components or subshells: one is characterized by small radial fluctuations, the parallel orientation and a dominance of the attractive part of both the electrostatic (EL) and Lennard-Jones (LJ) to the total interaction energy. Conversely, the second subshell is characterized by large radial fluctuations, a perpendicular orientation, and a concomitant increase of the repulsive contribution of the EL interaction and a shift to less attractive character of the LJ contribution. When the temperature increases along the liquid-gas CC, the first subshell undergoes large changes which are characterized by an obvious increase of the radial fluctuations, by an increase of the random character of the orientation distribution except for the first nearest neighbor which maintains its parallel orientation, and by a drastic decrease of the EL interaction contribution to the total interaction energy. When the temperature is close to the critical isochore, the local structure is no longer resolved into two subshells. Starting from the idea that the profile of vibration modes is sensitive to the local structure as revealed from the nearest neighbor approach, the hypothesis that the CO(2) vibration profile may be deconvoluted into two contributions is discussed in a qualitative manner.
Chou, Kuo-Chen; Shen, Hong-Bin
2006-08-01
Facing the explosion of newly generated protein sequences in the post genomic era, we are challenged to develop an automated method for fast and reliably annotating their subcellular locations. Knowledge of subcellular locations of proteins can provide useful hints for revealing their functions and understanding how they interact with each other in cellular networking. Unfortunately, it is both expensive and time-consuming to determine the localization of an uncharacterized protein in a living cell purely based on experiments. To tackle the challenge, a novel hybridization classifier was developed by fusing many basic individual classifiers through a voting system. The "engine" of these basic classifiers was operated by the OET-KNN (Optimized Evidence-Theoretic K-Nearest Neighbor) rule. As a demonstration, predictions were performed with the fusion classifier for proteins among the following 16 localizations: (1) cell wall, (2) centriole, (3) chloroplast, (4) cyanelle, (5) cytoplasm, (6) cytoskeleton, (7) endoplasmic reticulum, (8) extracell, (9) Golgi apparatus, (10) lysosome, (11) mitochondria, (12) nucleus, (13) peroxisome, (14) plasma membrane, (15) plastid, and (16) vacuole. To get rid of redundancy and homology bias, none of the proteins investigated here had >/=25% sequence identity to any other in a same subcellular location. The overall success rates thus obtained via the jack-knife cross-validation test and independent dataset test were 81.6% and 83.7%, respectively, which were 46 approximately 63% higher than those performed by the other existing methods on the same benchmark datasets. Also, it is clearly elucidated that the overwhelmingly high success rates obtained by the fusion classifier is by no means a trivial utilization of the GO annotations as prone to be misinterpreted because there is a huge number of proteins with given accession numbers and the corresponding GO numbers, but their subcellular locations are still unknown, and that the
Schmalz, M.; Key, G.
Accurate spectral signature classification is key to the nonimaging detection and recognition of spaceborne objects. In classical hyperspectral recognition applications, signature classification accuracy depends on accurate spectral endmember determination [1]. However, in selected target recognition (ATR) applications, it is possible to circumvent the endmember detection problem by employing a Bayesian classifier. Previous approaches to Bayesian classification of spectral signatures have been rule- based, or predicated on a priori parameterized information obtained from offline training, as in the case of neural networks [1,2]. Unfortunately, class separation and classifier refinement results in these methods tends to be suboptimal, and the number of signatures that can be accurately classified often depends linearly on the number of inputs. This can lead to potentially significant classification errors in the presence of noise or densely interleaved signatures. In this paper, we present an emerging technology for nonimaging spectral signature classfication based on a highly accurate but computationally efficient search engine called Tabular Nearest Neighbor Encoding (TNE) [3]. Based on prior results, TNE can optimize its classifier performance to track input nonergodicities, as well as yield measures of confidence or caution for evaluation of classification results. Unlike neural networks, TNE does not have a hidden intermediate data structure (e.g., the neural net weight matrix). Instead, TNE generates and exploits a user-accessible data structure called the agreement map (AM), which can be manipulated by Boolean logic operations to effect accurate classifier refinement algorithms. This allows the TNE programmer or user to determine parameters for classification accuracy, and to mathematically analyze the signatures for which TNE did not obtain classification matches. This dual approach to analysis (i.e., correct vs. incorrect classification) has been shown to
Nearest neighbor imputation using spatial-temporal correlations in wireless sensor networks.
Li, YuanYuan; Parker, Lynne E
2014-01-01
Missing data is common in Wireless Sensor Networks (WSNs), especially with multi-hop communications. There are many reasons for this phenomenon, such as unstable wireless communications, synchronization issues, and unreliable sensors. Unfortunately, missing data creates a number of problems for WSNs. First, since most sensor nodes in the network are battery-powered, it is too expensive to have the nodes retransmit missing data across the network. Data re-transmission may also cause time delays when detecting abnormal changes in an environment. Furthermore, localized reasoning techniques on sensor nodes (such as machine learning algorithms to classify states of the environment) are generally not robust enough to handle missing data. Since sensor data collected by a WSN is generally correlated in time and space, we illustrate how replacing missing sensor values with spatially and temporally correlated sensor values can significantly improve the network's performance. However, our studies show that it is important to determine which nodes are spatially and temporally correlated with each other. Simple techniques based on Euclidean distance are not sufficient for complex environmental deployments. Thus, we have developed a novel Nearest Neighbor (NN) imputation method that estimates missing data in WSNs by learning spatial and temporal correlations between sensor nodes. To improve the search time, we utilize a kd-tree data structure, which is a non-parametric, data-driven binary search tree. Instead of using traditional mean and variance of each dimension for kd-tree construction, and Euclidean distance for kd-tree search, we use weighted variances and weighted Euclidean distances based on measured percentages of missing data. We have evaluated this approach through experiments on sensor data from a volcano dataset collected by a network of Crossbow motes, as well as experiments using sensor data from a highway traffic monitoring application. Our experimental results
Directory of Open Access Journals (Sweden)
Y. Erfanifard
2014-03-01
Full Text Available The ecological relationship between trees is important in the sustainable management of forests. Studying this relationship in spatial ecology, different indices are applied that are based on distance to nearest neighbor. The aim of this research was introduction of important indices based on nearest neighbor analysis and their application in the investigation of ecological relationship between Persian oak coppice trees in Zagros forests. A 9 ha plot of these forests in Kohgilouye - BoyerAhmad province was selected that was completely homogeneous. This plot was covered with Persian oak coppice trees that their point map was obtained after registering their spatial location. Five nearest neighbor indices of G(r, F(r, J(r, GF(r and CE were then applied to study the spatial pattern and relationship of these trees. The results showed that Persian oak coppice trees were located regularly in the homogeneous plot and they were not dependent ecologically. These trees were independent and did not affect the establishment of each other.
Energy Technology Data Exchange (ETDEWEB)
Shimo-Oka, T.; Miwa, S.; Suzuki, Y.; Mizuochi, N., E-mail: mizuochi@mp.es.osaka-u.ac.jp [Graduate School of Engineering Science, Osaka University, Toyonaka, Osaka 560-8531 (Japan); Kato, H.; Yamasaki, S. [Energy Technology Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Tsukuba, Ibaraki 305-8568 (Japan); Jelezko, F. [Institut für Quantenoptik, Universität Ulm, Albert-Einstein-Allee 11, 89081 Ulm (Germany)
2015-04-13
Individual nuclear spins in diamond can be optically detected through hyperfine couplings with the electron spin of a single nitrogen-vacancy (NV) center; such nuclear spins have outstandingly long coherence times. Among the hyperfine couplings in the NV center, the nearest neighbor {sup 13}C nuclear spins have the largest coupling strength. Nearest neighbor {sup 13}C nuclear spins have the potential to perform fastest gate operations, providing highest fidelity in quantum computing. Herein, we report on the control of coherences in the NV center where all three nearest neighbor carbons are of the {sup 13}C isotope. Coherence among the three and four qubits are generated and analyzed at room temperature.
Xiao, Xuan; Qiu, Wang-Ren
2010-06-01
G-Protein-Coupled Receptors (GPCRs) are the largest of cell surface receptor, accounting for >1% of the human genome. They play a key role in cellular signaling networks that regulate various physiological processes. The functions of many of GPCRs are unknown, because they are difficult to crystallize and most of them will not dissolve in normal solvents. This difficulty has motivated and challenged the development of a computational method which can predict the classification of the families and subfamilies of GPCRs based on their primary sequence so as to help us classify drugs. In this paper the adaptive K-nearest neighbor algorithm and protein cellular automata image (CAI) is introduced. Based on the CAI, the complexity measure factors derived from each of the protein sequences concerned are adopted for its Pseudo amino acid composition. GPCRs were categorized into nine subtypes. The overall success rate in identifying GPCRs among their nine family classes was about 83.5%. The high success rate suggests that the adaptive K-nearest neighbor algorithm and protein CAI holds very high potential to become a useful tool for understanding the actions of drugs that target GPCRs and designing new medications with fewer side effects and greater efficacy.
Suratanee, Apichat; Plaimas, Kitiporn
2017-01-01
The associations between proteins and diseases are crucial information for investigating pathological mechanisms. However, the number of known and reliable protein-disease associations is quite small. In this study, an analysis framework to infer associations between proteins and diseases was developed based on a large data set of a human protein-protein interaction network integrating an effective network search, namely, the reverse k-nearest neighbor (RkNN) search. The RkNN search was used to identify an impact of a protein on other proteins. Then, associations between proteins and diseases were inferred statistically. The method using the RkNN search yielded a much higher precision than a random selection, standard nearest neighbor search, or when applying the method to a random protein-protein interaction network. All protein-disease pair candidates were verified by a literature search. Supporting evidence for 596 pairs was identified. In addition, cluster analysis of these candidates revealed 10 promising groups of diseases to be further investigated experimentally. This method can be used to identify novel associations to better understand complex relationships between proteins and diseases.
Zu, Baokai; Xia, Kewen; Pan, Yongke; Niu, Wenjia
2017-01-01
Semisupervised Discriminant Analysis (SDA) is a semisupervised dimensionality reduction algorithm, which can easily resolve the out-of-sample problem. Relative works usually focus on the geometric relationships of data points, which are not obvious, to enhance the performance of SDA. Different from these relative works, the regularized graph construction is researched here, which is important in the graph-based semisupervised learning methods. In this paper, we propose a novel graph for Semisupervised Discriminant Analysis, which is called combined low-rank and k-nearest neighbor (LRKNN) graph. In our LRKNN graph, we map the data to the LR feature space and then the kNN is adopted to satisfy the algorithmic requirements of SDA. Since the low-rank representation can capture the global structure and the k-nearest neighbor algorithm can maximally preserve the local geometrical structure of the data, the LRKNN graph can significantly improve the performance of SDA. Extensive experiments on several real-world databases show that the proposed LRKNN graph is an efficient graph constructor, which can largely outperform other commonly used baselines.
Directory of Open Access Journals (Sweden)
Baokai Zu
2017-01-01
Full Text Available Semisupervised Discriminant Analysis (SDA is a semisupervised dimensionality reduction algorithm, which can easily resolve the out-of-sample problem. Relative works usually focus on the geometric relationships of data points, which are not obvious, to enhance the performance of SDA. Different from these relative works, the regularized graph construction is researched here, which is important in the graph-based semisupervised learning methods. In this paper, we propose a novel graph for Semisupervised Discriminant Analysis, which is called combined low-rank and k-nearest neighbor (LRKNN graph. In our LRKNN graph, we map the data to the LR feature space and then the kNN is adopted to satisfy the algorithmic requirements of SDA. Since the low-rank representation can capture the global structure and the k-nearest neighbor algorithm can maximally preserve the local geometrical structure of the data, the LRKNN graph can significantly improve the performance of SDA. Extensive experiments on several real-world databases show that the proposed LRKNN graph is an efficient graph constructor, which can largely outperform other commonly used baselines.
Shahabul Alam, Md.; Nazemi, Alireza; Elshorbagy, Amin
2014-05-01
Intensity-Duration-Frequency (IDF) curves are among standard design criteria for various engineering applications, such as storm water management systems. Warming climate, however, changes the extreme rainfall quantiles represented by the IDF curves. This study attempts to construct the future IDF curves under possible climate change scenarios. For this purpose, a stochastic rainfall generator is used to spatially downscale the daily projections of Global Climate Models (GCMs) from coarse grid resolution to the point scale. The stochastically downscaled daily rainfall realizations can be further disaggregated to hourly and sub-hourly rainfall series using a deterministic disaggregation scheme developed based on the K-Nearest Neighbor (K-NN) method. We applied this framework for constructing the future IDF curves in the city of Saskatoon, Canada. As a model development step, the sensitivity of the K-NN disaggregation model to the number of nearest neighbors (i.e. window size) is evaluated during the baseline periods. The optimum window size is assigned based on the performance in reproducing the historical IDF curves. The optimum windows identified for 1-hour and 5-min temporal resolutions are then used to produce the future hourly and consequently, 5-min resolution rainfall based on the K-NN simulations. By using the simulated hourly and sub-hourly rainfall series and the Generalized Extreme Value (GEV) distribution future changes in IDF curves and associated uncertainties are quantified using a large ensemble of projections obtained for the CGCM3.1 and HadCM3 based on A1B, A2 and B1 emission scenarios in case of CMIP3 and RCP2.6, RCP4.5, and RCP8.5 in case of CMIP5 datasets. The constructed IDF curves for the city of Saskatoon are then compared with corresponding historical relationships at various durations and/or return periods and are discussed based on different models, emission scenarios and/or simulation release (i.e. CMIP3 vs. CMIP5).
Li, Ting-Ting; Li, Cheng-Ren; Wang, Chen; He, Fang-Jun; Zhou, Guang-Ye; Sun, Jing-Chang; Han, Fei
2016-12-01
A new synchronization technique of inner and outer couplings is proposed in this work to investigate the synchronization of network group. Some Haken-Lorenz lasers with chaos behaviors are taken as the nodes to construct a few nearest neighbor complex networks and those sub-networks are also connected to form a network group. The effective node controllers are designed based on Lyapunov function and the complete synchronization among the sub-networks is realized perfectly under inner and outer couplings. The work is of potential applications in the cooperation output of lasers and the communication network. Project supported by the National Natural Science Foundation of China (Grant No. 11004092), the Natural Science Foundation of Liaoning Province, China (Grant Nos. 2015020079 and 201602455), and the Foundation of Education Department of Liaoning Province, China (Grant No. L201683665)
SubPatch: random kd-tree on a sub-sampled patch set for nearest neighbor field estimation
Pedersoli, Fabrizio; Benini, Sergio; Adami, Nicola; Okuda, Masahiro; Leonardi, Riccardo
2015-02-01
We propose a new method to compute the approximate nearest-neighbors field (ANNF) between image pairs using random kd-tree and patch set sub-sampling. By exploiting image coherence we demonstrate that it is possible to reduce the number of patches on which we compute the ANNF, while maintaining high overall accuracy on the final result. Information on missing patches is then recovered by interpolation and propagation of good matches. The introduction of the sub-sampling factor on patch sets also allows for setting the desired trade off between accuracy and speed, providing a flexibility that lacks in state-of-the-art methods. Tests conducted on a public database prove that our algorithm achieves superior performance with respect to PatchMatch (PM) and Coherence Sensitivity Hashing (CSH) algorithms in a comparable computational time.
NC Machine Tools Fault Diagnosis Based on Kernel PCA and k-Nearest Neighbor Using Vibration Signals
Directory of Open Access Journals (Sweden)
Zhou Yuqing
2015-01-01
Full Text Available This paper focuses on the fault diagnosis for NC machine tools and puts forward a fault diagnosis method based on kernel principal component analysis (KPCA and k-nearest neighbor (kNN. A data-dependent KPCA based on covariance matrix of sample data is designed to overcome the subjectivity in parameter selection of kernel function and is used to transform original high-dimensional data into low-dimensional manifold feature space with the intrinsic dimensionality. The kNN method is modified to adapt the fault diagnosis of tools that can determine thresholds of multifault classes and is applied to detect potential faults. An experimental analysis in NC milling machine tools is developed; the testing result shows that the proposed method is outperforming compared to the other two methods in tool fault diagnosis.
Directory of Open Access Journals (Sweden)
Jianbin Xiong
2015-01-01
Full Text Available It is difficult to well distinguish the dimensionless indexes between normal petrochemical rotating machinery equipment and those with complex faults. When the conflict of evidence is too big, it will result in uncertainty of diagnosis. This paper presents a diagnosis method for rotation machinery fault based on dimensionless indexes combined with K-nearest neighbor (KNN algorithm. This method uses a KNN algorithm and an evidence fusion theoretical formula to process fuzzy data, incomplete data, and accurate data. This method can transfer the signals from the petrochemical rotating machinery sensors to the reliability manners using dimensionless indexes and KNN algorithm. The input information is further integrated by an evidence synthesis formula to get the final data. The type of fault will be decided based on these data. The experimental results show that the proposed method can integrate data to provide a more reliable and reasonable result, thereby reducing the decision risk.
Porta, A; Castiglioni, P; Bari, V; Bassani, T; Marchi, A; Cividjian, A; Quintin, L; Di Rienzo, M
2013-01-01
Complexity analysis of short-term cardiovascular control is traditionally performed using entropy-based approaches including corrective terms or strategies to cope with the loss of reliability of conditional distributions with pattern length. This study proposes a new approach aiming at the estimation of conditional entropy (CE) from short data segments (about 250 samples) based on the k-nearest-neighbor technique. The main advantages are: (i) the control of the loss of reliability of the conditional distributions with the pattern length without introducing a priori information; (ii) the assessment of complexity indexes without fixing the pattern length to an arbitrary low value. The approach, referred to as k-nearest-neighbor conditional entropy (KNNCE), was contrasted with corrected approximate entropy (CApEn), sample entropy (SampEn) and corrected CE (CCE), being the most frequently exploited approaches for entropy-based complexity analysis of short cardiovascular series. Complexity indexes were evaluated during the selective pharmacological blockade of the vagal and/or sympathetic branches of the autonomic nervous system. We found that KNNCE was more powerful than CCE in detecting the decrease of complexity of heart period variability imposed by double autonomic blockade. In addition, KNNCE provides indexes indistinguishable from those derived from CApEn and SampEn. Since this result was obtained without using strategies to correct the CE estimate and without fixing the embedding dimension to an arbitrary low value, KNNCE is potentially more valuable than CCE, CApEn and SampEn when the number of past samples most useful to reduce the uncertainty of future behaviors is high and/or variable among conditions and/or groups.
Institute of Scientific and Technical Information of China (English)
祝继华; 尹俊; 邗汶锌; 杜少毅
2014-01-01
为提高点集配准效率，设计一种适用于二维/三维点集的高效最近邻搜索法。该方法根据由模型点集的各维方差所选定的维度信息，排序模型点集中的点。借助二分查找法，将数据点集中的每个点插入至排序后的模型点集中，并利用左边第一个点确定搜索范围的上确界。当在确定范围内搜索最近邻时，可根据当前结果进一步减小待搜索范围，以便快速获得各点的最近邻。最后进行的复杂度分析和实验结果对比均验证文中方法的有效性。%To improve the efficiency of point set registration, an efficient nearest neighbor search approach for 2D/3D point sets is proposed. Firstly, according to the variance of each dimension of the model points, all model points based on the selected dimension information are sorted. By adopting the binary search strategy, each data point is inserted into the sorted model points. Then, the upper bound of search range can be obtained by calculating the distance between the data point and its first left model point. During the search process, the search range can be further reduced by the current nearest neighbor so that the final nearest neighbor can be efficiently searched. Finally, the efficiency of the approach is demonstrated by both the complexity analysis and experimental results comparision.
Self-consistent-field calculations of proteinlike incorporations in polyelectrolyte complex micelles
Lindhoud, S.; Cohen Stuart, M.A.; Norde, W.; Leermakers, F.A.M.
2009-01-01
Self-consistent field theory is applied to model the structure and stability of polyelectrolyte complex micelles with incorporated protein (molten globule) molecules in the core. The electrostatic interactions that drive the micelle formation are mimicked by nearest-neighbor interactions using
Manganaro, Alberto; Pizzo, Fabiola; Lombardo, Anna; Pogliaghi, Alberto; Benfenati, Emilio
2016-02-01
The ability of a substance to resist degradation and persist in the environment needs to be readily identified in order to protect the environment and human health. Many regulations require the assessment of persistence for substances commonly manufactured and marketed. Besides laboratory-based testing methods, in silico tools may be used to obtain a computational prediction of persistence. We present a new program to develop k-Nearest Neighbor (k-NN) models. The k-NN algorithm is a similarity-based approach that predicts the property of a substance in relation to the experimental data for its most similar compounds. We employed this software to identify persistence in the sediment compartment. Data on half-life (HL) in sediment were obtained from different sources and, after careful data pruning the final dataset, containing 297 organic compounds, was divided into four experimental classes. We developed several models giving satisfactory performances, considering that both the training and test set accuracy ranged between 0.90 and 0.96. We finally selected one model which will be made available in the near future in the freely available software platform VEGA. This model offers a valuable in silico tool that may be really useful for fast and inexpensive screening. Copyright © 2015 Elsevier Ltd. All rights reserved.
Combining Fourier and lagged k-nearest neighbor imputation for biomedical time series data.
Rahman, Shah Atiqur; Huang, Yuxiao; Claassen, Jan; Heintzman, Nathaniel; Kleinberg, Samantha
2015-12-01
Most clinical and biomedical data contain missing values. A patient's record may be split across multiple institutions, devices may fail, and sensors may not be worn at all times. While these missing values are often ignored, this can lead to bias and error when the data are mined. Further, the data are not simply missing at random. Instead the measurement of a variable such as blood glucose may depend on its prior values as well as that of other variables. These dependencies exist across time as well, but current methods have yet to incorporate these temporal relationships as well as multiple types of missingness. To address this, we propose an imputation method (FLk-NN) that incorporates time lagged correlations both within and across variables by combining two imputation methods, based on an extension to k-NN and the Fourier transform. This enables imputation of missing values even when all data at a time point is missing and when there are different types of missingness both within and across variables. In comparison to other approaches on three biological datasets (simulated and actual Type 1 diabetes datasets, and multi-modality neurological ICU monitoring) the proposed method has the highest imputation accuracy. This was true for up to half the data being missing and when consecutive missing values are a significant fraction of the overall time series length. Copyright © 2015 Elsevier Inc. All rights reserved.
Jaradat, Nour Jamal; Khanfar, Mohammad A; Habash, Maha; Taha, Mutasem Omar
2015-06-01
Check point kinase 1 (Chk1) is an important protein in G2 phase checkpoint arrest required by cancer cells to maintain cell cycle and to prevent cell death. Therefore, Chk1 inhibitors should have potential as anti-cancer therapeutics. Docking-based comparative intermolecular contacts analysis (dbCICA) is a new three-dimensional quantitative structure activity relationship method that depends on the quality and number of contact points between docked ligands and binding pocket amino acid residues. In this presented work we implemented a novel combination of k-nearest neighbor/genetic function algorithm modeling coupled with dbCICA to select critical ligand-Chk1 contacts capable of explaining anti-Chk1 bioactivity among a long list of inhibitors. The finest set of contacts were translated into two valid pharmacophore hypotheses that were used as 3D search queries to screen the National Cancer Institute's structural database for new Chk1 inhibitors. Three potent Chk1 inhibitors were discovered with IC50 values ranging from 2.4 to 69.7 µM.
Institute of Scientific and Technical Information of China (English)
刘芬芬; 张勇; 袁峰; 夏临华
2012-01-01
The two dimensions hole-doped t-t ＇-J-U model was studied based on the Gutzwiller approach and the renormalized mean-field theory.The phase diagrams of gossamer superconductors and the effects of the next-nearestneighbor hopping（t ＇） on superconductivity and antiferromagnetism based on the t-t ＇-J-U model were investigated.The results show that the qualitative feature of the phase diagrams in the t-t ＇-J-U model is the same as in the case of the t-J-U model.The antiferromagnetic order coexists with the d-wave superconductivity（dSC） in the underdoped region below the doping δ≈ 0.1 and is enhanced by the t ＇.The dSC order is slightly suppressed by t ＇ in the underdoped region and greatly enhanced in the overdoped region.The dSC order is pushed to a larger doping region and the coexistence region of the AF and dSC extends to higher doping.
Kügler, S D; Hoecker, M
2014-01-01
Context: In astronomy, new approaches to process and analyze the exponentially increasing amount of data are inevitable. While classical approaches (e.g. template fitting) are fine for objects of well-known classes, alternative techniques have to be developed to determine those that do not fit. Therefore a classification scheme should be based on individual properties instead of fitting to a global model and therefore loose valuable information. An important issue when dealing with large data sets is the outlier detection which at the moment is often treated problem-orientated. Aims: In this paper we present a method to statistically estimate the redshift z based on a similarity approach. This allows us to determine redshifts in spectra in emission as well as in absorption without using any predefined model. Additionally we show how an estimate of the redshift based on single features is possible. As a consequence we are e.g. able to filter objects which show multiple redshift components. We propose to apply ...
DEFF Research Database (Denmark)
Lefmann, K.; Rischel, C.
1996-01-01
We present a numerical diagonalization study of two one-dimensional S=1/2 antiferromagnetic Heisenberg chains, having nearest-neighbor and Haldane-Shastry (1/r(2)) interactions, respectively. We have obtained the T=0 dynamical correlation function, S-alpha alpha(q,omega), for chains of length N=8...
Blom, Hans; Rönnlund, Daniel; Scott, Lena; Spicarova, Zuzana; Rantanen, Ville; Widengren, Jerker; Aperia, Anita; Brismar, Hjalmar
2012-02-01
Protein localization in dendritic spines is the focus of intense investigations within neuroscience. Applications of super-resolution microscopy to dissect nanoscale protein distributions, as shown in this work with dual-color STED, generate spatial correlation coefficients having quite small values. This means that colocalization analysis to some extent looses part of its correlative impact. In this study we thus introduced nearest neighbor analysis to quantify the spatial relations between two important proteins in neurons, the dopamine D1 receptor and Na(+),K(+)-ATPase. The analysis gave new information on how dense the D1 receptor and Na(+),K(+)-ATPase constituting nanoclusters are located both with respect to the homogenous (self to same) and the heterogeneous (same to other) topology. The STED dissected nanoscale topologies provide evidence for both a joint as well as a separated confinement of the D1 receptor and the Na(+),K(+)-ATPase in the postsynaptic areas of dendritic spines. This confined topology may have implications for generation of local sodium gradients and for structural and functional interactions modulating slow synaptic transmission processes. Copyright © 2011 Wiley Periodicals, Inc.
Liu, Da-You; Chen, Hui-Ling; Yang, Bo; Lv, Xin-En; Li, Li-Na; Liu, Jie
2012-10-01
In this paper, we present an enhanced fuzzy k-nearest neighbor (FKNN) classifier based computer aided diagnostic (CAD) system for thyroid disease. The neighborhood size k and the fuzzy strength parameter m in FKNN classifier are adaptively specified by the particle swarm optimization (PSO) approach. The adaptive control parameters including time-varying acceleration coefficients (TVAC) and time-varying inertia weight (TVIW) are employed to efficiently control the local and global search ability of PSO algorithm. In addition, we have validated the effectiveness of the principle component analysis (PCA) in constructing a more discriminative subspace for classification. The effectiveness of the resultant CAD system, termed as PCA-PSO-FKNN, has been rigorously evaluated against the thyroid disease dataset, which is commonly used among researchers who use machine learning methods for thyroid disease diagnosis. Compared to the existing methods in previous studies, the proposed system has achieved the highest classification accuracy reported so far via 10-fold cross-validation (CV) analysis, with the mean accuracy of 98.82% and with the maximum accuracy of 99.09%. Promisingly, the proposed CAD system might serve as a new candidate of powerful tools for diagnosing thyroid disease with excellent performance.
Kamath, Sudha D; Mahato, Krishna K
2009-08-01
The objective of this study was to verify the suitability of principal component analysis (PCA)-based k-nearest neighbor (k-NN) analysis for discriminating normal and malignant autofluorescence spectra of colonic mucosal tissues. Autofluorescence spectroscopy, a noninvasive technique, has high specificity and sensitivity for discrimination of diseased and nondiseased colonic tissues. Previously, we assessed the efficacy of the technique on colonic data using PCA Match/No match and Artificial Neural Networks (ANNs) analyses. To improve the classification reliability, the present work was conducted using PCA-based k-NN analysis and was compared with previously obtained results. A total of 115 fluorescence spectra (69 normal and 46 malignant) were recorded from 13 normal and 10 malignant colonic tissues with 325 nm pulsed laser excitation in the spectral region 350-600 nm in vitro. We applied PCA to extract the relevant information from the spectra and used a nonparametric k-NN analysis for classification. The normal and malignant spectra showed large variations in shape and intensity. Statistically significant differences were found between normal and malignant classes. The performance of the analysis was evaluated by calculating the statistical parameters specificity and sensitivity, which were found to be 100% and 91.3%, respectively. The results obtained in this study showed good discrimination between normal and malignant conditions using PCA-based k-NN analysis.
Li, Chao; Zhang, Shuheng; Zhang, Huan; Pang, Lifang; Lam, Kinman; Hui, Chun; Zhang, Su
2012-01-01
Accurate tumor, node, and metastasis (TNM) staging, especially N staging in gastric cancer or the metastasis on lymph node diagnosis, is a popular issue in clinical medical image analysis in which gemstone spectral imaging (GSI) can provide more information to doctors than conventional computed tomography (CT) does. In this paper, we apply machine learning methods on the GSI analysis of lymph node metastasis in gastric cancer. First, we use some feature selection or metric learning methods to reduce data dimension and feature space. We then employ the K-nearest neighbor classifier to distinguish lymph node metastasis from nonlymph node metastasis. The experiment involved 38 lymph node samples in gastric cancer, showing an overall accuracy of 96.33%. Compared with that of traditional diagnostic methods, such as helical CT (sensitivity 75.2% and specificity 41.8%) and multidetector computed tomography (82.09%), the diagnostic accuracy of lymph node metastasis is high. GSI-CT can then be the optimal choice for the preoperative diagnosis of patients with gastric cancer in the N staging.
Directory of Open Access Journals (Sweden)
Jaime Vitola
2017-02-01
Full Text Available Civil and military structures are susceptible and vulnerable to damage due to the environmental and operational conditions. Therefore, the implementation of technology to provide robust solutions in damage identification (by using signals acquired directly from the structure is a requirement to reduce operational and maintenance costs. In this sense, the use of sensors permanently attached to the structures has demonstrated a great versatility and benefit since the inspection system can be automated. This automation is carried out with signal processing tasks with the aim of a pattern recognition analysis. This work presents the detailed description of a structural health monitoring (SHM system based on the use of a piezoelectric (PZT active system. The SHM system includes: (i the use of a piezoelectric sensor network to excite the structure and collect the measured dynamic response, in several actuation phases; (ii data organization; (iii advanced signal processing techniques to define the feature vectors; and finally; (iv the nearest neighbor algorithm as a machine learning approach to classify different kinds of damage. A description of the experimental setup, the experimental validation and a discussion of the results from two different structures are included and analyzed.
Vitola, Jaime; Pozo, Francesc; Tibaduiza, Diego A; Anaya, Maribel
2017-02-21
Civil and military structures are susceptible and vulnerable to damage due to the environmental and operational conditions. Therefore, the implementation of technology to provide robust solutions in damage identification (by using signals acquired directly from the structure) is a requirement to reduce operational and maintenance costs. In this sense, the use of sensors permanently attached to the structures has demonstrated a great versatility and benefit since the inspection system can be automated. This automation is carried out with signal processing tasks with the aim of a pattern recognition analysis. This work presents the detailed description of a structural health monitoring (SHM) system based on the use of a piezoelectric (PZT) active system. The SHM system includes: (i) the use of a piezoelectric sensor network to excite the structure and collect the measured dynamic response, in several actuation phases; (ii) data organization; (iii) advanced signal processing techniques to define the feature vectors; and finally; (iv) the nearest neighbor algorithm as a machine learning approach to classify different kinds of damage. A description of the experimental setup, the experimental validation and a discussion of the results from two different structures are included and analyzed.
Vitola, Jaime; Pozo, Francesc; Tibaduiza, Diego A.; Anaya, Maribel
2017-01-01
Civil and military structures are susceptible and vulnerable to damage due to the environmental and operational conditions. Therefore, the implementation of technology to provide robust solutions in damage identification (by using signals acquired directly from the structure) is a requirement to reduce operational and maintenance costs. In this sense, the use of sensors permanently attached to the structures has demonstrated a great versatility and benefit since the inspection system can be automated. This automation is carried out with signal processing tasks with the aim of a pattern recognition analysis. This work presents the detailed description of a structural health monitoring (SHM) system based on the use of a piezoelectric (PZT) active system. The SHM system includes: (i) the use of a piezoelectric sensor network to excite the structure and collect the measured dynamic response, in several actuation phases; (ii) data organization; (iii) advanced signal processing techniques to define the feature vectors; and finally; (iv) the nearest neighbor algorithm as a machine learning approach to classify different kinds of damage. A description of the experimental setup, the experimental validation and a discussion of the results from two different structures are included and analyzed. PMID:28230796
Moldovanu, Simona; Bibicu, Dorin; Moraru, Luminita; Nicolae, Mariana Carmen
2011-12-01
Co-occurrence matrix has been applied successfully for echographic images characterization because it contains information about spatial distribution of grey-scale levels in an image. The paper deals with the analysis of pixels in selected regions of interest of an US image of the liver. The useful information obtained refers to texture features such as entropy, contrast, dissimilarity and correlation extract with co-occurrence matrix. The analyzed US images were grouped in two distinct sets: healthy liver and steatosis (or fatty) liver. These two sets of echographic images of the liver build a database that includes only histological confirmed cases: 10 images of healthy liver and 10 images of steatosis liver. The healthy subjects help to compute four textural indices and as well as control dataset. We chose to study these diseases because the steatosis is the abnormal retention of lipids in cells. The texture features are statistical measures and they can be used to characterize irregularity of tissues. The goal is to extract the information using the Nearest Neighbor classification algorithm. The K-NN algorithm is a powerful tool to classify features textures by means of grouping in a training set using healthy liver, on the one hand, and in a holdout set using the features textures of steatosis liver, on the other hand. The results could be used to quantify the texture information and will allow a clear detection between health and steatosis liver.
Shah, Jasmit S; Rai, Shesh N; DeFilippis, Andrew P; Hill, Bradford G; Bhatnagar, Aruni; Brock, Guy N
2017-02-20
High throughput metabolomics makes it possible to measure the relative abundances of numerous metabolites in biological samples, which is useful to many areas of biomedical research. However, missing values (MVs) in metabolomics datasets are common and can arise due to both technical and biological reasons. Typically, such MVs are substituted by a minimum value, which may lead to different results in downstream analyses. Here we present a modified version of the K-nearest neighbor (KNN) approach which accounts for truncation at the minimum value, i.e., KNN truncation (KNN-TN). We compare imputation results based on KNN-TN with results from other KNN approaches such as KNN based on correlation (KNN-CR) and KNN based on Euclidean distance (KNN-EU). Our approach assumes that the data follow a truncated normal distribution with the truncation point at the detection limit (LOD). The effectiveness of each approach was analyzed by the root mean square error (RMSE) measure as well as the metabolite list concordance index (MLCI) for influence on downstream statistical testing. Through extensive simulation studies and application to three real data sets, we show that KNN-TN has lower RMSE values compared to the other two KNN procedures as well as simpler imputation methods based on substituting missing values with the metabolite mean, zero values, or the LOD. MLCI values between KNN-TN and KNN-EU were roughly equivalent, and superior to the other four methods in most cases. Our findings demonstrate that KNN-TN generally has improved performance in imputing the missing values of the different datasets compared to KNN-CR and KNN-EU when there is missingness due to missing at random combined with an LOD. The results shown in this study are in the field of metabolomics but this method could be applicable with any high throughput technology which has missing due to LOD.
Steenwijk, Martijn D; Pouwels, Petra J W; Daams, Marita; van Dalen, Jan Willem; Caan, Matthan W A; Richard, Edo; Barkhof, Frederik; Vrenken, Hugo
2013-01-01
The segmentation and volumetric quantification of white matter (WM) lesions play an important role in monitoring and studying neurological diseases such as multiple sclerosis (MS) or cerebrovascular disease. This is often interactively done using 2D magnetic resonance images. Recent developments in acquisition techniques allow for 3D imaging with much thinner sections, but the large number of images per subject makes manual lesion outlining infeasible. This warrants the need for a reliable automated approach. Here we aimed to improve k nearest neighbor (kNN) classification of WM lesions by optimizing intensity normalization and using spatial tissue type priors (TTPs). The kNN-TTP method used kNN classification with 3.0 T 3DFLAIR and 3DT1 intensities as well as MNI-normalized spatial coordinates as features. Additionally, TTPs were computed by nonlinear registration of data from healthy controls. Intensity features were normalized using variance scaling, robust range normalization or histogram matching. The algorithm was then trained and evaluated using a leave-one-out experiment among 20 patients with MS against a reference segmentation that was created completely manually. The performance of each normalization method was evaluated both with and without TTPs in the feature set. Volumetric agreement was evaluated using intra-class coefficient (ICC), and voxelwise spatial agreement was evaluated using Dice similarity index (SI). Finally, the robustness of the method across different scanners and patient populations was evaluated using an independent sample of elderly subjects with hypertension. The intensity normalization method had a large influence on the segmentation performance, with average SI values ranging from 0.66 to 0.72 when no TTPs were used. Independent of the normalization method, the inclusion of TTPs as features increased performance particularly by reducing the lesion detection error. Best performance was achieved using variance scaled intensity
位置隐私保护下的连续最近邻查询%Continous Nearest-Neighbor Query in Location Privacy Preserving
Institute of Scientific and Technical Information of China (English)
王勇; 董一鸿; 钱江波; 陈华辉
2016-01-01
已有的位置隐私保护下的连续最近邻查询往往采用snapshot方式进行，导致较高的中央处理器开销。为此，研究了基于位置隐私的连续最近邻查询，提出了基于重用技术的位置隐私保护的连续最近邻查询算法。该算法利用相邻时刻查询结果集的相似性来减少计算成本，从而实现答案集的快速更新，可大大加快系统响应时间。实验结果表明了该算法的有效性。%Enjoying the location-based service ( LBS ) , the mobile subscribers may threaten the disclo-sure of location privacy. To protect location privacy, an effective method for location privacy preserving was proposed to cloak the user's exact coordinates into a spatial region and turn the location-based query into region-based query. Existing continuous nearest-neighbor query algorithms with privacy-aware are based on snapshot, which incur higher central processing unit ( CPU) cost. The location privacy-based continuous nearest-neighbor query was studied and an algorithm named reusing-based location privacy-preserving continuous nearest-neighbor query ( RLPCNN) which is based on reusing technique query up-dating was proposed. The algorithm can reduce the cost of computation by using the similarity between the two adjacent time and make the answer set updated quickly, which can quicken the response time markedly. The experiments show that the algorithm is effective and efficient.
Research on optimization of dynamic nearest neighbor clustering algorithm%动态最近邻聚类算法的优化研究
Institute of Scientific and Technical Information of China (English)
储岳中; 徐波
2011-01-01
To solve the problem of the sensitivity for clustering radius and difficult to obtain the optimal solution of the nearest neighbor clustering algorithm, an optimization method based on Bayesian information criterion (BIC) is proposed.Firstly, the initial data set is to be preprocessed to remove noise data by DBSCAN algorithm.Then, the nearest neighbor clustering algorithm is to be used in the parameter space of cluster radius, and the value of Bayesian information for each cluster is to be calculated.Finally, the maximum value of the corresponding Bayesian information is obtained by comparing the results of various cluster, which is just the optimal clustering.Experimental results show that the optimization of nearest neighbor clustering algorithm is a best solution for the selecting of clustering radius.%针对最近邻聚类算法对聚类半径敏感、不易获得最优解的问题,提出了基于贝叶斯信息测度BIC(Bayesian informationcriterion)的优化方法.通过DBSCAN算法对初始数据集进行预处理,去除噪声数据.在参数空间内逐步调整聚类半径,利用最近邻聚类算法对数据集进行聚类,并计算每次聚类结果的贝叶斯信息测度值.比较各次聚类结果的贝叶斯信息测度值,最大贝叶斯信息测度值对应的聚类即为最优聚类结果.实验结果表明,优化的最近邻聚类算法很好地解决了合适的聚类半径选取问题.
Cao, Jun-Zhe; Liu, Wen-Qi; Gu, Hong
2012-11-01
Machine learning is a kind of reliable technology for automated subcellular localization of viral proteins within a host cell or virus-infected cell. One challenge is that the viral protein samples are not only with multiple location sites, but also class-imbalanced. The imbalanced dataset often decreases the prediction performance. In order to accomplish this challenge, this paper proposes a novel approach named imbalance-weighted multi-label K-nearest neighbor to predict viral protein subcellular location with multiple sites. The experimental results by jackknife test indicate that the presented algorithm achieves a better performance than the existing methods and has great potentials in protein science.
E, Mingju; Gong, Ye; Yu, Jiangping; Zhang, Siyu; Fan, Qianxi; Jiang, Yunlei
2017-01-01
Extra-pair copulation is considered to be a means by which females can modify their initial mate choice, and females might obtain indirect benefits to offspring fitness by engaging in this behavior. Here, we examined the patterns of extra-pair paternity and female preferences in the yellow-rumped flycatcher (Ficedula zanthopygia). We found that female yellow-rumped flycatchers are more likely to choose larger and relatively highly heterozygous males than their social mates as extra-pair mates, that the genetic similarity of pairs that produced mixed-paternity offspring did not differ from the similarity of pairs producing only within-pair offspring, and that extra-pair offspring were more heterozygous than their half-siblings. These findings support the good genes hypothesis but do not exclude the compatibility hypothesis. Most female yellow-rumped flycatchers attained extra-pair paternity with distant males rather than their nearest accessible neighboring males, and no differences in genetic and phenotypic characteristics were detected between cuckolded males and their nearest neighbors. There was no evidence that extra-pair mating by female flycatchers reduced inbreeding. Moreover, breeding density, breeding synchrony and their interaction did not affect the occurrence of extra-pair paternity in this species. Our results suggest that the variation in extra-pair paternity distribution between nearest neighbors in some passerine species might result from female preference for highly heterozygous males. PMID:28257431
Shariq, Ahmed
2012-01-01
A next nearest neighbor evaluation procedure of atom probe tomography data provides distributions of the distances between atoms. The width of these distributions for metallic glasses studied so far is a few Angstrom reflecting the spatial resolution of the analytical technique. However, fitting Gaussian distributions to the distribution of atomic distances yields average distances with statistical uncertainties of 2 to 3 hundredth of an Angstrom. Fe 40Ni40B20 metallic glass ribbons are characterized this way in the as quenched state and for a state heat treated at 350 °C for 1 h revealing a change in the structure on the sub-nanometer scale. By applying the statistical tool of the χ2 test a slight deviation from a random distribution of B-atoms in the as quenched sample is perceived, whereas a pronounced elemental inhomogeneity of boron is detected for the annealed state. In addition, the distance distribution of the first fifteen atomic neighbors is determined by using this algorithm for both annealed and as quenched states. The next neighbor evaluation algorithm evinces a steric periodicity of the atoms when the next neighbor distances are normalized by the first next neighbor distance. A comparison of the nearest neighbor atomic distribution for as quenched and annealed state shows accumulation of Ni and B. Moreover, it also reveals the tendency of Fe and B to move slightly away from each other, an incipient step to Ni rich boride formation. © 2011 Elsevier B.V.
Datta, A.; Banerjee, S.; Finley, A.O.; Hamm, N.A.S.; Schaap, M.
2016-01-01
Particulate matter (PM) is a class of malicious environmental pollutants known to be detrimental to human health. Regulatory efforts aimed at curbing PM levels in different countries often require high resolution space–time maps that can identify red-flag regions exceeding statutory concentration
Energy Technology Data Exchange (ETDEWEB)
Fournier, Sean Donovan; Beall, Patrick S; Miller, Mark L
2014-08-01
Through the SNL New Mexico Small Business Assistance (NMSBA) program, several Sandia engineers worked with the Environmental Restoration Group (ERG) Inc. to verify and validate a novel algorithm used to determine the scanning Critical Level (L c ) and Minimum Detectable Concentration (MDC) (or Minimum Detectable Areal Activity) for the 102F scanning system. Through the use of Monte Carlo statistical simulations the algorithm mathematically demonstrates accuracy in determining the L c and MDC when a nearest-neighbor averaging (NNA) technique was used. To empirically validate this approach, SNL prepared several spiked sources and ran a test with the ERG 102F instrument on a bare concrete floor known to have no radiological contamination other than background naturally occurring radioactive material (NORM). The tests conclude that the NNA technique increases the sensitivity (decreases the L c and MDC) for high-density data maps that are obtained by scanning radiological survey instruments.
Du, Pufeng; Cao, Shengjiao; Li, Yanda
2009-11-21
The chloroplast is a type of plant specific subcellular organelle. It is of central importance in several biological processes like photosynthesis and amino acid biosynthesis. Thus, understanding the function of chloroplast proteins is of significant value. Since the function of chloroplast proteins correlates with their subchloroplast locations, the knowledge of their subchloroplast locations can be very helpful in understanding their role in the biological processes. In the current paper, by introducing the evidence-theoretic K-nearest neighbor (ET-KNN) algorithm, we developed a method for predicting the protein subchloroplast locations. This is the first algorithm for predicting the protein subchloroplast locations. We have implemented our algorithm as an online service, SubChlo (http://bioinfo.au.tsinghua.edu.cn/subchlo). This service may be useful to the chloroplast proteome research.
Research of Nearest Neighbor Query of Point to General Polygons in The Plane%平面中点对一般多边形的最近邻查询研究
Institute of Scientific and Technical Information of China (English)
朱婧
2014-01-01
平面中点对一般多边形的最近邻查询问题是要在一般多边形集合中找到查询点的最近邻以及顺序最近邻。针对查询对象的特殊性，以R树作为索引结构，采用一般多边形的凸包组织空间结构。通过判断可见边计算查询点到凸包的最小距离。采用优先队列的方法给出查询点到每个凸包的最小距离的排序，最终找到查询点的最近邻和顺序最近邻。%The nearest neighbor query of point to general polygons in the plane is to find the nearest neighbor and sequential nearest neighbor in the general polygon set. In view of the particularity of the query object, regards R tree as the index structure, organizes spatial structure with convex hull of the general polygon, computes the minimum distance between the query point and the convex hull by judging the visible edge and gives the sort of the minimum distance between the query point and each convex hull with the , eventually finds nearest neighbor and sequential nearest neighbor of the query point.
Hayat, Maqsood; Khan, Asifullah
2012-04-01
Outer membrane proteins (OMPs) play important roles in cell biology. In addition, OMPs are targeted by multiple drugs. The identification of OMPs from genomic sequences and successful prediction of their secondary and tertiary structures is a challenging task due to short membrane-spanning regions with high variation in properties. Therefore, an effective and accurate silico method for discrimination of OMPs from their primary sequences is needed. In this paper, we have analyzed the performance of various machine learning mechanisms for discriminating OMPs such as: Genetic Programming, K-nearest Neighbor, and Fuzzy K-nearest Neighbor (Fuzzy K-NN) in conjunction with discrete methods such as: Amino acid composition, Amphiphilic Pseudo amino acid composition, Split amino acid composition (SAAC), and hybrid versions of these methods. The performance of the classifiers is evaluated by two datasets using 5-fold crossvalidation. After the simulation, we have observed that Fuzzy K-NN using SAAC based-features makes it quite effective in discriminating OMPs. Fuzzy K-NN achieves the highest success rates of 99.00% accuracy for discriminating OMPs from non-OMPs and 98.77% and 98.28% accuracies from α-helix membrane and globular proteins, respectively on dataset1. While on dataset2, Fuzzy K-NN achieves 99.55%, 99.90%, and 99.81% accuracies for discriminating OMPs from non- OMPs, α-helix membrane, and globular proteins, respectively. It is observed that the classification performance of our proposed method is satisfactory and is better than the existing methods. Thus, it might be an effective tool for high throughput innovation of OMPs.
Directory of Open Access Journals (Sweden)
Brett A McKinney
Full Text Available Relief-F is a nonparametric, nearest-neighbor machine learning method that has been successfully used to identify relevant variables that may interact in complex multivariate models to explain phenotypic variation. While several tools have been developed for assessing differential expression in sequence-based transcriptomics, the detection of statistical interactions between transcripts has received less attention in the area of RNA-seq analysis. We describe a new extension and assessment of Relief-F for feature selection in RNA-seq data. The ReliefSeq implementation adapts the number of nearest neighbors (k for each gene to optimize the Relief-F test statistics (importance scores for finding both main effects and interactions. We compare this gene-wise adaptive-k (gwak Relief-F method with standard RNA-seq feature selection tools, such as DESeq and edgeR, and with the popular machine learning method Random Forests. We demonstrate performance on a panel of simulated data that have a range of distributional properties reflected in real mRNA-seq data including multiple transcripts with varying sizes of main effects and interaction effects. For simulated main effects, gwak-Relief-F feature selection performs comparably to standard tools DESeq and edgeR for ranking relevant transcripts. For gene-gene interactions, gwak-Relief-F outperforms all comparison methods at ranking relevant genes in all but the highest fold change/highest signal situations where it performs similarly. The gwak-Relief-F algorithm outperforms Random Forests for detecting relevant genes in all simulation experiments. In addition, Relief-F is comparable to the other methods based on computational time. We also apply ReliefSeq to an RNA-Seq study of smallpox vaccine to identify gene expression changes between vaccinia virus-stimulated and unstimulated samples. ReliefSeq is an attractive tool for inclusion in the suite of tools used for analysis of mRNA-Seq data; it has power to
McKinney, Brett A; White, Bill C; Grill, Diane E; Li, Peter W; Kennedy, Richard B; Poland, Gregory A; Oberg, Ann L
2013-01-01
Relief-F is a nonparametric, nearest-neighbor machine learning method that has been successfully used to identify relevant variables that may interact in complex multivariate models to explain phenotypic variation. While several tools have been developed for assessing differential expression in sequence-based transcriptomics, the detection of statistical interactions between transcripts has received less attention in the area of RNA-seq analysis. We describe a new extension and assessment of Relief-F for feature selection in RNA-seq data. The ReliefSeq implementation adapts the number of nearest neighbors (k) for each gene to optimize the Relief-F test statistics (importance scores) for finding both main effects and interactions. We compare this gene-wise adaptive-k (gwak) Relief-F method with standard RNA-seq feature selection tools, such as DESeq and edgeR, and with the popular machine learning method Random Forests. We demonstrate performance on a panel of simulated data that have a range of distributional properties reflected in real mRNA-seq data including multiple transcripts with varying sizes of main effects and interaction effects. For simulated main effects, gwak-Relief-F feature selection performs comparably to standard tools DESeq and edgeR for ranking relevant transcripts. For gene-gene interactions, gwak-Relief-F outperforms all comparison methods at ranking relevant genes in all but the highest fold change/highest signal situations where it performs similarly. The gwak-Relief-F algorithm outperforms Random Forests for detecting relevant genes in all simulation experiments. In addition, Relief-F is comparable to the other methods based on computational time. We also apply ReliefSeq to an RNA-Seq study of smallpox vaccine to identify gene expression changes between vaccinia virus-stimulated and unstimulated samples. ReliefSeq is an attractive tool for inclusion in the suite of tools used for analysis of mRNA-Seq data; it has power to detect both main
Jensen, Berith F; Vind, Christian; Padkjaer, Søren B; Brockhoff, Per B; Refsgaard, Hanne H F
2007-02-08
Inhibition of cytochrome P450 (CYP) enzymes is unwanted because of the risk of severe side effects due to drug-drug interactions. We present two in silico Gaussian kernel weighted k-nearest neighbor models based on extended connectivity fingerprints that classify CYP2D6 and CYP3A4 inhibition. Data used for modeling consisted of diverse sets of 1153 and 1382 drug candidates tested for CYP2D6 and CYP3A4 inhibition in human liver microsomes. For CYP2D6, 82% of the classified test set compounds were predicted to the correct class. For CYP3A4, 88% of the classified compounds were correctly classified. CYP2D6 and CYP3A4 inhibition were additionally classified for an external test set on 14 drugs, and multidimensional scaling plots showed that the drugs in the external test set were in the periphery of the training sets. Furthermore, fragment analyses were performed and structural fragments frequent in CYP2D6 and CYP3A4 inhibitors and noninhibitors are presented.
Institute of Scientific and Technical Information of China (English)
Carlos A AGUIRRE-SALADO; Liliana MIRANDA-ARAGÓN; Eduardo J TREVIÑO-GARZA; Oscar A AGUIRRE-CALDERÓN; Javier JIMÉNEZ-PÉREZ; Marco A GONZÁLEZ-TAGLE; José R VALDÉZ-LAZALDE; Guillermo SÁNCHEZ-DÍAZ; Reija HAAPANEN; Alejandro I AGUIRRE-SALADO
2014-01-01
As climate change negotiations progress, monitoring biomass and carbon stocks is becoming an im-portant part of the current forest research. Therefore, national governments are interested in developing for-est-monitoring strategies using geospatial technology. Among statistical methods for mapping biomass, there is a nonparametric approach called k-nearest neighbor (kNN). We compared four variations of distance metrics of the kNN for the spatially-explicit estimation of aboveground biomass in a portion of the Mexican north border of the intertropical zone. Satellite derived, climatic, and topographic predictor variables were combined with the Mexican National Forest Inventory (NFI) data to accomplish the purpose. Performance of distance metrics applied into the kNN algorithm was evaluated using a cross validation leave-one-out technique. The results indicate that the Most Similar Neighbor (MSN) approach maximizes the correlation between predictor and response variables (r=0.9). Our results are in agreement with those reported in the literature. These findings confirm the predictive potential of the MSN approach for mapping forest variables at pixel level under the policy of Reducing Emission from Deforestation and Forest Degradation (REDD+).
Yangyang Guo; Wei Li; Jiping He
2014-01-01
Neural decoding is a procedure to acquire intended movement information from neural activity and generate movement commands to control external devices such as intelligent prostheses. In this study, monkey Astra was trained to accomplish a 3-D reach-to-grasp task, and we recorded neural signals from its primary motor cortex (M1) during the task. The task-related cells were divided into four classes based on their correlation with two movement parameters: movement direction and orientation. We adopted the simple k-nearest neighbor (KNN) algorithm as the classifier, and chose cells from appropriate cell classes for movement parameter decoding. Cell classification was shown improving decoding accuracy with relatively less cells, even during movement planning stage (CRT). High decoding accuracy before movement actually performed is of great significance for intelligent prostheses control, and provides evidence that M1 is more than accepting ready-made movement commands but also participating in movement planning. We also found that population of task-related cells in M1 had a preference for specific direction and orientation, and this preference was more significant when it came to population of direction-related cells and orientation-related cells.
Bergmann, Tommy; Heinke, Florian; Labudde, Dirk
2017-09-01
The age determination of blood traces provides important hints for the chronological assessment of criminal events and their reconstruction. Current methods are often expensive, involve significant experimental complexity and often fail to perform when being applied to aged blood samples taken from different substrates. In this work an absorption spectroscopy-based blood stain age estimation method is presented, which utilizes 400-640nm absorption spectra in computation. Spectral data from 72 differently aged pig blood stains (2h to three weeks) dried on three different substrate surfaces (cotton, polyester and glass) were acquired and the turnover-time correlations were utilized to develop a straightforward age estimation scheme. More precisely, data processing includes data dimensionality reduction, upon which classic k-nearest neighbor classifiers are employed. This strategy shows good agreement between observed and predicted blood stain age (r>0.9) in cross-validation. The presented estimation strategy utilizes spectral data from dissolved blood samples to bypass spectral artifacts which are well known to interfere with other spectral methods such as reflection spectroscopy. Results indicate that age estimations can be drawn from such absorbance spectroscopic data independent from substrate the blood dried on. Since data in this study was acquired under laboratory conditions, future work has to consider perturbing environmental conditions in order to assess real-life applicability. Copyright © 2017 Elsevier B.V. All rights reserved.
Zuo, Yong-Chun; Su, Wen-Xia; Zhang, Shi-Hua; Wang, Shan-Shan; Wu, Cheng-Yan; Yang, Lei; Li, Guang-Peng
2015-03-01
Membrane transporters play crucial roles in the fundamental cellular processes of living organisms. Computational techniques are very necessary to annotate the transporter functions. In this study, a multi-class K nearest neighbor classifier based on the increment of diversity (KNN-ID) was developed to discriminate the membrane transporter types when the increment of diversity (ID) was introduced as one of the novel similarity distances. Comparisons with multiple recently published methods showed that the proposed KNN-ID method outperformed the other methods, obtaining more than 20% improvement for overall accuracy. The overall prediction accuracy reached was 83.1%, when the K was selected as 2. The prediction sensitivity achieved 76.7%, 89.1%, 80.1% for channels/pores, electrochemical potential-driven transporters, primary active transporters, respectively. Discrimination and comparison between any two different classes of transporters further demonstrated that the proposed method is a potential classifier and will play a complementary role for facilitating the functional assignment of transporters.
Porta, Alberto; De Maria, Beatrice; Bari, Vlasta; Marchi, Andrea; Marinou, Kalliopi; Sideri, Riccardo; Mora, Gabriele; Dalla Vecchia, Laura
2016-08-01
The study evaluates the k-nearest-neighbor (KNN) strategy for the assessment of complexity of the cardiac neural control from spontaneous fluctuations of heart period (HP). Two different procedures were assessed: i) the KNN estimation of the conditional entropy (CE) proposed by Porta et al; ii) the KNN estimation of mutual information proposed by Kozachenko-Leonenko, refined by Kraskov-Stögbauer-Grassberger and here adapted for the CE estimation. The two procedures were compared over HP variability recordings obtained at rest in supine position and during head-up tilt (HUT) in amyotrophic lateral sclerosis patients and healthy subjects. We found that the indexes derived from the two procedures were significantly correlated and both methods were able to detect the effect of HUT on HP complexity within the same group and distinguish the two populations within the same experimental condition. We recommend the use of the KNN strategy to quantify the dynamical complexity of cardiac neural control in addition to more traditional approaches.
Directory of Open Access Journals (Sweden)
A. Moosavian
2013-01-01
Full Text Available Vibration analysis is an accepted method in condition monitoring of machines, since it can provide useful and reliable information about machine working condition. This paper surveys a new scheme for fault diagnosis of main journal-bearings of internal combustion (IC engine based on power spectral density (PSD technique and two classifiers, namely, K-nearest neighbor (KNN and artificial neural network (ANN. Vibration signals for three different conditions of journal-bearing; normal, with oil starvation condition and extreme wear fault were acquired from an IC engine. PSD was applied to process the vibration signals. Thirty features were extracted from the PSD values of signals as a feature source for fault diagnosis. KNN and ANN were trained by training data set and then used as diagnostic classifiers. Variable K value and hidden neuron count (N were used in the range of 1 to 20, with a step size of 1 for KNN and ANN to gain the best classification results. The roles of PSD, KNN and ANN techniques were studied. From the results, it is shown that the performance of ANN is better than KNN. The experimental results dèmonstrate that the proposed diagnostic method can reliably separate different fault conditions in main journal-bearings of IC engine.
Directory of Open Access Journals (Sweden)
Leonhard Suchenwirth
2014-07-01
Full Text Available Among the machine learning tools being used in recent years for environmental applications such as forestry, self-organizing maps (SOM and the k-nearest neighbor (kNN algorithm have been used successfully. We applied both methods for the mapping of organic carbon (Corg in riparian forests due to their considerably high carbon storage capacity. Despite the importance of floodplains for carbon sequestration, a sufficient scientific foundation for creating large-scale maps showing the spatial Corg distribution is still missing. We estimated organic carbon in a test site in the Danube Floodplain based on RapidEye remote sensing data and additional geodata. Accordingly, carbon distribution maps of vegetation, soil, and total Corg stocks were derived. Results were compared and statistically evaluated with terrestrial survey data for outcomes with pure remote sensing data and for the combination with additional geodata using bias and the Root Mean Square Error (RMSE. Results show that SOM and kNN approaches enable us to reproduce spatial patterns of riparian forest Corg stocks. While vegetation Corg has very high RMSEs, outcomes for soil and total Corg stocks are less biased with a lower RMSE, especially when remote sensing and additional geodata are conjointly applied. SOMs show similar percentages of RMSE to kNN estimations.
Institute of Scientific and Technical Information of China (English)
颜克胜; 李太福; 魏正元; 苏盈盈; 姚立忠
2012-01-01
The classifier is often led to the problem of low recognition accuracy and time and space overhead, due to the multicollinearity and redundant features and noise in the classification of high dimensional data. A feature selection method based on partial least squares(PLS) and false nearest neighbors(FNN) is proposed. Firstly, the partial least squares method is employed to extract the principal components of high-dimensional data and overcome difficulties encountered with the existing multicollinearity between the original features, and the independent principal components space which carries supervision information could be obtained. Then, the similarity measure based on FNN would be established by calculating the correlation in this space before and after each feature selection, furthermore, gets the original features ranking of interpretation to the dependent variable. Finally, the features which have weak explanatory ability could be removed in turn to construct various classification models, and uses recognition rate of Support Vector Machine(SVM) as a evaluation criterion of models to search out the classification model which not only has the highest recognition rate, but also contains the least number of features, the best feature subset is the just model. A series of experiments from different data models have been conducted. The simulation results show that this method has a good capability to select the best feature subset which is consistent with the nature of classification feature for the data set. Therefore, the research provides a new approach to the feature selection of data classification.%在高维数据分类中,针对多重共线性、冗余特征及噪声易导致分类器识别精度低和时空开销大的问题,提出融合偏最小二乘(Partial Least Squares,PLS)有监督特征提取和虚假最近邻点(False Nearest Neighbors,FNN)的特征选择方法:首先利用偏最小二乘对高维数据提取主元,消除特征之间的多重共
加权最近邻聚类在SOC中的应用%The weight value of the nearest neighbor clustering in SOC application
Institute of Scientific and Technical Information of China (English)
丛佩丽
2012-01-01
The alarm technology of Security Operation Center（SOC） is studied,Presents a kind of the clustering method based on the weighted of nearest neighbor clustering,after an initial filtering,normalization of alarm information and knowledge base of existing rules clustering,get real attack and attack scene reconstruction,the alarm information further correlation analysis provides a strong guarantee.The testing and application indicated that our method performed slightly better than other similar method in usability,flexibility,veracity of getting real attacks and efficiency.%本文对网络安全管理中心（SOC）的报警技术进行了研究,提出了一种具有权值的最近邻算法的聚类方法,对经过初步过滤、规范化后的报警信息与知识库中已有规则进行聚类,获取真正的攻击事件并完成攻击场景的重构,对报警信息进一步进行关联分析提供了有力保障。通过测试及应用表明,本文所应用的方法在可用性、灵活性、获取攻击事件的准确性以及处理效率上要优于其它方法。
基于路由机制的时变路网κ近邻算法%κ-Nearest Neighbor Algorithm in Dynamic Road Network Based on Routing Mechanism
Institute of Scientific and Technical Information of China (English)
张栋良; 唐俊
2013-01-01
针对现实生活中动态路网的地理信息查询问题,提出了一种基于路由机制的动态路网中k近邻查询的算法.其主导思想是利用空间换时间,用路由表保存历史查询结果,用查询路由表的方法代替传统的最短路径计算,通过历史数据减少系统重复计算并对车辆行驶路径进行规划,用更新路由表的方法适应路况的变化.围绕路由表这一核心,改进相应的k近邻算法的过滤、精炼过程.通过路由表对动态路网进行少量的预处理,减少系统在k近邻搜索中的候选点数量,缩小查询范围,提高搜索效率.%Aiming at the issue of geography information query, a new κ-Nearest Neighbor algorithm for dynamic road network was proposed based on routing mechanism. With the idea of "space for time", we saved history query results in routing tables, and substituted the traditional method by requiring tables. We updated the route tables to adapt the time varying road status. With the kernel of routing table, we improved the filtering and refining procedure of kNN algorithm. By preprocess of dynamic road network using routing table, the amount of candidate points in κ-NN computing is reduced,and the rang of query and the searching efficiency are promoted.
基于K-近邻算法的文本情感分析方法研究%Research on analyzing sentiment of texts based on k-nearest neighbor algorithm
Institute of Scientific and Technical Information of China (English)
樊娜; 安毅生; 李慧贤
2012-01-01
In order to identify polarity of sentiment on web texts, by analyzing the text structure and the characteristics of expressing sentiment in texts, a method based on K-nearest algorithm is proposed. In this method, sentiment of a text is divided into local sentiment and global sentiment Local sentiment can be determined by conditional random field models, and the K-nearest neighbor algorithm is used to compute global sentiment of the text Experimental results show that compared with traditional machine learning methods, this method can analyze sentiment on multi-level and is fine granularity, and can effectively improve accuracy of sentiment analysis.%为了识别网络文本的情感倾向性,通过分析文本结构以及情感表达的特点,提出了一种基于K-近邻的文本情感分析方法,将整个文本的情感划分为局部情感和全局情感.建立条件随机场模型,确定文本中的局部情感,通过K-近邻算法计算文本的全局情感.实验结果表明,与传统机器学习方法相比,该方法能细粒度、多层次的分析文本的情感,同时能有效提高情感分析的准确率.
Ground-state diagrams for lattice-gas models of catalytic CO oxidation
Directory of Open Access Journals (Sweden)
I.S.Bzovska
2007-01-01
Full Text Available Based on simple lattice models of catalytic carbon dioxide synthesis from oxygen and carbon monoxide, phase diagrams are investigated at temperature T=0 by incorporating the nearest-neighbor interactions on a catalyst surface. The main types of ground-state phase diagrams of two lattice models are classified describing the cases of clean surface and surface containing impurities. Nonuniform phases are obtained and the conditions of their existence dependent on the interaction parameters are established.
Institute of Scientific and Technical Information of China (English)
魏书宁; 王耀南; 印峰; 杨易旻
2012-01-01
输电线柔性结构特性给除冰机器人越障抓线控制带来极大困难.本文提出了一种结合k–最近邻（k-nearest neighbor,KNN）分类算法和增强学习算法的抓线控制方法.利用基于KNN算法的状态感知机制选择机器人当前状态k个最邻近状态并且对之加权.根据加权结果决定当前最优动作.该方法可以得到机器人连续状态的离散表达形式,从而有效解决传统连续状态泛化方法带来的计算收敛性和维数灾难问题.借助增强学习算法探测和适应环境的能力,该方法能够克服机器人模型误差和姿态误差,以及环境干扰等因素对抓线控制的影响.文中给出了算法具体实现步骤,并给出了应用此方法控制除冰机器人抓线的仿真实验.%The flexible mechanical characteristic of power lines induces difficulties for line-grasping control for de-icing robots.To deal with this difficulty,we propose for de-icing robots a line-grasping control approach which combines the k-nearest neighbor（KNN） algorithm and the reinforcement-learning（RL）.In the learning iteration,the state-perception mechanism of the KNN algorithm selects k-nearest states and weights;from k-weighted states,an optimal action is determined.By expressing a continuous state by k-nearest discrete states in this way,this approach effectively ensures the convergence for the computation and avoids the curse of dimensionality occurred in traditional continuous state-space generalization methods.Abilities of RL in perception and adaptation to the environment make the line-grasping control to tolerate possible errors in robot model,errors of robot arm attitudes and interferences from the environment.The design procedures are presented in details.Simulation results of line-grasping control based on this approach are given.
Institute of Scientific and Technical Information of China (English)
于攀; 叶俊勇
2011-01-01
Cancer gene expression data is a typical data with high dimension and small sample, identifying it directly will encounter the curse of dimensionality,so needs dimensions reduction. This paper proposes a kind of classification approach based on Spectral Regression (SR)analysis and Kernel space K-Nearest Neighbor(KKNN) classifier for gene expression data.it gets the projection matrix through Spectral Regression Analysis witch can extract effectively discriminative characteristics of low dimensions, and reduces the dimensionality of gene expression data by projection matrix, then identifies the low-dimensional data reduced with the Kernel Space K-Nearest Neighbor Classifier. As the experiments operated on the cancer datasets Prostate. Tumor and 4. Tumors demonstrate the effectiveness of the proposed algorithm; simultaneously,compared with the K-Nearest Neighbor(KNN) classification approach,The Kernel space K-Nearest Neighbor has a better classification result.%肿瘤基因表达数据是典型的高维小样本数据,直接对其进行识别存在维数灾难,需要对数据进行维数约简.提出了一种基于谱回归分析和核空间最近邻分类器的基因表达数据分类方法,采用谱回归分析得到可有效提取低维鉴别特征的投影矩阵,然后通过投影矩阵对基因表达数据进行维数约简,得到的低维数据用核空间最近邻分类器进行识别.通过在Prostate_Tumor,4_ Tumors两种肿瘤数据集上的实验,证明了该方法的有效性；同时证明了核空间最近邻具有比最近邻更好的分类效果.
Institute of Scientific and Technical Information of China (English)
李乡儒
2012-01-01
The neatest neighbor (NN) method is one of the most typical methods in spectral retrieval, automatic processing and data mining. The main problem in NN is the low efficiency. Therefore,focus on the efficient implementation problem and introduce a novel and efficient algorithm SHNN (sequential computation-based hash nearest neighbor algorithm). In algorithm SHNN, firstly, decompose and recognize the spectrum flux components based on their hashing power; Secondly, the nearest neighbor is computed in PC A space based on sequential computation idea. In the second procedure,the putative nearest spectra can be reduced based on hash idea,and the un-nearest spectra can be rejected as early as possible. The contributions of this work are; 1) anovel algorithm SHNN is introduced,which improve the efficiency of the most popular spectramining method nearest neighbor significantly;2) Its application in star spectrum,normal galaxy spectrum and Qso spectrum classificationis investigated. Evaluated the efficiency of the proposed algorithms experimentally on the SDSS (Sloan Digital Sky Survey) released spectra. The experimental results show that the proposed SHNN algorithm improves the efficiency of nearest neighbor method more than 96%. The nearest neighbor is one of the most popular and typical methods in spectra mining. Therefore , this work is useful in a wide scenario of automatic spectra analysis, for example, spectra classification, spectra parameter estimation, redshift estimation based on spectra,etc.%基于近邻的方法是海量光谱数据获取、自动处理和挖掘中的一类重要方法,在应用中它们的主要问题是效率较低,为此文中提出了基于序贯计算的散列近邻法( SHNN).在SHNN中,首先使用PCA方法对光谱数据进行正交变换,使数据按照各成分的散列能力进行组织；然后在PCA空间中快速查找待识别光谱的近邻数据,在此过程中通过散列思想快速约减搜索空间,并用序贯计算法高效
Institute of Scientific and Technical Information of China (English)
殷龙; 衡红军
2016-01-01
window constraints according to the business of the airport refue-ling service is built in this paper. After that,the research of using the nearest neighbor algorithm on the solution of the model is given,and taking the actual flight data of a domestic airport as an example,the model is verified the effectiveness on the issue. At last,the optimum fuel filler task allocation result is obtained. Experimental results show that the algorithm can greatly reduce the service cost for special scheduling vehicles.
Burton, B P; Gopman, D B; Dogan, Gunay; Hood, Sarah
2016-01-01
In previous work, molecular dynamics simulations based on a first-principles-derived effective Hamiltonian for $Pb_{1-X}(Sc_{1/2}Nb_{1/2})O_{3-X}$~ (PSN), with nearest-neighbor Pb-O divacancy pairs, was used to calculate $X_{\\rm [Pb-O]}$~vs.~T, phase diagrams for PSN with: ideal rock-salt type chemical order; nanoscale chemical short-range order; and random chemical disorder. Here, we show that the phase diagrams should include additional regions in which a glassy relaxor-phase (or state) is predicted. With respect to phase diagram topology, these results strongly support the analogy between relaxors and magnetic spin-glass-systems.
Digital terrain model generalization incorporating scale, semantic and cognitive constraints
Partsinevelos, Panagiotis; Papadogiorgaki, Maria
2014-05-01
research scheme comprises of the combination of SOM with the variations of other widely used generalization algorithms. For instance, an adaptation of the Douglas-Peucker line simplification method in 3D data is used in order to reduce the initial nodes, while maintaining their actual coordinates. Furthermore, additional methods are deployed, aiming to corroborate and verify the significance of each node, such as mathematical algorithms exploiting the pixel's nearest neighbors. Finally, besides the quantitative evaluation of error vs information preservation in a DTM, cognitive inputs from geoscience experts are incorporated in order to test, fine-tune and advance our algorithm. Under the described strategy that incorporates mechanical, topology, semantic and cognitive restrains, results demonstrate the necessity to integrate these characteristics in describing raster DTM surfaces. Acknowledgements: This work is partially supported under the framework of the "Cooperation 2011" project ATLANTAS (11_SYN_6_1937) funded from the Operational Program "Competitiveness and Entrepreneurship" (co-funded by the European Regional Development Fund (ERDF)) and managed by the Greek General Secretariat for Research and Technology.
Directory of Open Access Journals (Sweden)
Fuqian Shi
2012-01-01
Full Text Available Emotional cellular (EC, proposed in our previous works, is a kind of semantic cell that contains kernel and shell and the kernel is formalized by a triple- L = , where P denotes a typical set of positive examples relative to word-L, d is a pseudodistance measure on emotional two-dimensional space: valence-arousal, and δ is a probability density function on positive real number field. The basic idea of EC model is to assume that the neighborhood radius of each semantic concept is uncertain, and this uncertainty will be measured by one-dimensional density function δ. In this paper, product form features were evaluated by using ECs and to establish the product style database, fuzzy case based reasoning (FCBR model under a defined similarity measurement based on fuzzy nearest neighbors (FNN incorporating EC was applied to extract product styles. A mathematical formalized inference system for product style was also proposed, and it also includes uncertainty measurement tool emotional cellular. A case study of style acquisition of mobile phones illustrated the effectiveness of the proposed methodology.
基于最近邻方法的类星体与正常星系光谱分类%Galaxy/Quasar Classification Based on Nearest Neighbor Method
Institute of Scientific and Technical Information of China (English)
李乡儒; 卢瑜; 周建明; 王永俊
2011-01-01
随着高质最CCD传感器技术的日渐成熟与广泛应用,以及许多大型巡天计划的相继实施,天体数据量极大,因此天体观测数据的自动识别、分析问题首当其冲.文章在原始测量空间使用最近邻方法(NN)研究了正常星系与类星体光谱的识别问题.正常星系和类星体属于河外天体,一般距离地球较远,其观测光谱会受到许多干扰,所以这两类天体光谱的分类在光谱自动识别研究中具有一定的代表性.同时,采用的NN是模式识别和数据挖掘方面的基准性方法,在许多新方法的评估中,往往以NN方法的性能作为比较对象.从实用价值来说,研究表明,NN方法的类星体和正常星系光谱识别率与文献中复杂方法的最好结果相当,但该文方法不需要进行分类器的训练,利于实时进行增量式学习和并行实现,这对海量光谱数据的快速处理有重要意义.因此,该研究具有重要的理论参考意义和一定的实用价值.%With the wide application of high-quality CCD in celestial spectrum imagery and the implementation of many large sky survey programs (e.g., Sloan Digital Sky Survey (SDSS) , Two-degree-Field Galaxy Redshift Survey (2dF), Spectroscopic Survey Telescope(SST), Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) program and Large Synoptic Survey Telescope (LSST) program, etc. ), celestial observational data are coming into the world like torrential rain. Therefore, to utilize them effectively and fully, research on automated processing methods for celestial data is imperative. In the present work, we investigated how to recognizing galaxies and quasars from spectra based on nearest neighbor method. Galaxies and quasars are extragalactic objects, they are far away from earth, and their spectra are usually contaminated by various noise. Therefore, it is a typical problem to recognize these two types of spectra in automatic spectra classification. Furthermore, the
Institute of Scientific and Technical Information of China (English)
苏婷; 于洪
2016-01-01
聚类是数据挖掘的重要技术之一，在许多实际应用领域，由于数据获取限制，数据误读，随机噪音等原因会造成大量的缺失数据，形成数据集的不完备性，而传统的聚类方法无法直接对这类数据集进行聚类分析。针对数值型数据，提出了一个基于三支决策的不完备数据聚类方法。首先找到不完备数据对象的q个近邻，使用q个近邻的平均值填充缺失的数据；然后在“完备的”数据集上使用基于密度峰值的聚类方法得到簇划分，对每个簇中含有不确定性的数据对象，使用三支决策的思想将其划分到边界域中。三支决策聚类结果采用区间集形式表示，通常一个簇被划分成正域、负域和边界域部分，可以更好地描述软聚类结果。在UCI数据集和人工数据集上的实验结果展示了算法的有效性。%Clustering is a common technique for data analysis, and has been widely used in many practical areas. However, in many practical applications, there are some reasons to cause the missing values in real data sets such as difficulties and limitations of data acquisition and random noises. Most of clustering methods can’t be used to deal with incomplete data sets for clustering analysis directly. For this reason, this paper proposes a three-way decision clustering algorithm for incomplete data based on q-nearest neighbors. Firstly, the algorithm finds the q-nearest neighbors for an object with missing values, and the missing value is filled by the average value of q-nearest neighbors. Secondly, it uses the clustering method based on density peaks for the complete data set to obtain the clustering result. For the data object with uncertainty in each cluster, it is designed to the boundary region of a cluster using the three-way decision theory. The three-way decision with interval sets naturally partitions a cluster into three regions as the positive region, boundary region
2003-01-01
We study the anisotropic Heisenberg (XYZ) spin-1/2 chain placed in a magnetic field pointing along the x-axis. We use bosonization and a renormalization group analysis to show that the model has a non-trivial fixed point at a certain value of the XY anisotropy a and the magnetic field h. Hence, there is a line of critical points in the (a,h) plane on which the system is gapless, even though the Hamiltonian has no continuous symmetry. The quantum critical line corresponds to a spin-flop transi...
Liseau, R.; De la Luz, V.; O'Gorman, E.; Bertone, E.; Chavez, M.; Tapia, F.
2016-10-01
Context. The precise mechanisms that provide the nonradiative energy for heating the chromosphere and corona of the Sun and other stars are at the focus of intense contemporary research. Aims: Observations at submm and mm wavelengths are particularly useful to obtain information about the run of the temperature in the upper atmosphere of Sun-like stars. We used the Atacama Large Millimeter/submillimeter Array (ALMA) to study the chromospheric emission of the α Centauri binary system in all six available frequency bands during Cycle 2 in 2014-2015. Methods: Since ALMA is an interferometer, the multitelescope array is particularly suited for the observation of point sources. With its large collecting area, the sensitivity is high enough to allow the observation of nearby main-sequence stars at submm/mm wavelengths for the first time. The comparison of the observed spectral energy distributions with theoretical model computations provides the chromospheric structure in terms of temperature and density above the stellar photosphere and the quantitative understanding of the primary emission processes. Results: Both stars in the α Centauri binary system were detected and resolved at all ALMA frequencies. For both α Cen A and B, the existence and location of the temperature minima, first detected from space with Herschel, are well reproduced by the theoretical models of this paper. The temperature minimum for α Cen B is lower than for A and occurs at a lower height in the atmosphere, but for both stars, Tmin/Teff is consistently lower than what is derived from optical and UV data. In addition, and as a completely different matter, a third point source was detected in Band 8 (405 GHz, 740 μm) in 2015. With only one epoch and only one detection, we are left with little information regarding that object's nature, but we conjecture that it might be a distant solar system object. Conclusions: The submm/mm emission of the α Cen stars is indeed very well reproduced by
Directory of Open Access Journals (Sweden)
Juan Carlos Salcedo-Reyes
2008-09-01
Full Text Available Usually, semiconductor ternary alloys are studied via a pseudo-binary approach in which the semiconductoris described like a crystalline array were the cation/anion sub-lattice consist of a random distribution of thecationic/anionic atoms. However, in the case of reported III-V and II-VI artificial structures, in which anordering of either the cations or the anions of the respective fcc sub-lattice is involved, a pseudo-binaryapproach can no longer be employed, an atomistic point of view, which takes into account the localstructure, must be used to study the electronic and optical properties of these artificial semiconductoralloys. In particular, the ordered Zn0.5Cd0.5Se alloy has to be described as a crystal with the simple-tetragonalBravais lattice with a composition equal to the zincblende random ternary alloy. The change of symmetryproperties of the tetragonal alloy, in relation to the cubic alloy, results mainly in two effects: i reduction ofthe banned gap, and ii crystal field cleavage of the valence band maximum. In this work, the electronicband structure of the ordered Zn0.5Cd0.5Se alloy is calculated using a second nearest neighbor semi-empiricaltight binding method. Also, it is compared with the electronic band structure obtained by FP-LAPW (fullpotentiallinearized augmented-plane wave method.
Directory of Open Access Journals (Sweden)
Fen Wei
2016-01-01
Full Text Available In order to sufficiently capture the useful fault-related information available in the multiple vibration sensors used in rotation machinery, while concurrently avoiding the introduction of the limitation of dimensionality, a new fault diagnosis method for rotation machinery based on supervised second-order tensor locality preserving projection (SSTLPP and weighted k-nearest neighbor classifier (WKNNC with an assembled matrix distance metric (AMDM is presented. Second-order tensor representation of multisensor fused conditional features is employed to replace the prevailing vector description of features from a single sensor. Then, an SSTLPP algorithm under AMDM (SSTLPP-AMDM is presented to realize dimensional reduction of original high-dimensional feature tensor. Compared with classical second-order tensor locality preserving projection (STLPP, the SSTLPP-AMDM algorithm not only considers both local neighbor information and class label information but also replaces the existing Frobenius distance measure with AMDM for construction of the similarity weighting matrix. Finally, the obtained low-dimensional feature tensor is input into WKNNC with AMDM to implement the fault diagnosis of the rotation machinery. A fault diagnosis experiment is performed for a gearbox which demonstrates that the second-order tensor formed multisensor fused fault data has good results for multisensor fusion fault diagnosis and the formulated fault diagnosis method can effectively improve diagnostic accuracy.
Ma, Yitao; Miura, Sadahiko; Honjo, Hiroaki; Ikeda, Shoji; Hanyu, Takahiro; Ohno, Hideo; Endoh, Tetsuo
2017-04-01
A high-density nonvolatile associative memory (NV-AM) based on spin transfer torque magnetoresistive random access memory (STT-MRAM), which achieves highly concurrent and ultralow-power nearest neighbor search with full adaptivity of the template data format, has been proposed and fabricated using the 90 nm CMOS/70 nm perpendicular-magnetic-tunnel-junction hybrid process. A truly compact current-mode circuitry is developed to realize flexibly controllable and high-parallel similarity evaluation, which makes the NV-AM adaptable to any dimensionality and component-bit of template data. A compact dual-stage time-domain minimum searching circuit is also developed, which can freely extend the system for more template data by connecting multiple NM-AM cores without additional circuits for integrated processing. Both the embedded STT-MRAM module and the computing circuit modules in this NV-AM chip are synchronously power-gated to completely eliminate standby power and maximally reduce operation power by only activating the currently accessed circuit blocks. The operations of a prototype chip at 40 MHz are demonstrated by measurement. The average operation power is only 130 µW, and the circuit density is less than 11 µm2/bit. Compared with the latest conventional works in both volatile and nonvolatile approaches, more than 31.3% circuit area reductions and 99.2% power improvements are achieved, respectively. Further power performance analyses are discussed, which verify the special superiority of the proposed NV-AM in low-power and large-memory-based VLSIs.
Institute of Scientific and Technical Information of China (English)
沈跃; 徐慧; 刘慧; 李宁
2016-01-01
neighbor. We also developed a novel method of image restoration to reduce the impact of background information to improve the accuracy of the color image segmentation, and to enhance the accuracy of depth data. Firstly, a RGB threshold segmentation algorithm was applied to original RGB-formatted plant color images to extract plant target areas from backgrounds. Three components R, G, and B were respectively separated from RGB color space, and the difference between G and R or B was primary extract of the plant area information. Meanwhile, for the color characteristic of the environment, aK-means clustering segmentation algorithm was performed on the extracted plant target areas to remove background noise and enhance target contours. Secondly, to fix the errors of the depth data and meet the requirements of the agricultural plant detection operations, the color image and depth image were registered to restore the suspicious pixels depth data based onK-nearest neighbor algorithm. Then, aK-nearest neighbor algorithm was presented to recovery the black hole pixels for depth images. Finally, we acquired the depth data of target plant from the detected images. Compared with conventional RGB threshold segmentation method andK-means algorithm method, the proposed method can be used to solve the problem of the color image noise. The experiment results showed that, the segmentation error can be reduced by 12.12% with RGB threshold segmentation method, and 41.48% withK-means algorithm method. The average segmentation error can be up to 12.33% by using RGB threshold segmentation first and then theK-means algorithm. Furthermore, the proposed method can be used to restore the depth data, and can significantly reduce the effect of the backgrounds. Thus it had a good improvement to the edge sharpness of the depth data, and the accuracy of the empty point depth data of single frame. The result of this study can be a reference for agricultural plant detection and 3D reconstruction
Institute of Scientific and Technical Information of China (English)
宋涛; 汤宝平; 李锋
2013-01-01
针对旋转机械故障诊断需人工干预、精度低、故障样本难以获取等问题,提出基于流形学习和K-最近邻分类器(KNNC)的故障诊断模型.提取振动信号多域信息熵以全面反映设备运行状态并构造高维特征集；利用正交邻域保持嵌入(ONPE)非线性流形学习算法的二次特征提取特性进行维数约简使特征具有更好的聚类特性；基于改进的更适用于小样本分类KNNC进行模式识别,用轴承故障诊断案例证明该模型的有效性.%Considering the disadvantages existing in conventional fault diagnosis methods for rotating machinery, such as necessity of manual intervention, low accuracy and difficulty to obtain fault samples, a fault diagnosis method was proposed based on manifold learning and K-nearest neighbor classifier ( KNNC). Multi-domain information entropy of vibration signal was extracted to reflect fully the working status and construct high-dimensional characteristic sets. Then the second feature extraction property of the nonlinear manifold learning algorithm, orthogonal neighborhood preserving embedding( ONPE) , was used for dimensionality reduction and to make the features get better clustering property. Finally, improved KNNC was used for pattern classification. The method is more suitable for small sample classification. A diagnostic case of a bearing proves the effectiveness of the model.
Institute of Scientific and Technical Information of China (English)
倪巍伟; 陈萧; 马中希
2015-01-01
随着人们对个体隐私的日益关注,位置服务中的隐私保护问题成为数据库领域新兴的研究热点.针对面向路网的隐私保护k近邻查询中,保护位置隐私引发的难以兼顾查询质量问题及查询者对查询效率与准确性间偏好调控需求问题,引入PoI(Points of Interest)概率分布概念,通过分析服务器端PoI邻接关系,生成PoI概率分布.将服务器端查找k近邻PoI过程分解为路网扩张查询阶段和迭代替换阶段,为迭代替换阶段构建基于PoI概率分布的可替换PoI概率预测机制.基于所构建概率预测机制,提出支持用户偏好调控的保护位置隐私k近邻查询方法AdPriQuery(Adjustable Privacy-preserving k nearest neighbor Query),查询者通过调节筛选概率阈值,在兼顾位置隐私安全的同时,实现对查询效率与准确性的偏好调控.所提调控机制对已有的基于空间混淆的路网环境保护位置隐私近邻查询方法具有良好的兼容性.理论分析和实验结果表明,所提方法在兼顾保护位置隐私的同时,能有效提高服务器端查询效率,同时支持查询结果准确性与查询效率的偏好调控要求.
Phase transition of p-adic Ising λ-model
Energy Technology Data Exchange (ETDEWEB)
Dogan, Mutlay; Akın, Hasan [Department of Mathematics, Faculty of Education, Zirve University, Gaziantep, TR27260 (Turkey); Mukhamedov, Farrukh [Department of Computational & Theoretical Sciences Faculty of Science, International Islamic University Malaysia P.O. Box, 141, 25710, Kuantan Pahang (Malaysia)
2015-09-18
We consider an interaction of the nearest-neighbors and next nearest-neighbors for the mixed type p-adic λ-model with spin values (−1, +1) on a Cayley tree of order two. In the previous work we have proved the existence of the p-adic Gibbs measure for the model. In this work we have proved the existence of the phase transition occurs for the model.
Hefner, B. Todd; Walker, James S.
1999-12-01
Position-space renormalization-group methods are used to derive exact results for an Ising model on a fractal lattice. The model incorporates both nearest-neighbor and long-range interactions. The long-range interactions, which span all length scales on the lattice, can be thought of as resulting from fractal periodic boundary conditions. We present exact phase diagrams and specific heats in terms of these two interactions, and show that a “hall of mirrors” fixed-point imaging mechanism leads to an infinite number of phase transitions.
Using New Approaches to obtain Gibbs Measures of Vannimenus model on a Cayley tree
2015-01-01
In this paper, we consider Vannimenus model with competing nearest-neighbors and prolonged next-nearest-neighbors interactions on a Cayley tree. For this model we define Markov random fields with memory of length 2. By using a new approach, we obtain new sets of Gibbs measures of Ising-Vannimenus model on Cayley tree of order 2. We construct the recurrence equations corresponding Ising-Vannimenus model. We prove the Kolmogorov consistency condition. We investigate the translation-invariant an...
Takayama, Tomohiro; Matsumoto, Akiyo; Jackeli, George; Takagi, Hidenori
2016-12-01
We report the analysis of magnetic susceptibility χ (T ) of Sr2IrO4 single crystal in the paramagnetic phase. We formulate the theoretical susceptibility based on isotropic Heisenberg antiferromagnetism incorporating the Dzyaloshinsky-Moriya interaction exactly, and include the interlayer couplings in a mean-field approximation. χ (T ) above TN was found to be well described by the model, indicating the predominant Heisenberg exchange consistent with the microscopic theory. The analysis points to a competition of nearest and next-nearest-neighbor interlayer couplings, which results in the up-up-down-down configuration of the in-plane canting moments identified by the diffraction experiments.
Institute of Scientific and Technical Information of China (English)
周彤; 彭彦昆; 刘媛媛
2014-01-01
consisted of an image acquisition device, light, a single chip microcomputer, a detection control button, and a computer and image processing algorithm equipped into the self developed system software. A black background plate was placed behind the pig carcass in order to adapt to the complexity of the environment. When a half carcass reached the camera view, the operator pressed the control button to acquire images of the carcass. And these collected images were automatically stored in the computer for further image processing. The algorithm consisted of two parts:the detection of the backfat part and the location. Some methods such as image segmentation, feature point detection, and flood fill were adapted to extract the backfat part. The method of determining the measurement position was as follows. First, the region of interest (ROI) was obtained. In this step, the rib area was extracted from the pig carcass. Then the floating window was used to scan the whole ROI image. The size of the scanning window was 20×1 and the direction of scanning was from top to bottom in each line of image. The average gray values in each scanning widow were calculated to obtain the distribution of the average gray value in each column. The feature points of the ribs were extracted by the characteristics of the average gray level line on each column of the ROI image. Next, points on the sixth and seventh ribs were clustered based on a nearest neighbor clustering algorithm. The points of each column were averaged, and they became new feature points between the sixth and seventh ribs. The horizontal and vertical coordinates of the known point were the average of new feature points. At last, we extracted the measuring line based on Passing a Known Point Hough Transform (PKPHT). The slope between two points, which belonged to the same line, was calculated and the slope accumulator was voted. The peak of the slope accumulator corresponded to the slope of the line to detect. Backfat thickness can
Institute of Scientific and Technical Information of China (English)
杨帆; 林琛; 周绮凤; 符长虹; 罗林开
2012-01-01
随机森林被广泛应用于包括癌症诊断在内的生物信息学领域.从自适应k近邻的角度分析了随机森林的分类机理,分析其存在的信息损失,据此提出一种新的投票机制,称为基于随机森林的潜在k近邻算法RF-PN,铠够充分利用决策树上的OOB样本信息,显著改善随机森林的分类性能.6个癌症基因表达数据集上的对比实验表明,RF-PN的分类准确率优于原算法.%Random forests (RF) has been widely used in bioinformatics especially in cancer diagnosis. This paper studies the classification scheme of RF from the viewpoint of adaptive k nearest neighbors, analyzes the information loss in RF, and proposes a new voting method called RF-based potential nearest neighbor which can use the information of OOB samples in each tree and show significant improvement. Comparison result on 6 cancer gene expression datasets demonstrated that RF-PN got better predictive accuracy than RF.
Incorporating groundwater flow into the WEPP model
William Elliot; Erin Brooks; Tim Link; Sue Miller
2010-01-01
The water erosion prediction project (WEPP) model is a physically-based hydrology and erosion model. In recent years, the hydrology prediction within the model has been improved for forest watershed modeling by incorporating shallow lateral flow into watershed runoff prediction. This has greatly improved WEPP's hydrologic performance on small watersheds with...
Institute of Scientific and Technical Information of China (English)
刘治理; 马光文; 严秉忠
2005-01-01
介绍了NNB-RBFN模型的基本思想和实现算法.通过日径流中的实例对模型的预测效果进行了验证,并与最近邻抽样回归模型的预测结果进行了对比,取得良好的效果.
Institute of Scientific and Technical Information of China (English)
郑启富; 张有正; 朱益民
2003-01-01
针对时间序列变量难以精确预测的问题,本文将最近邻思想与径向基函数网络相融合,提出了一种新的预测方法,并将其应用于石油产量的预测,取得了良好的效果.
Improving Relevance Feedback in Image Retrieval by Incorporating Unlabelled Images
Directory of Open Access Journals (Sweden)
Guizhi Li
2013-07-01
Full Text Available In content-base image retrieval, relevance feedback (RF schemes based on support vector machine (SVM have been widely used to narrow the semantic gap between low-level visual features and high-level human perception. However, the performance of image retrieval with SVM active learning is known to be poor when the training data is insufficient. In this paper, the problem is solved by incorporating the unlabelled images into the learning process. We proposed a semi-supervised active learning algorithm which uses not only labeled training samples but also unlabeled ones to build better models. In relevance feedback, active learning algorithm is often used to reduce the cost of labeling by selecting only the most informative data. In addition, we introduced a semi-supervised approach which employed Nearest-Neighbor technique to label the unlabeled sample with a certain degree of uncertainty in its class information. Using these samples, Fuzzy support vector machine (FSVM which takes into account the fuzzy nature of some training samples during its training is trained. We compared our method with standard active SVM on a database of 10,000 images, the experiment results show that the efficiency of SVM active learning can be improved by incorporating unlabelled images, and thus improve the overall retrieval performance.
Incorporating immigrant flows into microsimulation models.
Duleep, Harriet Orcutt; Dowhan, Daniel J
2008-01-01
Building on the research on immigrant earnings reviewed in the first article of this series, "Research on Immigrant Earnings," the preceding article, "Adding Immigrants to Microsimulation Models," linked research results to various issues essential for incorporating immigrant earnings into microsimulation models. The discussions of that article were in terms of a closed system. That is, it examined a system in which immigrant earnings and emigration are forecast for a given population represented in the base sample in the microsimulation model. This article, the last in the series, addresses immigrant earnings projections for open systems--microsimulation models that include projections of future immigration. The article suggests a simple method to project future immigrants and their earnings. Including the future flow of immigrants in microsimulation models can dramatically affect the projected Social Security benefits of some groups.
Incorporating neurophysiological concepts in mathematical thermoregulation models
Kingma, Boris R. M.; Vosselman, M. J.; Frijns, A. J. H.; van Steenhoven, A. A.; van Marken Lichtenbelt, W. D.
2014-01-01
Skin blood flow (SBF) is a key player in human thermoregulation during mild thermal challenges. Various numerical models of SBF regulation exist. However, none explicitly incorporates the neurophysiology of thermal reception. This study tested a new SBF model that is in line with experimental data on thermal reception and the neurophysiological pathways involved in thermoregulatory SBF control. Additionally, a numerical thermoregulation model was used as a platform to test the function of the neurophysiological SBF model for skin temperature simulation. The prediction-error of the SBF-model was quantified by root-mean-squared-residual (RMSR) between simulations and experimental measurement data. Measurement data consisted of SBF (abdomen, forearm, hand), core and skin temperature recordings of young males during three transient thermal challenges (1 development and 2 validation). Additionally, ThermoSEM, a thermoregulation model, was used to simulate body temperatures using the new neurophysiological SBF-model. The RMSR between simulated and measured mean skin temperature was used to validate the model. The neurophysiological model predicted SBF with an accuracy of RMSR temperature. This study shows that (1) thermal reception and neurophysiological pathways involved in thermoregulatory SBF control can be captured in a mathematical model, and (2) human thermoregulation models can be equipped with SBF control functions that are based on neurophysiology without loss of performance. The neurophysiological approach in modelling thermoregulation is favourable over engineering approaches because it is more in line with the underlying physiology.
Institute of Scientific and Technical Information of China (English)
杨宇; 曾鸣; 程军圣
2013-01-01
A rolling bearing fault diagnosis approach is proposed based on local characteristic-scale decomposition (LCD) and kernel nearest neighbor convex hull (KNNCH) classification algorithm. By using LCD, an original rolling bearing vibration signal could be adaptively decomposed into a number of intrinsic scale components (ISC) , and an initial feature vector matrix is automatically formed from these components. Then, by applying singular value decomposition technique to the initial feature vector matrix, singular values are obtained and regarded as the fault feature vector. Finally, KNNCH classifier accepts the fault feature vector as the input, and then the working condition and fault patterns of rolling bearing could be identified by the output of the classifier. LCD is a new adaptive time-frequency analysis method which very suits non-stationary signals processing. Additionally, KNNCH algorithm is a kernel-based pattern recognition approach which combines convex hull estimation and nearest neighbor classification rule. Contrast to support vector machine (SVM) algorithm, KNNCH algorithm could be directly applied to multi-class tasks and the parameter needed to be optimized is only the kernel parameter. The analysis results from rolling bearing vibration signals show that the proposed approach can effectively extract the fault feature information and accurately classify the working conditions and fault patterns of rolling bearing even in the case of small samples. What's more, the comparative analysis results demonstrate that KNNCH algorithm gains more stable classification performance than SVM algorithm.%提出了一种基于局部特征尺度分解(Local characteristic-scale decomposition,LCD)和核最近邻凸包(Kernel nearest neighbor convex hull,KNNCH)分类算法的滚动轴承故障诊断方法.采用LCD方法对滚动轴承原始振动信号进行分解得到若干内禀尺度分量(Intrinsic scale component,ISC),然后将这些ISC分量组成初始特征向
Incorporation of RAM techniques into simulation modeling
Energy Technology Data Exchange (ETDEWEB)
Nelson, S.C. Jr.; Haire, M.J.; Schryver, J.C.
1995-07-01
This work concludes that reliability, availability, and maintainability (RAM) analytical techniques can be incorporated into computer network simulation modeling to yield an important new analytical tool. This paper describes the incorporation of failure and repair information into network simulation to build a stochastic computer model represents the RAM Performance of two vehicles being developed for the US Army: The Advanced Field Artillery System (AFAS) and the Future Armored Resupply Vehicle (FARV). The AFAS is the US Army`s next generation self-propelled cannon artillery system. The FARV is a resupply vehicle for the AFAS. Both vehicles utilize automation technologies to improve the operational performance of the vehicles and reduce manpower. The network simulation model used in this work is task based. The model programmed in this application requirements a typical battle mission and the failures and repairs that occur during that battle. Each task that the FARV performs--upload, travel to the AFAS, refuel, perform tactical/survivability moves, return to logistic resupply, etc.--is modeled. Such a model reproduces a model reproduces operational phenomena (e.g., failures and repairs) that are likely to occur in actual performance. Simulation tasks are modeled as discrete chronological steps; after the completion of each task decisions are programmed that determine the next path to be followed. The result is a complex logic diagram or network. The network simulation model is developed within a hierarchy of vehicle systems, subsystems, and equipment and includes failure management subnetworks. RAM information and other performance measures are collected which have impact on design requirements. Design changes are evaluated through ``what if`` questions, sensitivity studies, and battle scenario changes.
Schmitz, Christoph; Grolms, Norman; Hof, Patrick R; Boehringer, Robert; Glaser, Jacob; Korr, Hubert
2002-09-01
Prenatal X-irradiation, even at doses <1 Gy, can induce spatial disarray of neurons in the brains of offspring, possibly due to disturbed neuronal migration. Here we analyze the effects of prenatal low-dose X-irradiation using a novel stereological method designed to investigate the three-dimensional (3D) spatial arrangement of neurons in thick sections. Pregnant mice were X-irradiated with 50 cGy on embryonic day 13 or were sham-irradiated. The right brain halves of their 180-day-old offspring were dissected into entire series of 150 microm thick frontal cryostat sections and stained with gallocyanin. Approximately 700 layer V pyramidal cells per animal were sampled in a systematic-random manner in the middle of the section's thickness. The x-y-z coordinates of these 'parent neurons' were recorded, as well as of all neighboring (up to 10) 'offspring neurons' close to each 'parent neuron'. From these data, the nearest neighbor distance (NND) distributions for layer V pyramidal cells were calculated. Using this novel 3D analysis method, we found that, in comparison to controls, prenatal X-irradiation had no effect on the total neuron number, but did cause a reduction in the mean volume of layer V by 26.5% and a more dispersed spatial arrangement of these neurons. Considering the recent literature, it seems reasonable to consider abnormal neuronal migration as the potential basic cause of this finding.
A Solvable Decorated Ising Lattice Model
Institute of Scientific and Technical Information of China (English)
无
2006-01-01
A decoratedlattice is suggested and the Ising model on it with three kinds of interactions K1, K2, and K3 is studied. Using an equivalent transformation, the square decorated Ising lattice is transformed into a regular square Ising lattice with nearest-neighbor, next-nearest-neighbor, and four-spin interactions, and the critical fixed point is found atK1 = 0.5769, K2 = -0.0671, and K3 = 0.3428, which determines the critical temperature of the system. It is also found that this system and the regular square Ising lattice, and the eight-vertex model belong to the same universality class.
Incorporating infiltration modelling in urban flood management
Directory of Open Access Journals (Sweden)
A. S. Jumadar
2008-06-01
Full Text Available Increasing frequency and intensity of flood events in urban areas can be linked to increase in impervious area due to urbanization, exacerbated by climate change. The established approach of conveying storm water by conventional drainage systems has contributed to magnification of runoff volume and peak flows beyond those of undeveloped catchments. Furthermore, the continuous upgrading of such conventional systems is costly and unsustainable in the long term. Sustainable drainage systems aim at addressing the adverse effects associated with conventional systems, by mimicking the natural drainage processes, encouraging infiltration and storage of storm water. In this study we model one of the key components of SuDS, the infiltration basins, in order to assert the benefits of the approach. Infiltration modelling was incorporated in the detention storage unit within the one-dimensional urban storm water management model, EPA-SWMM 5.0. By introduction of infiltration modelling in the storage, the flow attenuation performance of the unit was considerably improved. The study also examines the catchment scale impact of both source and regional control storage/infiltration systems. Based on the findings of two case study areas modelled with the proposed options, it was observed that source control systems have a greater and much more natural impact at a catchment level, with respect to flow attenuation, compared to regional control systems of which capacity is equivalent to the sum of source control capacity at the catchment.
The Islands Approach to Nearest Neighbor Querying in Spatial Networks
DEFF Research Database (Denmark)
Huang, Xuegang; Jensen, Christian Søndergaard; Saltenis, Simonas
2005-01-01
Much research has recently been devoted to the data management foundations of location-based mobile services. In one important scenario, the service users are constrained to a transportation network. As a result, query processing in spatial road networks is of interest. We propose a versatile app...
Clustered K nearest neighbor algorithm for daily inflow forecasting
Akbari, M.; Van Overloop, P.J.A.T.M.; Afshar, A.
2010-01-01
Instance based learning (IBL) algorithms are a common choice among data driven algorithms for inflow forecasting. They are based on the similarity principle and prediction is made by the finite number of similar neighbors. In this sense, the similarity of a query instance is estimated according to
Nearest Neighbor Classification Using a Density Sensitive Distance Measurement
2009-09-01
Standards and Technology (NIST) (MNIST Handwritten Digit Database, Yann LeCun and Corinna Cortes .). The MNIST database was constructed from NIST’s...the generalised distance in statistics Proceedings of the National Institute of Sciences of India , 2(1), 49–55. MNIST Handwritten Digit Database...Yann LeCun and Corinna Cortes . Retrieved 9/28/2009 from http://yann.lecun.com/exdb/mnist/ OpenCV 1.1 (2008). Open Computer Vision Library Downloads
MOST Observations of Our Nearest Neighbor: Flares on Proxima Centauri
Davenport, James R. A.; Kipping, David M.; Sasselov, Dimitar; Matthews, Jaymie M.; Cameron, Chris
2016-10-01
We present a study of white-light flares from the active M5.5 dwarf Proxima Centauri using the Canadian microsatellite Microvariability and Oscillations of STars. Using 37.6 days of monitoring data from 2014 to 2015, we have detected 66 individual flare events, the largest number of white-light flares observed to date on Proxima Cen. Flare energies in our sample range from 1029 to 1031.5 erg. The flare rate is lower than that of other classic flare stars of a similar spectral type, such as UV Ceti, which may indicate Proxima Cen had a higher flare rate in its youth. Proxima Cen does have an unusually high flare rate given its slow rotation period, however. Extending the observed power-law occurrence distribution down to 1028 erg, we show that flares with flux amplitudes of 0.5% occur 63 times per day, while superflares with energies of 1033 erg occur ∼8 times per year. Small flares may therefore pose a great difficulty in searches for transits from the recently announced 1.27 M ⊕ Proxima b, while frequent large flares could have significant impact on the planetary atmosphere.
MOST Observations of our Nearest Neighbor: Flares on Proxima Centauri
Davenport, James R A; Sasselov, Dimitar; Matthews, Jaymie M; Cameron, Chris
2016-01-01
We present a study of white light flares from the active M5.5 dwarf Proxima Centauri using the Canadian microsatellite MOST. Using 37.6 days of monitoring data from 2014 and 2015, we have detected 66 individual flare events, the largest number of white light flares observed to date on Proxima Cen. Flare energies in our sample range from $10^{29}$-$10^{31.5}$ erg, with complex, multi-peaked structure found in 22% of these events. The flare rate is lower than that of other classic flare stars of similar spectral type, such as UV Ceti, which may indicate Proxima Cen had a higher flare rate in its youth. Proxima Cen does have an unusually high flare rate given the slow reported rotation period, however. Extending the observed power-law occurrence distribution down to $10^{28}$ erg, we show that flares with flux amplitudes of 0.5% occur 63 times per day, while superflares with energies of $10^{33}$ erg occur ~8 times per year. Small flares may therefore pose a great difficulty in searches for transits from the rec...
Incorporation of salinity in Water Availability Modeling
Wurbs, Ralph A.; Lee, Chihun
2011-10-01
SummaryNatural salt pollution from geologic formations in the upper watersheds of several large river basins in the Southwestern United States severely constrains the use of otherwise available major water supply sources. The Water Rights Analysis Package modeling system has been routinely applied in Texas since the late 1990s in regional and statewide planning studies and administration of the state's water rights permit system, but without consideration of water quality. The modeling system was recently expanded to incorporate salinity considerations in assessments of river/reservoir system capabilities for supplying water for environmental, municipal, agricultural, and industrial needs. Salinity loads and concentrations are tracked through systems of river reaches and reservoirs to develop concentration frequency statistics that augment flow frequency and water supply reliability metrics at pertinent locations for alternative water management strategies. Flexible generalized capabilities are developed for using limited observed salinity data to model highly variable concentrations imposed upon complex river regulation infrastructure and institutional water allocation/management practices.
Institute of Scientific and Technical Information of China (English)
GE Rong-Chun; LI Chuan-Feng; GUO Guang-Can
2012-01-01
We investigate the dynamics of entanglement,quantum correlation and classical correlation for the one-dimensional XY model in a transverse magnetic field.With the initial state polarized along the z axis,we find that the first maximum of the classical correlation between the nearest neighbor sites peaks around the critical point for large anisotropy parameter.It may indicate the quantum phase transition.For all kinds of correlation,we find that their behaviors between the nearest neighbor sites are significantly different from those of the next-nearestneighbor sites.
Self-organized Criticality in an Earthquake Model on Random Network
Institute of Scientific and Technical Information of China (English)
无
2006-01-01
A simplified Olami-Feder-Christensen model on a random network has been studied. We propose a new toppling rule - when there is an unstable site toppling, the energy of the site is redistributed to its nearest neighbors randomly not averagely. The simulation results indicate that the model displays self-organized criticality when the system is conservative, and the avalanche size probability distribution of the system obeys finite size scaling. When the system is nonconservative, the model does not display scaling behavior. Simulation results of our model with different nearest neighbors q is also compared, which indicates that the spatialtopology does not alter the critical behavior of the system.
a Dynamical Model with Next-Nearest Interaction in Relative Velocity
Li, Zhipeng; Liu, Yuncai; Liu, Fuqiang
By introducing the velocity difference between the preceding car and the car before the preceding one into the optimal velocity model (OVM), we present an extended dynamical model which takes into account the next-nearest-neighbor interaction in relative velocity. The stability condition of this model is derived by considering a small perturbation around the uniform flow solution and the validity of our theoretical analysis is also confirmed by direct simulations. The analytic and simulation results indicate that traffic congestion is suppressed efficiently by incorporating the effect of new consideration. Moreover, the effect of the new consideration is investigated by numerical simulation. In particular, the jamming flow, the current-density relation, and the propagation speed of small disturbance are examined in detail by varying various values of the parameter.
Liu, Guisen; Cheng, Xi; Wang, Jian; Chen, Kaiguo; Shen, Yao
2017-03-02
Prediction of Peierls stress associated with dislocation glide is of fundamental concern in understanding and designing the plasticity and mechanical properties of crystalline materials. Here, we develop a nonlocal semi-discrete variational Peierls-Nabarro (SVPN) model by incorporating the nonlocal atomic interactions into the semi-discrete variational Peierls framework. The nonlocal kernel is simplified by limiting the nonlocal atomic interaction in the nearest neighbor region, and the nonlocal coefficient is directly computed from the dislocation core structure. Our model is capable of accurately predicting the displacement profile, and the Peierls stress, of planar-extended core dislocations in face-centered cubic structures. Our model could be extended to study more complicated planar-extended core dislocations, such as {111} dislocations in Al-based and Ti-based intermetallic compounds.
Off-lattice model for the phase behavior of lipid-cholesterol bilayers
DEFF Research Database (Denmark)
Nielsen, Morten; Miao, Ling; Ipsen, John Hjorth
1999-01-01
Lipid bilayers exhibit a phase behavior that involves two distinct, but coupled, order-disorder processes, one in terms of lipid-chain crystalline packing (translational degrees of freedom) and the other in terms of lipid-chain conformational ordering (internal degrees of freedom). Experiments...... and previous approximate theories have suggested that cholesterol incorporated into lipid bilayers has different microscopic effects on lipid-chain packing and conformations and that cholesterol thereby leads to decoupling of the two ordering processes, manifested by a special equilibrium phase, "liquid......-lattice model based on a two-dimensional random triangulation algorithm and represents lipid and cholesterol molecules by hard-core particles with internal (spin-type) degrees of freedom that have nearest-neighbor interactions. The phase equilibria described by the model, specifically in terms of phase diagrams...
Emergent phases in the spin orbit coupled spin-1 Bose Hubbard model
Natu, Stefan; Pixley, Jedediah
2015-05-01
Motivated by recent experiments on spin orbit coupled, ultra-cold Bose gases, we theoretically study the spin-1 Bose Hubbard model in the presence and absence of spin orbit coupling (SOC). In the absence of SOC, using a spatially homogenous Gutzwiller mean field theory, we determine the phase diagram and excitation spectrum of the spin-1 Bose Hubbard model on a hyper-cubic lattice in both the polar and ferromagnetic phases. We focus on the evolution of various density, spin, and nematic order parameters across the phase diagram as a function of chemical potential and nearest neighbor hopping. We then generalize the Gutzwiller mean-field theory to incorporate spin-orbit coupling by allowing the mean-fields to be spatially inhomogeneous, which enable us to study spontaneous translational symmetry broken phases. To connect with ongoing experiments, we focus on the lattice generalization of the experimentally realized 1D spin-orbit coupling.
Incorporating direct marketing activity into latent attrition models
Schweidel, David A.; Knox, George
2013-01-01
When defection is unobserved, latent attrition models provide useful insights about customer behavior and accurate forecasts of customer value. Yet extant models ignore direct marketing efforts. Response models incorporate the effects of direct marketing, but because they ignore latent attrition,
Modeling the amide I bands of small peptides
Jansen, Thomas la Cour; Dijkstra, Arend G.; Watson, Tim M.; Hirst, Jonathan D.; Knoester, Jasper
2006-01-01
In this paper different floating oscillator models for describing the amide I band of peptides and proteins are compared with density functional theory (DFT) calculations. Models for the variation of the frequency shifts of the oscillators and the nearest-neighbor coupling between them with respect
Outdoor-indoor Space: Unified Modeling and Shortest Path Search
DEFF Research Database (Denmark)
Jensen, Søren Kejser; Nielsen, Jens Thomas Vejlby; Lu, Hua;
2016-01-01
Graph models are widely used for representing the topology of outdoor space (O-Space) and indoor space (I-Space). However, existing models neglect the intersection between O-Space and I-Space, only allowing for computations such as shortest path and nearest neighbor queries in either O-Space or I...
Random non-Hermitian tight-binding models
Marinello, G.; Pato, M. P.
2016-08-01
For a one dimensional system tight binding models are described by sparse tridiagonal matrices which describe interactions between nearest neighbors. In this report, we construct open and closed random tight-binding models based in the tridiagonal matrices of the so-called,β-ensembles of random matrix theory.
Multiplicative earthquake likelihood models incorporating strain rates
Rhoades, D. A.; Christophersen, A.; Gerstenberger, M. C.
2017-01-01
SUMMARYWe examine the potential for strain-rate variables to improve long-term earthquake likelihood models. We derive a set of multiplicative hybrid earthquake likelihood models in which cell rates in a spatially uniform baseline model are scaled using combinations of covariates derived from earthquake catalogue data, fault data, and strain-rates for the New Zealand region. Three components of the strain rate estimated from GPS data over the period 1991-2011 are considered: the shear, rotational and dilatational strain rates. The hybrid model parameters are optimised for earthquakes of M 5 and greater over the period 1987-2006 and tested on earthquakes from the period 2012-2015, which is independent of the strain rate estimates. The shear strain rate is overall the most informative individual covariate, as indicated by Molchan error diagrams as well as multiplicative modelling. Most models including strain rates are significantly more informative than the best models excluding strain rates in both the fitting and testing period. A hybrid that combines the shear and dilatational strain rates with a smoothed seismicity covariate is the most informative model in the fitting period, and a simpler model without the dilatational strain rate is the most informative in the testing period. These results have implications for probabilistic seismic hazard analysis and can be used to improve the background model component of medium-term and short-term earthquake forecasting models.
Exact solutions of the high dimensional hard-core Fermi-Hubbard model
Institute of Scientific and Technical Information of China (English)
潘峰; 戴连荣
2001-01-01
A simple algebraic approach to exact solutions of the hard-core Fermi-Hubbard model is proposed. Excitation energies and the corresponding wavefunctions of the hard-core Fermi-Hubbard model with nearest neighbor hopping cases in high dimension are obtained by using this method, which manifests that the model is exactly solvable in any dimension.
A Financial Market Model Incorporating Herd Behaviour.
Wray, Christopher M; Bishop, Steven R
2016-01-01
Herd behaviour in financial markets is a recurring phenomenon that exacerbates asset price volatility, and is considered a possible contributor to market fragility. While numerous studies investigate herd behaviour in financial markets, it is often considered without reference to the pricing of financial instruments or other market dynamics. Here, a trader interaction model based upon informational cascades in the presence of information thresholds is used to construct a new model of asset price returns that allows for both quiescent and herd-like regimes. Agent interaction is modelled using a stochastic pulse-coupled network, parametrised by information thresholds and a network coupling probability. Agents may possess either one or two information thresholds that, in each case, determine the number of distinct states an agent may occupy before trading takes place. In the case where agents possess two thresholds (labelled as the finite state-space model, corresponding to agents' accumulating information over a bounded state-space), and where coupling strength is maximal, an asymptotic expression for the cascade-size probability is derived and shown to follow a power law when a critical value of network coupling probability is attained. For a range of model parameters, a mixture of negative binomial distributions is used to approximate the cascade-size distribution. This approximation is subsequently used to express the volatility of model price returns in terms of the model parameter which controls the network coupling probability. In the case where agents possess a single pulse-coupling threshold (labelled as the semi-infinite state-space model corresponding to agents' accumulating information over an unbounded state-space), numerical evidence is presented that demonstrates volatility clustering and long-memory patterns in the volatility of asset returns. Finally, output from the model is compared to both the distribution of historical stock returns and the market
Incorporating Resilience into Dynamic Social Models
2016-07-20
resiliency, computational modeling, computational social science /systems, modeling and simulation 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF...system. The relationships between random variables are given as conditional probability rules. BKBs are represented as a directed graph with...and BKB inferencing methods can be found in Santos et al [20]. 4.1. BKB Definition and Inferencing A BKB is a directed , bipartite graph consisting
Incorporating evolutionary processes into population viability models.
Pierson, Jennifer C; Beissinger, Steven R; Bragg, Jason G; Coates, David J; Oostermeijer, J Gerard B; Sunnucks, Paul; Schumaker, Nathan H; Trotter, Meredith V; Young, Andrew G
2015-06-01
We examined how ecological and evolutionary (eco-evo) processes in population dynamics could be better integrated into population viability analysis (PVA). Complementary advances in computation and population genomics can be combined into an eco-evo PVA to offer powerful new approaches to understand the influence of evolutionary processes on population persistence. We developed the mechanistic basis of an eco-evo PVA using individual-based models with individual-level genotype tracking and dynamic genotype-phenotype mapping to model emergent population-level effects, such as local adaptation and genetic rescue. We then outline how genomics can allow or improve parameter estimation for PVA models by providing genotypic information at large numbers of loci for neutral and functional genome regions. As climate change and other threatening processes increase in rate and scale, eco-evo PVAs will become essential research tools to evaluate the effects of adaptive potential, evolutionary rescue, and locally adapted traits on persistence.
Incorporating 3-dimensional models in online articles
Cevidanes, Lucia H. S.; Ruellasa, Antonio C. O.; Jomier, Julien; Nguyen, Tung; Pieper, Steve; Budin, Francois; Styner, Martin; Paniagua, Beatriz
2015-01-01
Introduction The aims of this article were to introduce the capability to view and interact with 3-dimensional (3D) surface models in online publications, and to describe how to prepare surface models for such online 3D visualizations. Methods Three-dimensional image analysis methods include image acquisition, construction of surface models, registration in a common coordinate system, visualization of overlays, and quantification of changes. Cone-beam computed tomography scans were acquired as volumetric images that can be visualized as 3D projected images or used to construct polygonal meshes or surfaces of specific anatomic structures of interest. The anatomic structures of interest in the scans can be labeled with color (3D volumetric label maps), and then the scans are registered in a common coordinate system using a target region as the reference. The registered 3D volumetric label maps can be saved in .obj, .ply, .stl, or .vtk file formats and used for overlays, quantification of differences in each of the 3 planes of space, or color-coded graphic displays of 3D surface distances. Results All registered 3D surface models in this study were saved in .vtk file format and loaded in the Elsevier 3D viewer. In this study, we describe possible ways to visualize the surface models constructed from cone-beam computed tomography images using 2D and 3D figures. The 3D surface models are available in the article’s online version for viewing and downloading using the reader’s software of choice. These 3D graphic displays are represented in the print version as 2D snapshots. Overlays and color-coded distance maps can be displayed using the reader’s software of choice, allowing graphic assessment of the location and direction of changes or morphologic differences relative to the structure of reference. The interpretation of 3D overlays and quantitative color-coded maps requires basic knowledge of 3D image analysis. Conclusions When submitting manuscripts, authors can
Dynamical phase transitions in the two-dimensional ANNNI model
Energy Technology Data Exchange (ETDEWEB)
Barber, M.N.; Derrida, B.
1988-06-01
We study the phase diagram of the two-dimensional anisotropic next-nearest neighbor Ising (ANNNI) model by comparing the time evolution of two distinct spin configurations submitted to the same thermal noise. We clearly se several dynamical transitions between ferromagnetic, paramagnetic, antiphase, and floating phases. These dynamical transitions seem to occur rather close to the transition lines determined previously in the literature.
Incorporating territory compression into population models
Ridley, J; Komdeur, J; Sutherland, WJ; Sutherland, William J.
The ideal despotic distribution, whereby the lifetime reproductive success a territory's owner achieves is unaffected by population density, is a mainstay of behaviour-based population models. We show that the population dynamics of an island population of Seychelles warblers (Acrocephalus
Incorporating POS Tagging into Language Modeling
Heeman, P A; Heeman, Peter A.; Allen, James F.
1997-01-01
Language models for speech recognition tend to concentrate solely on recognizing the words that were spoken. In this paper, we redefine the speech recognition problem so that its goal is to find both the best sequence of words and their syntactic role (part-of-speech) in the utterance. This is a necessary first step towards tightening the interaction between speech recognition and natural language understanding.
Incorporating direct marketing activity into latent attrition models
Schweidel, David A.; Knox, George
2013-01-01
When defection is unobserved, latent attrition models provide useful insights about customer behavior and accurate forecasts of customer value. Yet extant models ignore direct marketing efforts. Response models incorporate the effects of direct marketing, but because they ignore latent attrition, th
New phases in an extended Hubbard model explicitly including atomic polarizabilities
Brink, van de J.; Meinders, M.B.J.; Lorenzana, J.; Eder, R.; Sawatzky, G.A.
1996-01-01
We consider the influence of a nearest-neighbor Coulomb interaction in an extended Hubbard model and introduce a new interaction term which simulates atomic polarizabilities. This has the effect of screening the on-site Coulomb interaction for charged excitations, unlike a neighbor Coulomb interacti
Monte Carlo renormalization: the triangular Ising model as a test case.
Guo, Wenan; Blöte, Henk W J; Ren, Zhiming
2005-04-01
We test the performance of the Monte Carlo renormalization method in the context of the Ising model on a triangular lattice. We apply a block-spin transformation which allows for an adjustable parameter so that the transformation can be optimized. This optimization purportedly brings the fixed point of the transformation to a location where the corrections to scaling vanish. To this purpose we determine corrections to scaling of the triangular Ising model with nearest- and next-nearest-neighbor interactions by means of transfer-matrix calculations and finite-size scaling. We find that the leading correction to scaling just vanishes for the nearest-neighbor model. However, the fixed point of the commonly used majority-rule block-spin transformation appears to lie well away from the nearest-neighbor critical point. This raises the question whether the majority rule is suitable as a renormalization transformation, because the standard assumptions of real-space renormalization imply that corrections to scaling vanish at the fixed point. We avoid this inconsistency by means of the optimized transformation which shifts the fixed point back to the vicinity of the nearest-neighbor critical Hamiltonian. The results of the optimized transformation in terms of the Ising critical exponents are more accurate than those obtained with the majority rule.
Emergent lattices with geometrical frustration in doped extended Hubbard models
Kaneko, Ryui; Tocchio, Luca F.; Valentí, Roser; Gros, Claudius
2016-11-01
Spontaneous charge ordering occurring in correlated systems may be considered as a possible route to generate effective lattice structures with unconventional couplings. For this purpose we investigate the phase diagram of doped extended Hubbard models on two lattices: (i) the honeycomb lattice with on-site U and nearest-neighbor V Coulomb interactions at 3 /4 filling (n =3 /2 ) and (ii) the triangular lattice with on-site U , nearest-neighbor V , and next-nearest-neighbor V' Coulomb interactions at 3 /8 filling (n =3 /4 ). We consider various approaches including mean-field approximations, perturbation theory, and variational Monte Carlo. For the honeycomb case (i), charge order induces an effective triangular lattice at large values of U /t and V /t , where t is the nearest-neighbor hopping integral. The nearest-neighbor spin exchange interactions on this effective triangular lattice are antiferromagnetic in most of the phase diagram, while they become ferromagnetic when U is much larger than V . At U /t ˜(V/t ) 3 , ferromagnetic and antiferromagnetic exchange interactions nearly cancel out, leading to a system with four-spin ring-exchange interactions. On the other hand, for the triangular case (ii) at large U and finite V', we find no charge order for small V , an effective kagome lattice for intermediate V , and one-dimensional charge order for large V . These results indicate that Coulomb interactions induce [case (i)] or enhance [case(ii)] emergent geometrical frustration of the spin degrees of freedom in the system, by forming charge order.
Incorporating RTI in a Hybrid Model of Reading Disability
Spencer, Mercedes; Wagner, Richard K.; Schatschneider, Christopher; Quinn, Jamie M.; Lopez, Danielle; Petscher, Yaacov
2014-01-01
The present study seeks to evaluate a hybrid model of identification that incorporates response to instruction and intervention (RTI) as one of the key symptoms of reading disability. The 1-year stability of alternative operational definitions of reading disability was examined in a large-scale sample of students who were followed longitudinally…
"Violent Intent Modeling: Incorporating Cultural Knowledge into the Analytical Process
Energy Technology Data Exchange (ETDEWEB)
Sanfilippo, Antonio P.; Nibbs, Faith G.
2007-08-24
While culture has a significant effect on the appropriate interpretation of textual data, the incorporation of cultural considerations into data transformations has not been systematic. Recognizing that the successful prevention of terrorist activities could hinge on the knowledge of the subcultures, Anthropologist and DHS intern Faith Nibbs has been addressing the need to incorporate cultural knowledge into the analytical process. In this Brown Bag she will present how cultural ideology is being used to understand how the rhetoric of group leaders influences the likelihood of their constituents to engage in violent or radicalized behavior, and how violent intent modeling can benefit from understanding that process.
Incorporating RTI in a Hybrid Model of Reading Disability
2014-01-01
The present study seeks to evaluate a hybrid model of identification that incorporates response-to-intervention (RTI) as a one of the key symptoms of reading disability. The one-year stability of alternative operational definitions of reading disability was examined in a large scale sample of students who were followed longitudinally from first to second grade. The results confirmed previous findings of limited stability for single-criterion based operational definitions of reading disability...
Incorporating the Hayflick Limit into a model of Telomere Dynamics
Cyrenne, Benoit M
2013-01-01
A model of telomere dynamics is proposed and examined. Our model, which extends a previously introduced two-compartment model that incorporates stem cells as progenitors of new cells, imposes the Hayflick Limit, the maximum number of cell divisions that are possible. This new model leads to cell populations for which the average telomere length is not necessarily a monotonically decreasing function of time, in contrast to previously published models. We provide a phase diagram indicating where such results would be expected. In addition, qualitatively different results are obtained for the evolution of the total cell population. Last, in comparison to available leukocyte baboon data, this new model is shown to provide a better fit to biological data.
Incorporating Linguistic Structure into Maximum Entropy Language Models
Institute of Scientific and Technical Information of China (English)
FANG GaoLin(方高林); GAO Wen(高文); WANG ZhaoQi(王兆其)
2003-01-01
In statistical language models, how to integrate diverse linguistic knowledge in a general framework for long-distance dependencies is a challenging issue. In this paper, an improved language model incorporating linguistic structure into maximum entropy framework is presented.The proposed model combines trigram with the structure knowledge of base phrase in which trigram is used to capture the local relation between words, while the structure knowledge of base phrase is considered to represent the long-distance relations between syntactical structures. The knowledge of syntax, semantics and vocabulary is integrated into the maximum entropy framework.Experimental results show that the proposed model improves by 24% for language model perplexity and increases about 3% for sign language recognition rate compared with the trigram model.
Methods improvements incorporated into the SAPHIRE ASP models
Energy Technology Data Exchange (ETDEWEB)
Sattison, M.B.; Blackman, H.S.; Novack, S.D. [Idaho National Engineering Lab., Idaho Falls, ID (United States)] [and others
1995-04-01
The Office for Analysis and Evaluation of Operational Data (AEOD) has sought the assistance of the Idaho National Engineering Laboratory (INEL) to make some significant enhancements to the SAPHIRE-based Accident Sequence Precursor (ASP) models recently developed by the INEL. The challenge of this project is to provide the features of a full-scale PRA within the framework of the simplified ASP models. Some of these features include: (1) uncertainty analysis addressing the standard PRA uncertainties and the uncertainties unique to the ASP models and methods, (2) incorporation and proper quantification of individual human actions and the interaction among human actions, (3) enhanced treatment of common cause failures, and (4) extension of the ASP models to more closely mimic full-scale PRAs (inclusion of more initiators, explicitly modeling support system failures, etc.). This paper provides an overview of the methods being used to make the above improvements.
A novel fluence map optimization model incorporating leaf sequencing constraints.
Jin, Renchao; Min, Zhifang; Song, Enmin; Liu, Hong; Ye, Yinyu
2010-02-21
A novel fluence map optimization model incorporating leaf sequencing constraints is proposed to overcome the drawbacks of the current objective inside smoothing models. Instead of adding a smoothing item to the objective function, we add the total number of monitor unit (TNMU) requirement directly to the constraints which serves as an important factor to balance the fluence map optimization and leaf sequencing optimization process at the same time. Consequently, we formulate the fluence map optimization models for the trailing (left) leaf synchronized, leading (right) leaf synchronized and the interleaf motion constrained non-synchronized leaf sweeping schemes, respectively. In those schemes, the leaves are all swept unidirectionally from left to right. Each of those models is turned into a linear constrained quadratic programming model which can be solved effectively by the interior point method. Those new models are evaluated with two publicly available clinical treatment datasets including a head-neck case and a prostate case. As shown by the empirical results, our models perform much better in comparison with two recently emerged smoothing models (the total variance smoothing model and the quadratic smoothing model). For all three leaf sweeping schemes, our objective dose deviation functions increase much slower than those in the above two smoothing models with respect to the decreasing of the TNMU. While keeping plans in the similar conformity level, our new models gain much better performance on reducing TNMU.
Incorporating vegetation feedbacks in regional climate modeling over West Africa
Erfanian, A.; Wang, G.; Yu, M.; Ahmed, K. F.; Anyah, R. O.
2015-12-01
Despite major advancements in modeling of the climate system, incorporating vegetation dynamics into climate models is still at the initial stages making it an ongoing research topic. Only few of GCMs participating in CMIP5 simulations included the vegetation dynamics component. Consideration for vegetation dynamics is even less common in RCMs. In this study, RegCM4.3.4-CLM4-CN-DV, a regional climate model synchronously coupled with a land surface component that includes both Carbon-Nitrogen (CN) and Dynamic-Vegetation (DV) processes is used to simulate and project regional climate over West Africa. Due to its unique regional features, West Africa climate is known for being susceptible to land-atmosphere interactions, enhancing the importance of including vegetation dynamics in modeling climate over this region. In this study the model is integrated for two scenarios (present-day and future) using outputs from four GCMs participating in CMIP5 (MIROC, CESM, GFDL and CCSM4) as lateral boundary conditions, which form the basis of a multi-model ensemble. Results of model validation indicates that ensemble of all models outperforms each of individual models in simulating present-day temperature and precipitation. Therefore, the ensemble set is used to analyze the impact of including vegetation dynamics in the RCM on future projection of West Africa's climate. Results from the ensemble analysis will be presented, together with comparison among individual models.
Ordering in Two-Dimensional Ising Models with Competing Interactions
2004-01-01
We study the 2D Ising model on a square lattice with additional non-equal diagonal next-nearest neighbor interactions. The cases of classical and quantum (transverse) models are considered. Possible phases and their locations in the space of three Ising couplings are analyzed. In particular, incommensurate phases occurring only at non-equal diagonal couplings, are predicted. We also analyze a spin-pseudospin model comprised of the quantum Ising model coupled to XY spin chains in a particular ...
A simple spatiotemporal chaotic Lotka-Volterra model
Energy Technology Data Exchange (ETDEWEB)
Sprott, J.C. [Department of Physics, University of Wisconsin, 1150 University Avenue, Madison, WI 53706 (United States)] e-mail: sprott@physics.wisc.edu; Wildenberg, J.C. [Department of Physics, University of Wisconsin, 1150 University Avenue, Madison, WI 53706 (United States)] e-mail: jcwildenberg@wisc.edu; Azizi, Yousef [Institute for Advanced Studies in Basic Sciences, Zanjan (Iran, Islamic Republic of)] e-mail: joseph_azizi@yahoo.com
2005-11-01
A mathematically simple example of a high-dimensional (many-species) Lotka-Volterra model that exhibits spatiotemporal chaos in one spatial dimension is described. The model consists of a closed ring of identical agents, each competing for fixed finite resources with two of its four nearest neighbors. The model is prototypical of more complicated models in its quasiperiodic route to chaos (including attracting 3-tori), bifurcations, spontaneous symmetry breaking, and spatial pattern formation.
A mathematical model for incorporating biofeedback into human postural control
Directory of Open Access Journals (Sweden)
Ersal Tulga
2013-02-01
Full Text Available Abstract Background Biofeedback of body motion can serve as a balance aid and rehabilitation tool. To date, mathematical models considering the integration of biofeedback into postural control have represented this integration as a sensory addition and limited their application to a single degree-of-freedom representation of the body. This study has two objectives: 1 to develop a scalable method for incorporating biofeedback into postural control that is independent of the model’s degrees of freedom, how it handles sensory integration, and the modeling of its postural controller; and 2 to validate this new model using multidirectional perturbation experimental results. Methods Biofeedback was modeled as an additional torque to the postural controller torque. For validation, this biofeedback modeling approach was applied to a vibrotactile biofeedback device and incorporated into a two-link multibody model with full-state-feedback control that represents the dynamics of bipedal stance. Average response trajectories of body sway and center of pressure (COP to multidirectional surface perturbations of subjects with vestibular deficits were used for model parameterization and validation in multiple perturbation directions and for multiple display resolutions. The quality of fit was quantified using average error and cross-correlation values. Results The mean of the average errors across all tactor configurations and perturbations was 0.24° for body sway and 0.39 cm for COP. The mean of the cross-correlation value was 0.97 for both body sway and COP. Conclusions The biofeedback model developed in this study is capable of capturing experimental response trajectory shapes with low average errors and high cross-correlation values in both the anterior-posterior and medial-lateral directions for all perturbation directions and spatial resolution display configurations considered. The results validate that biofeedback can be modeled as an additional
Safety models incorporating graph theory based transit indicators.
Quintero, Liliana; Sayed, Tarek; Wahba, Mohamed M
2013-01-01
There is a considerable need for tools to enable the evaluation of the safety of transit networks at the planning stage. One interesting approach for the planning of public transportation systems is the study of networks. Network techniques involve the analysis of systems by viewing them as a graph composed of a set of vertices (nodes) and edges (links). Once the transport system is visualized as a graph, various network properties can be evaluated based on the relationships between the network elements. Several indicators can be calculated including connectivity, coverage, directness and complexity, among others. The main objective of this study is to investigate the relationship between network-based transit indicators and safety. The study develops macro-level collision prediction models that explicitly incorporate transit physical and operational elements and transit network indicators as explanatory variables. Several macro-level (zonal) collision prediction models were developed using a generalized linear regression technique, assuming a negative binomial error structure. The models were grouped into four main themes: transit infrastructure, transit network topology, transit route design, and transit performance and operations. The safety models showed that collisions were significantly associated with transit network properties such as: connectivity, coverage, overlapping degree and the Local Index of Transit Availability. As well, the models showed a significant relationship between collisions and some transit physical and operational attributes such as the number of routes, frequency of routes, bus density, length of bus and 3+ priority lanes.
Tantalum strength model incorporating temperature, strain rate and pressure
Lim, Hojun; Battaile, Corbett; Brown, Justin; Lane, Matt
Tantalum is a body-centered-cubic (BCC) refractory metal that is widely used in many applications in high temperature, strain rate and pressure environments. In this work, we propose a physically-based strength model for tantalum that incorporates effects of temperature, strain rate and pressure. A constitutive model for single crystal tantalum is developed based on dislocation kink-pair theory, and calibrated to measurements on single crystal specimens. The model is then used to predict deformations of single- and polycrystalline tantalum. In addition, the proposed strength model is implemented into Sandia's ALEGRA solid dynamics code to predict plastic deformations of tantalum in engineering-scale applications at extreme conditions, e.g. Taylor impact tests and Z machine's high pressure ramp compression tests, and the results are compared with available experimental data. Sandia National Laboratories is a multi program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000.
Incorporating Plant Phenology Dynamics in a Biophysical Canopy Model
Barata, Raquel A.; Drewry, Darren
2012-01-01
The Multi-Layer Canopy Model (MLCan) is a vegetation model created to capture plant responses to environmental change. Themodel vertically resolves carbon uptake, water vapor and energy exchange at each canopy level by coupling photosynthesis, stomatal conductance and leaf energy balance. The model is forced by incoming shortwave and longwave radiation, as well as near-surface meteorological conditions. The original formulation of MLCan utilized canopy structural traits derived from observations. This project aims to incorporate a plant phenology scheme within MLCan allowing these structural traits to vary dynamically. In the plant phenology scheme implemented here, plant growth is dependent on environmental conditions such as air temperature and soil moisture. The scheme includes functionality that models plant germination, growth, and senescence. These growth stages dictate the variation in six different vegetative carbon pools: storage, leaves, stem, coarse roots, fine roots, and reproductive. The magnitudes of these carbon pools determine land surface parameters such as leaf area index, canopy height, rooting depth and root water uptake capacity. Coupling this phenology scheme with MLCan allows for a more flexible representation of the structure and function of vegetation as it responds to changing environmental conditions.
A dengue model incorporating saturation incidence and human migration
Gakkhar, S.; Mishra, A.
2015-03-01
In this paper, a non-linear model has been proposed to investigate the effects of human migration on dengue dynamics. Human migration has been considered between two patches having different dengue strains. Due to migration secondary infection is possible. Further, the secondary infection is considered in patch-2 only as strain-2 in patch-2 is considered to be more severe than that of strain-1 in patch-1. The saturation incidence rate has been considered to incorporate the behavioral changes towards epidemic in human population. The basic reproduction number has been computed. Four Equilibrium states have been found and analyzed. Increasing saturation rate decreases the threshold thereby enhancing the stability of disease-free state in both the patches. Control on migration may lead to change in infection level of patches.
An SIRS Epidemic Model Incorporating Media Coverage with Time Delay
Lin, Yiping; Dai, Yunxian
2014-01-01
An SIRS epidemic model incorporating media coverage with time delay is proposed. The positivity and boundedness are studied firstly. The locally asymptotical stability of the disease-free equilibrium and endemic equilibrium is studied in succession. And then, the conditions on which periodic orbits bifurcate are given. Furthermore, we show that the local Hopf bifurcation implies the global Hopf bifurcation after the second critical value of the delay. The obtained results show that the time delay in media coverage can not affect the stability of the disease-free equilibrium when the basic reproduction number R0 1, the stability of the endemic equilibrium will be affected by the time delay; there will be a family of periodic orbits bifurcating from the endemic equilibrium when the time delay increases through a critical value. Finally, some examples for numerical simulations are also included. PMID:24723967
A Latent Source Model for Patch-Based Image Segmentation.
Chen, George H; Shah, Devavrat; Golland, Polina
2015-10-01
Despite the popularity and empirical success of patch-based nearest-neighbor and weighted majority voting approaches to medical image segmentation, there has been no theoretical development on when, why, and how well these nonparametric methods work. We bridge this gap by providing a theoretical performance guarantee for nearest-neighbor and weighted majority voting segmentation under a new probabilistic model for patch-based image segmentation. Our analysis relies on a new local property for how similar nearby patches are, and fuses existing lines of work on modeling natural imagery patches and theory for nonparametric classification. We use the model to derive a new patch-based segmentation algorithm that iterates between inferring local label patches and merging these local segmentations to produce a globally consistent image segmentation. Many existing patch-based algorithms arise as special cases of the new algorithm.
Incorporating Phaeocystis into a Southern Ocean ecosystem model
Wang, Shanlin; Moore, J. Keith
2011-01-01
Phaeocystis antarctica is an important phytoplankton species in the Southern Ocean. We incorporated P. antarctica into the biogeochemical elemental cycling ocean model to study Southern Ocean ecosystem dynamics and biogeochemistry. The optimum values of ecological parameters for Phaeocystis were sought through synthesizing laboratory and field observations, and the model output was evaluated with observed chlorophyll a, carbon biomass, and nutrient distributions. Several factors have been proposed to control Southern Ocean ecosystem structure, including light adaptation, iron uptake capability, and loss processes. Optimum simulation results were obtained when P. antarctica had a relatively high α (P-I curve initial slope) value and a higher half-saturation constant for iron uptake than other phytoplankton. Simulation results suggested that P. antarctica had a competitive advantage under low irradiance levels, especially in the Ross Sea and Weddell Sea. However, the distributions of P. antarctica and diatoms were also strongly influenced by iron availability. Although grazing rates had an influence on total biomass, our simulations did not show a strong influence of grazing pressure in the competition between P. antarctica and diatoms. However, limited observations and the relative simplicity of zooplankton in our model suggest further research is needed. Overall, P. antarctica contributed ˜13% of annual primary production and ˜19% of sinking carbon export in the Southern Ocean (>40°S) in our best case simulation. At higher latitudes (>60°S) P. antarctica accounts for ˜23% of annual primary production and ˜30% of sinking carbon export.
Critical Behavior in a Cellular Automata Animal Disease Transmission Model
Morley, P D; Chang, Julius
2003-01-01
Using a cellular automata model, we simulate the British Government Policy (BGP) in the 2001 foot and mouth epidemic in Great Britain. When clinical symptoms of the disease appeared on a farm, there is mandatory slaughter (culling) of all livestock on an infected premise (IP). Those farms that neighbor an IP (contiguous premise, CP), are also culled, aka nearest neighbor interaction. Farms where the disease may be prevalent from animal, human, vehicle or airborne transmission (dangerous contact, DC), are additionally culled, aka next-to-nearest neighbor iteractions and lightning factor. The resulting mathematical model possesses a phase transition, whereupon if the physical disease transmission kernel exceeds a critical value, catastrophic loss of animals ensues. The non-local disease transport probability can be as low as .01% per day and the disease can still be in the high mortality phase. We show that the fundamental equation for sustainable disease transport is the criticality equation for neutron fissio...
Realizing a lattice spin model with polar molecules
Yan, Bo; Gadway, Bryce; Covey, Jacob P; Hazzard, Kaden R A; Rey, Ana Maria; Jin, Deborah S; Ye, Jun
2013-01-01
With the recent production of polar molecules in the quantum regime, long-range dipolar interactions are expected to facilitate the understanding of strongly interacting many-body quantum systems and to realize lattice spin models for exploring quantum magnetism. In atomic systems, where interactions require wave function overlap, effective spin interactions on a lattice can be realized via superexchange; however, the coupling is weak and limited to nearest-neighbor interactions. In contrast, dipolar interactions exist in the absence of tunneling and extend beyond nearest neighbors. This allows coherent spin dynamics to persist even at high entropy and low lattice filling. Effects of dipolar interactions in ultracold molecular gases have so far been limited to the modification of chemical reactions. We now report the observation of dipolar interactions of polar molecules pinned in a 3D optical lattice. We realize a lattice spin model with spin encoded in rotational states, prepared and probed by microwaves. T...
Modified DM Models for Aging Networks Based on Neighborhood Connectivity
Institute of Scientific and Technical Information of China (English)
WEI Du-Qu; LIN Min; LUO Xiao-Shu; WANG Gang; ZOU Yan-Li; CHEN Tian-Lun
2008-01-01
Two modified Dorogovtsev-Mendes (DM) models of aging networks based on the dynamics of connecting nearest-neighbors are introduced. One edge of the new site is connected to the old site with probabilityekt-αas in the DM's model, where the degree and age of the old site are k and t, respectively. We consider two eases, I.e. The other edges of the new site attaching to the nearest-neighbors of the old site with uniform and degree connectivity probability, respectively. The network structure changes with an increase of aging exponent α. It is found that the networks can produce scale-free degree distributions with small-world properties. And the different connectivity probabilities lead to the different properties of the networks.
The scaling limit of the energy correlations in non integrable Ising models
Giuliani, Alessandro; Mastropietro, Vieri
2012-01-01
We obtain an explicit expression for the multipoint energy correlations of a non solvable two-dimensional Ising models with nearest neighbor ferromagnetic interactions plus a weak finite range interaction of strength $\\lambda$, in a scaling limit in which we send the lattice spacing to zero and the temperature to the critical one. Our analysis is based on an exact mapping of the model into an interacting lattice fermionic theory, which generalizes the one originally used by Schultz, Mattis and Lieb for the nearest neighbor Ising model. The interacting model is then analyzed by a multiscale method first proposed by Pinson and Spencer. If the lattice spacing is finite, then the correlations cannot be computed in closed form: rather, they are expressed in terms of infinite, convergent, power series in $\\lambda$. In the scaling limit, these infinite expansions radically simplify and reduce to the limiting energy correlations of the integrable Ising model, up to a finite renormalization of the parameters. Explicit...
Incorporating Context Dependency of Species Interactions in Species Distribution Models.
Lany, Nina K; Zarnetske, Phoebe L; Gouhier, Tarik C; Menge, Bruce A
2017-07-01
Species distribution models typically use correlative approaches that characterize the species-environment relationship using occurrence or abundance data for a single species. However, species distributions are determined by both abiotic conditions and biotic interactions with other species in the community. Therefore, climate change is expected to impact species through direct effects on their physiology and indirect effects propagated through their resources, predators, competitors, or mutualists. Furthermore, the sign and strength of species interactions can change according to abiotic conditions, resulting in context-dependent species interactions that may change across space or with climate change. Here, we incorporated the context dependency of species interactions into a dynamic species distribution model. We developed a multi-species model that uses a time-series of observational survey data to evaluate how abiotic conditions and species interactions affect the dynamics of three rocky intertidal species. The model further distinguishes between the direct effects of abiotic conditions on abundance and the indirect effects propagated through interactions with other species. We apply the model to keystone predation by the sea star Pisaster ochraceus on the mussel Mytilus californianus and the barnacle Balanus glandula in the rocky intertidal zone of the Pacific coast, USA. Our method indicated that biotic interactions between P. ochraceus and B. glandula affected B. glandula dynamics across >1000 km of coastline. Consistent with patterns from keystone predation, the growth rate of B. glandula varied according to the abundance of P. ochraceus in the previous year. The data and the model did not indicate that the strength of keystone predation by P. ochraceus varied with a mean annual upwelling index. Balanus glandula cover increased following years with high phytoplankton abundance measured as mean annual chlorophyll-a. M. californianus exhibited the same
A model for thermal annealing on forming In—N clusters in InGaNP
Institute of Scientific and Technical Information of China (English)
ZHAO ChuanZhen; CHEN Lei; LI NaNa; ZHANG HuanHuan; CHEN YaFei; WEI Tong; TANG ChunXiao; XIE ZiLi
2012-01-01
We develop a model for the effect of thermal annealing on forming In—N clusters in GaInNP according to thermodynamics.The average energy variation for forming an In—N bond in the model is estimated according to the theoretical calculation.Using the model,the added number of In—N bonds per mol of InGaNP,the added number of nearest-neighbor In atoms per N atom and the average number of nearest-neighbor In atoms per N atom after annealing are calculated.The different function of In—N clusters in InGaNP and InGaN is also discussed,which is due to the different environments around the In—N clusters.
DEFF Research Database (Denmark)
Fontenete, Sílvia; Guimarães, Nuno; Wengel, Jesper
2016-01-01
Abstract The thermodynamics and kinetics of DNA hybridization, i.e. the process of self-assembly of one, two or more complementary nucleic acid strands, has been studied for many years. The appearance of the nearest-neighbor model led to several theoretical and experimental papers on DNA thermody......Abstract The thermodynamics and kinetics of DNA hybridization, i.e. the process of self-assembly of one, two or more complementary nucleic acid strands, has been studied for many years. The appearance of the nearest-neighbor model led to several theoretical and experimental papers on DNA...... thermodynamics that provide reasonably accurate thermodynamic information on nucleic acid duplexes and allow estimation of the melting temperature. Because there are no thermodynamic models specifically developed to predict the hybridization temperature of a probe used in a fluorescence in situ hybridization...
Statistical mechanics model for the dynamics of collective epigenetic histone modification.
Zhang, Hang; Tian, Xiao-Jun; Mukhopadhyay, Abhishek; Kim, K S; Xing, Jianhua
2014-02-14
Epigenetic histone modifications play an important role in the maintenance of different cell phenotypes. The exact molecular mechanism for inheritance of the modification patterns over cell generations remains elusive. We construct a Potts-type model based on experimentally observed nearest-neighbor enzyme lateral interactions and nucleosome covalent modification state biased enzyme recruitment. The model can lead to effective nonlocal interactions among nucleosomes suggested in previous theoretical studies, and epigenetic memory is robustly inheritable against stochastic cellular processes.
The scaling limit of the energy correlations in non integrable Ising models
2012-01-01
We obtain an explicit expression for the multipoint energy correlations of a non solvable two-dimensional Ising models with nearest neighbor ferromagnetic interactions plus a weak finite range interaction of strength $\\lambda$, in a scaling limit in which we send the lattice spacing to zero and the temperature to the critical one. Our analysis is based on an exact mapping of the model into an interacting lattice fermionic theory, which generalizes the one originally used by Schultz, Mattis an...
Nuclear Chaotic Behavior in a Two-j Shell Coupled with a Rotor Model
Institute of Scientific and Technical Information of China (English)
GUO Lu; ZHOU XianRong; MENG Jie; ZHAO EnGuang
2002-01-01
The chaotic properties for six particles interacting by a monopolc pairing force in a two-.j stiell modelcoupled with a deformed core are studied in the frame of particle-rotor model. The nearest-neighbor distribution ofenergy levels and spectral rigidity in the two-j shell arc compared with those in the single-j case. The result, s show thatthe system is more regular in the two-j model than that in the single-j case.
Critical phenomena in the majority voter model on two-dimensional regular lattices.
Acuña-Lara, Ana L; Sastre, Francisco; Vargas-Arriola, José Raúl
2014-05-01
In this work we studied the critical behavior of the critical point as a function of the number of nearest neighbors on two-dimensional regular lattices. We performed numerical simulations on triangular, hexagonal, and bilayer square lattices. Using standard finite-size scaling theory we found that all cases fall in the two-dimensional Ising model universality class, but that the critical point value for the bilayer lattice does not follow the regular tendency that the Ising model shows.
Zigzag order and phase competition in expanded Kitaev-Heisenberg model on honeycomb lattice
Yao, Xiaoyan
2015-07-01
The Kitaev-Heisenberg model on the honeycomb lattice is investigated in two cases: (I) with the Kitaev interaction between the nearest neighbors, and (II) with the Kitaev interaction between the next nearest neighbors. In the full parameter range, the ground states are searched by Monte Carlo simulation and identified by evaluating the correlation functions. The energies of different phases are calculated and compared with the simulated result to show the phase competition. It is observed from both energy calculation and the density of states that the zigzag order shows a symmetric behavior to the stripy phase in the pure Kitaev-Heisenberg model. By considering more interactions in both cases, the energy of zigzag order can be reduced lower than the energies of other states. Thus the zigzag phase may be stabilized in more parameter region and even extended to the whole parameter range.
Modeling of the magnetic properties of nanomaterials with different crystalline structure
Kirienko, Yury
2012-01-01
We propose a method for modeling the magnetic properties of nanomaterials with different structures. The method is based on the Ising model and the approximation of the random field interaction. It is shown that in this approximation, the magnetization of the nanocrystal depends only on the number of nearest neighbors of the lattice atoms and the values of exchange integrals between them. This gives a good algorithmic problem of calculating the magnetization of any nano-object, whether it is ultrathin film or nanoparticle of any shape and structure, managing only a rule of selection of nearest neighbors. By setting different values of exchange integrals, it is easy to describe ferromagnets, antiferromagnets, and ferrimagnets in a unified formalism. Having obtained the magnetization curve of the sample it is possible to find the Curie temperature as a function of, for example, the thickness of ultrathin film. Afterwards one can obtain the numerical values for critical exponents of the phase transition "ferroma...
Modulated Phase of a Potts Model with Competing Binary Interactions on a Cayley Tree
Ganikhodjaev, N.; Temir, S.; Akin, H.
2009-11-01
We study the phase diagram for Potts model on a Cayley tree with competing nearest-neighbor interactions J 1, prolonged next-nearest-neighbor interactions J p and one-level next-nearest-neighbor interactions J o . Vannimenus proved that the phase diagram of Ising model with J o =0 contains a modulated phase, as found for similar models on periodic lattices, but the multicritical Lifshitz point is at zero temperature. Later Mariz et al. generalized this result for Ising model with J o ≠0 and recently Ganikhodjaev et al. proved similar result for the three-state Potts model with J o =0. We consider Potts model with J o ≠0 and show that for some values of J o the multicritical Lifshitz point be at non-zero temperature. We also prove that as soon as the same-level interaction J o is nonzero, the paramagnetic phase found at high temperatures for J o =0 disappears, while Ising model does not obtain such property. To perform this study, an iterative scheme similar to that appearing in real space renormalization group frameworks is established; it recovers, as particular case, previous work by Ganikhodjaev et al. for J o =0. At vanishing temperature, the phase diagram is fully determined for all values and signs of J 1, J p and J o . At finite temperatures several interesting features are exhibited for typical values of J o / J 1.
Tamura, Ryo; Tanaka, Shu
2013-11-01
We study the phase transition behavior of a frustrated Heisenberg model on a stacked triangular lattice by Monte Carlo simulations. The model has three types of interactions: the ferromagnetic nearest-neighbor interaction J(1) and antiferromagnetic third nearest-neighbor interaction J(3) in each triangular layer and the ferromagnetic interlayer interaction J([perpendicular]). Frustration comes from the intralayer interactions J(1) and J(3). We focus on the case that the order parameter space is SO(3)×C(3). We find that the model exhibits a first-order phase transition with breaking of the SO(3) and C(3) symmetries at finite temperature. We also discover that the transition temperature increases but the latent heat decreases as J([perpendicular])/J(1) increases, which is opposite to the behavior observed in typical unfrustrated three-dimensional systems.
Babaev, A. B.; Murtazaev, A. K.; Suleimanov, E. M.; Rizvanova, T. R.
2016-10-01
Influence of disorder in the form of frustration on the thermodynamic behavior of a two-dimensional three-vertex Potts model has been studied by the Monte Carlo method, taking into account the nearest and next-nearest neighbors. Systems with linear sizes of L × L = N ( L = 9-48) on a triangular lattice have been considered. It has been shown that in the case of J 1 > 0 and J 2 model undergoes a phase transition outside this region.
Monte Carlo Tests of Nucleation Concepts in the Lattice Gas Model
Schmitz, Fabian; Virnau, Peter; Binder, Kurt
2013-01-01
The conventional theory of homogeneous and heterogeneous nucleation in a supersaturated vapor is tested by Monte Carlo simulations of the lattice gas (Ising) model with nearest-neighbor attractive interactions on the simple cubic lattice. The theory considers the nucleation process as a slow (quasi-static) cluster (droplet) growth over a free energy barrier $\\Delta F^*$, constructed in terms of a balance of surface and bulk term of a "critical droplet" of radius $R^*$, implying that the rates...
Directory of Open Access Journals (Sweden)
N Rahimipour
2015-07-01
Full Text Available The classical J1-J2 Heisenberg model on bipartite lattice exhibits "Neel" order. However if the AF interactions between the next nearest neighbor(nnn are increased with respect to the nearest neighbor(nn, the frustration effect arises. In such situations, new phases such as ordered phases with coplanar or spiral ordering and disordered phases such as spin liquids can arise. In this paper we use the self-consistent Gaussian approximation to study the J1-J2 Heisenberg model in honeycomb and diamond lattices. We find the spin liquid phases such as ring-liquid and pancake-liquid in honeycomb lattice.Also for diamond lattice we show that the degeneracy of ground state can be lifted by thermal fluctuations through the order by disorder mechanism.
Zigzag order and phase competition in expanded Kitaev–Heisenberg model on honeycomb lattice
Energy Technology Data Exchange (ETDEWEB)
Yao, Xiaoyan, E-mail: yaoxiaoyan@gmail.com
2015-07-17
Highlights: • Expanded Kitaev–Heisenberg model on honeycomb lattice is investigated. • Kitaev interactions between the first or second nearest neighbors are considered. • Phase competition is discussed by energy calculation and Monte Carlo simulation. • Zigzag phase shows a symmetric behavior to the stripy phase. • Zigzag order is extended to the whole parameter range by more interactions. - Abstract: The Kitaev–Heisenberg model on the honeycomb lattice is investigated in two cases: (I) with the Kitaev interaction between the nearest neighbors, and (II) with the Kitaev interaction between the next nearest neighbors. In the full parameter range, the ground states are searched by Monte Carlo simulation and identified by evaluating the correlation functions. The energies of different phases are calculated and compared with the simulated result to show the phase competition. It is observed from both energy calculation and the density of states that the zigzag order shows a symmetric behavior to the stripy phase in the pure Kitaev–Heisenberg model. By considering more interactions in both cases, the energy of zigzag order can be reduced lower than the energies of other states. Thus the zigzag phase may be stabilized in more parameter region and even extended to the whole parameter range.
Institute of Scientific and Technical Information of China (English)
JIANG Wei; Veng-Cheong Lo
2005-01-01
Ferroelectric phase diagrams and the temperature dependence of polarization, dielectric properties of the three pseudo-spin in ferroelectric or ferro-antiferroelectric system described by a transverse Ising models are investigated on the basis of the effective-field theory with the differential operator technique. The effects of the transverse field and the coupling strength between the nearest-neighboring pseudo-spin on the physical properties are discussed in detail.
Classification of EEG Signals using adaptive weighted distance nearest neighbor algorithm
E. Parvinnia; M. Sabeti; M. Zolghadri Jahromi; Boostani, R
2014-01-01
Electroencephalogram (EEG) signals are often used to diagnose diseases such as seizure, alzheimer, and schizophrenia. One main problem with the recorded EEG samples is that they are not equally reliable due to the artifacts at the time of recording. EEG signal classification algorithms should have a mechanism to handle this issue. It seems that using adaptive classifiers can be useful for the biological signals such as EEG. In this paper, a general adaptive method named weighted distance near...
A Coupled k-Nearest Neighbor Algorithm for Multi-Label Classification
2015-05-22
Although effective in some cases, ML-kNN has some defect due to the fact that it is a binary relevance classifier which only considers one label every time...informatics, a gene can belong to both metabolism and transcription classes; and in music categorization, a song may labeled as Mozart and sad. In the...previous research [4,6]. In [8,14], Can and Liu etc. analysis the coupling relationship on categorical data. These works all proved the effectiveness of
PERBANDINGAN K-NEAREST NEIGHBOR DAN NAIVE BAYES UNTUK KLASIFIKASI TANAH LAYAK TANAM POHON JATI
Directory of Open Access Journals (Sweden)
Didik Srianto
2016-10-01
Full Text Available Data mining adalah proses menganalisa data dari perspektif yang berbeda dan menyimpulkannya menjadi informasi-informasi penting yang dapat dipakai untuk meningkatkan keuntungan, memperkecil biaya pengeluaran, atau bahkan keduanya. Secara teknis, data mining dapat disebut sebagai proses untuk menemukan korelasi atau pola dari ratusan atau ribuan field dari sebuah relasional database yang besar. Pada perum perhutani KPH SEMARANG saat ini masih menggunakan cara manual untuk menentukan jenis tanaman (jati / non jati. K-Nearest Neighbour atau k-NN merupakan algoritma data mining yang dapat digunakan untuk proses klasifikasi dan regresi. Naive bayes Classifier merupakan suatu teknik yang dapat digunakan untuk teknik klasifikasi. Pada penelitian ini k-NN dan Naive Bayes akan digunakan untuk mengklasifikasi data pohon jati dari perum perhutani KPH SEMARANG. Yang mana hasil klasifikasi dari k-NN dan Naive Bayes akan dibandingkan hasilnya. Pengujian dilakukan menggunakan software RapidMiner. Setelah dilakukan pengujian k-NN dianggap lebih baik dari Naife Bayes dengan akurasi 96.66% dan 82.63. Kata kunci -k-NN,Klasifikasi,Naive Bayes,Penanaman Pohon Jati
Directory of Open Access Journals (Sweden)
Fachruddin Fachruddin
2017-07-01
Full Text Available Software Effort Estimation adalah proses estimasi biaya perangkat lunak sebagai suatu proses penting dalam melakukan proyek perangkat lunak. Berbagai penelitian terdahulu telah melakukan estimasi usaha perangkat lunak dengan berbagai metode, baik metode machine learning maupun non machine learning. Penelitian ini mengadakan set eksperimen seleksi atribut pada parameter proyek menggunakan teknik k-nearest neighbours sebagai estimasinya dengan melakukan seleksi atribut menggunakan information gain dan mutual information serta bagaimana menemukan parameter proyek yang paling representif pada software effort estimation. Dataset software estimation effort yang digunakan pada eksperimen adalah yakni albrecht, china, kemerer dan mizayaki94 yang dapat diperoleh dari repositori data khusus Software Effort Estimation melalui url http://openscience.us/repo/effort/. Selanjutnya peneliti melakukan pembangunan aplikasi seleksi atribut untuk menyeleksi parameter proyek. Sistem ini menghasilkan dataset arff yang telah diseleksi. Aplikasi ini dibangun dengan bahasa java menggunakan IDE Netbean. Kemudian dataset yang telah di-generate merupakan parameter hasil seleksi yang akan dibandingkan pada saat melakukan Software Effort Estimation menggunakan tool WEKA . Seleksi Fitur berhasil menurunkan nilai error estimasi (yang diwakilkan oleh nilai RAE dan RMSE. Artinya bahwa semakin rendah nilai error (RAE dan RMSE maka semakin akurat nilai estimasi yang dihasilkan. Estimasi semakin baik setelah di lakukan seleksi fitur baik menggunakan information gain maupun mutual information. Dari nilai error yang dihasilkan maka dapat disimpulkan bahwa dataset yang dihasilkan seleksi fitur dengan metode information gain lebih baik dibanding mutual information namun, perbedaan keduanya tidak terlalu signifikan.
Using K-Nearest Neighbor Classification to Diagnose Abnormal Lung Sounds
National Research Council Canada - National Science Library
Chen, Chin-Hsing; Huang, Wen-Tzeng; Tan, Tan-Hsu; Chang, Cheng-Chun; Chang, Yuan-Jen
2015-01-01
.... In this digital system, mel-frequency cepstral coefficients (MFCCs) were used to extract the features of lung sounds, and then the K-means algorithm was used for feature clustering, to reduce the amount of data...
Database Selection for Processing k Nearest Neighbors Queries in Distributed Environments.
Yu, Clement; Sharma, Prasoon; Meng, Weiyi; Qin, Yan
This paper considers the processing of digital library queries, consisting of a text component and a structured component in distributed environments. The paper concentrates on the processing of the structured component of a distributed query. A method is proposed to identify the databases that are likely to be useful for processing any given…
Ghinita, Gabriel
2010-12-15
Mobile devices with global positioning capabilities allow users to retrieve points of interest (POI) in their proximity. To protect user privacy, it is important not to disclose exact user coordinates to un-trusted entities that provide location-based services. Currently, there are two main approaches to protect the location privacy of users: (i) hiding locations inside cloaking regions (CRs) and (ii) encrypting location data using private information retrieval (PIR) protocols. Previous work focused on finding good trade-offs between privacy and performance of user protection techniques, but disregarded the important issue of protecting the POI dataset D. For instance, location cloaking requires large-sized CRs, leading to excessive disclosure of POIs (O({pipe}D{pipe}) in the worst case). PIR, on the other hand, reduces this bound to O(√{pipe}D{pipe}), but at the expense of high processing and communication overhead. We propose hybrid, two-step approaches for private location-based queries which provide protection for both the users and the database. In the first step, user locations are generalized to coarse-grained CRs which provide strong privacy. Next, a PIR protocol is applied with respect to the obtained query CR. To protect against excessive disclosure of POI locations, we devise two cryptographic protocols that privately evaluate whether a point is enclosed inside a rectangular region or a convex polygon. We also introduce algorithms to efficiently support PIR on dynamic POI sub-sets. We provide solutions for both approximate and exact NN queries. In the approximate case, our method discloses O(1) POI, orders of magnitude fewer than CR- or PIR-based techniques. For the exact case, we obtain optimal disclosure of a single POI, although with slightly higher computational overhead. Experimental results show that the hybrid approaches are scalable in practice, and outperform the pure-PIR approach in terms of computational and communication overhead. © 2010 Springer Science+Business Media, LLC.
Multilayer Neural Networks and Nearest Neighbor Classifier Performances for Image Annotation
Directory of Open Access Journals (Sweden)
Mustapha OUJAOURA
2012-12-01
Full Text Available The explosive growth of image data leads to the research and development of image content searching and indexing systems. Image annotation systems aim at annotating automatically animage with some controlled keywords that can be used for indexing and retrieval of images. This paper presents a comparative evaluation of the image content annotation system by using the multilayer neural networks and the nearest neighbour classifier. The region growing segmentation is used to separate objects, the Hu moments, Legendre moments and Zernike moments which are used in as feature descriptors for the image content characterization and annotation.The ETH-80 database image is used in the experiments here. The best annotation rate is achieved by using Legendre moments as feature extraction method and the multilayer neural network as a classifier
Nearest-neighbor coordination and chemical ordering in multi-component bulk metallic glasses
Energy Technology Data Exchange (ETDEWEB)
Ma, Dong [ORNL; Stoica, Alexandru Dan [ORNL; Yang, Ling [ORNL; Wang, Xun-Li [ORNL; Lu, Zhao Ping [ORNL; Neuefeind, Joerg C [ORNL; Kramer, Matthew J [ORNL; Richardson, James W [Argonne National Laboratory (ANL); Proffen, Thomas E [ORNL
2007-01-01
We report complimentary use of high energy x-ray and neutron diffraction to probe the local atomic structure in a Zr-based multi-component bulk metallic glass. By analyzing the partial coordination numbers, we demonstrate the presence of multiple types of solute-centered clusters (or the lack of solute-solute bonding) and efficient packing of the amorphous structure at the atomic scale. Our findings provide a basis for understanding how the local structures change during phase transformation and mechanical deformation.
Using Nearest Neighbor Information to Improve Cross-Language Text Classification
Escobar-Acevedo, Adelina; Montes-Y-Gómez, Manuel; Villaseñor-Pineda, Luis
Cross-language text classification (CLTC) aims to take advantage of existing training data from one language to construct a classifier for another language. In addition to the expected translation issues, CLTC is also complicated by the cultural distance between both languages, which causes that documents belonging to the same category concern very different topics. This paper proposes a re-classification method which purpose is to reduce the errors caused by this phenomenon by considering information from the own target language documents. Experimental results in a news corpus considering three pairs of languages and four categories demonstrated the appropriateness of the proposed method, which could improve the initial classification accuracy by up to 11%.
A Comparison of Rule-Based, K-Nearest Neighbor, and Neural Net Classifiers for Automated
Tai-Hoon Cho; Richard W. Conners; Philip A. Araman
1991-01-01
Over the last few years the authors have been involved in research aimed at developing a machine vision system for locating and identifying surface defects on materials. The particular problem being studied involves locating surface defects on hardwood lumber in a species independent manner. Obviously, the accurate location and identification of defects is of paramount...
The APF Fifty: A Robotic Search for Earth’s Nearest Neighbors
Fulton, Benjamin; Howard, Andrew; Weiss, Lauren M.; Sinukoff, Evan; Marcy, Geoffrey W.; Isaacson, Howard T.; Alyse Hirsch, Lea
2015-12-01
With the Automated Planet Finder (APF) telescope, we are conducting a Doppler survey of a magnitude-limited sample of 51 nearby, chromospherically inactive, G and K dwarfs. This APF-50 survey is sensitive to planets with masses as low as 2-3 times the mass of the Earth and will measure small planet occurrence in the solar neighborhood. We expect to measure details of the planet mass function and to identify the nearby stars hosting low-mass planetary systems that will be the likely targets of follow-up measurements. We employ the robotic APF telescope to monitor the stars at high cadence for the duration of the survey. It builds on the Eta-Earth Survey at Keck Observatory, but with improved Doppler precision due to the high observing cadence and a larger number of measurements. We will measure the occurrence rate and mass function of small planets in our local neighborhood using the new planets discovered by the APF-50 survey and the set of planets already known to orbit stars in our sample. Combining the mass function from this survey with the size distribution from Kepler, we will probe the density and core mass properties of super-Earths to inform formation theories of the galaxy's most abundant planets.
Nearest Neighbor: The Low-Mass Milky Way Satellite Tucana III
Simon, J D; Drlica-Wagner, A; Bechtol, K; Marshall, J L; James, D J; Wang, M Y; Strigari, L; Balbinot, E; Kuehn, K; Walker, A R; Abbott, T M C; Allam, S; Annis, J; Benoit-Levy, A; Brooks, D; Buckley-Geer, E; Burke, D L; Rosell, A Carnero; Kind, M Carrasco; Carretero, J; Cunha, C E; D'Andrea, C B; da Costa, L N; DePoy, D L; Desai, S; Doel, P; Fernandez, E; Flaugher, B; Frieman, J; Garcia-Bellido, J; Gaztanaga, E; Goldstein, D A; Gruen, D; Gutierrez, G; Kuropatkin, N; Maia, M A G; Martini, P; Menanteau, F; Miller, C J; Miquel, R; Neilsen, E; Nord, B; Ogando, R; Plazas, A A; Romer, A K; Rykoff, E S; Sanchez, E; Santiago, B; Scarpine, V; Schubnell, M; Sevilla-Noarbe, I; Smith, R C; Sobreira, F; Suchyta, E; Swanson, M E C; Tarle, G; Whiteway, L; Yanny, B
2016-01-01
We present Magellan/IMACS spectroscopy of the recently discovered Milky Way satellite Tucana III (Tuc III). We identify 26 member stars in Tuc III, from which we measure a mean radial velocity of v_hel = -102.3 +/- 0.4 (stat.) +/- 2.0 (sys.) km/s, a velocity dispersion of 0.1^+0.7_-0.1 km/s, and a mean metallicity of [Fe/H] = -2.42^+0.07_-0.08. The upper limit on the velocity dispersion is sigma < 1.5 km/s at 95.5% confidence, and the corresponding upper limit on the mass within the half-light radius of Tuc III is 9.0 x 10^4 Msun. We cannot rule out mass-to-light ratios as large as 240 Msun/Lsun for Tuc III, but much lower mass-to-light ratios that would leave the system baryon-dominated are also allowed. We measure an upper limit on the metallicity spread of the stars in Tuc III of 0.19 dex at 95.5% confidence. Tuc III has a smaller metallicity dispersion and likely a smaller velocity dispersion than any known dwarf galaxy, but a larger size and lower surface brightness than any known globular cluster. It...
Tail estimates for one-dimensional non-nearest-neighbor random walk in random environment
Institute of Scientific and Technical Information of China (English)
无
2010-01-01
Suppose that the integers are assigned i.i.d. random variables {(β gx , . . . , β 1x , α x )} (each taking values in the unit interval and the sum of them being 1), which serve as an environment. This environment defines a random walk {X n } (called RWRE) which, when at x, moves one step of length 1 to the right with probability α x and one step of length k to the left with probability β kx for 1≤ k≤ g. For certain environment distributions, we determine the almost-sure asymptotic speed of the RWRE and show that the chance of the RWRE deviating below this speed has a polynomial rate of decay. This is the generalization of the results by Dembo, Peres and Zeitouni in 1996. In the proof we use a large deviation result for the product of random matrices and some tail estimates and moment estimates for the total population size in a multi-type branching process with random environment.
Li, Peng; Su, Haibin; Dong, Hui-Ning; Shen, Shun-Qing
2009-08-12
We study a triangular frustrated antiferromagnetic Heisenberg model with nearest-neighbor interactions J(1) and third-nearest-neighbor interactions J(3) by means of Schwinger-boson mean-field theory. By setting an antiferromagnetic J(3) and varying J(1) from positive to negative values, we disclose the low-temperature features of its interesting incommensurate phase. The gapless dispersion of quasiparticles leads to the intrinsic T(2) law of specific heat. The magnetic susceptibility is linear in temperature. The local magnetization is significantly reduced by quantum fluctuations. We address possible relevance of these results to the low-temperature properties of NiGa(2)S(4). From a careful analysis of the incommensurate spin wavevector, the interaction parameters are estimated as J(1)≈-3.8755 K and J(3)≈14.0628 K, in order to account for the experimental data.
Bayesian Approach to Effective Model of NiGa2S4 Triangular Lattice with Boltzmann Factor
Takenaka, Hikaru; Nagata, Kenji; Mizokawa, Takashi; Okada, Masato
2016-12-01
We propose a method for inducting the Boltzmann factor to extract effective classical spin Hamiltonians from mean-field-type electronic structural calculations by means of the Bayesian inference. This method enables us to compare electronic structural calculations with experiments according to the classical model at a finite temperature. Application of this method to the unrestricted Hartree-Fock calculations for NiGa2S4 led to the estimation that the superexchange interaction between the nearest neighbor sites is ferromagnetic at low temperature, which is consistent with magnetic experiment results. This supports the theory that competition between the antiferromagnetic third neighbor interaction and ferromagnetic nearest neighbor interaction may lead to the quantum spin liquid in NiGa2S4.
Pineda, M.; Stamatakis, M.
2017-07-01
Modeling the kinetics of surface catalyzed reactions is essential for the design of reactors and chemical processes. The majority of microkinetic models employ mean-field approximations, which lead to an approximate description of catalytic kinetics by assuming spatially uncorrelated adsorbates. On the other hand, kinetic Monte Carlo (KMC) methods provide a discrete-space continuous-time stochastic formulation that enables an accurate treatment of spatial correlations in the adlayer, but at a significant computation cost. In this work, we use the so-called cluster mean-field approach to develop higher order approximations that systematically increase the accuracy of kinetic models by treating spatial correlations at a progressively higher level of detail. We further demonstrate our approach on a reduced model for NO oxidation incorporating first nearest-neighbor lateral interactions and construct a sequence of approximations of increasingly higher accuracy, which we compare with KMC and mean-field. The latter is found to perform rather poorly, overestimating the turnover frequency by several orders of magnitude for this system. On the other hand, our approximations, while more computationally intense than the traditional mean-field treatment, still achieve tremendous computational savings compared to KMC simulations, thereby opening the way for employing them in multiscale modeling frameworks.
The expanded triangular Kitaev–Heisenberg model in the full parameter space
Energy Technology Data Exchange (ETDEWEB)
Yao, Xiaoyan, E-mail: yaoxiaoyan@gmail.com
2014-06-13
The classical Kitaev–Heisenberg model on the triangular lattice is investigated by simulation in its full parameter space together with the next-nearest neighboring Heisenberg interaction or the single-ion anisotropy. The variation of the system is demonstrated directly by the joint density of states (DOS) depending on energy and magnetization obtained from Wang–Landau algorithm. The Metropolis Monte Carlo simulation and the zero-temperature Glauber dynamics are performed to show the internal energy, the correlation functions and spin configurations at zero temperature. It is revealed that two types of DOS (U and inverse U) divide the whole parameter range into two main parts with antiferromagnetic and ferromagnetic features respectively. In the parameter range of U type DOS, the mixed frustration from the triangular geometry and the Kitaev interaction produces rich phases, which are influenced in different ways by the next-nearest neighboring Heisenberg interaction and the single-ion anisotropy. - Highlights: • The expanded triangular Kitaev–Heisenberg model is investigated by simulation. • The density of states is shown in the full parameter space. • Rich low-temperature phases are induced by the mixed frustration. • The next nearest-neighboring Heisenberg interaction influences the phases. • The single-ion anisotropy modifies the shape of the density of states.
Mu, Yan; Gao, Yi Qin
2007-09-01
We studied the effects of hydrophobicity and dipole-dipole interactions between the nearest-neighbor amide planes on the secondary structures of a model polypeptide by calculating the free energy differences between different peptide structures. The free energy calculations were performed with low computational costs using the accelerated Monte Carlo simulation (umbrella sampling) method, with a bias-potential method used earlier in our accelerated molecular dynamics simulations. It was found that the hydrophobic interaction enhances the stability of α helices at both low and high temperatures but stabilizes β structures only at high temperatures at which α helices are not stable. The nearest-neighbor dipole-dipole interaction stabilizes β structures under all conditions, especially in the low temperature region where α helices are the stable structures. Our results indicate clearly that the dipole-dipole interaction between the nearest neighboring amide planes plays an important role in determining the peptide structures. Current research provides a more unified and quantitative picture for understanding the effects of different forms of interactions on polypeptide structures. In addition, the present model can be extended to describe DNA/RNA, polymer, copolymer, and other chain systems.
Robust Supersolidity in the V1- V2 Extended Bose-Hubbard Model
Greene, Nicole; Pixley, Jedediah
2016-05-01
Motivated by ultra-cold atomic gases with long-range interactions in an optical lattice we study the effects of the next-nearest neighbor interaction on the extended Bose-Hubbard model on a square lattice. Using the variational Gutzwiller approach with a four-site unit cell we determine the ground state phase diagrams as a function of the model parameters. We focus on the interplay of each interaction between the nearest neighbor (V1) , the next-nearest neighbor (V2) , and the onsite repulsion (U). We find various super-solid phases that can be described by one of the ordering wave-vectors (π, 0), (0, π) , and (π, π) . In the limits V1, V2 U we find phases reminiscent of the limit V2 = 0 but with a richer super solid structure. For V1
Hiebeler, David E; Millett, Nicholas E
2011-06-21
We investigate a spatial lattice model of a population employing dispersal to nearest and second-nearest neighbors, as well as long-distance dispersal across the landscape. The model is studied via stochastic spatial simulations, ordinary pair approximation, and triplet approximation. The latter method, which uses the probabilities of state configurations of contiguous blocks of three sites as its state variables, is demonstrated to be greatly superior to pair approximations for estimating spatial correlation information at various scales. Correlations between pairs of sites separated by arbitrary distances are estimated by constructing spatial Markov processes using the information from both approximations. These correlations demonstrate why pair approximation misses basic qualitative features of the model, such as decreasing population density as a large proportion of offspring are dropped on second-nearest neighbors, and why triplet approximation is able to include them. Analytical and numerical results show that, excluding long-distance dispersal, the initial growth rate of an invading population is maximized and the equilibrium population density is also roughly maximized when the population spreads its offspring evenly over nearest and second-nearest neighboring sites.
Incorporating Enterprise Risk Management in the Business Model Innovation Process
Yariv Taran; Harry Boer; Peter Lindgren
2013-01-01
Purpose: Relative to other types of innovations, little is known about business model innovation, let alone the process of managing the risks involved in that process. Using the emerging (enterprise) risk management literature, an approach is proposed through which risk management can be embedded in the business model innovation process. Design: The integrated business model innovation risk management model developed in this paper has been tested through an action research study in a Dani...
Incorporating inductances in tissue-scale models of cardiac electrophysiology
Rossi, Simone; Griffith, Boyce E.
2017-09-01
In standard models of cardiac electrophysiology, including the bidomain and monodomain models, local perturbations can propagate at infinite speed. We address this unrealistic property by developing a hyperbolic bidomain model that is based on a generalization of Ohm's law with a Cattaneo-type model for the fluxes. Further, we obtain a hyperbolic monodomain model in the case that the intracellular and extracellular conductivity tensors have the same anisotropy ratio. In one spatial dimension, the hyperbolic monodomain model is equivalent to a cable model that includes axial inductances, and the relaxation times of the Cattaneo fluxes are strictly related to these inductances. A purely linear analysis shows that the inductances are negligible, but models of cardiac electrophysiology are highly nonlinear, and linear predictions may not capture the fully nonlinear dynamics. In fact, contrary to the linear analysis, we show that for simple nonlinear ionic models, an increase in conduction velocity is obtained for small and moderate values of the relaxation time. A similar behavior is also demonstrated with biophysically detailed ionic models. Using the Fenton-Karma model along with a low-order finite element spatial discretization, we numerically analyze differences between the standard monodomain model and the hyperbolic monodomain model. In a simple benchmark test, we show that the propagation of the action potential is strongly influenced by the alignment of the fibers with respect to the mesh in both the parabolic and hyperbolic models when using relatively coarse spatial discretizations. Accurate predictions of the conduction velocity require computational mesh spacings on the order of a single cardiac cell. We also compare the two formulations in the case of spiral break up and atrial fibrillation in an anatomically detailed model of the left atrium, and we examine the effect of intracellular and extracellular inductances on the virtual electrode phenomenon.
Seasonal variation in survival and reproduction can be a large source of prediction uncertainty in models used for conservation and management. A seasonally varying matrix population model is developed that incorporates temperature-driven differences in mortality and reproduction...
Directory of Open Access Journals (Sweden)
F Keshavarz
2017-02-01
Full Text Available In this study, the effect of four-spin exchanges between the nearest and next nearest neighbor spins of honeycomb lattice on the phase diagram of S=3/2 antiferomagnetic Heisenberg model is considered with two-spin exchanges between the nearest and next nearest neighbor spins. Firstly, the method is investigated with classical phase diagram. In classical phase diagram, in addition to Neel order, classical degeneracy is also seen. The existance of this phase in diagram phase is important because of the probability of the existence of quantum spin liquid in this region for such amount of interaction. To investigate the effect of quantum fluctuation on the stability of the obtained classical phase diagram, linear spin wave theory has been used. Obtained results show that in classical degeneracy regime, the quantum fluctuations cause the order by disorder in the spin system and the ground state is ordered
Butera, P
2002-01-01
We present an on-line library of unprecedented extension for high-temperature expansions of basic observables in the Ising models of general spin S, with nearest-neighbor interactions. We have tabulated through order beta^{25} the series for the nearest-neighbor correlation function, the susceptibility and the second correlation moment in two dimensions on the square lattice, and, in three dimensions, on the simple-cubic and the body-centered cubic lattices. The expansion of the second field derivative of the susceptibility is also tabulated through beta^{23} for the same lattices. We have thus added several terms (from four up to thirteen) to the series already published for spin S=1/2,1,3/2,2,5/2,3,7/2,4,5,infinity.
Multiplicity Control in Structural Equation Modeling: Incorporating Parameter Dependencies
Smith, Carrie E.; Cribbie, Robert A.
2013-01-01
When structural equation modeling (SEM) analyses are conducted, significance tests for all important model relationships (parameters including factor loadings, covariances, etc.) are typically conducted at a specified nominal Type I error rate ([alpha]). Despite the fact that many significance tests are often conducted in SEM, rarely is…
Incorporating Enterprise Risk Management in the Business Model Innovation Process
Directory of Open Access Journals (Sweden)
Yariv Taran
2013-12-01
Full Text Available Purpose: Relative to other types of innovations, little is known about business model innovation, let alone the process of managing the risks involved in that process. Using the emerging (enterprise risk management literature, an approach is proposed through which risk management can be embedded in the business model innovation process. Design: The integrated business model innovation risk management model developed in this paper has been tested through an action research study in a Danish company. Findings: The study supports our proposition that the implementation of risk management throughout the innovation process reduces the risks related to the uncertainty and complexity of developing and implementing a new business model. Originality: The study supports the proposition that the implementation of risk management throughout the innovation process reduces the risks related to the uncertainty and complexity of developing and implementing a new business model. The business model risk management model makes managers much more focused on identifying problematic issues and putting explicit plans and timetables into place for resolving/reducing risks, and assists companies in aligning the risk treatment choices made during the
A Constrained CA Model for Planning Simulation Incorporating Institutional Constraints
Institute of Scientific and Technical Information of China (English)
2010-01-01
In recent years,it is prevailing to simulate urban growth by means of cellular automata (CA in short) modeling,which is based on selforganizing theories and different from the system dynamic modeling.Since the urban system is definitely complex,the CA models applied in urban growth simulation should take into consideration not only the neighborhood influence,but also other factors influencing urban development.We bring forward the term of complex constrained CA (CC-CA in short) model,which integrates the constrained conditions of neighborhood,macro socio-economy,space and institution.Particularly,the constrained construction zoning,as one institutional constraint,is considered in the CC-CA modeling.In the paper,the conceptual CC-CA model is introduced together with the transition rules.Based on the CC-CA model for Beijing,we discuss the complex constraints to the urban development of,and we show how to set institutional constraints in planning scenario to control the urban growth pattern of Beijing.
Modelling of Permanent Magnet Synchronous Motor Incorporating Core-loss
Directory of Open Access Journals (Sweden)
K. Suthamno
2012-08-01
Full Text Available This study proposes a dq-axis modelling of a Permanent Magnet Synchronous Motor (PMSM with copper-loss and core-loss taken into account. The proposed models can be applied to PMSM control and drive with loss minimization in simultaneous consideration. The study presents simulation results of direct drive of a PMSM under no-load and loaded conditions using the proposed models with MATLAB codes. Comparisons of the results are made among those obtained from using PSIM and SIMULINK software packages. The comparison results indicate very good agreement.
Energy Technology Data Exchange (ETDEWEB)
Rubin, P., E-mail: rubin@fi.tartu.ee; Sherman, A.
2014-11-07
The spin-1 Heisenberg model on a triangular lattice with the ferromagnetic nearest-neighbor and antiferromagnetic third-nearest-neighbor exchange interactions, J{sub 1}=−(1−p)J and J{sub 2}=pJ, J>0(0≤p≤1), is studied with the use of the SPINPACK code. This model is applicable for the description of the magnetic properties of NiGa{sub 2}S{sub 4}. The ground, low-lying excited state energies and spin-spin correlation functions have been found for lattices with N=16 and N=20 sites with the periodic boundary conditions. These results are in qualitative agreement with earlier authors' results obtained with Mori's projection operator technique. - Highlights: • The S=1J{sub 1}–J{sub 3} Heisenberg model on a triangular lattice is studied. • The ferromagnetic nearest and AF 3rd-nearest-neighbor couplings are considered. • The exact diagonalization study of finite lattices was done. • The SPINPACK code using Lanczos' method is used for calculations. • The obtained results are in agreement with those obtained by Mori's approach.
Castin, N.; Messina, L.; Domain, C.; Pasianot, R. C.; Olsson, P.
2017-06-01
We significantly improve the physical models underlying atomistic Monte Carlo (MC) simulations, through the use of ab initio fitted high-dimensional neural network potentials (NNPs). In this way, we can incorporate energetics derived from density functional theory (DFT) in MC, and avoid using empirical potentials that are very challenging to design for complex alloys. We take significant steps forward from a recent work where artificial neural networks (ANNs), exclusively trained on DFT vacancy migration energies, were used to perform kinetic MC simulations of Cu precipitation in Fe. Here, a more extensive transfer of knowledge from DFT to our cohesive model is achieved via the fitting of NNPs, aimed at accurately mimicking the most important aspects of the ab initio predictions. Rigid-lattice potentials are designed to monitor the evolution during the simulation of the system energy, thus taking care of the thermodynamic aspects of the model. In addition, other ANNs are designed to evaluate the activation energies associated with the MC events (migration towards first-nearest-neighbor positions of single point defects), thereby providing an accurate kinetic modeling. Because our methodology inherently requires the calculation of a substantial amount of reference data, we design as well lattice-free potentials, aimed at replacing the very costly DFT method with an approximate, yet accurate and considerably more computationally efficient, potential. The binary FeCu and FeCr alloys are taken as sample applications considering the extensive literature covering these systems.
Chen, Yi-Ying; Chu, Chia-Ren; Li, Ming-Hsu
2012-10-01
SummaryIn this paper we present a semi-parametric multivariate gap-filling model for tower-based measurement of latent heat flux (LE). Two statistical techniques, the principal component analysis (PCA) and a nonlinear interpolation approach were integrated into this LE gap-filling model. The PCA was first used to resolve the multicollinearity relationships among various environmental variables, including radiation, soil moisture deficit, leaf area index, wind speed, etc. Two nonlinear interpolation methods, multiple regressions (MRS) and the K-nearest neighbors (KNNs) were examined with random selected flux gaps for both clear sky and nighttime/cloudy data to incorporate into this LE gap-filling model. Experimental results indicated that the KNN interpolation approach is able to provide consistent LE estimations while MRS presents over estimations during nighttime/cloudy. Rather than using empirical regression parameters, the KNN approach resolves the nonlinear relationship between the gap-filled LE flux and principal components with adaptive K values under different atmospheric states. The developed LE gap-filling model (PCA with KNN) works with a RMSE of 2.4 W m-2 (˜0.09 mm day-1) at a weekly time scale by adding 40% artificial flux gaps into original dataset. Annual evapotranspiration at this study site were estimated at 736 mm (1803 MJ) and 728 mm (1785 MJ) for year 2008 and 2009, respectively.
Markov modulated Poisson process models incorporating covariates for rainfall intensity.
Thayakaran, R; Ramesh, N I
2013-01-01
Time series of rainfall bucket tip times at the Beaufort Park station, Bracknell, in the UK are modelled by a class of Markov modulated Poisson processes (MMPP) which may be thought of as a generalization of the Poisson process. Our main focus in this paper is to investigate the effects of including covariate information into the MMPP model framework on statistical properties. In particular, we look at three types of time-varying covariates namely temperature, sea level pressure, and relative humidity that are thought to be affecting the rainfall arrival process. Maximum likelihood estimation is used to obtain the parameter estimates, and likelihood ratio tests are employed in model comparison. Simulated data from the fitted model are used to make statistical inferences about the accumulated rainfall in the discrete time interval. Variability of the daily Poisson arrival rates is studied.
Incorporating concern for relative wealth into economic models
1995-01-01
This article develops a simple model that captures a concern for relative standing, or status. This concern is instrumental, in the sense that individuals do not get utility directly from their relative standing, but, rather, the concern is induced because their relative standing affects their consumption of standard commodities. The article investigates the consequences of a concern for relative wealth in models in which individuals are making labor/leisure decisions. The analysis shows how ...
Zacharof, A I; Butler, A P
2004-01-01
A mathematical model simulating the hydrological and biochemical processes occurring in landfilled waste is presented and demonstrated. The model combines biochemical and hydrological models into an integrated representation of the landfill environment. Waste decomposition is modelled using traditional biochemical waste decomposition pathways combined with a simplified methodology for representing the rate of decomposition. Water flow through the waste is represented using a statistical velocity model capable of representing the effects of waste heterogeneity on leachate flow through the waste. Given the limitations in data capture from landfill sites, significant emphasis is placed on improving parameter identification and reducing parameter requirements. A sensitivity analysis is performed, highlighting the model's response to changes in input variables. A model test run is also presented, demonstrating the model capabilities. A parameter perturbation model sensitivity analysis was also performed. This has been able to show that although the model is sensitive to certain key parameters, its overall intuitive response provides a good basis for making reasonable predictions of the future state of the landfill system. Finally, due to the high uncertainty associated with landfill data, a tool for handling input data uncertainty is incorporated in the model's structure. It is concluded that the model can be used as a reasonable tool for modelling landfill processes and that further work should be undertaken to assess the model's performance.
The incorporation and validation of empirical crawling data into the buildingEXODUS model
Muhdi, Rani; Gwynne, Steve; Davis, Jerry
2009-01-01
The deterioration of environmental conditions can influence evacuee decisions and their subsequent behaviors. Simulating evacuee behaviors enhances the robustness of engineering procedural designs, improves the accuracy of egress models, and better evaluates the safety of evacuees. The purpose of this paper is to more accurately incorporate and validate evacuee crawling behavior into the buildingEXODUS egress model. Crawling data were incorporated into the model and tested for accurate repres...
Modelling toluene oxidation : Incorporation of mass transfer phenomena
Hoorn, J.A.A.; van Soolingen, J.; Versteeg, G. F.
2005-01-01
The kinetics of the oxidation of toluene have been studied in close interaction with the gas-liquid mass transfer occurring in the reactor. Kinetic parameters for a simple model have been estimated on basis of experimental observations performed under industrial conditions. The conclusions for the m
Modelling toluene oxidation : Incorporation of mass transfer phenomena
Hoorn, J.A.A.; van Soolingen, J.; Versteeg, G. F.
The kinetics of the oxidation of toluene have been studied in close interaction with the gas-liquid mass transfer occurring in the reactor. Kinetic parameters for a simple model have been estimated on basis of experimental observations performed under industrial conditions. The conclusions for the
Modelling toluene oxidation : Incorporation of mass transfer phenomena
Hoorn, J.A.A.; van Soolingen, J.; Versteeg, G. F.
2005-01-01
The kinetics of the oxidation of toluene have been studied in close interaction with the gas-liquid mass transfer occurring in the reactor. Kinetic parameters for a simple model have been estimated on basis of experimental observations performed under industrial conditions. The conclusions for the m
Incorporating Uncertainties in Satellite-Derived Chlorophyll into Model Forecasts
2012-10-01
radiances in the seven visible MODIS channels used in the estimation of the bio-optical products, such as chlorophyll, absorption and backscattering...grazers, nitrate, silicate, ammonium, and two detritus pools. Phytoplankton photosynthesis in the biochemical model is driven by Photosynthetically
Day-to-day route choice modeling incorporating inertial behavior
Essen, van M.A.; Rakha, H.; Vreeswijk, J.D.; Wismans, L.J.J.; Berkum, van E.C.
2015-01-01
Accurate route choice modeling is one of the most important aspects when predicting the effects of transport policy and dynamic traffic management. Moreover, the effectiveness of intervention measures to a large extent depends on travelers’ response to the changes these measures cause. As a compleme
Workforce scheduling: A new model incorporating human factors
Directory of Open Access Journals (Sweden)
Mohammed Othman
2012-12-01
Full Text Available Purpose: The majority of a company’s improvement comes when the right workers with the right skills, behaviors and capacities are deployed appropriately throughout a company. This paper considers a workforce scheduling model including human aspects such as skills, training, workers’ personalities, workers’ breaks and workers’ fatigue and recovery levels. This model helps to minimize the hiring, firing, training and overtime costs, minimize the number of fired workers with high performance, minimize the break time and minimize the average worker’s fatigue level.Design/methodology/approach: To achieve this objective, a multi objective mixed integer programming model is developed to determine the amount of hiring, firing, training and overtime for each worker type.Findings: The results indicate that the worker differences should be considered in workforce scheduling to generate realistic plans with minimum costs. This paper also investigates the effects of human fatigue and recovery on the performance of the production systems.Research limitations/implications: In this research, there are some assumptions that might affect the accuracy of the model such as the assumption of certainty of the demand in each period, and the linearity function of Fatigue accumulation and recovery curves. These assumptions can be relaxed in future work.Originality/value: In this research, a new model for integrating workers’ differences with workforce scheduling is proposed. To the authors' knowledge, it is the first time to study the effects of different important human factors such as human personality, skills and fatigue and recovery in the workforce scheduling process. This research shows that considering both technical and human factors together can reduce the costs in manufacturing systems and ensure the safety of the workers.
Incorporating Satellite Time-Series Data into Modeling
Gregg, Watson
2008-01-01
In situ time series observations have provided a multi-decadal view of long-term changes in ocean biology. These observations are sufficiently reliable to enable discernment of even relatively small changes, and provide continuous information on a host of variables. Their key drawback is their limited domain. Satellite observations from ocean color sensors do not suffer the drawback of domain, and simultaneously view the global oceans. This attribute lends credence to their use in global and regional model validation and data assimilation. We focus on these applications using the NASA Ocean Biogeochemical Model. The enhancement of the satellite data using data assimilation is featured and the limitation of tongterm satellite data sets is also discussed.
Aircraft conceptual design modelling incorporating reliability and maintainability predictions
Vaziry-Zanjany , Mohammad Ali (F)
1996-01-01
A computer assisted conceptual aircraft design program has been developed (CACAD). It has an optimisation capability, with extensive break-down in maintenance costs. CACAD's aim is to optimise the size, and configurations of turbofan-powered transport aircraft. A methodology was developed to enhance the reliability of current aircraft systems, and was applied to avionics systems. R&M models of thermal management were developed and linked with avionics failure rate and its ma...
Incorporating nucleosomes into thermodynamic models of transcription regulation.
Raveh-Sadka, Tali; Levo, Michal; Segal, Eran
2009-08-01
Transcriptional control is central to many cellular processes, and, consequently, much effort has been devoted to understanding its underlying mechanisms. The organization of nucleosomes along promoter regions is important for this process, since most transcription factors cannot bind nucleosomal sequences and thus compete with nucleosomes for DNA access. This competition is governed by the relative concentrations of nucleosomes and transcription factors and by their respective sequence binding preferences. However, despite its importance, a mechanistic understanding of the quantitative effects that the competition between nucleosomes and factors has on transcription is still missing. Here we use a thermodynamic framework based on fundamental principles of statistical mechanics to explore theoretically the effect that different nucleosome organizations along promoters have on the activation dynamics of promoters in response to varying concentrations of the regulating factors. We show that even simple landscapes of nucleosome organization reproduce experimental results regarding the effect of nucleosomes as general repressors and as generators of obligate binding cooperativity between factors. Our modeling framework also allows us to characterize the effects that various sequence elements of promoters have on the induction threshold and on the shape of the promoter activation curves. Finally, we show that using only sequence preferences for nucleosomes and transcription factors, our model can also predict expression behavior of real promoter sequences, thereby underscoring the importance of the interplay between nucleosomes and factors in determining expression kinetics.
Models of microbiome evolution incorporating host and microbial selection.
Zeng, Qinglong; Wu, Steven; Sukumaran, Jeet; Rodrigo, Allen
2017-09-25
Numerous empirical studies suggest that hosts and microbes exert reciprocal selective effects on their ecological partners. Nonetheless, we still lack an explicit framework to model the dynamics of both hosts and microbes under selection. In a previous study, we developed an agent-based forward-time computational framework to simulate the neutral evolution of host-associated microbial communities in a constant-sized, unstructured population of hosts. These neutral models allowed offspring to sample microbes randomly from parents and/or from the environment. Additionally, the environmental pool of available microbes was constituted by fixed and persistent microbial OTUs and by contributions from host individuals in the preceding generation. In this paper, we extend our neutral models to allow selection to operate on both hosts and microbes. We do this by constructing a phenome for each microbial OTU consisting of a sample of traits that influence host and microbial fitnesses independently. Microbial traits can influence the fitness of hosts ("host selection") and the fitness of microbes ("trait-mediated microbial selection"). Additionally, the fitness effects of traits on microbes can be modified by their hosts ("host-mediated microbial selection"). We simulate the effects of these three types of selection, individually or in combination, on microbiome diversities and the fitnesses of hosts and microbes over several thousand generations of hosts. We show that microbiome diversity is strongly influenced by selection acting on microbes. Selection acting on hosts only influences microbiome diversity when there is near-complete direct or indirect parental contribution to the microbiomes of offspring. Unsurprisingly, microbial fitness increases under microbial selection. Interestingly, when host selection operates, host fitness only increases under two conditions: (1) when there is a strong parental contribution to microbial communities or (2) in the absence of a strong
Amphiphilic poly-N-vinylpyrrolidone nanocarriers with incorporated model proteins
Energy Technology Data Exchange (ETDEWEB)
Kuskov, A N [Department of Polymers, D I Mendeleyev University of Chemical Technology, 9 Miusskaya Square, Moscow 125047 (Russian Federation); Villemson, A L [Department of Chemistry, M V Lomonosov Moscow State University, 119992 Moscow (Russian Federation); Shtilman, M I [Department of Polymers, D I Mendeleyev University of Chemical Technology, 9 Miusskaya Square, Moscow 125047 (Russian Federation); Larionova, N I [Department of Chemistry, M V Lomonosov Moscow State University, 119992 Moscow (Russian Federation); Tsatsakis, A M [Medical School, University of Crete, Voutes, 71409 Heraklion, Crete (Greece); Tsikalas, I [Department of Chemistry and Foundation for Research and Technology-Hellas (FORTH), University of Crete, PO Box 2208, Heraklion 71003, Crete (Greece); Rizos, A K [Department of Chemistry and Foundation for Research and Technology-Hellas (FORTH), University of Crete, PO Box 2208, Heraklion 71003, Crete (Greece)
2007-05-23
New nanoscaled polymeric carriers have been prepared on the basis of different amphiphilic water-soluble derivatives of poly-N-vinylpyrrolidone (PVP). The polymer self-assembly and interaction with model proteins (Bowman-Birk soybean proteinase inhibitor (BBI) and its hydrophobized derivatives) were studied in aqueous media. The possibility of inclusion of both BBI and hydrophobized oleic acid derivatives of BBI in amphiphilic PVP aggregates was investigated. It was ascertained that polymeric particles of size 50-80 nm were formed in certain concentrations of amphiphilic PVP and poorly soluble dioleic acid derivatives of BBI. Such polymeric aggregates are capable of solubilization of dioleoyl BBI with a concomitant prevention of its inactivation at low pH values.
Amphiphilic poly-N-vinylpyrrolidone nanocarriers with incorporated model proteins
Kuskov, A. N.; Villemson, A. L.; Shtilman, M. I.; Larionova, N. I.; Tsatsakis, A. M.; Tsikalas, I.; Rizos, A. K.
2007-05-01
New nanoscaled polymeric carriers have been prepared on the basis of different amphiphilic water-soluble derivatives of poly-N-vinylpyrrolidone (PVP). The polymer self-assembly and interaction with model proteins (Bowman-Birk soybean proteinase inhibitor (BBI) and its hydrophobized derivatives) were studied in aqueous media. The possibility of inclusion of both BBI and hydrophobized oleic acid derivatives of BBI in amphiphilic PVP aggregates was investigated. It was ascertained that polymeric particles of size 50-80 nm were formed in certain concentrations of amphiphilic PVP and poorly soluble dioleic acid derivatives of BBI. Such polymeric aggregates are capable of solubilization of dioleoyl BBI with a concomitant prevention of its inactivation at low pH values.
Incorporating flood event analyses and catchment structures into model development
Oppel, Henning; Schumann, Andreas
2016-04-01
The space-time variability in catchment response results from several hydrological processes which differ in their relevance in an event-specific way. An approach to characterise this variance consists in comparisons between flood events in a catchment and between flood responses of several sub-basins in such an event. In analytical frameworks the impact of space and time variability of rainfall on runoff generation due to rainfall excess can be characterised. Moreover the effect of hillslope and channel network routing on runoff timing can be specified. Hence, a modelling approach is needed to specify the runoff generation and formation. Knowing the space-time variability of rainfall and the (spatial averaged) response of a catchment it seems worthwhile to develop new models based on event and catchment analyses. The consideration of spatial order and the distribution of catchment characteristics in their spatial variability and interaction with the space-time variability of rainfall provides additional knowledge about hydrological processes at the basin scale. For this purpose a new procedure to characterise the spatial heterogeneity of catchments characteristics in their succession along the flow distance (differentiated between river network and hillslopes) was developed. It was applied to study of flood responses at a set of nested catchments in a river basin in eastern Germany. In this study the highest observed rainfall-runoff events were analysed, beginning at the catchment outlet and moving upstream. With regard to the spatial heterogeneities of catchment characteristics, sub-basins were separated by new algorithms to attribute runoff-generation, hillslope and river network processes. With this procedure the cumulative runoff response at the outlet can be decomposed and individual runoff features can be assigned to individual aspects of the catchment. Through comparative analysis between the sub-catchments and the assigned effects on runoff dynamics new
Crowther, Michael J; Andersson, Therese M-L; Lambert, Paul C; Abrams, Keith R; Humphreys, Keith
2016-03-30
A now common goal in medical research is to investigate the inter-relationships between a repeatedly measured biomarker, measured with error, and the time to an event of interest. This form of question can be tackled with a joint longitudinal-survival model, with the most common approach combining a longitudinal mixed effects model with a proportional hazards survival model, where the models are linked through shared random effects. In this article, we look at incorporating delayed entry (left truncation), which has received relatively little attention. The extension to delayed entry requires a second set of numerical integration, beyond that required in a standard joint model. We therefore implement two sets of fully adaptive Gauss-Hermite quadrature with nested Gauss-Kronrod quadrature (to allow time-dependent association structures), conducted simultaneously, to evaluate the likelihood. We evaluate fully adaptive quadrature compared with previously proposed non-adaptive quadrature through a simulation study, showing substantial improvements, both in terms of minimising bias and reducing computation time. We further investigate, through simulation, the consequences of misspecifying the longitudinal trajectory and its impact on estimates of association. Our scenarios showed the current value association structure to be very robust, compared with the rate of change that we found to be highly sensitive showing that assuming a simpler trend when the truth is more complex can lead to substantial bias. With emphasis on flexible parametric approaches, we generalise previous models by proposing the use of polynomials or splines to capture the longitudinal trend and restricted cubic splines to model the baseline log hazard function. The methods are illustrated on a dataset of breast cancer patients, modelling mammographic density jointly with survival, where we show how to incorporate density measurements prior to the at-risk period, to make use of all the available
Ahmad Fauzi, Mohammad Faizal; Gokozan, Hamza Numan; Elder, Brad; Puduvalli, Vinay K.; Otero, Jose J.; Gurcan, Metin N.
2014-03-01
Brain cancer surgery requires intraoperative consultation by neuropathology to guide surgical decisions regarding the extent to which the tumor undergoes gross total resection. In this context, the differential diagnosis between glioblastoma and metastatic cancer is challenging as the decision must be made during surgery in a short time-frame (typically 30 minutes). We propose a method to classify glioblastoma versus metastatic cancer based on extracting textural features from the non-nuclei region of cytologic preparations. For glioblastoma, these regions of interest are filled with glial processes between the nuclei, which appear as anisotropic thin linear structures. For metastasis, these regions correspond to a more homogeneous appearance, thus suitable texture features can be extracted from these regions to distinguish between the two tissue types. In our work, we use the Discrete Wavelet Frames to characterize the underlying texture due to its multi-resolution capability in modeling underlying texture. The textural characterization is carried out in primarily the non-nuclei regions after nuclei regions are segmented by adapting our visually meaningful decomposition segmentation algorithm to this problem. k-nearest neighbor method was then used to classify the features into glioblastoma or metastasis cancer class. Experiment on 53 images (29 glioblastomas and 24 metastases) resulted in average accuracy as high as 89.7% for glioblastoma, 87.5% for metastasis and 88.7% overall. Further studies are underway to incorporate nuclei region features into classification on an expanded dataset, as well as expanding the classification to more types of cancers.
Ground-state ordering of the J1-J2 model on the simple cubic and body-centered cubic lattices
Farnell, D. J. J.; Götze, O.; Richter, J.
2016-06-01
The J1-J2 Heisenberg model is a "canonical" model in the field of quantum magnetism in order to study the interplay between frustration and quantum fluctuations as well as quantum phase transitions driven by frustration. Here we apply the coupled cluster method (CCM) to study the spin-half J1-J2 model with antiferromagnetic nearest-neighbor bonds J1>0 and next-nearest-neighbor bonds J2>0 for the simple cubic (sc) and body-centered cubic (bcc) lattices. In particular, we wish to study the ground-state ordering of these systems as a function of the frustration parameter p =z2J2/z1J1 , where z1 (z2) is the number of nearest (next-nearest) neighbors. We wish to determine the positions of the phase transitions using the CCM and we aim to resolve the nature of the phase transition points. We consider the ground-state energy, order parameters, spin-spin correlation functions, as well as the spin stiffness in order to determine the ground-state phase diagrams of these models. We find a direct first-order phase transition at a value of p =0.528 from a state of nearest-neighbor Néel order to next-nearest-neighbor Néel order for the bcc lattice. For the sc lattice the situation is more subtle. CCM results for the energy, the order parameter, the spin-spin correlation functions, and the spin stiffness indicate that there is no direct first-order transition between ground-state phases with magnetic long-range order, rather it is more likely that two phases with antiferromagnetic long range are separated by a narrow region of a spin-liquid-like quantum phase around p =0.55 . Thus the strong frustration present in the J1-J2 Heisenberg model on the sc lattice may open a window for an unconventional quantum ground state in this three-dimensional spin model.
Incorporating phosphorus cycling into global modeling efforts: a worthwhile, tractable endeavor
Reed, Sasha C.; Yang, Xiaojuan; Thornton, Peter E.
2015-01-01
Myriad field, laboratory, and modeling studies show that nutrient availability plays a fundamental role in regulating CO2 exchange between the Earth's biosphere and atmosphere, and in determining how carbon pools and fluxes respond to climatic change. Accordingly, global models that incorporate coupled climate–carbon cycle feedbacks made a significant advance with the introduction of a prognostic nitrogen cycle. Here we propose that incorporating phosphorus cycling represents an important next step in coupled climate–carbon cycling model development, particularly for lowland tropical forests where phosphorus availability is often presumed to limit primary production. We highlight challenges to including phosphorus in modeling efforts and provide suggestions for how to move forward.
Using Unlabeled Data to Improve Inductive Models by Incorporating Transductive Models
Directory of Open Access Journals (Sweden)
ShengJun Cheng
2014-02-01
Full Text Available This paper shows how to use labeled and unlabeled data to improve inductive models with the help of transductivemodels.We proposed a solution for the self-training scenario. Self- training is an effective semi-supervised wrapper method which can generalize any type of supervised inductive model to the semi-supervised settings. it iteratively refines a inductive model by bootstrap from unlabeled data. Standard self-training uses the classifier model(trained on labeled examples to label and select candidates from the unlabeled training set, which may be problematic since the initial classifier may not be able to provide highly confident predictions as labeled training data is always rare. As a result, it could always suffer from introducing too much wrongly labeled candidates to the labeled training set, which may severely degrades performance. To tackle this problem, we propose a novel self-training style algorithm which incorporate a graph-based transductive model in the self-labeling process. Unlike standard self-training, our algorithm utilizes labeled and unlabeled data as a whole to label and select unlabeled examples for training set augmentation. A robust transductive model based on graph markov random walk is proposed, which exploits manifold assumption to output reliable predictions on unlabeled data using noisy labeled examples. The proposed algorithm can greatly minimize the risk of performance degradation due to accumulated noise in the training set. Experiments show that the proposed algorithm can effectively utilize unlabeled data to improve classification performance.
Acoustic modeling for emotion recognition
Anne, Koteswara Rao; Vankayalapati, Hima Deepthi
2015-01-01
This book presents state of art research in speech emotion recognition. Readers are first presented with basic research and applications – gradually more advance information is provided, giving readers comprehensive guidance for classify emotions through speech. Simulated databases are used and results extensively compared, with the features and the algorithms implemented using MATLAB. Various emotion recognition models like Linear Discriminant Analysis (LDA), Regularized Discriminant Analysis (RDA), Support Vector Machines (SVM) and K-Nearest neighbor (KNN) and are explored in detail using prosody and spectral features, and feature fusion techniques.
Image quantization: statistics and modeling
Whiting, Bruce R.; Muka, Edward
1998-07-01
A method for analyzing the effects of quantization, developed for temporal one-dimensional signals, is extended to two- dimensional radiographic images. By calculating the probability density function for the second order statistics (the differences between nearest neighbor pixels) and utilizing its Fourier transform (the characteristic function), the effect of quantization on image statistics can be studied by the use of standard communication theory. The approach is demonstrated by characterizing the noise properties of a storage phosphor computed radiography system and the image statistics of a simple radiographic object (cylinder) and by comparing the model to experimental measurements. The role of quantization noise and the onset of contouring in image degradation are explained.
A new model for in situ nitrogen incorporation into 4H-SiC during epitaxy
Ferro, Gabriel; Chaussende, Didier
2017-02-01
Nitrogen doping of 4H-SiC during vapor phase epitaxy is still lacking of a general model explaining the apparently contradictory trends obtained by different teams. In this paper, the evolutions of nitrogen incorporation (on both polar Si and C faces) as a function of the main growth parameters (C/Si ratio, temperature, pressure and growth rate) are reviewed and explained using a model based on surface exchanges between the gas phase and the uppermost 4H-SiC atomic layers. In this model, N incorporation is driven mainly by the transient formation of C vacancies, due to H2 etching, at the surface or near the surface. It is shown that all the growth parameters are influencing the probability of C vacancies formation in a similar manner as they do for N incorporation. The surface exchange model proposes a new framework for explaining the experimental results even beyond the commonly accepted reactor type dependency.
Calculation of Al-Zn diagram from central atoms model
Institute of Scientific and Technical Information of China (English)
无
1999-01-01
A slightly modified central atoms model was proposed. The probabilities of various clusters with the central atoms and their nearest neighboring shells can be calculated neglecting the assumption of the param eter of energy in the central atoms model in proportion to the number of other atoms i (referred with the central atom). A parameter Pα is proposed in this model, which equals to reciprocal of activity coefficient of a component, therefore, the new model can be understood easily. By this model, the Al-Zn phase diagram and its thermodynamic properties were calculated, the results coincide with the experimental data.
Peng, Guanghan; He, Hongdi; Lu, Wei-Zhen
2016-01-01
In this paper, a new car-following model is proposed with the consideration of the incorporating timid and aggressive behaviors on single lane. The linear stability condition with the incorporating timid and aggressive behaviors term is obtained. Numerical simulation indicates that the new car-following model can estimate proper delay time of car motion and kinematic wave speed at jam density by considering the incorporating the timid and aggressive behaviors. The results also show that the aggressive behavior can improve traffic flow while the timid behavior deteriorates traffic stability, which means that the aggressive behavior is better than timid behavior since the aggressive driver makes rapid response to the variation of the velocity of the leading car. Snapshot of the velocities also shows that the new model can approach approximation to a wide moving jam.
Incorporation of the capillary hysteresis model HYSTR into the numerical code TOUGH
Energy Technology Data Exchange (ETDEWEB)
Niemi, A.; Bodvarsson, G.S.; Pruess, K.
1991-11-01
As part of the work performed to model flow in the unsaturated zone at Yucca Mountain Nevada, a capillary hysteresis model has been developed. The computer program HYSTR has been developed to compute the hysteretic capillary pressure -- liquid saturation relationship through interpolation of tabulated data. The code can be easily incorporated into any numerical unsaturated flow simulator. A complete description of HYSTR, including a brief summary of the previous hysteresis literature, detailed description of the program, and instructions for its incorporation into a numerical simulator are given in the HYSTR user`s manual (Niemi and Bodvarsson, 1991a). This report describes the incorporation of HYSTR into the numerical code TOUGH (Transport of Unsaturated Groundwater and Heat; Pruess, 1986). The changes made and procedures for the use of TOUGH for hysteresis modeling are documented.
Bias associated with failing to incorporate dependence on event history in Markov models.
Bentley, Tanya G K; Kuntz, Karen M; Ringel, Jeanne S
2010-01-01
When using state-transition Markov models to simulate risk of recurrent events over time, incorporating dependence on higher numbers of prior episodes can increase model complexity, yet failing to capture this event history may bias model outcomes. This analysis assessed the tradeoffs between model bias and complexity when evaluating risks of recurrent events in Markov models. The authors developed a generic episode/relapse Markov cohort model, defining bias as the percentage change in events prevented with 2 hypothetical interventions (prevention and treatment) when incorporating 0 to 9 prior episodes in relapse risk versus a model with 10 such episodes. Magnitude and sign of bias were evaluated as a function of event and recovery risks, disease-specific mortality, and risk function. Bias was positive in the base case for a prevention strategy, indicating that failing to fully incorporate dependence on event history overestimated the prevention's predicted impact. For treatment, the bias was negative, indicating an underestimated benefit. Bias approached zero as the number of tracked prior episodes increased, and the average bias over 10 tracked episodes was greater with the exponential compared with linear functions of relapse risk and with treatment compared with prevention strategies. With linear and exponential risk functions, absolute bias reached 33% and 78%, respectively, in prevention and 52% and 85% in treatment. Failing to incorporate dependence on prior event history in subsequent relapse risk in Markov models can greatly affect model outcomes, overestimating the impact of prevention and treatment strategies by up to 85% and underestimating the impact in some treatment models by up to 20%. When at least 4 prior episodes are incorporated, bias does not exceed 26% in prevention or 11% in treatment.
Modeling fraud detection and the incorporation of forensic specialists in the audit process
DEFF Research Database (Denmark)
Sakalauskaite, Dominyka
Financial statement audits are still comparatively poor in fraud detection. Forensic specialists can play a significant role in increasing audit quality. In this paper, based on prior academic research, I develop a model of fraud detection and the incorporation of forensic specialists in the audit...... process. The intention of the model is to identify the reasons why the audit is weak in fraud detection and to provide the analytical framework to assess whether the incorporation of forensic specialists can help to improve it. The results show that such specialists can potentially improve the fraud...
Wang, Jian-Xun; Xiao, Heng
2015-01-01
Simulations based on Reynolds-Averaged Navier--Stokes (RANS) models have been used to support high-consequence decisions related to turbulent flows. Apart from the deterministic model predictions, the decision makers are often equally concerned about the predictions confidence. Among the uncertainties in RANS simulations, the model-form uncertainty is an important or even a dominant source. Therefore, quantifying and reducing the model-form uncertainties in RANS simulations are of critical importance to make risk-informed decisions. Researchers in statistics communities have made efforts on this issue by considering numerical models as black boxes. However, this physics-neutral approach is not a most efficient use of data, and is not practical for most engineering problems. Recently, we proposed an open-box, Bayesian framework for quantifying and reducing model-form uncertainties in RANS simulations by incorporating observation data and physics-prior knowledge. It can incorporate the information from the vast...
Brisbin, Abra; Fridley, Brooke L
2013-08-01
Pathway topology and relationships between genes have the potential to provide information for modeling effects of mRNA gene expression on complex traits. For example, researchers may wish to incorporate the prior belief that "hub" genes (genes with many neighbors) are more likely to influence the trait. In this paper, we propose and compare six Bayesian pathway-based prior models to incorporate pathway topology information into association analyses. Including prior information regarding the relationships among genes in a pathway was effective in somewhat improving detection rates for genes associated with complex traits. Through an extensive set of simulations, we found that when hub (central) effects are expected, the diagonal degree model is preferred; when spoke (edge) effects are expected, the spatial power model is preferred. When there is no prior knowledge about the location of the effect genes in the pathway (e.g., hub versus spoke model), it is worthwhile to apply multiple models, as the model with the best DIC is not always the one with the best detection rate. We also applied the models to pharmacogenomic studies for the drugs gemcitabine and 6-mercaptopurine and found that the diagonal degree model identified an association between 6-mercaptopurine response and expression of the gene SLC28A3, which was not detectable using the model including no pathway information. These results demonstrate the value of incorporating pathway information into association analyses.
Goldstein, Harvey; Leckie, George; Charlton, Christopher; Tilling, Kate; Browne, William J
2017-01-01
Aim To present a flexible model for repeated measures longitudinal growth data within individuals that allows trends over time to incorporate individual-specific random effects. These may reflect the timing of growth events and characterise within-individual variability which can be modelled as a function of age. Subjects and methods A Bayesian model is developed that includes random effects for the mean growth function, an individual age-alignment random effect and random effects for the within-individual variance function. This model is applied to data on boys' heights from the Edinburgh longitudinal growth study and to repeated weight measurements of a sample of pregnant women in the Avon Longitudinal Study of Parents and Children cohort. Results The mean age at which the growth curves for individual boys are aligned is 11.4 years, corresponding to the mean 'take off' age for pubertal growth. The within-individual variance (standard deviation) is found to decrease from 0.24 cm(2) (0.50 cm) at 9 years for the 'average' boy to 0.07 cm(2) (0.25 cm) at 16 years. Change in weight during pregnancy can be characterised by regression splines with random effects that include a large woman-specific random effect for the within-individual variation, which is also correlated with overall weight and weight gain. Conclusions The proposed model provides a useful extension to existing approaches, allowing considerable flexibility in describing within- and between-individual differences in growth patterns.
Chowdhury, Nadim; Azim, Zubair Al; Alam, Md Hasibul; Niaz, Iftikhar Ahmad; Khosru, Quazi D M
2014-01-01
We propose a physically based analytical compact model to calculate Eigen energies and Wave functions which incorporates penetration effect. The model is applicable for a quantum well structure that frequently appears in modern nano-scale devices. This model is equally applicable for both silicon and III-V devices. Unlike other models already available in the literature, our model can accurately predict all the eigen energies without the inclusion of any fitting parameters. The validity of our model has been checked with numerical simulations and the results show significantly better agreement compared to the available methods.
Institute of Scientific and Technical Information of China (English)
无
2001-01-01
An N-gram Chinese language model incorporating linguistic rules is presented. By constructing elements lattice, rules information is incorporated in statistical frame. To facilitate the hybrid modeling, novel methods such as MI-based rule evaluating, weighted rule quantification and element-based n-gram probability approximation are presented. Dynamic Viterbi algorithm is adopted to search the best path in lattice. To strengthen the model, transformation-based error-driven rules learning is adopted. Applying proposed model to Chinese Pinyin-to-character conversion, high performance has been achieved in accuracy, flexibility and robustness simultaneously. Tests show correct rate achieves 94.81% instead of 90.53% using bi-gram Markov model alone. Many long-distance dependency and recursion in language can be processed effectively.
A new experimental procedure for incorporation of model contaminants in polymer hosts
Papaspyrides, C.D.; Voultzatis, Y.; Pavlidou, S.; Tsenoglou, C.; Dole, P.; Feigenbaum, A.; Paseiro, P.; Pastorelli, S.; Cruz Garcia, C. de la; Hankemeier, T.; Aucejo, S.
2005-01-01
A new experimental procedure for incorporation of model contaminants in polymers was developed as part of a general scheme for testing the efficiency of functional barriers in food packaging. The aim was to progressively pollute polymers in a controlled fashion up to a high level in the range of 100
75 FR 56487 - Airworthiness Directives; Erickson Air-Crane Incorporated Model S-64F Helicopters
2010-09-16
... Federal Aviation Administration 14 CFR Part 39 RIN 2120-AA64 Airworthiness Directives; Erickson Air-Crane... rulemaking (NPRM). SUMMARY: This document proposes adopting a new airworthiness directive (AD) for Erickson Air-Crane Incorporated (Erickson Air-Crane) Model S- 64F helicopters. The AD would require, at...
A new experimental procedure for incorporation of model contaminants in polymer hosts
Papaspyrides, C.D.; Voultzatis, Y.; Pavlidou, S.; Tsenoglou, C.; Dole, P.; Feigenbaum, A.; Paseiro, P.; Pastorelli, S.; Cruz Garcia, C. de la; Hankemeier, T.; Aucejo, S.
2005-01-01
A new experimental procedure for incorporation of model contaminants in polymers was developed as part of a general scheme for testing the efficiency of functional barriers in food packaging. The aim was to progressively pollute polymers in a controlled fashion up to a high level in the range of 100
Incorporating Eco-Evolutionary Processes into Population Models:Design and Applications
Eco-evolutionary population models are powerful new tools for exploring howevolutionary processes influence plant and animal population dynamics andvice-versa. The need to manage for climate change and other dynamicdisturbance regimes is creating a demand for the incorporation of...
The Forced Choice Dilemma: A Model Incorporating Idiocentric/Allocentric Cultural Orientation
Jung, Jae Yup; McCormick, John; Gross, Miraca U. M.
2012-01-01
This study developed and tested a new model of the forced choice dilemma (i.e., the belief held by some intellectually gifted students that they must choose between academic achievement and peer acceptance) that incorporates individual-level cultural orientation variables (i.e., vertical allocentrism and vertical idiocentrism). A survey that had…
SPARC Groups: A Model for Incorporating Spiritual Psychoeducation into Group Work
Christmas, Christopher; Van Horn, Stacy M.
2012-01-01
The use of spirituality as a resource for clients within the counseling field is growing; however, the primary focus has been on individual therapy. The purpose of this article is to provide counseling practitioners, administrators, and researchers with an approach for incorporating spiritual psychoeducation into group work. The proposed model can…
Kok, de Jean-Luc; Titus, Milan; Wind, Herman G.
2000-01-01
Decision-support systems in the field of integrated water management could benefit considerably from social science knowledge, as many environmental changes are human-induced. Unfortunately the adequate incorporation of qualitative social science concepts in a quantitative modeling framework is not
Energy transfers in shell models for magnetohydrodynamics turbulence.
Lessinnes, Thomas; Carati, Daniele; Verma, Mahendra K
2009-06-01
A systematic procedure to derive shell models for magnetohydrodynamic turbulence is proposed. It takes into account the conservation of ideal quadratic invariants such as the total energy, the cross helicity, and the magnetic helicity, as well as the conservation of the magnetic energy by the advection term in the induction equation. This approach also leads to simple expressions for the energy exchanges as well as to unambiguous definitions for the energy fluxes. When applied to the existing shell models with nonlinear interactions limited to the nearest-neighbor shells, this procedure reproduces well-known models but suggests a reinterpretation of the energy fluxes.
Growing small-world networks based on a modified BA model
Xu, Xinping; Li, Wei
2006-01-01
We propose a simple growing model for the evolution of small-world networks. It is introduced as a modified BA model in which all the edges connected to the new nodes are made locally to the creator and its nearest neighbors. It is found that this model can produce small-world networks with power-law degree distributions. Properties of our model, including the degree distribution, clustering, and the average path length are compared with that of the BA model. Since most real networks are both scale-free and small-world networks, our model may provide a satisfactory description for empirical characteristics of real networks.
Nine challenges in incorporating the dynamics of behaviour in infectious diseases models.
Funk, Sebastian; Bansal, Shweta; Bauch, Chris T; Eames, Ken T D; Edmunds, W John; Galvani, Alison P; Klepac, Petra
2015-03-01
Traditionally, the spread of infectious diseases in human populations has been modelled with static parameters. These parameters, however, can change when individuals change their behaviour. If these changes are themselves influenced by the disease dynamics, there is scope for mechanistic models of behaviour to improve our understanding of this interaction. Here, we present challenges in modelling changes in behaviour relating to disease dynamics, specifically: how to incorporate behavioural changes in models of infectious disease dynamics, how to inform measurement of relevant behaviour to parameterise such models, and how to determine the impact of behavioural changes on observed disease dynamics. Copyright © 2014 The Authors. Published by Elsevier B.V. All rights reserved.
Incorporation of composite defects from ultrasonic NDE into CAD and FE models
Bingol, Onur Rauf; Schiefelbein, Bryan; Grandin, Robert J.; Holland, Stephen D.; Krishnamurthy, Adarsh
2017-02-01
Fiber-reinforced composites are widely used in aerospace industry due to their combined properties of high strength and low weight. However, owing to their complex structure, it is difficult to assess the impact of manufacturing defects and service damage on their residual life. While, ultrasonic testing (UT) is the preferred NDE method to identify the presence of defects in composites, there are no reasonable ways to model the damage and evaluate the structural integrity of composites. We have developed an automated framework to incorporate flaws and known composite damage automatically into a finite element analysis (FEA) model of composites, ultimately aiding in accessing the residual life of composites and make informed decisions regarding repairs. The framework can be used to generate a layer-by-layer 3D structural CAD model of the composite laminates replicating their manufacturing process. Outlines of structural defects, such as delaminations, are automatically detected from UT of the laminate and are incorporated into the CAD model between the appropriate layers. In addition, the framework allows for direct structural analysis of the resulting 3D CAD models with defects by automatically applying the appropriate boundary conditions. In this paper, we show a working proof-of-concept for the composite model builder with capabilities of incorporating delaminations between laminate layers and automatically preparing the CAD model for structural analysis using a FEA software.
Incorporating sorption/desorption of organic pollutants into river water quality model
Institute of Scientific and Technical Information of China (English)
LOU Bao-feng; ZHU Li-zhong; YANG Kun
2004-01-01
Preliminary research was conducted about how to incorporate sorption/desorption of organic pollutants with suspended solids and sediments into single-chemical and one-dimensional water quality model of Jinghang Canal.Sedimentation-resuspension coefficient k3 was deduced; characteristics of organic pollutants, concentrations and components of suspended solids/sediments and hydrological and hydraulic conditions were integrated into k3 and further into river water quality model; impact of sorption/desorption of organic pollutants with suspended solids and sediments on prediction function of the model was discussed. Results demonstrated that this impact is pronounced for organic pollutants with relatively large Koc and Kow, especially when they are also conservative and foc of river suspended solids/sediments is high, and that incorporation of sorption/ desorption of organic pollutants into river water quality model can improve its prediction accuracy.
Iglesias, Juan Eugenio; Sabuncu, Mert Rory; Van Leemput, Koen
2012-01-01
Many successful segmentation algorithms are based on Bayesian models in which prior anatomical knowledge is combined with the available image information. However, these methods typically have many free parameters that are estimated to obtain point estimates only, whereas a faithful Bayesian analysis would also consider all possible alternate values these parameters may take. In this paper, we propose to incorporate the uncertainty of the free parameters in Bayesian segmentation models more a...
Incorporating social role theory into topic models for social media content analysis
Zhao, Wayne Xin; Wang, Jinpeng; He, Yulan; Nie, Jian-Yun; Wen, Ji-Rong; Li, Xiaoming
2015-01-01
In this paper, we explore the idea of social role theory (SRT) and propose a novel regularized topic model which incorporates SRT into the generative process of social media content. We assume that a user can play multiple social roles, and each social role serves to fulfil different duties and is associated with a role-driven distribution over latent topics. In particular, we focus on social roles corresponding to the most common social activities on social networks. Our model is instantiate...
Incorporating preferential flow into a 3D model of a forested headwater catchment
Glaser, Barbara; Jackisch, Conrad; Hopp, Luisa; Pfister, Laurent; Klaus, Julian
2016-04-01
Preferential flow plays an important role for water flow and solute transport. The inclusion of preferential flow, for example with dual porosity or dual permeability approaches, is a common feature in transport simulations at the plot scale. But at hillslope and catchment scales, incorporation of macropore and fracture flow into distributed hydrologic 3D models is rare, often due to limited data availability for model parameterisation. In this study, we incorporated preferential flow into an existing 3D integrated surface subsurface hydrologic model (HydroGeoSphere) of a headwater region (6 ha) of the forested Weierbach catchment in western Luxembourg. Our model philosophy was a strong link between measured data and the model setup. The model setup we used previously had been parameterised and validated based on various field data. But existing macropores and fractures had not been considered in this initial model setup. The multi-criteria validation revealed a good model performance but also suggested potential for further improvement by incorporating preferential flow as additional process. In order to pursue the data driven model philosophy for the implementation of preferential flow, we analysed the results of plot scale bromide sprinkling and infiltration experiments carried out in the vicinity of the Weierbach catchment. Three 1 sqm plots were sprinkled for one hour and excavated one day later for bromide depth profile sampling. We simulated these sprinkling experiments at the soil column scale, using the parameterisation of the base headwater model extended by a second permeability domain. Representing the bromide depth profiles was successful without changing this initial parameterisation. Moreover, to explain the variability between the three bromide depth profiles it was sufficient to adapt the dual permeability properties, indicating the spatial heterogeneity of preferential flow. Subsequently, we incorporated the dual permeability simulation in the
Környei, László; Pleimling, Michel; Iglói, Ferenc
2008-01-01
The universality class, even the order of the transition, of the two-dimensional Ising model depends on the range and the symmetry of the interactions (Onsager model, Baxter-Wu model, Turban model, etc.), but the critical temperature is generally the same due to self-duality. Here we consider a sudden change in the form of the interaction and study the nonequilibrium critical dynamical properties of the nearest-neighbor model. The relaxation of the magnetization and the decay of the autocorrelation function are found to display a power law behavior with characteristic exponents that depend on the universality class of the initial state.