Manifold learning to interpret JET high-dimensional operational space
International Nuclear Information System (INIS)
Cannas, B; Fanni, A; Pau, A; Sias, G; Murari, A
2013-01-01
In this paper, the problem of visualization and exploration of JET high-dimensional operational space is considered. The data come from plasma discharges selected from JET campaigns from C15 (year 2005) up to C27 (year 2009). The aim is to learn the possible manifold structure embedded in the data and to create some representations of the plasma parameters on low-dimensional maps, which are understandable and which preserve the essential properties owned by the original data. A crucial issue for the design of such mappings is the quality of the dataset. This paper reports the details of the criteria used to properly select suitable signals downloaded from JET databases in order to obtain a dataset of reliable observations. Moreover, a statistical analysis is performed to recognize the presence of outliers. Finally data reduction, based on clustering methods, is performed to select a limited and representative number of samples for the operational space mapping. The high-dimensional operational space of JET is mapped using a widely used manifold learning method, the self-organizing maps. The results are compared with other data visualization methods. The obtained maps can be used to identify characteristic regions of the plasma scenario, allowing to discriminate between regions with high risk of disruption and those with low risk of disruption. (paper)
The literary uses of high-dimensional space
Directory of Open Access Journals (Sweden)
Ted Underwood
2015-12-01
Full Text Available Debates over “Big Data” shed more heat than light in the humanities, because the term ascribes new importance to statistical methods without explaining how those methods have changed. What we badly need instead is a conversation about the substantive innovations that have made statistical modeling useful for disciplines where, in the past, it truly wasn’t. These innovations are partly technical, but more fundamentally expressed in what Leo Breiman calls a new “culture” of statistical modeling. Where 20th-century methods often required humanists to squeeze our unstructured texts, sounds, or images into some special-purpose data model, new methods can handle unstructured evidence more directly by modeling it in a high-dimensional space. This opens a range of research opportunities that humanists have barely begun to discuss. To date, topic modeling has received most attention, but in the long run, supervised predictive models may be even more important. I sketch their potential by describing how Jordan Sellers and I have begun to model poetic distinction in the long 19th century—revealing an arc of gradual change much longer than received literary histories would lead us to expect.
Data analysis in high-dimensional sparse spaces
DEFF Research Database (Denmark)
Clemmensen, Line Katrine Harder
classification techniques for high-dimensional problems are presented: Sparse discriminant analysis, sparse mixture discriminant analysis and orthogonality constrained support vector machines. The first two introduces sparseness to the well known linear and mixture discriminant analysis and thereby provide low...... are applied to classifications of fish species, ear canal impressions used in the hearing aid industry, microbiological fungi species, and various cancerous tissues and healthy tissues. In addition, novel applications of sparse regressions (also called the elastic net) to the medical, concrete, and food...
Efficient and accurate nearest neighbor and closest pair search in high-dimensional space
Tao, Yufei; Yi, Ke; Sheng, Cheng; Kalnis, Panos
2010-01-01
Nearest Neighbor (NN) search in high-dimensional space is an important problem in many applications. From the database perspective, a good solution needs to have two properties: (i) it can be easily incorporated in a relational database, and (ii
On High Dimensional Searching Spaces and Learning Methods
DEFF Research Database (Denmark)
Yazdani, Hossein; Ortiz-Arroyo, Daniel; Choros, Kazimierz
2017-01-01
, and similarity functions and discuss the pros and cons of using each of them. Conventional similarity functions evaluate objects in the vector space. Contrarily, Weighted Feature Distance (WFD) functions compare data objects in both feature and vector spaces, preventing the system from being affected by some...
Aspects of high-dimensional theories in embedding spaces
International Nuclear Information System (INIS)
Maia, M.D.; Mecklenburg, W.
1983-01-01
The question of whether physical meaning may be attributed to the extra dimensions provided by embedding procedures as applied to physical space-times is discussed. The similarities and differences of the present picture to that of conventional Kaluza-Klein pictures are commented. (Author) [pt
Distribution of high-dimensional entanglement via an intra-city free-space link.
Steinlechner, Fabian; Ecker, Sebastian; Fink, Matthias; Liu, Bo; Bavaresco, Jessica; Huber, Marcus; Scheidl, Thomas; Ursin, Rupert
2017-07-24
Quantum entanglement is a fundamental resource in quantum information processing and its distribution between distant parties is a key challenge in quantum communications. Increasing the dimensionality of entanglement has been shown to improve robustness and channel capacities in secure quantum communications. Here we report on the distribution of genuine high-dimensional entanglement via a 1.2-km-long free-space link across Vienna. We exploit hyperentanglement, that is, simultaneous entanglement in polarization and energy-time bases, to encode quantum information, and observe high-visibility interference for successive correlation measurements in each degree of freedom. These visibilities impose lower bounds on entanglement in each subspace individually and certify four-dimensional entanglement for the hyperentangled system. The high-fidelity transmission of high-dimensional entanglement under real-world atmospheric link conditions represents an important step towards long-distance quantum communications with more complex quantum systems and the implementation of advanced quantum experiments with satellite links.
Efficient and accurate nearest neighbor and closest pair search in high-dimensional space
Tao, Yufei
2010-07-01
Nearest Neighbor (NN) search in high-dimensional space is an important problem in many applications. From the database perspective, a good solution needs to have two properties: (i) it can be easily incorporated in a relational database, and (ii) its query cost should increase sublinearly with the dataset size, regardless of the data and query distributions. Locality-Sensitive Hashing (LSH) is a well-known methodology fulfilling both requirements, but its current implementations either incur expensive space and query cost, or abandon its theoretical guarantee on the quality of query results. Motivated by this, we improve LSH by proposing an access method called the Locality-Sensitive B-tree (LSB-tree) to enable fast, accurate, high-dimensional NN search in relational databases. The combination of several LSB-trees forms a LSB-forest that has strong quality guarantees, but improves dramatically the efficiency of the previous LSH implementation having the same guarantees. In practice, the LSB-tree itself is also an effective index which consumes linear space, supports efficient updates, and provides accurate query results. In our experiments, the LSB-tree was faster than: (i) iDistance (a famous technique for exact NN search) by two orders ofmagnitude, and (ii) MedRank (a recent approximate method with nontrivial quality guarantees) by one order of magnitude, and meanwhile returned much better results. As a second step, we extend our LSB technique to solve another classic problem, called Closest Pair (CP) search, in high-dimensional space. The long-term challenge for this problem has been to achieve subquadratic running time at very high dimensionalities, which fails most of the existing solutions. We show that, using a LSB-forest, CP search can be accomplished in (worst-case) time significantly lower than the quadratic complexity, yet still ensuring very good quality. In practice, accurate answers can be found using just two LSB-trees, thus giving a substantial
High-dimensional free-space optical communications based on orbital angular momentum coding
Zou, Li; Gu, Xiaofan; Wang, Le
2018-03-01
In this paper, we propose a high-dimensional free-space optical communication scheme using orbital angular momentum (OAM) coding. In the scheme, the transmitter encodes N-bits information by using a spatial light modulator to convert a Gaussian beam to a superposition mode of N OAM modes and a Gaussian mode; The receiver decodes the information through an OAM mode analyser which consists of a MZ interferometer with a rotating Dove prism, a photoelectric detector and a computer carrying out the fast Fourier transform. The scheme could realize a high-dimensional free-space optical communication, and decodes the information much fast and accurately. We have verified the feasibility of the scheme by exploiting 8 (4) OAM modes and a Gaussian mode to implement a 256-ary (16-ary) coding free-space optical communication to transmit a 256-gray-scale (16-gray-scale) picture. The results show that a zero bit error rate performance has been achieved.
Compound Structure-Independent Activity Prediction in High-Dimensional Target Space.
Balfer, Jenny; Hu, Ye; Bajorath, Jürgen
2014-08-01
Profiling of compound libraries against arrays of targets has become an important approach in pharmaceutical research. The prediction of multi-target compound activities also represents an attractive task for machine learning with potential for drug discovery applications. Herein, we have explored activity prediction in high-dimensional target space. Different types of models were derived to predict multi-target activities. The models included naïve Bayesian (NB) and support vector machine (SVM) classifiers based upon compound structure information and NB models derived on the basis of activity profiles, without considering compound structure. Because the latter approach can be applied to incomplete training data and principally depends on the feature independence assumption, SVM modeling was not applicable in this case. Furthermore, iterative hybrid NB models making use of both activity profiles and compound structure information were built. In high-dimensional target space, NB models utilizing activity profile data were found to yield more accurate activity predictions than structure-based NB and SVM models or hybrid models. An in-depth analysis of activity profile-based models revealed the presence of correlation effects across different targets and rationalized prediction accuracy. Taken together, the results indicate that activity profile information can be effectively used to predict the activity of test compounds against novel targets. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
A New Ensemble Method with Feature Space Partitioning for High-Dimensional Data Classification
Directory of Open Access Journals (Sweden)
Yongjun Piao
2015-01-01
Full Text Available Ensemble data mining methods, also known as classifier combination, are often used to improve the performance of classification. Various classifier combination methods such as bagging, boosting, and random forest have been devised and have received considerable attention in the past. However, data dimensionality increases rapidly day by day. Such a trend poses various challenges as these methods are not suitable to directly apply to high-dimensional datasets. In this paper, we propose an ensemble method for classification of high-dimensional data, with each classifier constructed from a different set of features determined by partitioning of redundant features. In our method, the redundancy of features is considered to divide the original feature space. Then, each generated feature subset is trained by a support vector machine, and the results of each classifier are combined by majority voting. The efficiency and effectiveness of our method are demonstrated through comparisons with other ensemble techniques, and the results show that our method outperforms other methods.
Individual-based models for adaptive diversification in high-dimensional phenotype spaces.
Ispolatov, Iaroslav; Madhok, Vaibhav; Doebeli, Michael
2016-02-07
Most theories of evolutionary diversification are based on equilibrium assumptions: they are either based on optimality arguments involving static fitness landscapes, or they assume that populations first evolve to an equilibrium state before diversification occurs, as exemplified by the concept of evolutionary branching points in adaptive dynamics theory. Recent results indicate that adaptive dynamics may often not converge to equilibrium points and instead generate complicated trajectories if evolution takes place in high-dimensional phenotype spaces. Even though some analytical results on diversification in complex phenotype spaces are available, to study this problem in general we need to reconstruct individual-based models from the adaptive dynamics generating the non-equilibrium dynamics. Here we first provide a method to construct individual-based models such that they faithfully reproduce the given adaptive dynamics attractor without diversification. We then show that a propensity to diversify can be introduced by adding Gaussian competition terms that generate frequency dependence while still preserving the same adaptive dynamics. For sufficiently strong competition, the disruptive selection generated by frequency-dependence overcomes the directional evolution along the selection gradient and leads to diversification in phenotypic directions that are orthogonal to the selection gradient. Copyright © 2015 Elsevier Ltd. All rights reserved.
Nam, Julia EunJu; Mueller, Klaus
2013-02-01
Gaining a true appreciation of high-dimensional space remains difficult since all of the existing high-dimensional space exploration techniques serialize the space travel in some way. This is not so foreign to us since we, when traveling, also experience the world in a serial fashion. But we typically have access to a map to help with positioning, orientation, navigation, and trip planning. Here, we propose a multivariate data exploration tool that compares high-dimensional space navigation with a sightseeing trip. It decomposes this activity into five major tasks: 1) Identify the sights: use a map to identify the sights of interest and their location; 2) Plan the trip: connect the sights of interest along a specifyable path; 3) Go on the trip: travel along the route; 4) Hop off the bus: experience the location, look around, zoom into detail; and 5) Orient and localize: regain bearings in the map. We describe intuitive and interactive tools for all of these tasks, both global navigation within the map and local exploration of the data distributions. For the latter, we describe a polygonal touchpad interface which enables users to smoothly tilt the projection plane in high-dimensional space to produce multivariate scatterplots that best convey the data relationships under investigation. Motion parallax and illustrative motion trails aid in the perception of these transient patterns. We describe the use of our system within two applications: 1) the exploratory discovery of data configurations that best fit a personal preference in the presence of tradeoffs and 2) interactive cluster analysis via cluster sculpting in N-D.
Directory of Open Access Journals (Sweden)
L.V. Arun Shalin
2016-01-01
Full Text Available Clustering is a process of grouping elements together, designed in such a way that the elements assigned to similar data points in a cluster are more comparable to each other than the remaining data points in a cluster. During clustering certain difficulties related when dealing with high dimensional data are ubiquitous and abundant. Works concentrated using anonymization method for high dimensional data spaces failed to address the problem related to dimensionality reduction during the inclusion of non-binary databases. In this work we study methods for dimensionality reduction for non-binary database. By analyzing the behavior of dimensionality reduction for non-binary database, results in performance improvement with the help of tag based feature. An effective multi-clustering anonymization approach called Discrete Component Task Specific Multi-Clustering (DCTSM is presented for dimensionality reduction on non-binary database. To start with we present the analysis of attribute in the non-binary database and cluster projection identifies the sparseness degree of dimensions. Additionally with the quantum distribution on multi-cluster dimension, the solution for relevancy of attribute and redundancy on non-binary data spaces is provided resulting in performance improvement on the basis of tag based feature. Multi-clustering tag based feature reduction extracts individual features and are correspondingly replaced by the equivalent feature clusters (i.e. tag clusters. During training, the DCTSM approach uses multi-clusters instead of individual tag features and then during decoding individual features is replaced by corresponding multi-clusters. To measure the effectiveness of the method, experiments are conducted on existing anonymization method for high dimensional data spaces and compared with the DCTSM approach using Statlog German Credit Data Set. Improved tag feature extraction and minimum error rate compared to conventional anonymization
Extending the Generalised Pareto Distribution for Novelty Detection in High-Dimensional Spaces.
Clifton, David A; Clifton, Lei; Hugueny, Samuel; Tarassenko, Lionel
2014-01-01
Novelty detection involves the construction of a "model of normality", and then classifies test data as being either "normal" or "abnormal" with respect to that model. For this reason, it is often termed one-class classification. The approach is suitable for cases in which examples of "normal" behaviour are commonly available, but in which cases of "abnormal" data are comparatively rare. When performing novelty detection, we are typically most interested in the tails of the normal model, because it is in these tails that a decision boundary between "normal" and "abnormal" areas of data space usually lies. Extreme value statistics provides an appropriate theoretical framework for modelling the tails of univariate (or low-dimensional) distributions, using the generalised Pareto distribution (GPD), which can be demonstrated to be the limiting distribution for data occurring within the tails of most practically-encountered probability distributions. This paper provides an extension of the GPD, allowing the modelling of probability distributions of arbitrarily high dimension, such as occurs when using complex, multimodel, multivariate distributions for performing novelty detection in most real-life cases. We demonstrate our extension to the GPD using examples from patient physiological monitoring, in which we have acquired data from hospital patients in large clinical studies of high-acuity wards, and in which we wish to determine "abnormal" patient data, such that early warning of patient physiological deterioration may be provided.
Du, Jing; Wang, Jian
2015-11-01
Bessel beams carrying orbital angular momentum (OAM) with helical phase fronts exp(ilφ)(l=0;±1;±2;…), where φ is the azimuthal angle and l corresponds to the topological number, are orthogonal with each other. This feature of Bessel beams provides a new dimension to code/decode data information on the OAM state of light, and the theoretical infinity of topological number enables possible high-dimensional structured light coding/decoding for free-space optical communications. Moreover, Bessel beams are nondiffracting beams having the ability to recover by themselves in the face of obstructions, which is important for free-space optical communications relying on line-of-sight operation. By utilizing the OAM and nondiffracting characteristics of Bessel beams, we experimentally demonstrate 12 m distance obstruction-free optical m-ary coding/decoding using visible Bessel beams in a free-space optical communication system. We also study the bit error rate (BER) performance of hexadecimal and 32-ary coding/decoding based on Bessel beams with different topological numbers. After receiving 500 symbols at the receiver side, a zero BER of hexadecimal coding/decoding is observed when the obstruction is placed along the propagation path of light.
Chernozhukov, Victor; Hansen, Christian; Spindler, Martin
2016-01-01
In this article the package High-dimensional Metrics (\\texttt{hdm}) is introduced. It is a collection of statistical methods for estimation and quantification of uncertainty in high-dimensional approximately sparse models. It focuses on providing confidence intervals and significance testing for (possibly many) low-dimensional subcomponents of the high-dimensional parameter vector. Efficient estimators and uniformly valid confidence intervals for regression coefficients on target variables (e...
Wang, Wei; Yang, Jiong
With the rapid growth of computational biology and e-commerce applications, high-dimensional data becomes very common. Thus, mining high-dimensional data is an urgent problem of great practical importance. However, there are some unique challenges for mining data of high dimensions, including (1) the curse of dimensionality and more crucial (2) the meaningfulness of the similarity measure in the high dimension space. In this chapter, we present several state-of-art techniques for analyzing high-dimensional data, e.g., frequent pattern mining, clustering, and classification. We will discuss how these methods deal with the challenges of high dimensionality.
A Tool for Parameter-space Explorations
Murase, Yohsuke; Uchitane, Takeshi; Ito, Nobuyasu
A software for managing simulation jobs and results, named "OACIS", is presented. It controls a large number of simulation jobs executed in various remote servers, keeps these results in an organized way, and manages the analyses on these results. The software has a web browser front end, and users can submit various jobs to appropriate remote hosts from a web browser easily. After these jobs are finished, all the result files are automatically downloaded from the computational hosts and stored in a traceable way together with the logs of the date, host, and elapsed time of the jobs. Some visualization functions are also provided so that users can easily grasp the overview of the results distributed in a high-dimensional parameter space. Thus, OACIS is especially beneficial for the complex simulation models having many parameters for which a lot of parameter searches are required. By using API of OACIS, it is easy to write a code that automates parameter selection depending on the previous simulation results. A few examples of the automated parameter selection are also demonstrated.
Free flight in parameter space
DEFF Research Database (Denmark)
Dahlstedt, Palle; Nilsson, Per Anders
2008-01-01
with continuous interpolation between population members. With a suitable sound engine, the system forms a surprisingly expressive performance instrument, used by the electronic free impro duo pantoMorf in concerts and recording sessions over the last year.......The well-known difficulty of controlling many synthesis parameters in performance, for exploration and expression, is addressed. Inspired by interactive evolution, random vectors in parameter space are assigned to an array of pressure sensitive pads. Vectors are scaled with pressure and added...... to define the current point in parameter space. Vectors can be scaled globally, allowing exploration of the whole space or minute timberal expression. The vector origin can be shifted at any time, allowing exploration of subspaces. In essence, this amounts to mutation-based interactive evolution...
Determining frequentist confidence limits using a directed parameter space search
International Nuclear Information System (INIS)
Daniel, Scott F.; Connolly, Andrew J.; Schneider, Jeff
2014-01-01
We consider the problem of inferring constraints on a high-dimensional parameter space with a computationally expensive likelihood function. We propose a machine learning algorithm that maps out the Frequentist confidence limit on parameter space by intelligently targeting likelihood evaluations so as to quickly and accurately characterize the likelihood surface in both low- and high-likelihood regions. We compare our algorithm to Bayesian credible limits derived by the well-tested Markov Chain Monte Carlo (MCMC) algorithm using both multi-modal toy likelihood functions and the seven yr Wilkinson Microwave Anisotropy Probe cosmic microwave background likelihood function. We find that our algorithm correctly identifies the location, general size, and general shape of high-likelihood regions in parameter space while being more robust against multi-modality than MCMC.
Legal Parameters of Space Tourism
Smith, Lesley Jane; Hörl, Kay-Uwe
2004-01-01
The commercial concept of space tourism raises important legal issues not specifically addressed by first generation rules of international spacelaw. The principles established in the nineteen sixties and seventies were inspired by the philosophy that exploration of space was undertaken by and for the benefit of mankind. Technical developments since then haveincreased the potential for new space applications, with a corresponding increase in commercial interest in space. If space tourism is t...
Chernozhukov, Victor; Hansen, Chris; Spindler, Martin
2016-01-01
The package High-dimensional Metrics (\\Rpackage{hdm}) is an evolving collection of statistical methods for estimation and quantification of uncertainty in high-dimensional approximately sparse models. It focuses on providing confidence intervals and significance testing for (possibly many) low-dimensional subcomponents of the high-dimensional parameter vector. Efficient estimators and uniformly valid confidence intervals for regression coefficients on target variables (e.g., treatment or poli...
Ma, Wei Ji; Zhou, Xiang; Ross, Lars A; Foxe, John J; Parra, Lucas C
2009-01-01
Watching a speaker's facial movements can dramatically enhance our ability to comprehend words, especially in noisy environments. From a general doctrine of combining information from different sensory modalities (the principle of inverse effectiveness), one would expect that the visual signals would be most effective at the highest levels of auditory noise. In contrast, we find, in accord with a recent paper, that visual information improves performance more at intermediate levels of auditory noise than at the highest levels, and we show that a novel visual stimulus containing only temporal information does the same. We present a Bayesian model of optimal cue integration that can explain these conflicts. In this model, words are regarded as points in a multidimensional space and word recognition is a probabilistic inference process. When the dimensionality of the feature space is low, the Bayesian model predicts inverse effectiveness; when the dimensionality is high, the enhancement is maximal at intermediate auditory noise levels. When the auditory and visual stimuli differ slightly in high noise, the model makes a counterintuitive prediction: as sound quality increases, the proportion of reported words corresponding to the visual stimulus should first increase and then decrease. We confirm this prediction in a behavioral experiment. We conclude that auditory-visual speech perception obeys the same notion of optimality previously observed only for simple multisensory stimuli.
Directory of Open Access Journals (Sweden)
Wei Ji Ma
Full Text Available Watching a speaker's facial movements can dramatically enhance our ability to comprehend words, especially in noisy environments. From a general doctrine of combining information from different sensory modalities (the principle of inverse effectiveness, one would expect that the visual signals would be most effective at the highest levels of auditory noise. In contrast, we find, in accord with a recent paper, that visual information improves performance more at intermediate levels of auditory noise than at the highest levels, and we show that a novel visual stimulus containing only temporal information does the same. We present a Bayesian model of optimal cue integration that can explain these conflicts. In this model, words are regarded as points in a multidimensional space and word recognition is a probabilistic inference process. When the dimensionality of the feature space is low, the Bayesian model predicts inverse effectiveness; when the dimensionality is high, the enhancement is maximal at intermediate auditory noise levels. When the auditory and visual stimuli differ slightly in high noise, the model makes a counterintuitive prediction: as sound quality increases, the proportion of reported words corresponding to the visual stimulus should first increase and then decrease. We confirm this prediction in a behavioral experiment. We conclude that auditory-visual speech perception obeys the same notion of optimality previously observed only for simple multisensory stimuli.
Clustering high dimensional data
DEFF Research Database (Denmark)
Assent, Ira
2012-01-01
High-dimensional data, i.e., data described by a large number of attributes, pose specific challenges to clustering. The so-called ‘curse of dimensionality’, coined originally to describe the general increase in complexity of various computational problems as dimensionality increases, is known...... to render traditional clustering algorithms ineffective. The curse of dimensionality, among other effects, means that with increasing number of dimensions, a loss of meaningful differentiation between similar and dissimilar objects is observed. As high-dimensional objects appear almost alike, new approaches...... for clustering are required. Consequently, recent research has focused on developing techniques and clustering algorithms specifically for high-dimensional data. Still, open research issues remain. Clustering is a data mining task devoted to the automatic grouping of data based on mutual similarity. Each cluster...
Modeling high dimensional multichannel brain signals
Hu, Lechuan
2017-03-27
In this paper, our goal is to model functional and effective (directional) connectivity in network of multichannel brain physiological signals (e.g., electroencephalograms, local field potentials). The primary challenges here are twofold: first, there are major statistical and computational difficulties for modeling and analyzing high dimensional multichannel brain signals; second, there is no set of universally-agreed measures for characterizing connectivity. To model multichannel brain signals, our approach is to fit a vector autoregressive (VAR) model with sufficiently high order so that complex lead-lag temporal dynamics between the channels can be accurately characterized. However, such a model contains a large number of parameters. Thus, we will estimate the high dimensional VAR parameter space by our proposed hybrid LASSLE method (LASSO+LSE) which is imposes regularization on the first step (to control for sparsity) and constrained least squares estimation on the second step (to improve bias and mean-squared error of the estimator). Then to characterize connectivity between channels in a brain network, we will use various measures but put an emphasis on partial directed coherence (PDC) in order to capture directional connectivity between channels. PDC is a directed frequency-specific measure that explains the extent to which the present oscillatory activity in a sender channel influences the future oscillatory activity in a specific receiver channel relative all possible receivers in the network. Using the proposed modeling approach, we have achieved some insights on learning in a rat engaged in a non-spatial memory task.
Modeling high dimensional multichannel brain signals
Hu, Lechuan; Fortin, Norbert; Ombao, Hernando
2017-01-01
In this paper, our goal is to model functional and effective (directional) connectivity in network of multichannel brain physiological signals (e.g., electroencephalograms, local field potentials). The primary challenges here are twofold: first, there are major statistical and computational difficulties for modeling and analyzing high dimensional multichannel brain signals; second, there is no set of universally-agreed measures for characterizing connectivity. To model multichannel brain signals, our approach is to fit a vector autoregressive (VAR) model with sufficiently high order so that complex lead-lag temporal dynamics between the channels can be accurately characterized. However, such a model contains a large number of parameters. Thus, we will estimate the high dimensional VAR parameter space by our proposed hybrid LASSLE method (LASSO+LSE) which is imposes regularization on the first step (to control for sparsity) and constrained least squares estimation on the second step (to improve bias and mean-squared error of the estimator). Then to characterize connectivity between channels in a brain network, we will use various measures but put an emphasis on partial directed coherence (PDC) in order to capture directional connectivity between channels. PDC is a directed frequency-specific measure that explains the extent to which the present oscillatory activity in a sender channel influences the future oscillatory activity in a specific receiver channel relative all possible receivers in the network. Using the proposed modeling approach, we have achieved some insights on learning in a rat engaged in a non-spatial memory task.
CSIR Research Space (South Africa)
Mc
2012-07-01
Full Text Available stream_source_info McLaren_2012.pdf.txt stream_content_type text/plain stream_size 2190 Content-Encoding ISO-8859-1 stream_name McLaren_2012.pdf.txt Content-Type text/plain; charset=ISO-8859-1 High dimensional... entanglement M. McLAREN1,2, F.S. ROUX1 & A. FORBES1,2,3 1. CSIR National Laser Centre, PO Box 395, Pretoria 0001 2. School of Physics, University of the Stellenbosch, Private Bag X1, 7602, Matieland 3. School of Physics, University of Kwazulu...
MFV Reductions of MSSM Parameter Space
AbdusSalam, S.S.; Quevedo, F.
2015-01-01
The 100+ free parameters of the minimal supersymmetric standard model (MSSM) make it computationally difficult to compare systematically with data, motivating the study of specific parameter reductions such as the cMSSM and pMSSM. Here we instead study the reductions of parameter space implied by using minimal flavour violation (MFV) to organise the R-parity conserving MSSM, with a view towards systematically building in constraints on flavour-violating physics. Within this framework the space of parameters is reduced by expanding soft supersymmetry-breaking terms in powers of the Cabibbo angle, leading to a 24-, 30- or 42-parameter framework (which we call MSSM-24, MSSM-30, and MSSM-42 respectively), depending on the order kept in the expansion. We provide a Bayesian global fit to data of the MSSM-30 parameter set to show that this is manageable with current tools. We compare the MFV reductions to the 19-parameter pMSSM choice and show that the pMSSM is not contained as a subset. The MSSM-30 analysis favours...
Ye, Fei
2017-01-01
In this paper, we propose a new automatic hyperparameter selection approach for determining the optimal network configuration (network structure and hyperparameters) for deep neural networks using particle swarm optimization (PSO) in combination with a steepest gradient descent algorithm. In the proposed approach, network configurations were coded as a set of real-number m-dimensional vectors as the individuals of the PSO algorithm in the search procedure. During the search procedure, the PSO algorithm is employed to search for optimal network configurations via the particles moving in a finite search space, and the steepest gradient descent algorithm is used to train the DNN classifier with a few training epochs (to find a local optimal solution) during the population evaluation of PSO. After the optimization scheme, the steepest gradient descent algorithm is performed with more epochs and the final solutions (pbest and gbest) of the PSO algorithm to train a final ensemble model and individual DNN classifiers, respectively. The local search ability of the steepest gradient descent algorithm and the global search capabilities of the PSO algorithm are exploited to determine an optimal solution that is close to the global optimum. We constructed several experiments on hand-written characters and biological activity prediction datasets to show that the DNN classifiers trained by the network configurations expressed by the final solutions of the PSO algorithm, employed to construct an ensemble model and individual classifier, outperform the random approach in terms of the generalization performance. Therefore, the proposed approach can be regarded an alternative tool for automatic network structure and parameter selection for deep neural networks.
2017-01-01
In this paper, we propose a new automatic hyperparameter selection approach for determining the optimal network configuration (network structure and hyperparameters) for deep neural networks using particle swarm optimization (PSO) in combination with a steepest gradient descent algorithm. In the proposed approach, network configurations were coded as a set of real-number m-dimensional vectors as the individuals of the PSO algorithm in the search procedure. During the search procedure, the PSO algorithm is employed to search for optimal network configurations via the particles moving in a finite search space, and the steepest gradient descent algorithm is used to train the DNN classifier with a few training epochs (to find a local optimal solution) during the population evaluation of PSO. After the optimization scheme, the steepest gradient descent algorithm is performed with more epochs and the final solutions (pbest and gbest) of the PSO algorithm to train a final ensemble model and individual DNN classifiers, respectively. The local search ability of the steepest gradient descent algorithm and the global search capabilities of the PSO algorithm are exploited to determine an optimal solution that is close to the global optimum. We constructed several experiments on hand-written characters and biological activity prediction datasets to show that the DNN classifiers trained by the network configurations expressed by the final solutions of the PSO algorithm, employed to construct an ensemble model and individual classifier, outperform the random approach in terms of the generalization performance. Therefore, the proposed approach can be regarded an alternative tool for automatic network structure and parameter selection for deep neural networks. PMID:29236718
Physics parameter space of tokamak ignition devices
International Nuclear Information System (INIS)
Selcow, E.C.; Peng, Y.K.M.; Uckan, N.A.; Houlberg, W.A.
1985-01-01
This paper describes the results of a study to explore the physics parameter space of tokamak ignition experiments. A new physics systems code has been developed to perform the study. This code performs a global plasma analysis using steady-state, two-fluid, energy-transport models. In this paper, we discuss the models used in the code and their application to the analysis of compact ignition experiments. 8 refs., 8 figs., 1 tab
Approximating high-dimensional dynamics by barycentric coordinates with linear programming
Energy Technology Data Exchange (ETDEWEB)
Hirata, Yoshito, E-mail: yoshito@sat.t.u-tokyo.ac.jp; Aihara, Kazuyuki; Suzuki, Hideyuki [Institute of Industrial Science, The University of Tokyo, 4-6-1 Komaba, Meguro-ku, Tokyo 153-8505 (Japan); Department of Mathematical Informatics, The University of Tokyo, Bunkyo-ku, Tokyo 113-8656 (Japan); CREST, JST, 4-1-8 Honcho, Kawaguchi, Saitama 332-0012 (Japan); Shiro, Masanori [Department of Mathematical Informatics, The University of Tokyo, Bunkyo-ku, Tokyo 113-8656 (Japan); Mathematical Neuroinformatics Group, Advanced Industrial Science and Technology, Tsukuba, Ibaraki 305-8568 (Japan); Takahashi, Nozomu; Mas, Paloma [Center for Research in Agricultural Genomics (CRAG), Consorci CSIC-IRTA-UAB-UB, Barcelona 08193 (Spain)
2015-01-15
The increasing development of novel methods and techniques facilitates the measurement of high-dimensional time series but challenges our ability for accurate modeling and predictions. The use of a general mathematical model requires the inclusion of many parameters, which are difficult to be fitted for relatively short high-dimensional time series observed. Here, we propose a novel method to accurately model a high-dimensional time series. Our method extends the barycentric coordinates to high-dimensional phase space by employing linear programming, and allowing the approximation errors explicitly. The extension helps to produce free-running time-series predictions that preserve typical topological, dynamical, and/or geometric characteristics of the underlying attractors more accurately than the radial basis function model that is widely used. The method can be broadly applied, from helping to improve weather forecasting, to creating electronic instruments that sound more natural, and to comprehensively understanding complex biological data.
Approximating high-dimensional dynamics by barycentric coordinates with linear programming
International Nuclear Information System (INIS)
Hirata, Yoshito; Aihara, Kazuyuki; Suzuki, Hideyuki; Shiro, Masanori; Takahashi, Nozomu; Mas, Paloma
2015-01-01
The increasing development of novel methods and techniques facilitates the measurement of high-dimensional time series but challenges our ability for accurate modeling and predictions. The use of a general mathematical model requires the inclusion of many parameters, which are difficult to be fitted for relatively short high-dimensional time series observed. Here, we propose a novel method to accurately model a high-dimensional time series. Our method extends the barycentric coordinates to high-dimensional phase space by employing linear programming, and allowing the approximation errors explicitly. The extension helps to produce free-running time-series predictions that preserve typical topological, dynamical, and/or geometric characteristics of the underlying attractors more accurately than the radial basis function model that is widely used. The method can be broadly applied, from helping to improve weather forecasting, to creating electronic instruments that sound more natural, and to comprehensively understanding complex biological data
Approximating high-dimensional dynamics by barycentric coordinates with linear programming.
Hirata, Yoshito; Shiro, Masanori; Takahashi, Nozomu; Aihara, Kazuyuki; Suzuki, Hideyuki; Mas, Paloma
2015-01-01
The increasing development of novel methods and techniques facilitates the measurement of high-dimensional time series but challenges our ability for accurate modeling and predictions. The use of a general mathematical model requires the inclusion of many parameters, which are difficult to be fitted for relatively short high-dimensional time series observed. Here, we propose a novel method to accurately model a high-dimensional time series. Our method extends the barycentric coordinates to high-dimensional phase space by employing linear programming, and allowing the approximation errors explicitly. The extension helps to produce free-running time-series predictions that preserve typical topological, dynamical, and/or geometric characteristics of the underlying attractors more accurately than the radial basis function model that is widely used. The method can be broadly applied, from helping to improve weather forecasting, to creating electronic instruments that sound more natural, and to comprehensively understanding complex biological data.
Ferdosi, Bilkis J.; Buddelmeijer, Hugo; Trager, Scott; Wilkinson, Michael H.F.; Roerdink, Jos B.T.M.
2010-01-01
Data sets in astronomy are growing to enormous sizes. Modern astronomical surveys provide not only image data but also catalogues of millions of objects (stars, galaxies), each object with hundreds of associated parameters. Exploration of this very high-dimensional data space poses a huge challenge.
High dimensional neurocomputing growth, appraisal and applications
Tripathi, Bipin Kumar
2015-01-01
The book presents a coherent understanding of computational intelligence from the perspective of what is known as "intelligent computing" with high-dimensional parameters. It critically discusses the central issue of high-dimensional neurocomputing, such as quantitative representation of signals, extending the dimensionality of neuron, supervised and unsupervised learning and design of higher order neurons. The strong point of the book is its clarity and ability of the underlying theory to unify our understanding of high-dimensional computing where conventional methods fail. The plenty of application oriented problems are presented for evaluating, monitoring and maintaining the stability of adaptive learning machine. Author has taken care to cover the breadth and depth of the subject, both in the qualitative as well as quantitative way. The book is intended to enlighten the scientific community, ranging from advanced undergraduates to engineers, scientists and seasoned researchers in computational intelligenc...
HL-LHC parameter space and scenarios
International Nuclear Information System (INIS)
Bruning, O.S.
2012-01-01
The HL-LHC project aims at a total integrated luminosity of approximately 3000 fb -1 over the lifetime of the HL-LHC. Assuming an exploitation period of ca. 10 years this goal implies an annual integrated luminosity of approximately 200 fb -1 to 300 fb -1 per year. This paper looks at potential beam parameters that are compatible with the HL-LHC performance goals and discusses briefly potential variation in the parameter space. It is shown that the design goal of the HL-LHC project can only be achieved with a full upgrade of the injector complex and the operation with β* values close to 0.15 m. Significant margins for leveling can be achieved for β* values close to 0.15 m. However, these margins can only be harvested during the HL-LHC operation if the required leveling techniques have been demonstrated in operation
Reduced basis ANOVA methods for partial differential equations with high-dimensional random inputs
Energy Technology Data Exchange (ETDEWEB)
Liao, Qifeng, E-mail: liaoqf@shanghaitech.edu.cn [School of Information Science and Technology, ShanghaiTech University, Shanghai 200031 (China); Lin, Guang, E-mail: guanglin@purdue.edu [Department of Mathematics & School of Mechanical Engineering, Purdue University, West Lafayette, IN 47907 (United States)
2016-07-15
In this paper we present a reduced basis ANOVA approach for partial deferential equations (PDEs) with random inputs. The ANOVA method combined with stochastic collocation methods provides model reduction in high-dimensional parameter space through decomposing high-dimensional inputs into unions of low-dimensional inputs. In this work, to further reduce the computational cost, we investigate spatial low-rank structures in the ANOVA-collocation method, and develop efficient spatial model reduction techniques using hierarchically generated reduced bases. We present a general mathematical framework of the methodology, validate its accuracy and demonstrate its efficiency with numerical experiments.
High-dimensional covariance estimation with high-dimensional data
Pourahmadi, Mohsen
2013-01-01
Methods for estimating sparse and large covariance matrices Covariance and correlation matrices play fundamental roles in every aspect of the analysis of multivariate data collected from a variety of fields including business and economics, health care, engineering, and environmental and physical sciences. High-Dimensional Covariance Estimation provides accessible and comprehensive coverage of the classical and modern approaches for estimating covariance matrices as well as their applications to the rapidly developing areas lying at the intersection of statistics and mac
International Nuclear Information System (INIS)
Guerrieri, A.
2009-01-01
In this report the largest Lyapunov characteristic exponent of a high dimensional atmospheric global circulation model of intermediate complexity has been estimated numerically. A sensitivity analysis has been carried out by varying the equator-to-pole temperature difference, the space resolution and the value of some parameters employed by the model. Chaotic and non-chaotic regimes of circulation have been found. [it
Multiplicity distributions in impact parameter space
International Nuclear Information System (INIS)
Wakano, Masami
1976-01-01
A definition for the average multiplicity of pions as a function of momentum transfer and total energy in the high energy proton-proton collisions is proposed by using the n-pion production differential cross section with the given momentum transfer from a proton to other final products and the given energy of the latter. Contributions from nondiffractive and diffractive processes are formulated in a multi-Regge model. We define a relationship between impact parameter and momentum transfer in the sense of classical theory for inelastic processes and we obtain the average multiplicity of pions as a function of impact parameter and total energy from the corresponding quantity afore-mentioned. By comparing this quantity with the square root of the opaqueness at given impact parameter, we conclude that the overlap of localized constituents is important in determining the opaqueness at given impact parameter in a collision of two hadrons. (auth.)
Parameter space of general gauge mediation
International Nuclear Information System (INIS)
Rajaraman, Arvind; Shirman, Yuri; Smidt, Joseph; Yu, Felix
2009-01-01
We study a subspace of General Gauge Mediation (GGM) models which generalize models of gauge mediation. We find superpartner spectra that are markedly different from those of typical gauge and gaugino mediation scenarios. While typical gauge mediation predictions of either a neutralino or stau next-to-lightest supersymmetric particle (NLSP) are easily reproducible with the GGM parameters, chargino and sneutrino NLSPs are generic for many reasonable choices of GGM parameters.
A qualitative numerical study of high dimensional dynamical systems
Albers, David James
Since Poincare, the father of modern mathematical dynamical systems, much effort has been exerted to achieve a qualitative understanding of the physical world via a qualitative understanding of the functions we use to model the physical world. In this thesis, we construct a numerical framework suitable for a qualitative, statistical study of dynamical systems using the space of artificial neural networks. We analyze the dynamics along intervals in parameter space, separating the set of neural networks into roughly four regions: the fixed point to the first bifurcation; the route to chaos; the chaotic region; and a transition region between chaos and finite-state neural networks. The study is primarily with respect to high-dimensional dynamical systems. We make the following general conclusions as the dimension of the dynamical system is increased: the probability of the first bifurcation being of type Neimark-Sacker is greater than ninety-percent; the most probable route to chaos is via a cascade of bifurcations of high-period periodic orbits, quasi-periodic orbits, and 2-tori; there exists an interval of parameter space such that hyperbolicity is violated on a countable, Lebesgue measure 0, "increasingly dense" subset; chaos is much more likely to persist with respect to parameter perturbation in the chaotic region of parameter space as the dimension is increased; moreover, as the number of positive Lyapunov exponents is increased, the likelihood that any significant portion of these positive exponents can be perturbed away decreases with increasing dimension. The maximum Kaplan-Yorke dimension and the maximum number of positive Lyapunov exponents increases linearly with dimension. The probability of a dynamical system being chaotic increases exponentially with dimension. The results with respect to the first bifurcation and the route to chaos comment on previous results of Newhouse, Ruelle, Takens, Broer, Chenciner, and Iooss. Moreover, results regarding the high-dimensional
Charge distributions in transverse coordinate space and in impact parameter space
Energy Technology Data Exchange (ETDEWEB)
Hwang, Dae Sung [Department of Physics, Sejong University, Seoul 143-747 (Korea, Republic of)], E-mail: dshwang@slac.stanford.edu; Kim, Dong Soo [Department of Physics, Kangnung National University, Kangnung 210-702 (Korea, Republic of); Kim, Jonghyun [Department of Physics, Sejong University, Seoul 143-747 (Korea, Republic of)
2008-11-27
We study the charge distributions of the valence quarks inside nucleon in the transverse coordinate space, which is conjugate to the transverse momentum space. We compare the results with the charge distributions in the impact parameter space.
Charge distributions in transverse coordinate space and in impact parameter space
Hwang, Dae Sung; Kim, Dong Soo; Kim, Jonghyun
2008-01-01
We study the charge distributions of the valence quarks inside nucleon in the transverse coordinate space, which is conjugate to the transverse momentum space. We compare the results with the charge distributions in the impact parameter space.
Hierarchical low-rank approximation for high dimensional approximation
Nouy, Anthony
2016-01-01
Tensor methods are among the most prominent tools for the numerical solution of high-dimensional problems where functions of multiple variables have to be approximated. Such high-dimensional approximation problems naturally arise in stochastic analysis and uncertainty quantification. In many practical situations, the approximation of high-dimensional functions is made computationally tractable by using rank-structured approximations. In this talk, we present algorithms for the approximation in hierarchical tensor format using statistical methods. Sparse representations in a given tensor format are obtained with adaptive or convex relaxation methods, with a selection of parameters using crossvalidation methods.
Hierarchical low-rank approximation for high dimensional approximation
Nouy, Anthony
2016-01-07
Tensor methods are among the most prominent tools for the numerical solution of high-dimensional problems where functions of multiple variables have to be approximated. Such high-dimensional approximation problems naturally arise in stochastic analysis and uncertainty quantification. In many practical situations, the approximation of high-dimensional functions is made computationally tractable by using rank-structured approximations. In this talk, we present algorithms for the approximation in hierarchical tensor format using statistical methods. Sparse representations in a given tensor format are obtained with adaptive or convex relaxation methods, with a selection of parameters using crossvalidation methods.
Dynamics in the Parameter Space of a Neuron Model
Paulo, C. Rech
2012-06-01
Some two-dimensional parameter-space diagrams are numerically obtained by considering the largest Lyapunov exponent for a four-dimensional thirteen-parameter Hindmarsh—Rose neuron model. Several different parameter planes are considered, and it is shown that depending on the combination of parameters, a typical scenario can be preserved: for some choice of two parameters, the parameter plane presents a comb-shaped chaotic region embedded in a large periodic region. It is also shown that there exist regions close to these comb-shaped chaotic regions, separated by the comb teeth, organizing themselves in period-adding bifurcation cascades.
An Integrated Approach to Parameter Learning in Infinite-Dimensional Space
Energy Technology Data Exchange (ETDEWEB)
Boyd, Zachary M. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Wendelberger, Joanne Roth [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
2017-09-14
The availability of sophisticated modern physics codes has greatly extended the ability of domain scientists to understand the processes underlying their observations of complicated processes, but it has also introduced the curse of dimensionality via the many user-set parameters available to tune. Many of these parameters are naturally expressed as functional data, such as initial temperature distributions, equations of state, and controls. Thus, when attempting to find parameters that match observed data, being able to navigate parameter-space becomes highly non-trivial, especially considering that accurate simulations can be expensive both in terms of time and money. Existing solutions include batch-parallel simulations, high-dimensional, derivative-free optimization, and expert guessing, all of which make some contribution to solving the problem but do not completely resolve the issue. In this work, we explore the possibility of coupling together all three of the techniques just described by designing user-guided, batch-parallel optimization schemes. Our motivating example is a neutron diffusion partial differential equation where the time-varying multiplication factor serves as the unknown control parameter to be learned. We find that a simple, batch-parallelizable, random-walk scheme is able to make some progress on the problem but does not by itself produce satisfactory results. After reducing the dimensionality of the problem using functional principal component analysis (fPCA), we are able to track the progress of the solver in a visually simple way as well as viewing the associated principle components. This allows a human to make reasonable guesses about which points in the state space the random walker should try next. Thus, by combining the random walker's ability to find descent directions with the human's understanding of the underlying physics, it is possible to use expensive simulations more efficiently and more quickly arrive at the
Replicate periodic windows in the parameter space of driven oscillators
Energy Technology Data Exchange (ETDEWEB)
Medeiros, E.S., E-mail: esm@if.usp.br [Instituto de Fisica, Universidade de Sao Paulo, Sao Paulo (Brazil); Souza, S.L.T. de [Universidade Federal de Sao Joao del-Rei, Campus Alto Paraopeba, Minas Gerais (Brazil); Medrano-T, R.O. [Departamento de Ciencias Exatas e da Terra, Universidade Federal de Sao Paulo, Diadema, Sao Paulo (Brazil); Caldas, I.L. [Instituto de Fisica, Universidade de Sao Paulo, Sao Paulo (Brazil)
2011-11-15
Highlights: > We apply a weak harmonic perturbation to control chaos in two driven oscillators. > We find replicate periodic windows in the driven oscillator parameter space. > We find that the periodic window replication is associated with the chaos control. - Abstract: In the bi-dimensional parameter space of driven oscillators, shrimp-shaped periodic windows are immersed in chaotic regions. For two of these oscillators, namely, Duffing and Josephson junction, we show that a weak harmonic perturbation replicates these periodic windows giving rise to parameter regions correspondent to periodic orbits. The new windows are composed of parameters whose periodic orbits have the same periodicity and pattern of stable and unstable periodic orbits already existent for the unperturbed oscillator. Moreover, these unstable periodic orbits are embedded in chaotic attractors in phase space regions where the new stable orbits are identified. Thus, the observed periodic window replication is an effective oscillator control process, once chaotic orbits are replaced by regular ones.
Parameter and State Estimator for State Space Models
Directory of Open Access Journals (Sweden)
Ruifeng Ding
2014-01-01
Full Text Available This paper proposes a parameter and state estimator for canonical state space systems from measured input-output data. The key is to solve the system state from the state equation and to substitute it into the output equation, eliminating the state variables, and the resulting equation contains only the system inputs and outputs, and to derive a least squares parameter identification algorithm. Furthermore, the system states are computed from the estimated parameters and the input-output data. Convergence analysis using the martingale convergence theorem indicates that the parameter estimates converge to their true values. Finally, an illustrative example is provided to show that the proposed algorithm is effective.
Held, Christian; Nattkemper, Tim; Palmisano, Ralf; Wittenberg, Thomas
2013-01-01
Research and diagnosis in medicine and biology often require the assessment of a large amount of microscopy image data. Although on the one hand, digital pathology and new bioimaging technologies find their way into clinical practice and pharmaceutical research, some general methodological issues in automated image analysis are still open. In this study, we address the problem of fitting the parameters in a microscopy image segmentation pipeline. We propose to fit the parameters of the pipeline's modules with optimization algorithms, such as, genetic algorithms or coordinate descents, and show how visual exploration of the parameter space can help to identify sub-optimal parameter settings that need to be avoided. This is of significant help in the design of our automatic parameter fitting framework, which enables us to tune the pipeline for large sets of micrographs. The underlying parameter spaces pose a challenge for manual as well as automated parameter optimization, as the parameter spaces can show several local performance maxima. Hence, optimization strategies that are not able to jump out of local performance maxima, like the hill climbing algorithm, often result in a local maximum.
Directory of Open Access Journals (Sweden)
Christian Held
2013-01-01
Full Text Available Introduction: Research and diagnosis in medicine and biology often require the assessment of a large amount of microscopy image data. Although on the one hand, digital pathology and new bioimaging technologies find their way into clinical practice and pharmaceutical research, some general methodological issues in automated image analysis are still open. Methods: In this study, we address the problem of fitting the parameters in a microscopy image segmentation pipeline. We propose to fit the parameters of the pipeline′s modules with optimization algorithms, such as, genetic algorithms or coordinate descents, and show how visual exploration of the parameter space can help to identify sub-optimal parameter settings that need to be avoided. Results: This is of significant help in the design of our automatic parameter fitting framework, which enables us to tune the pipeline for large sets of micrographs. Conclusion: The underlying parameter spaces pose a challenge for manual as well as automated parameter optimization, as the parameter spaces can show several local performance maxima. Hence, optimization strategies that are not able to jump out of local performance maxima, like the hill climbing algorithm, often result in a local maximum.
Recovering a Probabilistic Knowledge Structure by Constraining Its Parameter Space
Stefanutti, Luca; Robusto, Egidio
2009-01-01
In the Basic Local Independence Model (BLIM) of Doignon and Falmagne ("Knowledge Spaces," Springer, Berlin, 1999), the probabilistic relationship between the latent knowledge states and the observable response patterns is established by the introduction of a pair of parameters for each of the problems: a lucky guess probability and a careless…
Harnessing high-dimensional hyperentanglement through a biphoton frequency comb
Xie, Zhenda; Zhong, Tian; Shrestha, Sajan; Xu, Xinan; Liang, Junlin; Gong, Yan-Xiao; Bienfang, Joshua C.; Restelli, Alessandro; Shapiro, Jeffrey H.; Wong, Franco N. C.; Wei Wong, Chee
2015-08-01
Quantum entanglement is a fundamental resource for secure information processing and communications, and hyperentanglement or high-dimensional entanglement has been separately proposed for its high data capacity and error resilience. The continuous-variable nature of the energy-time entanglement makes it an ideal candidate for efficient high-dimensional coding with minimal limitations. Here, we demonstrate the first simultaneous high-dimensional hyperentanglement using a biphoton frequency comb to harness the full potential in both the energy and time domain. Long-postulated Hong-Ou-Mandel quantum revival is exhibited, with up to 19 time-bins and 96.5% visibilities. We further witness the high-dimensional energy-time entanglement through Franson revivals, observed periodically at integer time-bins, with 97.8% visibility. This qudit state is observed to simultaneously violate the generalized Bell inequality by up to 10.95 standard deviations while observing recurrent Clauser-Horne-Shimony-Holt S-parameters up to 2.76. Our biphoton frequency comb provides a platform for photon-efficient quantum communications towards the ultimate channel capacity through energy-time-polarization high-dimensional encoding.
On the Consistency of Bootstrap Testing for a Parameter on the Boundary of the Parameter Space
DEFF Research Database (Denmark)
Cavaliere, Giuseppe; Nielsen, Heino Bohn; Rahbek, Anders
2017-01-01
It is well known that with a parameter on the boundary of the parameter space, such as in the classic cases of testing for a zero location parameter or no autoregressive conditional heteroskedasticity (ARCH) effects, the classic nonparametric bootstrap – based on unrestricted parameter estimates...... – leads to inconsistent testing. In contrast, we show here that for the two aforementioned cases, a nonparametric bootstrap test based on parameter estimates obtained under the null – referred to as ‘restricted bootstrap’ – is indeed consistent. While the restricted bootstrap is simple to implement...... in practice, novel theoretical arguments are required in order to establish consistency. In particular, since the bootstrap is analysed both under the null hypothesis and under the alternative, non-standard asymptotic expansions are required to deal with parameters on the boundary. Detailed proofs...
Analysis of chaos in high-dimensional wind power system.
Wang, Cong; Zhang, Hongli; Fan, Wenhui; Ma, Ping
2018-01-01
A comprehensive analysis on the chaos of a high-dimensional wind power system is performed in this study. A high-dimensional wind power system is more complex than most power systems. An 11-dimensional wind power system proposed by Huang, which has not been analyzed in previous studies, is investigated. When the systems are affected by external disturbances including single parameter and periodic disturbance, or its parameters changed, chaotic dynamics of the wind power system is analyzed and chaotic parameters ranges are obtained. Chaos existence is confirmed by calculation and analysis of all state variables' Lyapunov exponents and the state variable sequence diagram. Theoretical analysis and numerical simulations show that the wind power system chaos will occur when parameter variations and external disturbances change to a certain degree.
Modeling High-Dimensional Multichannel Brain Signals
Hu, Lechuan; Fortin, Norbert J.; Ombao, Hernando
2017-01-01
aspects: first, there are major statistical and computational challenges for modeling and analyzing high-dimensional multichannel brain signals; second, there is no set of universally agreed measures for characterizing connectivity. To model multichannel
Parameter choice in Banach space regularization under variational inequalities
International Nuclear Information System (INIS)
Hofmann, Bernd; Mathé, Peter
2012-01-01
The authors study parameter choice strategies for the Tikhonov regularization of nonlinear ill-posed problems in Banach spaces. The effectiveness of any parameter choice for obtaining convergence rates depends on the interplay of the solution smoothness and the nonlinearity structure, and it can be expressed concisely in terms of variational inequalities. Such inequalities are link conditions between the penalty term, the norm misfit and the corresponding error measure. The parameter choices under consideration include an a priori choice, the discrepancy principle as well as the Lepskii principle. For the convenience of the reader, the authors review in an appendix a few instances where the validity of a variational inequality can be established. (paper)
Tracking in Object Action Space
DEFF Research Database (Denmark)
Krüger, Volker; Herzog, Dennis
2013-01-01
the space of the object affordances, i.e., the space of possible actions that are applied on a given object. This way, 3D body tracking reduces to action tracking in the object (and context) primed parameter space of the object affordances. This reduces the high-dimensional joint-space to a low...
Determination of Geometric Parameters of Space Steel Constructions
Directory of Open Access Journals (Sweden)
Jitka Suchá
2005-06-01
Full Text Available The paper contains conclusions of the PhD thesis „Accuracy of determination of geometric parameters of space steel construction using geodetic methods“. Generally it is a difficult task with high requirements for the accuracy and reliability of results, i.e. space coordinates of assessed points on a steel construction. A solution of this task is complicated by the effects of atmospheric influences to begin with the temperature, which strongly affects steel constructions. It is desirable to eliminate the influence of the temperature for the evaluation of the geometric parameters. A choice of an efficient geodetic method, which fulfils demanding requirements, is often affected with a constrained place in an immediate neighbourhood of the measured construction. These conditions disable the choice of efficient points configuration of a geodetic micro network, e.g. the for forward intersection. In addition, points of a construction are often hardly accessible and therefore marking is difficult. The space polar method appears efficient owing to the mentioned reasons and its advantages were increased with the implementation of self-adhesive reflex targets for the distance measurement which enable the ermanent marking of measured points already in the course of placing the construction.
High dimensional model representation method for fuzzy structural dynamics
Adhikari, S.; Chowdhury, R.; Friswell, M. I.
2011-03-01
Uncertainty propagation in multi-parameter complex structures possess significant computational challenges. This paper investigates the possibility of using the High Dimensional Model Representation (HDMR) approach when uncertain system parameters are modeled using fuzzy variables. In particular, the application of HDMR is proposed for fuzzy finite element analysis of linear dynamical systems. The HDMR expansion is an efficient formulation for high-dimensional mapping in complex systems if the higher order variable correlations are weak, thereby permitting the input-output relationship behavior to be captured by the terms of low-order. The computational effort to determine the expansion functions using the α-cut method scales polynomically with the number of variables rather than exponentially. This logic is based on the fundamental assumption underlying the HDMR representation that only low-order correlations among the input variables are likely to have significant impacts upon the outputs for most high-dimensional complex systems. The proposed method is first illustrated for multi-parameter nonlinear mathematical test functions with fuzzy variables. The method is then integrated with a commercial finite element software (ADINA). Modal analysis of a simplified aircraft wing with fuzzy parameters has been used to illustrate the generality of the proposed approach. In the numerical examples, triangular membership functions have been used and the results have been validated against direct Monte Carlo simulations. It is shown that using the proposed HDMR approach, the number of finite element function calls can be reduced without significantly compromising the accuracy.
Parameter space of experimental chaotic circuits with high-precision control parameters
Energy Technology Data Exchange (ETDEWEB)
Sousa, Francisco F. G. de; Rubinger, Rero M. [Instituto de Física e Química, Universidade Federal de Itajubá, Itajubá, MG (Brazil); Sartorelli, José C., E-mail: sartorelli@if.usp.br [Universidade de São Paulo, São Paulo, SP (Brazil); Albuquerque, Holokx A. [Departamento de Física, Universidade do Estado de Santa Catarina, Joinville, SC (Brazil); Baptista, Murilo S. [Institute of Complex Systems and Mathematical Biology, SUPA, University of Aberdeen, Aberdeen (United Kingdom)
2016-08-15
We report high-resolution measurements that experimentally confirm a spiral cascade structure and a scaling relationship of shrimps in the Chua's circuit. Circuits constructed using this component allow for a comprehensive characterization of the circuit behaviors through high resolution parameter spaces. To illustrate the power of our technological development for the creation and the study of chaotic circuits, we constructed a Chua circuit and study its high resolution parameter space. The reliability and stability of the designed component allowed us to obtain data for long periods of time (∼21 weeks), a data set from which an accurate estimation of Lyapunov exponents for the circuit characterization was possible. Moreover, this data, rigorously characterized by the Lyapunov exponents, allows us to reassure experimentally that the shrimps, stable islands embedded in a domain of chaos in the parameter spaces, can be observed in the laboratory. Finally, we confirm that their sizes decay exponentially with the period of the attractor, a result expected to be found in maps of the quadratic family.
Space dependence of reactivity parameters on reactor dynamic perturbation measurements
International Nuclear Information System (INIS)
Maletti, R.; Ziegenbein, D.
1985-01-01
Practical application of reactor-dynamic perturbation measurements for on-power determination of differential reactivity weight of control rods and power coefficients of reactivity has shown a significant dependence of parameters on the position of outcore detectors. The space dependence of neutron flux signal in the core of a VVER-440-type reactor was measured by means of 60 self-powered neutron detectors. The greatest neutron flux alterations are located close to moved control rods and in height of the perturbation position. By means of computations, detector positions can be found in the core in which the one-point model is almost valid. (author)
Quantifying high dimensional entanglement with two mutually unbiased bases
Directory of Open Access Journals (Sweden)
Paul Erker
2017-07-01
Full Text Available We derive a framework for quantifying entanglement in multipartite and high dimensional systems using only correlations in two unbiased bases. We furthermore develop such bounds in cases where the second basis is not characterized beyond being unbiased, thus enabling entanglement quantification with minimal assumptions. Furthermore, we show that it is feasible to experimentally implement our method with readily available equipment and even conservative estimates of physical parameters.
Entropy considerations in constraining the mSUGRA parameter space
International Nuclear Information System (INIS)
Nunez, Dario; Sussman, Roberto A.; Zavala, Jesus; Nellen, Lukas; Cabral-Rosetti, Luis G.; Mondragon, Myriam
2006-01-01
We explore the use of two criteria to constraint the allowed parameter space in mSUGRA models. Both criteria are based in the calculation of the present density of neutralinos as dark matter in the Universe. The first one is the usual ''abundance'' criterion which is used to calculate the relic density after the ''freeze-out'' era. To compute the relic density we used the numerical public code micrOMEGAs. The second criterion applies the microcanonical definition of entropy to a weakly interacting and self-gravitating gas evaluating then the change in the entropy per particle of this gas between the ''freeze-out'' era and present day virialized structures (i.e systems in virial equilibrium). An ''entropy-consistency'' criterion emerges by comparing theoretical and empirical estimates of this entropy. The main objective of our work is to determine for which regions of the parameter space in the mSUGRA model are both criteria consistent with the 2σ bounds according to WMAP for the relic density: 0.0945 < ΩCDMh2 < 0.1287. As a first result, we found that for A0 = 0, sgnμ +, small values of tanβ are not favored; only for tanβ ≅ 50 are both criteria significantly consistent
Asymptotically Honest Confidence Regions for High Dimensional
DEFF Research Database (Denmark)
Caner, Mehmet; Kock, Anders Bredahl
While variable selection and oracle inequalities for the estimation and prediction error have received considerable attention in the literature on high-dimensional models, very little work has been done in the area of testing and construction of confidence bands in high-dimensional models. However...... develop an oracle inequality for the conservative Lasso only assuming the existence of a certain number of moments. This is done by means of the Marcinkiewicz-Zygmund inequality which in our context provides sharper bounds than Nemirovski's inequality. As opposed to van de Geer et al. (2014) we allow...
Scoping the parameter space for demo and the engineering test
International Nuclear Information System (INIS)
Meier, W R.
1999-01-01
In our IFE development plan, we have set a goal of building an Engineering Test Facility (ETF) for a total cost of $2B and a Demo for $3B. In Mike Campbell s presentation at Madison, we included a viewgraph with an example Demo that had 80 to 250 MWe of net power and showed a plausible argument that it could cost less than $3B. In this memo, I examine the design space for the Demo and then briefly for the ETF. Instead of attempting to estimate the costs of the drivers, I pose the question in a way to define R ampersand D goals: As a function of key design and performance parameters, how much can the driver cost if the total facility cost is limited to the specified goal? The design parameters examined for the Demo included target gain, driver energy, driver efficiency, and net power output. For the ETF; the design parameters are target gain, driver energy, and target yield. The resulting graphs of allowable driver cost determine the goals that the driver R ampersand D programs must seek to meet
Scanning the parameter space of collapsing rotating thin shells
Rocha, Jorge V.; Santarelli, Raphael
2018-06-01
We present results of a comprehensive study of collapsing and bouncing thin shells with rotation, framing it in the context of the weak cosmic censorship conjecture. The analysis is based on a formalism developed specifically for higher odd dimensions that is able to describe the dynamics of collapsing rotating shells exactly. We analyse and classify a plethora of shell trajectories in asymptotically flat spacetimes. The parameters varied include the shell’s mass and angular momentum, its radial velocity at infinity, the (linear) equation-of-state parameter and the spacetime dimensionality. We find that plunges of rotating shells into black holes never produce naked singularities, as long as the matter shell obeys the weak energy condition, and so respects cosmic censorship. This applies to collapses of dust shells starting from rest or with a finite velocity at infinity. Not even shells with a negative isotropic pressure component (i.e. tension) lead to the formation of naked singularities, as long as the weak energy condition is satisfied. Endowing the shells with a positive isotropic pressure component allows for the existence of bouncing trajectories satisfying the dominant energy condition and fully contained outside rotating black holes. Otherwise any turning point occurs always inside the horizon. These results are based on strong numerical evidence from scans of numerous sections in the large parameter space available to these collapsing shells. The generalisation of the radial equation of motion to a polytropic equation-of-state for the matter shell is also included in an appendix.
Evasive Maneuvers in Space Debris Environment and Technological Parameters
Directory of Open Access Journals (Sweden)
Antônio D. C. Jesus
2012-01-01
Full Text Available We present a study of collisional dynamics between space debris and an operational vehicle in LEO. We adopted an approach based on the relative dynamics between the objects on a collisional course and with a short warning time and established a semianalytical solution for the final trajectories of these objects. Our results show that there are angular ranges in 3D, in addition to the initial conditions, that favor the collisions. These results allowed the investigation of a range of technological parameters for the spacecraft (e.g., fuel reserve that allow a safe evasive maneuver (e.g., time available for the maneuver. The numerical model was tested for different values of the impact velocity and relative distance between the approaching objects.
Application of parameters space analysis tools for empirical model validation
Energy Technology Data Exchange (ETDEWEB)
Paloma del Barrio, E. [LEPT-ENSAM UMR 8508, Talence (France); Guyon, G. [Electricite de France, Moret-sur-Loing (France)
2004-01-01
A new methodology for empirical model validation has been proposed in the framework of the Task 22 (Building Energy Analysis Tools) of the International Energy Agency. It involves two main steps: checking model validity and diagnosis. Both steps, as well as the underlying methods, have been presented in the first part of the paper. In this part, they are applied for testing modelling hypothesis in the framework of the thermal analysis of an actual building. Sensitivity analysis tools have been first used to identify the parts of the model that can be really tested on the available data. A preliminary diagnosis is then supplied by principal components analysis. Useful information for model behaviour improvement has been finally obtained by optimisation techniques. This example of application shows how model parameters space analysis is a powerful tool for empirical validation. In particular, diagnosis possibilities are largely increased in comparison with residuals analysis techniques. (author)
The MSSM Parameter Space with Non-Universal Higgs Masses
Ellis, Jonathan Richard; Santoso, Y; Ellis, John; Olive, Keith A.; Santoso, Yudi
2002-01-01
Without assuming that Higgs masses have the same values as other scalar masses at the input GUT scale, we combine constraints on the minimal supersymmetric extension of the Standard Model (MSSM) coming from the cold dark matter density with the limits from direct searches at accelerators such as LEP, indirect measurements such as b to s gamma decay and the anomalous magnetic moment of the muon. The requirement that Higgs masses-squared be positive at the GUT scale imposes important restrictions on the MSSM parameter space, as does the requirement that the LSP be neutral. We analyze the interplay of these constraints in the (mu, m_A), (mu, m_{1/2}), (m_{1/2}, m_0) and (m_A, tan beta) planes. These exhibit new features not seen in the corresponding planes in the constrained MSSM in which universality is extended to Higgs masses.
Parameter estimation in space systems using recurrent neural networks
Parlos, Alexander G.; Atiya, Amir F.; Sunkel, John W.
1991-01-01
The identification of time-varying parameters encountered in space systems is addressed, using artificial neural systems. A hybrid feedforward/feedback neural network, namely a recurrent multilayer perception, is used as the model structure in the nonlinear system identification. The feedforward portion of the network architecture provides its well-known interpolation property, while through recurrency and cross-talk, the local information feedback enables representation of temporal variations in the system nonlinearities. The standard back-propagation-learning algorithm is modified and it is used for both the off-line and on-line supervised training of the proposed hybrid network. The performance of recurrent multilayer perceptron networks in identifying parameters of nonlinear dynamic systems is investigated by estimating the mass properties of a representative large spacecraft. The changes in the spacecraft inertia are predicted using a trained neural network, during two configurations corresponding to the early and late stages of the spacecraft on-orbit assembly sequence. The proposed on-line mass properties estimation capability offers encouraging results, though, further research is warranted for training and testing the predictive capabilities of these networks beyond nominal spacecraft operations.
Dynamical quantum Hall effect in the parameter space.
Gritsev, V; Polkovnikov, A
2012-04-24
Geometric phases in quantum mechanics play an extraordinary role in broadening our understanding of fundamental significance of geometry in nature. One of the best known examples is the Berry phase [M.V. Berry (1984), Proc. Royal. Soc. London A, 392:45], which naturally emerges in quantum adiabatic evolution. So far the applicability and measurements of the Berry phase were mostly limited to systems of weakly interacting quasi-particles, where interference experiments are feasible. Here we show how one can go beyond this limitation and observe the Berry curvature, and hence the Berry phase, in generic systems as a nonadiabatic response of physical observables to the rate of change of an external parameter. These results can be interpreted as a dynamical quantum Hall effect in a parameter space. The conventional quantum Hall effect is a particular example of the general relation if one views the electric field as a rate of change of the vector potential. We illustrate our findings by analyzing the response of interacting spin chains to a rotating magnetic field. We observe the quantization of this response, which we term the rotational quantum Hall effect.
Frequentist analysis of the parameter space of minimal supergravity
Energy Technology Data Exchange (ETDEWEB)
Buchmueller, O.; Colling, D. [Imperial College, London (United Kingdom). High Energy Physics Group; Cavanaugh, R. [Fermi National Accelerator Laboratory, Batavia, IL (United States); Illinois Univ., Chicago, IL (US). Physics Dept.] (and others)
2010-12-15
We make a frequentist analysis of the parameter space of minimal supergravity (mSUGRA), in which, as well as the gaugino and scalar soft supersymmetry-breaking parameters being universal, there is a specific relation between the trilinear, bilinear and scalar supersymmetry-breaking parameters, A{sub 0}=B{sub 0}+m{sub 0}, and the gravitino mass is fixed by m{sub 3/2}=m{sub 0}. We also consider a more general model, in which the gravitino mass constraint is relaxed (the VCMSSM). We combine in the global likelihood function the experimental constraints from low-energy electroweak precision data, the anomalous magnetic moment of the muon, the lightest Higgs boson mass M{sub h}, B physics and the astrophysical cold dark matter density, assuming that the lightest supersymmetric particle (LSP) is a neutralino. In the VCMSSM, we find a preference for values of m{sub 1/2} and m{sub 0} similar to those found previously in frequentist analyses of the constrained MSSM (CMSSM) and a model with common non-universal Higgs masses (NUHM1). On the other hand, in mSUGRA we find two preferred regions: one with larger values of both m{sub 1/2} and m{sub 0} than in the VCMSSM, and one with large m{sub 0} but small m{sub 1/2}. We compare the probabilities of the frequentist fits in mSUGRA, the VCMSSM, the CMSSM and the NUHM1: the probability that mSUGRA is consistent with the present data is significantly less than in the other models. We also discuss the mSUGRA and VCMSSM predictions for sparticle masses and other observables, identifying potential signatures at the LHC and elsewhere. (orig.)
Effect of solar wind plasma parameters on space weather
International Nuclear Information System (INIS)
Rathore, Balveer S.; Gupta, Dinesh C.; Kaushik, Subhash C.
2015-01-01
Today's challenge for space weather research is to quantitatively predict the dynamics of the magnetosphere from measured solar wind and interplanetary magnetic field (IMF) conditions. Correlative studies between geomagnetic storms (GMSs) and the various interplanetary (IP) field/plasma parameters have been performed to search for the causes of geomagnetic activity and develop models for predicting the occurrence of GMSs, which are important for space weather predictions. We find a possible relation between GMSs and solar wind and IMF parameters in three different situations and also derived the linear relation for all parameters in three situations. On the basis of the present statistical study, we develop an empirical model. With the help of this model, we can predict all categories of GMSs. This model is based on the following fact: the total IMF B total can be used to trigger an alarm for GMSs, when sudden changes in total magnetic field B total occur. This is the first alarm condition for a storm's arrival. It is observed in the present study that the southward B z component of the IMF is an important factor for describing GMSs. A result of the paper is that the magnitude of B z is maximum neither during the initial phase (at the instant of the IP shock) nor during the main phase (at the instant of Disturbance storm time (Dst) minimum). It is seen in this study that there is a time delay between the maximum value of southward B z and the Dst minimum, and this time delay can be used in the prediction of the intensity of a magnetic storm two-three hours before the main phase of a GMS. A linear relation has been derived between the maximum value of the southward component of B z and the Dst, which is Dst = (−0.06) + (7.65) B z +t. Some auxiliary conditions should be fulfilled with this, for example the speed of the solar wind should, on average, be 350 km s −1 to 750 km s −1 , plasma β should be low and, most importantly, plasma temperature
High Dimensional Classification Using Features Annealed Independence Rules.
Fan, Jianqing; Fan, Yingying
2008-01-01
Classification using high-dimensional features arises frequently in many contemporary statistical studies such as tumor classification using microarray or other high-throughput data. The impact of dimensionality on classifications is largely poorly understood. In a seminal paper, Bickel and Levina (2004) show that the Fisher discriminant performs poorly due to diverging spectra and they propose to use the independence rule to overcome the problem. We first demonstrate that even for the independence classification rule, classification using all the features can be as bad as the random guessing due to noise accumulation in estimating population centroids in high-dimensional feature space. In fact, we demonstrate further that almost all linear discriminants can perform as bad as the random guessing. Thus, it is paramountly important to select a subset of important features for high-dimensional classification, resulting in Features Annealed Independence Rules (FAIR). The conditions under which all the important features can be selected by the two-sample t-statistic are established. The choice of the optimal number of features, or equivalently, the threshold value of the test statistics are proposed based on an upper bound of the classification error. Simulation studies and real data analysis support our theoretical results and demonstrate convincingly the advantage of our new classification procedure.
The additive hazards model with high-dimensional regressors
DEFF Research Database (Denmark)
Martinussen, Torben; Scheike, Thomas
2009-01-01
This paper considers estimation and prediction in the Aalen additive hazards model in the case where the covariate vector is high-dimensional such as gene expression measurements. Some form of dimension reduction of the covariate space is needed to obtain useful statistical analyses. We study...... model. A standard PLS algorithm can also be constructed, but it turns out that the resulting predictor can only be related to the original covariates via time-dependent coefficients. The methods are applied to a breast cancer data set with gene expression recordings and to the well known primary biliary...
Derivation of Delaware Bay tidal parameters from space shuttle photography
International Nuclear Information System (INIS)
Zheng, Quanan; Yan, Xiaohai; Klemas, V.
1993-01-01
The tide-related parameters of the Delaware Bay are derived from space shuttle time-series photographs. The water areas in the bay are measured from interpretation maps of the photographs with a CALCOMP 9100 digitizer and ERDAS Image Processing System. The corresponding tidal levels are calculated using the exposure time annotated on the photographs. From these data, an approximate function relating the water area to the tidal level at a reference point is determined. Based on the function, the water areas of the Delaware Bay at mean high water (MHW) and mean low water (MLW), below 0 m, and for the tidal zone are inferred. With MHW and MLW areas and the mean tidal range, the authors calculate the tidal influx of the Delaware Bay, which is 2.76 x 1O 9 m 3 . Furthermore, the velocity of flood tide at the bay mouth is determined using the tidal flux and an integral of the velocity distribution function at the cross section between Cape Henlopen and Cape May. The result is 132 cm/s, which compares well with the data on tidal current charts
Charting the Parameter Space of the 21-cm Power Spectrum
Cohen, Aviad; Fialkov, Anastasia; Barkana, Rennan
2018-05-01
The high-redshift 21-cm signal of neutral hydrogen is expected to be observed within the next decade and will reveal epochs of cosmic evolution that have been previously inaccessible. Due to the lack of observations, many of the astrophysical processes that took place at early times are poorly constrained. In recent work we explored the astrophysical parameter space and the resulting large variety of possible global (sky-averaged) 21-cm signals. Here we extend our analysis to the fluctuations in the 21-cm signal, accounting for those introduced by density and velocity, Lyα radiation, X-ray heating, and ionization. While the radiation sources are usually highlighted, we find that in many cases the density fluctuations play a significant role at intermediate redshifts. Using both the power spectrum and its slope, we show that properties of high-redshift sources can be extracted from the observable features of the fluctuation pattern. For instance, the peak amplitude of ionization fluctuations can be used to estimate whether heating occurred early or late and, in the early case, to also deduce the cosmic mean ionized fraction at that time. The slope of the power spectrum has a more universal redshift evolution than the power spectrum itself and can thus be used more easily as a tracer of high-redshift astrophysics. Its peaks can be used, for example, to estimate the redshift of the Lyα coupling transition and the redshift of the heating transition (and the mean gas temperature at that time). We also show that a tight correlation is predicted between features of the power spectrum and of the global signal, potentially yielding important consistency checks.
Wang, Zhiping; Chen, Jinyu; Yu, Benli
2017-02-20
We investigate the two-dimensional (2D) and three-dimensional (3D) atom localization behaviors via spontaneously generated coherence in a microwave-driven four-level atomic system. Owing to the space-dependent atom-field interaction, it is found that the detecting probability and precision of 2D and 3D atom localization behaviors can be significantly improved via adjusting the system parameters, the phase, amplitude, and initial population distribution. Interestingly, the atom can be localized in volumes that are substantially smaller than a cubic optical wavelength. Our scheme opens a promising way to achieve high-precision and high-efficiency atom localization, which provides some potential applications in high-dimensional atom nanolithography.
Introduction to high-dimensional statistics
Giraud, Christophe
2015-01-01
Ever-greater computing technologies have given rise to an exponentially growing volume of data. Today massive data sets (with potentially thousands of variables) play an important role in almost every branch of modern human activity, including networks, finance, and genetics. However, analyzing such data has presented a challenge for statisticians and data analysts and has required the development of new statistical methods capable of separating the signal from the noise.Introduction to High-Dimensional Statistics is a concise guide to state-of-the-art models, techniques, and approaches for ha
Estimating High-Dimensional Time Series Models
DEFF Research Database (Denmark)
Medeiros, Marcelo C.; Mendes, Eduardo F.
We study the asymptotic properties of the Adaptive LASSO (adaLASSO) in sparse, high-dimensional, linear time-series models. We assume both the number of covariates in the model and candidate variables can increase with the number of observations and the number of candidate variables is, possibly......, larger than the number of observations. We show the adaLASSO consistently chooses the relevant variables as the number of observations increases (model selection consistency), and has the oracle property, even when the errors are non-Gaussian and conditionally heteroskedastic. A simulation study shows...
High dimensional classifiers in the imbalanced case
DEFF Research Database (Denmark)
Bak, Britta Anker; Jensen, Jens Ledet
We consider the binary classification problem in the imbalanced case where the number of samples from the two groups differ. The classification problem is considered in the high dimensional case where the number of variables is much larger than the number of samples, and where the imbalance leads...... to a bias in the classification. A theoretical analysis of the independence classifier reveals the origin of the bias and based on this we suggest two new classifiers that can handle any imbalance ratio. The analytical results are supplemented by a simulation study, where the suggested classifiers in some...
Topology of high-dimensional manifolds
Energy Technology Data Exchange (ETDEWEB)
Farrell, F T [State University of New York, Binghamton (United States); Goettshe, L [Abdus Salam ICTP, Trieste (Italy); Lueck, W [Westfaelische Wilhelms-Universitaet Muenster, Muenster (Germany)
2002-08-15
The School on High-Dimensional Manifold Topology took place at the Abdus Salam ICTP, Trieste from 21 May 2001 to 8 June 2001. The focus of the school was on the classification of manifolds and related aspects of K-theory, geometry, and operator theory. The topics covered included: surgery theory, algebraic K- and L-theory, controlled topology, homology manifolds, exotic aspherical manifolds, homeomorphism and diffeomorphism groups, and scalar curvature. The school consisted of 2 weeks of lecture courses and one week of conference. Thwo-part lecture notes volume contains the notes of most of the lecture courses.
Exploitation of ISAR Imagery in Euler Parameter Space
National Research Council Canada - National Science Library
Baird, Christopher; Kersey, W. T; Giles, R; Nixon, W. E
2005-01-01
.... The Euler parameters have potential value in target classification but have historically met with limited success due to ambiguities that arise in decomposition as well as the parameters' sensitivity...
Elucidating high-dimensional cancer hallmark annotation via enriched ontology.
Yan, Shankai; Wong, Ka-Chun
2017-09-01
Cancer hallmark annotation is a promising technique that could discover novel knowledge about cancer from the biomedical literature. The automated annotation of cancer hallmarks could reveal relevant cancer transformation processes in the literature or extract the articles that correspond to the cancer hallmark of interest. It acts as a complementary approach that can retrieve knowledge from massive text information, advancing numerous focused studies in cancer research. Nonetheless, the high-dimensional nature of cancer hallmark annotation imposes a unique challenge. To address the curse of dimensionality, we compared multiple cancer hallmark annotation methods on 1580 PubMed abstracts. Based on the insights, a novel approach, UDT-RF, which makes use of ontological features is proposed. It expands the feature space via the Medical Subject Headings (MeSH) ontology graph and utilizes novel feature selections for elucidating the high-dimensional cancer hallmark annotation space. To demonstrate its effectiveness, state-of-the-art methods are compared and evaluated by a multitude of performance metrics, revealing the full performance spectrum on the full set of cancer hallmarks. Several case studies are conducted, demonstrating how the proposed approach could reveal novel insights into cancers. https://github.com/cskyan/chmannot. Copyright © 2017 Elsevier Inc. All rights reserved.
Online State Space Model Parameter Estimation in Synchronous Machines
Directory of Open Access Journals (Sweden)
Z. Gallehdari
2014-06-01
The suggested approach is evaluated for a sample synchronous machine model. Estimated parameters are tested for different inputs at different operating conditions. The effect of noise is also considered in this study. Simulation results show that the proposed approach provides good accuracy for parameter estimation.
Clustering high dimensional data using RIA
Energy Technology Data Exchange (ETDEWEB)
Aziz, Nazrina [School of Quantitative Sciences, College of Arts and Sciences, Universiti Utara Malaysia, 06010 Sintok, Kedah (Malaysia)
2015-05-15
Clustering may simply represent a convenient method for organizing a large data set so that it can easily be understood and information can efficiently be retrieved. However, identifying cluster in high dimensionality data sets is a difficult task because of the curse of dimensionality. Another challenge in clustering is some traditional functions cannot capture the pattern dissimilarity among objects. In this article, we used an alternative dissimilarity measurement called Robust Influence Angle (RIA) in the partitioning method. RIA is developed using eigenstructure of the covariance matrix and robust principal component score. We notice that, it can obtain cluster easily and hence avoid the curse of dimensionality. It is also manage to cluster large data sets with mixed numeric and categorical value.
Scalable Nearest Neighbor Algorithms for High Dimensional Data.
Muja, Marius; Lowe, David G
2014-11-01
For many computer vision and machine learning problems, large training sets are key for good performance. However, the most computationally expensive part of many computer vision and machine learning algorithms consists of finding nearest neighbor matches to high dimensional vectors that represent the training data. We propose new algorithms for approximate nearest neighbor matching and evaluate and compare them with previous algorithms. For matching high dimensional features, we find two algorithms to be the most efficient: the randomized k-d forest and a new algorithm proposed in this paper, the priority search k-means tree. We also propose a new algorithm for matching binary features by searching multiple hierarchical clustering trees and show it outperforms methods typically used in the literature. We show that the optimal nearest neighbor algorithm and its parameters depend on the data set characteristics and describe an automated configuration procedure for finding the best algorithm to search a particular data set. In order to scale to very large data sets that would otherwise not fit in the memory of a single machine, we propose a distributed nearest neighbor matching framework that can be used with any of the algorithms described in the paper. All this research has been released as an open source library called fast library for approximate nearest neighbors (FLANN), which has been incorporated into OpenCV and is now one of the most popular libraries for nearest neighbor matching.
GRID-BASED EXPLORATION OF COSMOLOGICAL PARAMETER SPACE WITH SNAKE
International Nuclear Information System (INIS)
Mikkelsen, K.; Næss, S. K.; Eriksen, H. K.
2013-01-01
We present a fully parallelized grid-based parameter estimation algorithm for investigating multidimensional likelihoods called Snake, and apply it to cosmological parameter estimation. The basic idea is to map out the likelihood grid-cell by grid-cell according to decreasing likelihood, and stop when a certain threshold has been reached. This approach improves vastly on the 'curse of dimensionality' problem plaguing standard grid-based parameter estimation simply by disregarding grid cells with negligible likelihood. The main advantages of this method compared to standard Metropolis-Hastings Markov Chain Monte Carlo methods include (1) trivial extraction of arbitrary conditional distributions; (2) direct access to Bayesian evidences; (3) better sampling of the tails of the distribution; and (4) nearly perfect parallelization scaling. The main disadvantage is, as in the case of brute-force grid-based evaluation, a dependency on the number of parameters, N par . One of the main goals of the present paper is to determine how large N par can be, while still maintaining reasonable computational efficiency; we find that N par = 12 is well within the capabilities of the method. The performance of the code is tested by comparing cosmological parameters estimated using Snake and the WMAP-7 data with those obtained using CosmoMC, the current standard code in the field. We find fully consistent results, with similar computational expenses, but shorter wall time due to the perfect parallelization scheme
GRID-BASED EXPLORATION OF COSMOLOGICAL PARAMETER SPACE WITH SNAKE
Energy Technology Data Exchange (ETDEWEB)
Mikkelsen, K.; Næss, S. K.; Eriksen, H. K., E-mail: kristin.mikkelsen@astro.uio.no [Institute of Theoretical Astrophysics, University of Oslo, P.O. Box 1029, Blindern, NO-0315 Oslo (Norway)
2013-11-10
We present a fully parallelized grid-based parameter estimation algorithm for investigating multidimensional likelihoods called Snake, and apply it to cosmological parameter estimation. The basic idea is to map out the likelihood grid-cell by grid-cell according to decreasing likelihood, and stop when a certain threshold has been reached. This approach improves vastly on the 'curse of dimensionality' problem plaguing standard grid-based parameter estimation simply by disregarding grid cells with negligible likelihood. The main advantages of this method compared to standard Metropolis-Hastings Markov Chain Monte Carlo methods include (1) trivial extraction of arbitrary conditional distributions; (2) direct access to Bayesian evidences; (3) better sampling of the tails of the distribution; and (4) nearly perfect parallelization scaling. The main disadvantage is, as in the case of brute-force grid-based evaluation, a dependency on the number of parameters, N{sub par}. One of the main goals of the present paper is to determine how large N{sub par} can be, while still maintaining reasonable computational efficiency; we find that N{sub par} = 12 is well within the capabilities of the method. The performance of the code is tested by comparing cosmological parameters estimated using Snake and the WMAP-7 data with those obtained using CosmoMC, the current standard code in the field. We find fully consistent results, with similar computational expenses, but shorter wall time due to the perfect parallelization scheme.
Changes in Periodontal and Microbial Parameters after the Space ...
African Journals Online (AJOL)
Aim: This study aims to evaluate the clinical and microbiological changes accompanying the inflammatory process of periodontal tissues during treatment with space maintainers (SMs). Materials and Methods: The children were separated into fixed (Group 1, n = 20) and removable (Group 2, n = 20) appliance groups.
Dynamics of a neuron model in different two-dimensional parameter-spaces
Rech, Paulo C.
2011-03-01
We report some two-dimensional parameter-space diagrams numerically obtained for the multi-parameter Hindmarsh-Rose neuron model. Several different parameter planes are considered, and we show that regardless of the combination of parameters, a typical scenario is preserved: for all choice of two parameters, the parameter-space presents a comb-shaped chaotic region immersed in a large periodic region. We also show that exist regions close these chaotic region, separated by the comb teeth, organized themselves in period-adding bifurcation cascades.
Variance inflation in high dimensional Support Vector Machines
DEFF Research Database (Denmark)
Abrahamsen, Trine Julie; Hansen, Lars Kai
2013-01-01
Many important machine learning models, supervised and unsupervised, are based on simple Euclidean distance or orthogonal projection in a high dimensional feature space. When estimating such models from small training sets we face the problem that the span of the training data set input vectors...... the case of Support Vector Machines (SVMS) and we propose a non-parametric scheme to restore proper generalizability. We illustrate the algorithm and its ability to restore performance on a wide range of benchmark data sets....... follow a different probability law with less variance. While the problem and basic means to reconstruct and deflate are well understood in unsupervised learning, the case of supervised learning is less well understood. We here investigate the effect of variance inflation in supervised learning including...
Evaluating Clustering in Subspace Projections of High Dimensional Data
DEFF Research Database (Denmark)
Müller, Emmanuel; Günnemann, Stephan; Assent, Ira
2009-01-01
Clustering high dimensional data is an emerging research field. Subspace clustering or projected clustering group similar objects in subspaces, i.e. projections, of the full space. In the past decade, several clustering paradigms have been developed in parallel, without thorough evaluation...... and comparison between these paradigms on a common basis. Conclusive evaluation and comparison is challenged by three major issues. First, there is no ground truth that describes the "true" clusters in real world data. Second, a large variety of evaluation measures have been used that reflect different aspects...... of the clustering result. Finally, in typical publications authors have limited their analysis to their favored paradigm only, while paying other paradigms little or no attention. In this paper, we take a systematic approach to evaluate the major paradigms in a common framework. We study representative clustering...
Geometry on the parameter space of the belief propagation algorithm on Bayesian networks
Energy Technology Data Exchange (ETDEWEB)
Watanabe, Yodai [National Institute of Informatics, Research Organization of Information and Systems, 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430 (Japan); Laboratory for Mathematical Neuroscience, RIKEN Brain Science Institute, 2-1 Hirosawa, Wako-shi, Saitama 351-0198 (Japan)
2006-01-30
This Letter considers a geometrical structure on the parameter space of the belief propagation algorithm on Bayesian networks. The statistical manifold of posterior distributions is introduced, and the expression for the information metric on the manifold is derived. The expression is used to construct a cost function which can be regarded as a measure of the distance in the parameter space.
Dynamics of a neuron model in different two-dimensional parameter-spaces
International Nuclear Information System (INIS)
Rech, Paulo C.
2011-01-01
We report some two-dimensional parameter-space diagrams numerically obtained for the multi-parameter Hindmarsh-Rose neuron model. Several different parameter planes are considered, and we show that regardless of the combination of parameters, a typical scenario is preserved: for all choice of two parameters, the parameter-space presents a comb-shaped chaotic region immersed in a large periodic region. We also show that exist regions close these chaotic region, separated by the comb teeth, organized themselves in period-adding bifurcation cascades. - Research highlights: → We report parameter-spaces obtained for the Hindmarsh-Rose neuron model. → Regardless of the combination of parameters, a typical scenario is preserved. → The scenario presents a comb-shaped chaotic region immersed in a periodic region. → Periodic regions near the chaotic region are in period-adding bifurcation cascades.
International Nuclear Information System (INIS)
Zhang, Wuhong; Su, Ming; Wu, Ziwen; Lu, Meng; Huang, Bingwei; Chen, Lixiang
2013-01-01
Twisted photons enable the definition of a Hilbert space beyond two dimensions by orbital angular momentum (OAM) eigenstates. Here we propose a feasible entanglement concentration experiment, to enhance the quality of high-dimensional entanglement shared by twisted photon pairs. Our approach is started from the full characterization of entangled spiral bandwidth, and is then based on the careful selection of the Laguerre–Gaussian (LG) modes with specific radial and azimuthal indices p and ℓ. In particular, we demonstrate the possibility of high-dimensional entanglement concentration residing in the OAM subspace of up to 21 dimensions. By means of LabVIEW simulations with spatial light modulators, we show that the Shannon dimensionality could be employed to quantify the quality of the present concentration. Our scheme holds promise in quantum information applications defined in high-dimensional Hilbert space. (letter)
Detection of Subtle Context-Dependent Model Inaccuracies in High-Dimensional Robot Domains.
Mendoza, Juan Pablo; Simmons, Reid; Veloso, Manuela
2016-12-01
Autonomous robots often rely on models of their sensing and actions for intelligent decision making. However, when operating in unconstrained environments, the complexity of the world makes it infeasible to create models that are accurate in every situation. This article addresses the problem of using potentially large and high-dimensional sets of robot execution data to detect situations in which a robot model is inaccurate-that is, detecting context-dependent model inaccuracies in a high-dimensional context space. To find inaccuracies tractably, the robot conducts an informed search through low-dimensional projections of execution data to find parametric Regions of Inaccurate Modeling (RIMs). Empirical evidence from two robot domains shows that this approach significantly enhances the detection power of existing RIM-detection algorithms in high-dimensional spaces.
Modeling High-Dimensional Multichannel Brain Signals
Hu, Lechuan
2017-12-12
Our goal is to model and measure functional and effective (directional) connectivity in multichannel brain physiological signals (e.g., electroencephalograms, local field potentials). The difficulties from analyzing these data mainly come from two aspects: first, there are major statistical and computational challenges for modeling and analyzing high-dimensional multichannel brain signals; second, there is no set of universally agreed measures for characterizing connectivity. To model multichannel brain signals, our approach is to fit a vector autoregressive (VAR) model with potentially high lag order so that complex lead-lag temporal dynamics between the channels can be captured. Estimates of the VAR model will be obtained by our proposed hybrid LASSLE (LASSO + LSE) method which combines regularization (to control for sparsity) and least squares estimation (to improve bias and mean-squared error). Then we employ some measures of connectivity but put an emphasis on partial directed coherence (PDC) which can capture the directional connectivity between channels. PDC is a frequency-specific measure that explains the extent to which the present oscillatory activity in a sender channel influences the future oscillatory activity in a specific receiver channel relative to all possible receivers in the network. The proposed modeling approach provided key insights into potential functional relationships among simultaneously recorded sites during performance of a complex memory task. Specifically, this novel method was successful in quantifying patterns of effective connectivity across electrode locations, and in capturing how these patterns varied across trial epochs and trial types.
Network Reconstruction From High-Dimensional Ordinary Differential Equations.
Chen, Shizhe; Shojaie, Ali; Witten, Daniela M
2017-01-01
We consider the task of learning a dynamical system from high-dimensional time-course data. For instance, we might wish to estimate a gene regulatory network from gene expression data measured at discrete time points. We model the dynamical system nonparametrically as a system of additive ordinary differential equations. Most existing methods for parameter estimation in ordinary differential equations estimate the derivatives from noisy observations. This is known to be challenging and inefficient. We propose a novel approach that does not involve derivative estimation. We show that the proposed method can consistently recover the true network structure even in high dimensions, and we demonstrate empirical improvement over competing approaches. Supplementary materials for this article are available online.
Quantum correlation of high dimensional system in a dephasing environment
Ji, Yinghua; Ke, Qiang; Hu, Juju
2018-05-01
For a high dimensional spin-S system embedded in a dephasing environment, we theoretically analyze the time evolutions of quantum correlation and entanglement via Frobenius norm and negativity. The quantum correlation dynamics can be considered as a function of the decoherence parameters, including the ratio between the system oscillator frequency ω0 and the reservoir cutoff frequency ωc , and the different environment temperature. It is shown that the quantum correlation can not only measure nonclassical correlation of the considered system, but also perform a better robustness against the dissipation. In addition, the decoherence presents the non-Markovian features and the quantum correlation freeze phenomenon. The former is much weaker than that in the sub-Ohmic or Ohmic thermal reservoir environment.
Parameter spaces for linear and nonlinear whistler-mode waves
International Nuclear Information System (INIS)
Summers, Danny; Tang, Rongxin; Omura, Yoshiharu; Lee, Dong-Hun
2013-01-01
We examine the growth of magnetospheric whistler-mode waves which comprises a linear growth phase followed by a nonlinear growth phase. We construct time-profiles for the wave amplitude that smoothly match at the transition between linear and nonlinear wave growth. This matching procedure can only take place over a limited “matching region” in (N h /N 0 ,A T )-space, where A T is the electron thermal anisotropy, N h is the hot (energetic) electron number density, and N 0 is the cold (background) electron number density. We construct this matching region and determine how the matching wave amplitude varies throughout the region. Further, we specify a boundary in (N h /N 0 ,A T )-space that separates a region where only linear chorus wave growth can occur from the region in which fully nonlinear chorus growth is possible. We expect that this boundary should prove of practical use in performing computationally expensive full-scale particle simulations, and in interpreting experimental wave data
Constraining the loop quantum gravity parameter space from phenomenology
Brahma, Suddhasattwa; Ronco, Michele
2018-03-01
Development of quantum gravity theories rarely takes inputs from experimental physics. In this letter, we take a small step towards correcting this by establishing a paradigm for incorporating putative quantum corrections, arising from canonical quantum gravity (QG) theories, in deriving falsifiable modified dispersion relations (MDRs) for particles on a deformed Minkowski space-time. This allows us to differentiate and, hopefully, pick between several quantization choices via testable, state-of-the-art phenomenological predictions. Although a few explicit examples from loop quantum gravity (LQG) (such as the regularization scheme used or the representation of the gauge group) are shown here to establish the claim, our framework is more general and is capable of addressing other quantization ambiguities within LQG and also those arising from other similar QG approaches.
Variable kernel density estimation in high-dimensional feature spaces
CSIR Research Space (South Africa)
Van der Walt, Christiaan M
2017-02-01
Full Text Available Estimating the joint probability density function of a dataset is a central task in many machine learning applications. In this work we address the fundamental problem of kernel bandwidth estimation for variable kernel density estimation in high...
High-Dimensional Quantum Information Processing with Linear Optics
Fitzpatrick, Casey A.
Quantum information processing (QIP) is an interdisciplinary field concerned with the development of computers and information processing systems that utilize quantum mechanical properties of nature to carry out their function. QIP systems have become vastly more practical since the turn of the century. Today, QIP applications span imaging, cryptographic security, computation, and simulation (quantum systems that mimic other quantum systems). Many important strategies improve quantum versions of classical information system hardware, such as single photon detectors and quantum repeaters. Another more abstract strategy engineers high-dimensional quantum state spaces, so that each successful event carries more information than traditional two-level systems allow. Photonic states in particular bring the added advantages of weak environmental coupling and data transmission near the speed of light, allowing for simpler control and lower system design complexity. In this dissertation, numerous novel, scalable designs for practical high-dimensional linear-optical QIP systems are presented. First, a correlated photon imaging scheme using orbital angular momentum (OAM) states to detect rotational symmetries in objects using measurements, as well as building images out of those interactions is reported. Then, a statistical detection method using chains of OAM superpositions distributed according to the Fibonacci sequence is established and expanded upon. It is shown that the approach gives rise to schemes for sorting, detecting, and generating the recursively defined high-dimensional states on which some quantum cryptographic protocols depend. Finally, an ongoing study based on a generalization of the standard optical multiport for applications in quantum computation and simulation is reported upon. The architecture allows photons to reverse momentum inside the device. This in turn enables realistic implementation of controllable linear-optical scattering vertices for
High-Dimensional Intrinsic Interpolation Using Gaussian Process Regression and Diffusion Maps
International Nuclear Information System (INIS)
Thimmisetty, Charanraj A.; Ghanem, Roger G.; White, Joshua A.; Chen, Xiao
2017-01-01
This article considers the challenging task of estimating geologic properties of interest using a suite of proxy measurements. The current work recast this task as a manifold learning problem. In this process, this article introduces a novel regression procedure for intrinsic variables constrained onto a manifold embedded in an ambient space. The procedure is meant to sharpen high-dimensional interpolation by inferring non-linear correlations from the data being interpolated. The proposed approach augments manifold learning procedures with a Gaussian process regression. It first identifies, using diffusion maps, a low-dimensional manifold embedded in an ambient high-dimensional space associated with the data. It relies on the diffusion distance associated with this construction to define a distance function with which the data model is equipped. This distance metric function is then used to compute the correlation structure of a Gaussian process that describes the statistical dependence of quantities of interest in the high-dimensional ambient space. The proposed method is applicable to arbitrarily high-dimensional data sets. Here, it is applied to subsurface characterization using a suite of well log measurements. The predictions obtained in original, principal component, and diffusion space are compared using both qualitative and quantitative metrics. Considerable improvement in the prediction of the geological structural properties is observed with the proposed method.
Efficient Smoothed Concomitant Lasso Estimation for High Dimensional Regression
Ndiaye, Eugene; Fercoq, Olivier; Gramfort, Alexandre; Leclère, Vincent; Salmon, Joseph
2017-10-01
In high dimensional settings, sparse structures are crucial for efficiency, both in term of memory, computation and performance. It is customary to consider ℓ 1 penalty to enforce sparsity in such scenarios. Sparsity enforcing methods, the Lasso being a canonical example, are popular candidates to address high dimension. For efficiency, they rely on tuning a parameter trading data fitting versus sparsity. For the Lasso theory to hold this tuning parameter should be proportional to the noise level, yet the latter is often unknown in practice. A possible remedy is to jointly optimize over the regression parameter as well as over the noise level. This has been considered under several names in the literature: Scaled-Lasso, Square-root Lasso, Concomitant Lasso estimation for instance, and could be of interest for uncertainty quantification. In this work, after illustrating numerical difficulties for the Concomitant Lasso formulation, we propose a modification we coined Smoothed Concomitant Lasso, aimed at increasing numerical stability. We propose an efficient and accurate solver leading to a computational cost no more expensive than the one for the Lasso. We leverage on standard ingredients behind the success of fast Lasso solvers: a coordinate descent algorithm, combined with safe screening rules to achieve speed efficiency, by eliminating early irrelevant features.
Quality and efficiency in high dimensional Nearest neighbor search
Tao, Yufei; Yi, Ke; Sheng, Cheng; Kalnis, Panos
2009-01-01
Nearest neighbor (NN) search in high dimensional space is an important problem in many applications. Ideally, a practical solution (i) should be implementable in a relational database, and (ii) its query cost should grow sub-linearly with the dataset size, regardless of the data and query distributions. Despite the bulk of NN literature, no solution fulfills both requirements, except locality sensitive hashing (LSH). The existing LSH implementations are either rigorous or adhoc. Rigorous-LSH ensures good quality of query results, but requires expensive space and query cost. Although adhoc-LSH is more efficient, it abandons quality control, i.e., the neighbor it outputs can be arbitrarily bad. As a result, currently no method is able to ensure both quality and efficiency simultaneously in practice. Motivated by this, we propose a new access method called the locality sensitive B-tree (LSB-tree) that enables fast highdimensional NN search with excellent quality. The combination of several LSB-trees leads to a structure called the LSB-forest that ensures the same result quality as rigorous-LSH, but reduces its space and query cost dramatically. The LSB-forest also outperforms adhoc-LSH, even though the latter has no quality guarantee. Besides its appealing theoretical properties, the LSB-tree itself also serves as an effective index that consumes linear space, and supports efficient updates. Our extensive experiments confirm that the LSB-tree is faster than (i) the state of the art of exact NN search by two orders of magnitude, and (ii) the best (linear-space) method of approximate retrieval by an order of magnitude, and at the same time, returns neighbors with much better quality. © 2009 ACM.
Model-based Clustering of High-Dimensional Data in Astrophysics
Bouveyron, C.
2016-05-01
The nature of data in Astrophysics has changed, as in other scientific fields, in the past decades due to the increase of the measurement capabilities. As a consequence, data are nowadays frequently of high dimensionality and available in mass or stream. Model-based techniques for clustering are popular tools which are renowned for their probabilistic foundations and their flexibility. However, classical model-based techniques show a disappointing behavior in high-dimensional spaces which is mainly due to their dramatical over-parametrization. The recent developments in model-based classification overcome these drawbacks and allow to efficiently classify high-dimensional data, even in the "small n / large p" situation. This work presents a comprehensive review of these recent approaches, including regularization-based techniques, parsimonious modeling, subspace classification methods and classification methods based on variable selection. The use of these model-based methods is also illustrated on real-world classification problems in Astrophysics using R packages.
A Hybrid Semi-Supervised Anomaly Detection Model for High-Dimensional Data
Directory of Open Access Journals (Sweden)
Hongchao Song
2017-01-01
Full Text Available Anomaly detection, which aims to identify observations that deviate from a nominal sample, is a challenging task for high-dimensional data. Traditional distance-based anomaly detection methods compute the neighborhood distance between each observation and suffer from the curse of dimensionality in high-dimensional space; for example, the distances between any pair of samples are similar and each sample may perform like an outlier. In this paper, we propose a hybrid semi-supervised anomaly detection model for high-dimensional data that consists of two parts: a deep autoencoder (DAE and an ensemble k-nearest neighbor graphs- (K-NNG- based anomaly detector. Benefiting from the ability of nonlinear mapping, the DAE is first trained to learn the intrinsic features of a high-dimensional dataset to represent the high-dimensional data in a more compact subspace. Several nonparametric KNN-based anomaly detectors are then built from different subsets that are randomly sampled from the whole dataset. The final prediction is made by all the anomaly detectors. The performance of the proposed method is evaluated on several real-life datasets, and the results confirm that the proposed hybrid model improves the detection accuracy and reduces the computational complexity.
Directory of Open Access Journals (Sweden)
Dimitrios V Vavoulis
Full Text Available Traditional approaches to the problem of parameter estimation in biophysical models of neurons and neural networks usually adopt a global search algorithm (for example, an evolutionary algorithm, often in combination with a local search method (such as gradient descent in order to minimize the value of a cost function, which measures the discrepancy between various features of the available experimental data and model output. In this study, we approach the problem of parameter estimation in conductance-based models of single neurons from a different perspective. By adopting a hidden-dynamical-systems formalism, we expressed parameter estimation as an inference problem in these systems, which can then be tackled using a range of well-established statistical inference methods. The particular method we used was Kitagawa's self-organizing state-space model, which was applied on a number of Hodgkin-Huxley-type models using simulated or actual electrophysiological data. We showed that the algorithm can be used to estimate a large number of parameters, including maximal conductances, reversal potentials, kinetics of ionic currents, measurement and intrinsic noise, based on low-dimensional experimental data and sufficiently informative priors in the form of pre-defined constraints imposed on model parameters. The algorithm remained operational even when very noisy experimental data were used. Importantly, by combining the self-organizing state-space model with an adaptive sampling algorithm akin to the Covariance Matrix Adaptation Evolution Strategy, we achieved a significant reduction in the variance of parameter estimates. The algorithm did not require the explicit formulation of a cost function and it was straightforward to apply on compartmental models and multiple data sets. Overall, the proposed methodology is particularly suitable for resolving high-dimensional inference problems based on noisy electrophysiological data and, therefore, a
Review of the different methods to derive average spacing from resolved resonance parameters sets
International Nuclear Information System (INIS)
Fort, E.; Derrien, H.; Lafond, D.
1979-12-01
The average spacing of resonances is an important parameter for statistical model calculations, especially concerning non fissile nuclei. The different methods to derive this average value from resonance parameters sets have been reviewed and analyzed in order to tentatively detect their respective weaknesses and propose recommendations. Possible improvements are suggested
Oracle Inequalities for High Dimensional Vector Autoregressions
DEFF Research Database (Denmark)
Callot, Laurent; Kock, Anders Bredahl
This paper establishes non-asymptotic oracle inequalities for the prediction error and estimation accuracy of the LASSO in stationary vector autoregressive models. These inequalities are used to establish consistency of the LASSO even when the number of parameters is of a much larger order...
Forecasts of non-Gaussian parameter spaces using Box-Cox transformations
Joachimi, B.; Taylor, A. N.
2011-09-01
Forecasts of statistical constraints on model parameters using the Fisher matrix abound in many fields of astrophysics. The Fisher matrix formalism involves the assumption of Gaussianity in parameter space and hence fails to predict complex features of posterior probability distributions. Combining the standard Fisher matrix with Box-Cox transformations, we propose a novel method that accurately predicts arbitrary posterior shapes. The Box-Cox transformations are applied to parameter space to render it approximately multivariate Gaussian, performing the Fisher matrix calculation on the transformed parameters. We demonstrate that, after the Box-Cox parameters have been determined from an initial likelihood evaluation, the method correctly predicts changes in the posterior when varying various parameters of the experimental setup and the data analysis, with marginally higher computational cost than a standard Fisher matrix calculation. We apply the Box-Cox-Fisher formalism to forecast cosmological parameter constraints by future weak gravitational lensing surveys. The characteristic non-linear degeneracy between matter density parameter and normalization of matter density fluctuations is reproduced for several cases, and the capabilities of breaking this degeneracy by weak-lensing three-point statistics is investigated. Possible applications of Box-Cox transformations of posterior distributions are discussed, including the prospects for performing statistical data analysis steps in the transformed Gaussianized parameter space.
High-Dimensional Function Approximation With Neural Networks for Large Volumes of Data.
Andras, Peter
2018-02-01
Approximation of high-dimensional functions is a challenge for neural networks due to the curse of dimensionality. Often the data for which the approximated function is defined resides on a low-dimensional manifold and in principle the approximation of the function over this manifold should improve the approximation performance. It has been show that projecting the data manifold into a lower dimensional space, followed by the neural network approximation of the function over this space, provides a more precise approximation of the function than the approximation of the function with neural networks in the original data space. However, if the data volume is very large, the projection into the low-dimensional space has to be based on a limited sample of the data. Here, we investigate the nature of the approximation error of neural networks trained over the projection space. We show that such neural networks should have better approximation performance than neural networks trained on high-dimensional data even if the projection is based on a relatively sparse sample of the data manifold. We also find that it is preferable to use a uniformly distributed sparse sample of the data for the purpose of the generation of the low-dimensional projection. We illustrate these results considering the practical neural network approximation of a set of functions defined on high-dimensional data including real world data as well.
Multivariate statistics high-dimensional and large-sample approximations
Fujikoshi, Yasunori; Shimizu, Ryoichi
2010-01-01
A comprehensive examination of high-dimensional analysis of multivariate methods and their real-world applications Multivariate Statistics: High-Dimensional and Large-Sample Approximations is the first book of its kind to explore how classical multivariate methods can be revised and used in place of conventional statistical tools. Written by prominent researchers in the field, the book focuses on high-dimensional and large-scale approximations and details the many basic multivariate methods used to achieve high levels of accuracy. The authors begin with a fundamental presentation of the basic
Wells, J. R.; Kim, J. B.
2011-12-01
Parameters in dynamic global vegetation models (DGVMs) are thought to be weakly constrained and can be a significant source of errors and uncertainties. DGVMs use between 5 and 26 plant functional types (PFTs) to represent the average plant life form in each simulated plot, and each PFT typically has a dozen or more parameters that define the way it uses resource and responds to the simulated growing environment. Sensitivity analysis explores how varying parameters affects the output, but does not do a full exploration of the parameter solution space. The solution space for DGVM parameter values are thought to be complex and non-linear; and multiple sets of acceptable parameters may exist. In published studies, PFT parameters are estimated from published literature, and often a parameter value is estimated from a single published value. Further, the parameters are "tuned" using somewhat arbitrary, "trial-and-error" methods. BIOMAP is a new DGVM created by fusing MAPSS biogeography model with Biome-BGC. It represents the vegetation of North America using 26 PFTs. We are using simulated annealing, a global search method, to systematically and objectively explore the solution space for the BIOMAP PFTs and system parameters important for plant water use. We defined the boundaries of the solution space by obtaining maximum and minimum values from published literature, and where those were not available, using +/-20% of current values. We used stratified random sampling to select a set of grid cells representing the vegetation of the conterminous USA. Simulated annealing algorithm is applied to the parameters for spin-up and a transient run during the historical period 1961-1990. A set of parameter values is considered acceptable if the associated simulation run produces a modern potential vegetation distribution map that is as accurate as one produced by trial-and-error calibration. We expect to confirm that the solution space is non-linear and complex, and that
International Nuclear Information System (INIS)
Passos, E.J.V. de; Toledo Piza, A.F.R. de.
The properties of the subspaces of the many-body Hilbert space which are associated with the use of the Generator Coordinate Method (GCM) in connection with one parameter, and with two-conjugate parameter families of generator states are examined in detail. It is shown that natural orthonormal base vectors in each case are immediately related to Peierls-Voccoz and Peierls-Thouless projections respectively. Through the formal consideration of a canonical transformation to collective, P and Q, and intrinsic degrees of freedom, the properties of the GCM subspaces with respect to the kinematical separation of these degrees of freedom are discussed in detail. An application is made, using the ideas developed in this paper, a) to translation; b) to illustrate the qualitative understanting of the content of existing GCM calculations of giant ressonances in light nuclei and c) to the definition of appropriate asymptotic states in current GCM descriptions of scattering [pt
Research on Geometric Positioning Algorithm of License Plate in Multidimensional Parameter Space
Directory of Open Access Journals (Sweden)
Yinhua Huan
2014-05-01
Full Text Available Considering features of vehicle license plate location method which commonly used, in order to search a consistent location for reference images with license plates feature in multidimensional parameter space, a new algorithm of geometric location is proposed. Geometric location algorithm main include model training and real time search. Which not only adapt the gray-scale linearity and the gray non-linear changes, but also support changes of scale and angle. Compared with the mainstream locating software, numerical results shows under the same test conditions that the position deviation of geometric positioning algorithm is less than 0.5 pixel. Without taking into account the multidimensional parameter space, Geometric positioning algorithm position deviation is less than 1.0 pixel and angle deviation is less than 1.0 degree taking into account the multidimensional parameter space. This algorithm is robust, simple, practical and is better than the traditional method.
Naden, Levi N; Shirts, Michael R
2016-04-12
We show how thermodynamic properties of molecular models can be computed over a large, multidimensional parameter space by combining multistate reweighting analysis with a linear basis function approach. This approach reduces the computational cost to estimate thermodynamic properties from molecular simulations for over 130,000 tested parameter combinations from over 1000 CPU years to tens of CPU days. This speed increase is achieved primarily by computing the potential energy as a linear combination of basis functions, computed from either modified simulation code or as the difference of energy between two reference states, which can be done without any simulation code modification. The thermodynamic properties are then estimated with the Multistate Bennett Acceptance Ratio (MBAR) as a function of multiple model parameters without the need to define a priori how the states are connected by a pathway. Instead, we adaptively sample a set of points in parameter space to create mutual configuration space overlap. The existence of regions of poor configuration space overlap are detected by analyzing the eigenvalues of the sampled states' overlap matrix. The configuration space overlap to sampled states is monitored alongside the mean and maximum uncertainty to determine convergence, as neither the uncertainty or the configuration space overlap alone is a sufficient metric of convergence. This adaptive sampling scheme is demonstrated by estimating with high precision the solvation free energies of charged particles of Lennard-Jones plus Coulomb functional form with charges between -2 and +2 and generally physical values of σij and ϵij in TIP3P water. We also compute entropy, enthalpy, and radial distribution functions of arbitrary unsampled parameter combinations using only the data from these sampled states and use the estimates of free energies over the entire space to examine the deviation of atomistic simulations from the Born approximation to the solvation free
International Nuclear Information System (INIS)
Lell, R.M.; Hanan, N.A.
1987-01-01
Effects of multigroup neutron cross section generation procedures on core physics parameters for compact fast spectrum reactors have been examined. Homogeneous and space-dependent multigroup cross section sets were generated in 11 and 27 groups for a representative fast reactor core. These cross sections were used to compute various reactor physics parameters for the reference core. Coarse group structure and neglect of space-dependence in the generation procedure resulted in inaccurate computations of reactor flux and power distributions and in significant errors regarding estimates of core reactivity and control system worth. Delayed neutron fraction was insensitive to cross section treatment, and computed reactivity coefficients were only slightly sensitive. However, neutron lifetime was found to be very sensitive to cross section treatment. Deficiencies in multigroup cross sections are reflected in core nuclear design and, consequently, in system mechanical design
International Nuclear Information System (INIS)
Oraevskij, V.N.; Golyshev, S.A.; Levitin, A.E.; Breus, T.K.; Ivanova, S.V.; Komarov, F.I.; Rapoport, S.I.
1995-01-01
Space and time distribution of the electric and magnetic fields and current systems in the near terrestrial space (electromagnetic weather) were studied in connection with ambulance calls in Moscow, Russia, related to the cardia-vascular diseases. The some examples of the correlations between the solar activity parameters and geomagnetic variations and the events of the extreme number of ambulance calls were presented. 4 refs., 5 figs., 2 tabs
Tuning a space-time scalable PI controller using thermal parameters
Energy Technology Data Exchange (ETDEWEB)
Riverol, C. [University of West Indies, Chemical Engineering Department, St. Augustine, Trinidad (Trinidad and Tobago); Pilipovik, M.V. [Armach Engineers, Urb. Los Palos Grandes, Project Engineering Department, Caracas (Venezuela)
2005-03-01
The paper outlines the successful empirical design and validation of a space-time PI controller based on study of the controlled variable output as function of time and space. The developed control was implemented on two heat exchanger systems (falling film evaporator and milk pasteurizer). The strategy required adding a new term over the classical PI controller, such that a new parameter should be tuned. Measurements made on commercial installations have confirmed the validity of the new controller. (orig.)
B→τν: Opening up the charged Higgs parameter space with R-parity violation
International Nuclear Information System (INIS)
Bose, Roshni; Kundu, Anirban
2012-01-01
The theoretically clean channel B + →τ + ν shows a close to 3σ discrepancy between the Standard Model prediction and the data. This in turn puts a strong constraint on the parameter space of a two-Higgs doublet model, including R-parity conserving supersymmetry. The constraint is so strong that it almost smells of fine-tuning. We show how the parameter space opens up with the introduction of suitable R-parity violating interactions, and release the tension between data and theory.
The magnetically driven imploding liner parameter space of the ATLAS capacitor bank
Lindemuth, I R; Faehl, R J; Reinovsky, R E
2001-01-01
Summary form only given, as follows. The Atlas capacitor bank (23 MJ, 30 MA) is now operational at Los Alamos. Atlas was designed primarily to magnetically drive imploding liners for use as impactors in shock and hydrodynamic experiments. We have conducted a computational "mapping" of the high-performance imploding liner parameter space accessible to Atlas. The effect of charge voltage, transmission inductance, liner thickness, liner initial radius, and liner length has been investigated. One conclusion is that Atlas is ideally suited to be a liner driver for liner-on-plasma experiments in a magnetized target fusion (MTF) context . The parameter space of possible Atlas reconfigurations has also been investigated.
An open-source job management framework for parameter-space exploration: OACIS
Murase, Y.; Uchitane, T.; Ito, N.
2017-11-01
We present an open-source software framework for parameter-space exporation, named OACIS, which is useful to manage vast amount of simulation jobs and results in a systematic way. Recent development of high-performance computers enabled us to explore parameter spaces comprehensively, however, in such cases, manual management of the workflow is practically impossible. OACIS is developed aiming at reducing the cost of these repetitive tasks when conducting simulations by automating job submissions and data management. In this article, an overview of OACIS as well as a getting started guide are presented.
Parameter-space metric of semicoherent searches for continuous gravitational waves
International Nuclear Information System (INIS)
Pletsch, Holger J.
2010-01-01
Continuous gravitational-wave (CW) signals such as emitted by spinning neutron stars are an important target class for current detectors. However, the enormous computational demand prohibits fully coherent broadband all-sky searches for prior unknown CW sources over wide ranges of parameter space and for yearlong observation times. More efficient hierarchical ''semicoherent'' search strategies divide the data into segments much shorter than one year, which are analyzed coherently; then detection statistics from different segments are combined incoherently. To optimally perform the incoherent combination, understanding of the underlying parameter-space structure is requisite. This problem is addressed here by using new coordinates on the parameter space, which yield the first analytical parameter-space metric for the incoherent combination step. This semicoherent metric applies to broadband all-sky surveys (also embedding directed searches at fixed sky position) for isolated CW sources. Furthermore, the additional metric resolution attained through the combination of segments is studied. From the search parameters (sky position, frequency, and frequency derivatives), solely the metric resolution in the frequency derivatives is found to significantly increase with the number of segments.
Characterization of discontinuities in high-dimensional stochastic problems on adaptive sparse grids
International Nuclear Information System (INIS)
Jakeman, John D.; Archibald, Richard; Xiu Dongbin
2011-01-01
In this paper we present a set of efficient algorithms for detection and identification of discontinuities in high dimensional space. The method is based on extension of polynomial annihilation for discontinuity detection in low dimensions. Compared to the earlier work, the present method poses significant improvements for high dimensional problems. The core of the algorithms relies on adaptive refinement of sparse grids. It is demonstrated that in the commonly encountered cases where a discontinuity resides on a small subset of the dimensions, the present method becomes 'optimal', in the sense that the total number of points required for function evaluations depends linearly on the dimensionality of the space. The details of the algorithms will be presented and various numerical examples are utilized to demonstrate the efficacy of the method.
Can We Train Machine Learning Methods to Outperform the High-dimensional Propensity Score Algorithm?
Karim, Mohammad Ehsanul; Pang, Menglan; Platt, Robert W
2018-03-01
The use of retrospective health care claims datasets is frequently criticized for the lack of complete information on potential confounders. Utilizing patient's health status-related information from claims datasets as surrogates or proxies for mismeasured and unobserved confounders, the high-dimensional propensity score algorithm enables us to reduce bias. Using a previously published cohort study of postmyocardial infarction statin use (1998-2012), we compare the performance of the algorithm with a number of popular machine learning approaches for confounder selection in high-dimensional covariate spaces: random forest, least absolute shrinkage and selection operator, and elastic net. Our results suggest that, when the data analysis is done with epidemiologic principles in mind, machine learning methods perform as well as the high-dimensional propensity score algorithm. Using a plasmode framework that mimicked the empirical data, we also showed that a hybrid of machine learning and high-dimensional propensity score algorithms generally perform slightly better than both in terms of mean squared error, when a bias-based analysis is used.
An Unbiased Distance-based Outlier Detection Approach for High-dimensional Data
DEFF Research Database (Denmark)
Nguyen, Hoang Vu; Gopalkrishnan, Vivekanand; Assent, Ira
2011-01-01
than a global property. Different from existing approaches, it is not grid-based and dimensionality unbiased. Thus, its performance is impervious to grid resolution as well as the curse of dimensionality. In addition, our approach ranks the outliers, allowing users to select the number of desired...... outliers, thus mitigating the issue of high false alarm rate. Extensive empirical studies on real datasets show that our approach efficiently and effectively detects outliers, even in high-dimensional spaces....
DEFF Research Database (Denmark)
Kaniecki, M.; Saenz, E.; Rolo, L.
2014-01-01
This paper demonstrates a method for material characterization (permittivity, permeability, loss tangent) based on the scattering parameters. The performance of the extraction algorithm will be shown for modelled and measured data. The measurements were carried out at the European Space Agency...
The Legion Support for Advanced Parameter-Space Studies on a Grid
National Research Council Canada - National Science Library
Natrajan, Anand; Humphrey, Marty A; Grimshaw, Andrew S
2006-01-01
.... Legion provides tools and services that support advanced parameter-space studies, i.e., studies that make complex demands such as transparent access to distributed files, fault-tolerance and security. We demonstrate these benefits with a protein-folding experiment in which a molecular simulation package was run over a grid managed by Legion.
Miksovsky, J.; Raidl, A.
Time delays phase space reconstruction represents one of useful tools of nonlinear time series analysis, enabling number of applications. Its utilization requires the value of time delay to be known, as well as the value of embedding dimension. There are sev- eral methods how to estimate both these parameters. Typically, time delay is computed first, followed by embedding dimension. Our presented approach is slightly different - we reconstructed phase space for various combinations of mentioned parameters and used it for prediction by means of the nearest neighbours in the phase space. Then some measure of prediction's success was computed (correlation or RMSE, e.g.). The position of its global maximum (minimum) should indicate the suitable combination of time delay and embedding dimension. Several meteorological (particularly clima- tological) time series were used for the computations. We have also created a MS- Windows based program in order to implement this approach - its basic features will be presented as well.
Controlling chaos in low and high dimensional systems with periodic parametric perturbations
International Nuclear Information System (INIS)
Mirus, K.A.; Sprott, J.C.
1998-06-01
The effect of applying a periodic perturbation to an accessible parameter of various chaotic systems is examined. Numerical results indicate that perturbation frequencies near the natural frequencies of the unstable periodic orbits of the chaotic systems can result in limit cycles for relatively small perturbations. Such perturbations can also control or significantly reduce the dimension of high-dimensional systems. Initial application to the control of fluctuations in a prototypical magnetic fusion plasma device will be reviewed
GAMLSS for high-dimensional data – a flexible approach based on boosting
Mayr, Andreas; Fenske, Nora; Hofner, Benjamin; Kneib, Thomas; Schmid, Matthias
2010-01-01
Generalized additive models for location, scale and shape (GAMLSS) are a popular semi-parametric modelling approach that, in contrast to conventional GAMs, regress not only the expected mean but every distribution parameter (e.g. location, scale and shape) to a set of covariates. Current fitting procedures for GAMLSS are infeasible for high-dimensional data setups and require variable selection based on (potentially problematic) information criteria. The present work describes a boosting algo...
Determination of charged particle beam parameters with taking into account of space charge
International Nuclear Information System (INIS)
Ishkhanov, B.S.; Poseryaev, A.V.; Shvedunov, V.I.
2005-01-01
One describes a procedure to determine the basic parameters of a paraxial axially-symmetric beam of charged particles taking account of space charge contribution. The described procedure is based on application of the general equation for beam envelope. Paper presents data on its convergence and resistance to measurement errors. The position determination error of crossover (stretching) and radius of beam in crossover is maximum 15% , while the emittance determination error depends on emittance and space charge correlation. The introduced procedure was used to determine parameters of the available electron gun 20 keV energy beam with 0.64 A current. The derived results turned to agree closely with the design parameters [ru
Approximation of High-Dimensional Rank One Tensors
Bachmayr, Markus
2013-11-12
Many real world problems are high-dimensional in that their solution is a function which depends on many variables or parameters. This presents a computational challenge since traditional numerical techniques are built on model classes for functions based solely on smoothness. It is known that the approximation of smoothness classes of functions suffers from the so-called \\'curse of dimensionality\\'. Avoiding this curse requires new model classes for real world functions that match applications. This has led to the introduction of notions such as sparsity, variable reduction, and reduced modeling. One theme that is particularly common is to assume a tensor structure for the target function. This paper investigates how well a rank one function f(x 1,...,x d)=f 1(x 1)⋯f d(x d), defined on Ω=[0,1]d can be captured through point queries. It is shown that such a rank one function with component functions f j in W∞ r([0,1]) can be captured (in L ∞) to accuracy O(C(d,r)N -r) from N well-chosen point evaluations. The constant C(d,r) scales like d dr. The queries in our algorithms have two ingredients, a set of points built on the results from discrepancy theory and a second adaptive set of queries dependent on the information drawn from the first set. Under the assumption that a point z∈Ω with nonvanishing f(z) is known, the accuracy improves to O(dN -r). © 2013 Springer Science+Business Media New York.
Approximation of High-Dimensional Rank One Tensors
Bachmayr, Markus; Dahmen, Wolfgang; DeVore, Ronald; Grasedyck, Lars
2013-01-01
Many real world problems are high-dimensional in that their solution is a function which depends on many variables or parameters. This presents a computational challenge since traditional numerical techniques are built on model classes for functions based solely on smoothness. It is known that the approximation of smoothness classes of functions suffers from the so-called 'curse of dimensionality'. Avoiding this curse requires new model classes for real world functions that match applications. This has led to the introduction of notions such as sparsity, variable reduction, and reduced modeling. One theme that is particularly common is to assume a tensor structure for the target function. This paper investigates how well a rank one function f(x 1,...,x d)=f 1(x 1)⋯f d(x d), defined on Ω=[0,1]d can be captured through point queries. It is shown that such a rank one function with component functions f j in W∞ r([0,1]) can be captured (in L ∞) to accuracy O(C(d,r)N -r) from N well-chosen point evaluations. The constant C(d,r) scales like d dr. The queries in our algorithms have two ingredients, a set of points built on the results from discrepancy theory and a second adaptive set of queries dependent on the information drawn from the first set. Under the assumption that a point z∈Ω with nonvanishing f(z) is known, the accuracy improves to O(dN -r). © 2013 Springer Science+Business Media New York.
Analysing spatially extended high-dimensional dynamics by recurrence plots
Energy Technology Data Exchange (ETDEWEB)
Marwan, Norbert, E-mail: marwan@pik-potsdam.de [Potsdam Institute for Climate Impact Research, 14412 Potsdam (Germany); Kurths, Jürgen [Potsdam Institute for Climate Impact Research, 14412 Potsdam (Germany); Humboldt Universität zu Berlin, Institut für Physik (Germany); Nizhny Novgorod State University, Department of Control Theory, Nizhny Novgorod (Russian Federation); Foerster, Saskia [GFZ German Research Centre for Geosciences, Section 1.4 Remote Sensing, Telegrafenberg, 14473 Potsdam (Germany)
2015-05-08
Recurrence plot based measures of complexity are capable tools for characterizing complex dynamics. In this letter we show the potential of selected recurrence plot measures for the investigation of even high-dimensional dynamics. We apply this method on spatially extended chaos, such as derived from the Lorenz96 model and show that the recurrence plot based measures can qualitatively characterize typical dynamical properties such as chaotic or periodic dynamics. Moreover, we demonstrate its power by analysing satellite image time series of vegetation cover with contrasting dynamics as a spatially extended and potentially high-dimensional example from the real world. - Highlights: • We use recurrence plots for analysing partially extended dynamics. • We investigate the high-dimensional chaos of the Lorenz96 model. • The approach distinguishes different spatio-temporal dynamics. • We use the method for studying vegetation cover time series.
SP_Ace: a new code to derive stellar parameters and elemental abundances
Boeche, C.; Grebel, E. K.
2016-03-01
Context. Ongoing and future massive spectroscopic surveys will collect large numbers (106-107) of stellar spectra that need to be analyzed. Highly automated software is needed to derive stellar parameters and chemical abundances from these spectra. Aims: We developed a new method of estimating the stellar parameters Teff, log g, [M/H], and elemental abundances. This method was implemented in a new code, SP_Ace (Stellar Parameters And Chemical abundances Estimator). This is a highly automated code suitable for analyzing the spectra of large spectroscopic surveys with low or medium spectral resolution (R = 2000-20 000). Methods: After the astrophysical calibration of the oscillator strengths of 4643 absorption lines covering the wavelength ranges 5212-6860 Å and 8400-8924 Å, we constructed a library that contains the equivalent widths (EW) of these lines for a grid of stellar parameters. The EWs of each line are fit by a polynomial function that describes the EW of the line as a function of the stellar parameters. The coefficients of these polynomial functions are stored in a library called the "GCOG library". SP_Ace, a code written in FORTRAN95, uses the GCOG library to compute the EWs of the lines, constructs models of spectra as a function of the stellar parameters and abundances, and searches for the model that minimizes the χ2 deviation when compared to the observed spectrum. The code has been tested on synthetic and real spectra for a wide range of signal-to-noise and spectral resolutions. Results: SP_Ace derives stellar parameters such as Teff, log g, [M/H], and chemical abundances of up to ten elements for low to medium resolution spectra of FGK-type stars with precision comparable to the one usually obtained with spectra of higher resolution. Systematic errors in stellar parameters and chemical abundances are presented and identified with tests on synthetic and real spectra. Stochastic errors are automatically estimated by the code for all the parameters
Arif, Muhammad
2012-06-01
In pattern classification problems, feature extraction is an important step. Quality of features in discriminating different classes plays an important role in pattern classification problems. In real life, pattern classification may require high dimensional feature space and it is impossible to visualize the feature space if the dimension of feature space is greater than four. In this paper, we have proposed a Similarity-Dissimilarity plot which can project high dimensional space to a two dimensional space while retaining important characteristics required to assess the discrimination quality of the features. Similarity-dissimilarity plot can reveal information about the amount of overlap of features of different classes. Separable data points of different classes will also be visible on the plot which can be classified correctly using appropriate classifier. Hence, approximate classification accuracy can be predicted. Moreover, it is possible to know about whom class the misclassified data points will be confused by the classifier. Outlier data points can also be located on the similarity-dissimilarity plot. Various examples of synthetic data are used to highlight important characteristics of the proposed plot. Some real life examples from biomedical data are also used for the analysis. The proposed plot is independent of number of dimensions of the feature space.
High-dimensional model estimation and model selection
CERN. Geneva
2015-01-01
I will review concepts and algorithms from high-dimensional statistics for linear model estimation and model selection. I will particularly focus on the so-called p>>n setting where the number of variables p is much larger than the number of samples n. I will focus mostly on regularized statistical estimators that produce sparse models. Important examples include the LASSO and its matrix extension, the Graphical LASSO, and more recent non-convex methods such as the TREX. I will show the applicability of these estimators in a diverse range of scientific applications, such as sparse interaction graph recovery and high-dimensional classification and regression problems in genomics.
Quantum sensing of the phase-space-displacement parameters using a single trapped ion
Ivanov, Peter A.; Vitanov, Nikolay V.
2018-03-01
We introduce a quantum sensing protocol for detecting the parameters characterizing the phase-space displacement by using a single trapped ion as a quantum probe. We show that, thanks to the laser-induced coupling between the ion's internal states and the motion mode, the estimation of the two conjugated parameters describing the displacement can be efficiently performed by a set of measurements of the atomic state populations. Furthermore, we introduce a three-parameter protocol capable of detecting the magnitude, the transverse direction, and the phase of the displacement. We characterize the uncertainty of the two- and three-parameter problems in terms of the Fisher information and show that state projective measurement saturates the fundamental quantum Cramér-Rao bound.
Exploring the triplet parameters space to optimise the final focus of the FCC-hh
AUTHOR|(CDS)2141109; Abelleira, Jose; Seryi, Andrei; Cruz Alaniz, Emilia
2017-01-01
One of the main challenges when designing final focus systems of particle accelerators is maximising the beam stay clear in the strong quadrupole magnets of the inner triplet. Moreover it is desirable to keep the quadrupoles in the triplet as short as possible for space and costs reasons but also to reduce chromaticity and simplify corrections schemes. An algorithm that explores the triplet parameter space to optimise both these aspects was written. It uses thin lenses as a first approximation and MADX for more precise calculations. In cooperation with radiation studies, this algorithm was then applied to design an alternative triplet for the final focus of the Future Circular Collider (FCC-hh).
Moving to continuous facial expression space using the MPEG-4 facial definition parameter (FDP) set
Karpouzis, Kostas; Tsapatsoulis, Nicolas; Kollias, Stefanos D.
2000-06-01
Research in facial expression has concluded that at least six emotions, conveyed by human faces, are universally associated with distinct expressions. Sadness, anger, joy, fear, disgust and surprise are categories of expressions that are recognizable across cultures. In this work we form a relation between the description of the universal expressions and the MPEG-4 Facial Definition Parameter Set (FDP). We also investigate the relation between the movement of basic FDPs and the parameters that describe emotion-related words according to some classical psychological studies. In particular Whissel suggested that emotions are points in a space, which seem to occupy two dimensions: activation and evaluation. We show that some of the MPEG-4 Facial Animation Parameters (FAPs), approximated by the motion of the corresponding FDPs, can be combined by means of a fuzzy rule system to estimate the activation parameter. In this way variations of the six archetypal emotions can be achieved. Moreover, Plutchik concluded that emotion terms are unevenly distributed through the space defined by dimensions like Whissel's; instead they tend to form an approximately circular pattern, called 'emotion wheel,' modeled using an angular measure. The 'emotion wheel' can be defined as a reference for creating intermediate expressions from the universal ones, by interpolating the movement of dominant FDP points between neighboring basic expressions. By exploiting the relation between the movement of the basic FDP point and the activation and angular parameters we can model more emotions than the primary ones and achieve efficient recognition in video sequences.
Reinforcement learning on slow features of high-dimensional input streams.
Directory of Open Access Journals (Sweden)
Robert Legenstein
Full Text Available Humans and animals are able to learn complex behaviors based on a massive stream of sensory information from different modalities. Early animal studies have identified learning mechanisms that are based on reward and punishment such that animals tend to avoid actions that lead to punishment whereas rewarded actions are reinforced. However, most algorithms for reward-based learning are only applicable if the dimensionality of the state-space is sufficiently small or its structure is sufficiently simple. Therefore, the question arises how the problem of learning on high-dimensional data is solved in the brain. In this article, we propose a biologically plausible generic two-stage learning system that can directly be applied to raw high-dimensional input streams. The system is composed of a hierarchical slow feature analysis (SFA network for preprocessing and a simple neural network on top that is trained based on rewards. We demonstrate by computer simulations that this generic architecture is able to learn quite demanding reinforcement learning tasks on high-dimensional visual input streams in a time that is comparable to the time needed when an explicit highly informative low-dimensional state-space representation is given instead of the high-dimensional visual input. The learning speed of the proposed architecture in a task similar to the Morris water maze task is comparable to that found in experimental studies with rats. This study thus supports the hypothesis that slowness learning is one important unsupervised learning principle utilized in the brain to form efficient state representations for behavioral learning.
LAMOST DR1: Stellar Parameters and Chemical Abundances with SP_Ace
Boeche, C.; Smith, M. C.; Grebel, E. K.; Zhong, J.; Hou, J. L.; Chen, L.; Stello, D.
2018-04-01
We present a new analysis of the LAMOST DR1 survey spectral database performed with the code SP_Ace, which provides the derived stellar parameters {T}{{eff}}, {log}g, [Fe/H], and [α/H] for 1,097,231 stellar objects. We tested the reliability of our results by comparing them to reference results from high spectral resolution surveys. The expected errors can be summarized as ∼120 K in {T}{{eff}}, ∼0.2 in {log}g, ∼0.15 dex in [Fe/H], and ∼0.1 dex in [α/Fe] for spectra with S/N > 40, with some differences between dwarf and giant stars. SP_Ace provides error estimations consistent with the discrepancies observed between derived and reference parameters. Some systematic errors are identified and discussed. The resulting catalog is publicly available at the LAMOST and CDS websites.
Efficiently enclosing the compact binary parameter space by singular-value decomposition
International Nuclear Information System (INIS)
Cannon, Kipp; Hanna, Chad; Keppel, Drew
2011-01-01
Gravitational-wave searches for the merger of compact binaries use matched filtering as the method of detecting signals and estimating parameters. Such searches construct a fine mesh of filters covering a signal parameter space at high density. Previously it has been shown that singular-value decomposition can reduce the effective number of filters required to search the data. Here we study how the basis provided by the singular-value decomposition changes dimension as a function of template-bank density. We will demonstrate that it is sufficient to use the basis provided by the singular-value decomposition of a low-density bank to accurately reconstruct arbitrary points within the boundaries of the template bank. Since this technique is purely numerical, it may have applications to interpolating the space of numerical relativity waveforms.
Non-intrusive low-rank separated approximation of high-dimensional stochastic models
Doostan, Alireza; Validi, AbdoulAhad; Iaccarino, Gianluca
2013-01-01
This work proposes a sampling-based (non-intrusive) approach within the context of low-. rank separated representations to tackle the issue of curse-of-dimensionality associated with the solution of models, e.g., PDEs/ODEs, with high-dimensional random inputs. Under some conditions discussed in details, the number of random realizations of the solution, required for a successful approximation, grows linearly with respect to the number of random inputs. The construction of the separated representation is achieved via a regularized alternating least-squares regression, together with an error indicator to estimate model parameters. The computational complexity of such a construction is quadratic in the number of random inputs. The performance of the method is investigated through its application to three numerical examples including two ODE problems with high-dimensional random inputs. © 2013 Elsevier B.V.
Non-intrusive low-rank separated approximation of high-dimensional stochastic models
Doostan, Alireza
2013-08-01
This work proposes a sampling-based (non-intrusive) approach within the context of low-. rank separated representations to tackle the issue of curse-of-dimensionality associated with the solution of models, e.g., PDEs/ODEs, with high-dimensional random inputs. Under some conditions discussed in details, the number of random realizations of the solution, required for a successful approximation, grows linearly with respect to the number of random inputs. The construction of the separated representation is achieved via a regularized alternating least-squares regression, together with an error indicator to estimate model parameters. The computational complexity of such a construction is quadratic in the number of random inputs. The performance of the method is investigated through its application to three numerical examples including two ODE problems with high-dimensional random inputs. © 2013 Elsevier B.V.
Hadronic total cross-sections through soft gluon summation in impact parameter space
International Nuclear Information System (INIS)
Grau, A.
1999-01-01
IThe Bloch-Nordsieck model for the parton distribution of hadrons in impact parameter space, constructed using soft gluon summation, is investigated in detail. Its dependence upon the infrared structure of the strong coupling constant α s is discussed, both for finite as well as singular, but integrable, α s . The formalism is applied to the prediction of total proton-proton and proton-antiproton cross-sections, where screening, due to soft gluon emission from the initial valence quarks, becomes evident
Supporting Dynamic Quantization for High-Dimensional Data Analytics.
Guzun, Gheorghi; Canahuate, Guadalupe
2017-05-01
Similarity searches are at the heart of exploratory data analysis tasks. Distance metrics are typically used to characterize the similarity between data objects represented as feature vectors. However, when the dimensionality of the data increases and the number of features is large, traditional distance metrics fail to distinguish between the closest and furthest data points. Localized distance functions have been proposed as an alternative to traditional distance metrics. These functions only consider dimensions close to query to compute the distance/similarity. Furthermore, in order to enable interactive explorations of high-dimensional data, indexing support for ad-hoc queries is needed. In this work we set up to investigate whether bit-sliced indices can be used for exploratory analytics such as similarity searches and data clustering for high-dimensional big-data. We also propose a novel dynamic quantization called Query dependent Equi-Depth (QED) quantization and show its effectiveness on characterizing high-dimensional similarity. When applying QED we observe improvements in kNN classification accuracy over traditional distance functions. Gheorghi Guzun and Guadalupe Canahuate. 2017. Supporting Dynamic Quantization for High-Dimensional Data Analytics. In Proceedings of Ex-ploreDB'17, Chicago, IL, USA, May 14-19, 2017, 6 pages. https://doi.org/http://dx.doi.org/10.1145/3077331.3077336.
A hybridized K-means clustering approach for high dimensional ...
African Journals Online (AJOL)
International Journal of Engineering, Science and Technology ... Due to incredible growth of high dimensional dataset, conventional data base querying methods are inadequate to extract useful information, so researchers nowadays ... Recently cluster analysis is a popularly used data analysis method in number of areas.
On Robust Information Extraction from High-Dimensional Data
Czech Academy of Sciences Publication Activity Database
Kalina, Jan
2014-01-01
Roč. 9, č. 1 (2014), s. 131-144 ISSN 1452-4864 Grant - others:GA ČR(CZ) GA13-01930S Institutional support: RVO:67985807 Keywords : data mining * high-dimensional data * robust econometrics * outliers * machine learning Subject RIV: IN - Informatics, Computer Science
Inference in High-dimensional Dynamic Panel Data Models
DEFF Research Database (Denmark)
Kock, Anders Bredahl; Tang, Haihan
We establish oracle inequalities for a version of the Lasso in high-dimensional fixed effects dynamic panel data models. The inequalities are valid for the coefficients of the dynamic and exogenous regressors. Separate oracle inequalities are derived for the fixed effects. Next, we show how one can...
Pricing High-Dimensional American Options Using Local Consistency Conditions
Berridge, S.J.; Schumacher, J.M.
2004-01-01
We investigate a new method for pricing high-dimensional American options. The method is of finite difference type but is also related to Monte Carlo techniques in that it involves a representative sampling of the underlying variables.An approximating Markov chain is built using this sampling and
Irregular grid methods for pricing high-dimensional American options
Berridge, S.J.
2004-01-01
This thesis proposes and studies numerical methods for pricing high-dimensional American options; important examples being basket options, Bermudan swaptions and real options. Four new methods are presented and analysed, both in terms of their application to various test problems, and in terms of
On equivalent parameter learning in simplified feature space based on Bayesian asymptotic analysis.
Yamazaki, Keisuke
2012-07-01
Parametric models for sequential data, such as hidden Markov models, stochastic context-free grammars, and linear dynamical systems, are widely used in time-series analysis and structural data analysis. Computation of the likelihood function is one of primary considerations in many learning methods. Iterative calculation of the likelihood such as the model selection is still time-consuming though there are effective algorithms based on dynamic programming. The present paper studies parameter learning in a simplified feature space to reduce the computational cost. Simplifying data is a common technique seen in feature selection and dimension reduction though an oversimplified space causes adverse learning results. Therefore, we mathematically investigate a condition of the feature map to have an asymptotically equivalent convergence point of estimated parameters, referred to as the vicarious map. As a demonstration to find vicarious maps, we consider the feature space, which limits the length of data, and derive a necessary length for parameter learning in hidden Markov models. Copyright © 2012 Elsevier Ltd. All rights reserved.
Saleem, M.; Resmi, L.; Misra, Kuntal; Pai, Archana; Arun, K. G.
2018-03-01
Short duration Gamma Ray Bursts (SGRB) and their afterglows are among the most promising electromagnetic (EM) counterparts of Neutron Star (NS) mergers. The afterglow emission is broad-band, visible across the entire electromagnetic window from γ-ray to radio frequencies. The flux evolution in these frequencies is sensitive to the multidimensional afterglow physical parameter space. Observations of gravitational wave (GW) from BNS mergers in spatial and temporal coincidence with SGRB and associated afterglows can provide valuable constraints on afterglow physics. We run simulations of GW-detected BNS events and assuming that all of them are associated with a GRB jet which also produces an afterglow, investigate how detections or non-detections in X-ray, optical and radio frequencies can be influenced by the parameter space. We narrow down the regions of afterglow parameter space for a uniform top-hat jet model, which would result in different detection scenarios. We list inferences which can be drawn on the physics of GRB afterglows from multimessenger astronomy with coincident GW-EM observations.
Myers, J. G.; Feola, A.; Werner, C.; Nelson, E. S.; Raykin, J.; Samuels, B.; Ethier, C. R.
2016-01-01
The earliest manifestations of Visual Impairment and Intracranial Pressure (VIIP) syndrome become evident after months of spaceflight and include a variety of ophthalmic changes, including posterior globe flattening and distension of the optic nerve sheath. Prevailing evidence links the occurrence of VIIP to the cephalic fluid shift induced by microgravity and the subsequent pressure changes around the optic nerve and eye. Deducing the etiology of VIIP is challenging due to the wide range of physiological parameters that may be influenced by spaceflight and are required to address a realistic spectrum of physiological responses. Here, we report on the application of an efficient approach to interrogating physiological parameter space through computational modeling. Specifically, we assess the influence of uncertainty in input parameters for two models of VIIP syndrome: a lumped-parameter model (LPM) of the cardiovascular and central nervous systems, and a finite-element model (FEM) of the posterior eye, optic nerve head (ONH) and optic nerve sheath. Methods: To investigate the parameter space in each model, we employed Latin hypercube sampling partial rank correlation coefficient (LHSPRCC) strategies. LHS techniques outperform Monte Carlo approaches by enforcing efficient sampling across the entire range of all parameters. The PRCC method estimates the sensitivity of model outputs to these parameters while adjusting for the linear effects of all other inputs. The LPM analysis addressed uncertainties in 42 physiological parameters, such as initial compartmental volume and nominal compartment percentage of total cardiac output in the supine state, while the FEM evaluated the effects on biomechanical strain from uncertainties in 23 material and pressure parameters for the ocular anatomy. Results and Conclusion: The LPM analysis identified several key factors including high sensitivity to the initial fluid distribution. The FEM study found that intraocular pressure and
Directory of Open Access Journals (Sweden)
Haiwen Li
2018-01-01
Full Text Available The estimation speed of positioning parameters determines the effectiveness of the positioning system. The time of arrival (TOA and direction of arrival (DOA parameters can be estimated by the space-time two-dimensional multiple signal classification (2D-MUSIC algorithm for array antenna. However, this algorithm needs much time to complete the two-dimensional pseudo spectral peak search, which makes it difficult to apply in practice. Aiming at solving this problem, a fast estimation method of space-time two-dimensional positioning parameters based on Hadamard product is proposed in orthogonal frequency division multiplexing (OFDM system, and the Cramer-Rao bound (CRB is also presented. Firstly, according to the channel frequency domain response vector of each array, the channel frequency domain estimation vector is constructed using the Hadamard product form containing location information. Then, the autocorrelation matrix of the channel response vector for the extended array element in frequency domain and the noise subspace are calculated successively. Finally, by combining the closed-form solution and parameter pairing, the fast joint estimation for time delay and arrival direction is accomplished. The theoretical analysis and simulation results show that the proposed algorithm can significantly reduce the computational complexity and guarantee that the estimation accuracy is not only better than estimating signal parameters via rotational invariance techniques (ESPRIT algorithm and 2D matrix pencil (MP algorithm but also close to 2D-MUSIC algorithm. Moreover, the proposed algorithm also has certain adaptability to multipath environment and effectively improves the ability of fast acquisition of location parameters.
The dynamics of blood biochemical parameters in cosmonauts during long-term space flights
Markin, Andrei; Strogonova, Lubov; Balashov, Oleg; Polyakov, Valery; Tigner, Timoty
Most of the previously obtained data on cosmonauts' metabolic state concerned certain stages of the postflight period. In this connection, all conclusions, as to metabolism peculiarities during the space flight, were to a large extent probabilistic. The purpose of this work was study of metabolism characteristics in cosmonauts directly during long-term space flights. In the capillary blood samples taken from a finger, by "Reflotron IV" biochemical analyzer, "Boehringer Mannheim" GmbH, Germany, adapted to weightlessness environments, the activity of GOT, GPT, CK, gamma-GT, total and pancreatic amylase, as well as concentration of hemoglobin, glucose, total bilirubin, uric acid, urea, creatinine, total, HDL- and LDL cholesterol, triglycerides had been determined. HDL/LDL-cholesterol ratio also was computed. The crewmembers of 6 main missions to the "Mir" orbital station, a total of 17 cosmonauts, were examined. Biochemical tests were carryed out 30-60 days before lounch, and in the flights different stages between the 25-th and the 423-rd days of flights. In cosmonauts during space flight had been found tendency to increase, in compare with basal level, GOT, GPT, total amylase activity, glucose and total cholesterol concentration, and tendency to decrease of CK activity, hemoglobin, HDL-cholesterol concentration, and HDL/LDL — cholesterol ratio. Some definite trends in variations of other determined biochemical parameters had not been found. The same trends of mentioned biochemical parameters alterations observed in majority of tested cosmonauts, allows to suppose existence of connection between noted metabolic alterations with influence of space flight conditions upon cosmonaut's body. Variations of other studied blood biochemical parameters depends on, probably, pure individual causes.
Hypergraph-based anomaly detection of high-dimensional co-occurrences.
Silva, Jorge; Willett, Rebecca
2009-03-01
This paper addresses the problem of detecting anomalous multivariate co-occurrences using a limited number of unlabeled training observations. A novel method based on using a hypergraph representation of the data is proposed to deal with this very high-dimensional problem. Hypergraphs constitute an important extension of graphs which allow edges to connect more than two vertices simultaneously. A variational Expectation-Maximization algorithm for detecting anomalies directly on the hypergraph domain without any feature selection or dimensionality reduction is presented. The resulting estimate can be used to calculate a measure of anomalousness based on the False Discovery Rate. The algorithm has O(np) computational complexity, where n is the number of training observations and p is the number of potential participants in each co-occurrence event. This efficiency makes the method ideally suited for very high-dimensional settings, and requires no tuning, bandwidth or regularization parameters. The proposed approach is validated on both high-dimensional synthetic data and the Enron email database, where p > 75,000, and it is shown that it can outperform other state-of-the-art methods.
Shape, size, and robustness: feasible regions in the parameter space of biochemical networks.
Directory of Open Access Journals (Sweden)
Adel Dayarian
2009-01-01
Full Text Available The concept of robustness of regulatory networks has received much attention in the last decade. One measure of robustness has been associated with the volume of the feasible region, namely, the region in the parameter space in which the system is functional. In this paper, we show that, in addition to volume, the geometry of this region has important consequences for the robustness and the fragility of a network. We develop an approximation within which we could algebraically specify the feasible region. We analyze the segment polarity gene network to illustrate our approach. The study of random walks in the parameter space and how they exit the feasible region provide us with a rich perspective on the different modes of failure of this network model. In particular, we found that, between two alternative ways of activating Wingless, one is more robust than the other. Our method provides a more complete measure of robustness to parameter variation. As a general modeling strategy, our approach is an interesting alternative to Boolean representation of biochemical networks.
International Nuclear Information System (INIS)
Núñez, Darío; Zavala, Jesús; Nellen, Lukas; Sussman, Roberto A; Cabral-Rosetti, Luis G; Mondragón, Myriam
2008-01-01
We derive an expression for the entropy of a dark matter halo described using a Navarro–Frenk–White model with a core. The comparison of this entropy with that of dark matter in the freeze-out era allows us to constrain the parameter space in mSUGRA models. Moreover, combining these constraints with the ones obtained from the usual abundance criterion and demanding that these criteria be consistent with the 2σ bounds for the abundance of dark matter: 0.112≤Ω DM h 2 ≤0.122, we are able to clearly identify validity regions among the values of tanβ, which is one of the parameters of the mSUGRA model. We found that for the regions of the parameter space explored, small values of tanβ are not favored; only for tan β ≃ 50 are the two criteria significantly consistent. In the region where the two criteria are consistent we also found a lower bound for the neutralino mass, m χ ≥141 GeV
Energy Technology Data Exchange (ETDEWEB)
Nunez, Dario; Zavala, Jesus; Nellen, Lukas; Sussman, Roberto A [Instituto de Ciencias Nucleares, Universidad Nacional Autonoma de Mexico (ICN-UNAM), AP 70-543, Mexico 04510 DF (Mexico); Cabral-Rosetti, Luis G [Departamento de Posgrado, Centro Interdisciplinario de Investigacion y Docencia en Educacion Tecnica (CIIDET), Avenida Universidad 282 Pte., Col. Centro, Apartado Postal 752, C. P. 76000, Santiago de Queretaro, Qro. (Mexico); Mondragon, Myriam, E-mail: nunez@nucleares.unam.mx, E-mail: jzavala@nucleares.unam.mx, E-mail: jzavala@shao.ac.cn, E-mail: lukas@nucleares.unam.mx, E-mail: sussman@nucleares.unam.mx, E-mail: lgcabral@ciidet.edu.mx, E-mail: myriam@fisica.unam.mx [Instituto de Fisica, Universidad Nacional Autonoma de Mexico (IF-UNAM), Apartado Postal 20-364, 01000 Mexico DF (Mexico); Collaboration: For the Instituto Avanzado de Cosmologia, IAC
2008-05-15
We derive an expression for the entropy of a dark matter halo described using a Navarro-Frenk-White model with a core. The comparison of this entropy with that of dark matter in the freeze-out era allows us to constrain the parameter space in mSUGRA models. Moreover, combining these constraints with the ones obtained from the usual abundance criterion and demanding that these criteria be consistent with the 2{sigma} bounds for the abundance of dark matter: 0.112{<=}{Omega}{sub DM}h{sup 2}{<=}0.122, we are able to clearly identify validity regions among the values of tan{beta}, which is one of the parameters of the mSUGRA model. We found that for the regions of the parameter space explored, small values of tan{beta} are not favored; only for tan {beta} Asymptotically-Equal-To 50 are the two criteria significantly consistent. In the region where the two criteria are consistent we also found a lower bound for the neutralino mass, m{sub {chi}}{>=}141 GeV.
High Dimensional Modulation and MIMO Techniques for Access Networks
DEFF Research Database (Denmark)
Binti Othman, Maisara
Exploration of advanced modulation formats and multiplexing techniques for next generation optical access networks are of interest as promising solutions for delivering multiple services to end-users. This thesis addresses this from two different angles: high dimensionality carrierless...... the capacity per wavelength of the femto-cell network. Bit rate up to 1.59 Gbps with fiber-wireless transmission over 1 m air distance is demonstrated. The results presented in this thesis demonstrate the feasibility of high dimensionality CAP in increasing the number of dimensions and their potentially......) optical access network. 2 X 2 MIMO RoF employing orthogonal frequency division multiplexing (OFDM) with 5.6 GHz RoF signaling over all-vertical cavity surface emitting lasers (VCSEL) WDM passive optical networks (PONs). We have employed polarization division multiplexing (PDM) to further increase...
HSM: Heterogeneous Subspace Mining in High Dimensional Data
DEFF Research Database (Denmark)
Müller, Emmanuel; Assent, Ira; Seidl, Thomas
2009-01-01
Heterogeneous data, i.e. data with both categorical and continuous values, is common in many databases. However, most data mining algorithms assume either continuous or categorical attributes, but not both. In high dimensional data, phenomena due to the "curse of dimensionality" pose additional...... challenges. Usually, due to locally varying relevance of attributes, patterns do not show across the full set of attributes. In this paper we propose HSM, which defines a new pattern model for heterogeneous high dimensional data. It allows data mining in arbitrary subsets of the attributes that are relevant...... for the respective patterns. Based on this model we propose an efficient algorithm, which is aware of the heterogeneity of the attributes. We extend an indexing structure for continuous attributes such that HSM indexing adapts to different attribute types. In our experiments we show that HSM efficiently mines...
HIGH DIMENSIONAL COVARIANCE MATRIX ESTIMATION IN APPROXIMATE FACTOR MODELS.
Fan, Jianqing; Liao, Yuan; Mincheva, Martina
2011-01-01
The variance covariance matrix plays a central role in the inferential theories of high dimensional factor models in finance and economics. Popular regularization methods of directly exploiting sparsity are not directly applicable to many financial problems. Classical methods of estimating the covariance matrices are based on the strict factor models, assuming independent idiosyncratic components. This assumption, however, is restrictive in practical applications. By assuming sparse error covariance matrix, we allow the presence of the cross-sectional correlation even after taking out common factors, and it enables us to combine the merits of both methods. We estimate the sparse covariance using the adaptive thresholding technique as in Cai and Liu (2011), taking into account the fact that direct observations of the idiosyncratic components are unavailable. The impact of high dimensionality on the covariance matrix estimation based on the factor structure is then studied.
High-dimensional data in economics and their (robust) analysis
Czech Academy of Sciences Publication Activity Database
Kalina, Jan
2017-01-01
Roč. 12, č. 1 (2017), s. 171-183 ISSN 1452-4864 R&D Projects: GA ČR GA17-07384S Institutional support: RVO:67985556 Keywords : econometrics * high-dimensional data * dimensionality reduction * linear regression * classification analysis * robustness Subject RIV: BA - General Mathematics OBOR OECD: Business and management http://library.utia.cas.cz/separaty/2017/SI/kalina-0474076.pdf
High-dimensional Data in Economics and their (Robust) Analysis
Czech Academy of Sciences Publication Activity Database
Kalina, Jan
2017-01-01
Roč. 12, č. 1 (2017), s. 171-183 ISSN 1452-4864 R&D Projects: GA ČR GA17-07384S Grant - others:GA ČR(CZ) GA13-01930S Institutional support: RVO:67985807 Keywords : econometrics * high-dimensional data * dimensionality reduction * linear regression * classification analysis * robustness Subject RIV: BB - Applied Statistics, Operational Research OBOR OECD: Statistics and probability
Ozkat, Erkan Caner; Franciosa, Pasquale; Ceglarek, Dariusz
2017-08-01
Remote laser welding technology offers opportunities for high production throughput at a competitive cost. However, the remote laser welding process of zinc-coated sheet metal parts in lap joint configuration poses a challenge due to the difference between the melting temperature of the steel (∼1500 °C) and the vapourizing temperature of the zinc (∼907 °C). In fact, the zinc layer at the faying surface is vapourized and the vapour might be trapped within the melting pool leading to weld defects. Various solutions have been proposed to overcome this problem over the years. Among them, laser dimpling has been adopted by manufacturers because of its flexibility and effectiveness along with its cost advantages. In essence, the dimple works as a spacer between the two sheets in lap joint and allows the zinc vapour escape during welding process, thereby preventing weld defects. However, there is a lack of comprehensive characterization of dimpling process for effective implementation in real manufacturing system taking into consideration inherent changes in variability of process parameters. This paper introduces a methodology to develop (i) surrogate model for dimpling process characterization considering multiple-inputs (i.e. key control characteristics) and multiple-outputs (i.e. key performance indicators) system by conducting physical experimentation and using multivariate adaptive regression splines; (ii) process capability space (Cp-Space) based on the developed surrogate model that allows the estimation of a desired process fallout rate in the case of violation of process requirements in the presence of stochastic variation; and, (iii) selection and optimization of the process parameters based on the process capability space. The proposed methodology provides a unique capability to: (i) simulate the effect of process variation as generated by manufacturing process; (ii) model quality requirements with multiple and coupled quality requirements; and (iii
High-dimensional quantum cloning and applications to quantum hacking.
Bouchard, Frédéric; Fickler, Robert; Boyd, Robert W; Karimi, Ebrahim
2017-02-01
Attempts at cloning a quantum system result in the introduction of imperfections in the state of the copies. This is a consequence of the no-cloning theorem, which is a fundamental law of quantum physics and the backbone of security for quantum communications. Although perfect copies are prohibited, a quantum state may be copied with maximal accuracy via various optimal cloning schemes. Optimal quantum cloning, which lies at the border of the physical limit imposed by the no-signaling theorem and the Heisenberg uncertainty principle, has been experimentally realized for low-dimensional photonic states. However, an increase in the dimensionality of quantum systems is greatly beneficial to quantum computation and communication protocols. Nonetheless, no experimental demonstration of optimal cloning machines has hitherto been shown for high-dimensional quantum systems. We perform optimal cloning of high-dimensional photonic states by means of the symmetrization method. We show the universality of our technique by conducting cloning of numerous arbitrary input states and fully characterize our cloning machine by performing quantum state tomography on cloned photons. In addition, a cloning attack on a Bennett and Brassard (BB84) quantum key distribution protocol is experimentally demonstrated to reveal the robustness of high-dimensional states in quantum cryptography.
Non-Abelian monopole in the parameter space of point-like interactions
International Nuclear Information System (INIS)
Ohya, Satoshi
2014-01-01
We study non-Abelian geometric phase in N=2 supersymmetric quantum mechanics for a free particle on a circle with two point-like interactions at antipodal points. We show that non-Abelian Berry’s connection is that of SU(2) magnetic monopole discovered by Moody, Shapere and Wilczek in the context of adiabatic decoupling limit of diatomic molecule. - Highlights: • Supersymmetric quantum mechanics is an ideal playground for studying geometric phase. • We determine the parameter space of supersymmetric point-like interactions. • Berry’s connection is given by a Wu–Yang-like magnetic monopole in SU(2) Yang–Mills
Probing the parameter space of HD 49933: A comparison between global and local methods
Energy Technology Data Exchange (ETDEWEB)
Creevey, O L [Instituto de Astrofisica de Canarias (IAC), E-38200 La Laguna, Tenerife (Spain); Bazot, M, E-mail: orlagh@iac.es, E-mail: bazot@astro.up.pt [Centro de Astrofisica da Universidade do Porto, Rua das Estrelas, 4150-762 Porto (Portugal)
2011-01-01
We present two independent methods for studying the global stellar parameter space (mass M, age, chemical composition X{sub 0}, Z{sub 0}) of HD 49933 with seismic data. Using a local minimization and an MCMC algorithm, we obtain consistent results for the determination of the stellar properties: M 1.1-1.2 M{sub sun} Age {approx} 3.0 Gyr, Z{sub 0} {approx} 0.008. A description of the error ellipses can be defined using Singular Value Decomposition techniques, and this is validated by comparing the errors with those from the MCMC method.
A morphing technique for signal modelling in a multidimensional space of coupling parameters
The ATLAS collaboration
2015-01-01
This note describes a morphing method that produces signal models for fits to data in which both the affected event yields and kinematic distributions are simultaneously taken into account. The signal model is morphed in a continuous manner through the available multi-dimensional parameter space. Searches for deviations from Standard Model predictions for Higgs boson properties have so far used information either from event yields or kinematic distributions. The combined approach described here is expected to substantially enhance the sensitivity to beyond the Standard Model contributions.
Abidi, Yassine; Bellassoued, Mourad; Mahjoub, Moncef; Zemzemi, Nejib
2018-03-01
In this paper, we consider the inverse problem of space dependent multiple ionic parameters identification in cardiac electrophysiology modelling from a set of observations. We use the monodomain system known as a state-of-the-art model in cardiac electrophysiology and we consider a general Hodgkin-Huxley formalism to describe the ionic exchanges at the microscopic level. This formalism covers many physiological transmembrane potential models including those in cardiac electrophysiology. Our main result is the proof of the uniqueness and a Lipschitz stability estimate of ion channels conductance parameters based on some observations on an arbitrary subdomain. The key idea is a Carleman estimate for a parabolic operator with multiple coefficients and an ordinary differential equation system.
Effect of alloy deformation on the average spacing parameters of non-deforming particles
International Nuclear Information System (INIS)
Fisher, J.; Gurland, J.
1980-02-01
It is shown on the basis of stereological definitions and a few simple experiments that the commonly used average dispersion parameters, area fraction (A/sub A/)/sub β/, areal particle density N/sub Aβ/ and mean free path lambda/sub α/, remain invariant during plastic deformation in the case of non-deforming equiaxed particles. Directional effects on the spacing parameters N/sub Aβ/ and lambda/sub α/ arise during uniaxial deformation by rotation and preferred orientation of nonequiaxed particles. Particle arrangement in stringered or layered structures and the effect of deformation on nearest neighbor distances of particles and voids are briefly discussed in relation to strength and fracture theories
Covariance Method of the Tunneling Radiation from High Dimensional Rotating Black Holes
Li, Hui-Ling; Han, Yi-Wen; Chen, Shuai-Ru; Ding, Cong
2018-04-01
In this paper, Angheben-Nadalini-Vanzo-Zerbini (ANVZ) covariance method is used to study the tunneling radiation from the Kerr-Gödel black hole and Myers-Perry black hole with two independent angular momentum. By solving the Hamilton-Jacobi equation and separating the variables, the radial motion equation of a tunneling particle is obtained. Using near horizon approximation and the distance of the proper pure space, we calculate the tunneling rate and the temperature of Hawking radiation. Thus, the method of ANVZ covariance is extended to the research of high dimensional black hole tunneling radiation.
DEFF Research Database (Denmark)
Ding, Yunhong; Bacco, Davide; Dalgaard, Kjeld
2017-01-01
is intrinsically limited to 1 bit/photon. Here we propose and experimentally demonstrate, for the first time, a high-dimensional quantum key distribution protocol based on space division multiplexing in multicore fiber using silicon photonic integrated lightwave circuits. We successfully realized three mutually......-dimensional quantum states, and enables breaking the information efficiency limit of traditional quantum key distribution protocols. In addition, the silicon photonic circuits used in our work integrate variable optical attenuators, highly efficient multicore fiber couplers, and Mach-Zehnder interferometers, enabling...
Inferring biological tasks using Pareto analysis of high-dimensional data.
Hart, Yuval; Sheftel, Hila; Hausser, Jean; Szekely, Pablo; Ben-Moshe, Noa Bossel; Korem, Yael; Tendler, Avichai; Mayo, Avraham E; Alon, Uri
2015-03-01
We present the Pareto task inference method (ParTI; http://www.weizmann.ac.il/mcb/UriAlon/download/ParTI) for inferring biological tasks from high-dimensional biological data. Data are described as a polytope, and features maximally enriched closest to the vertices (or archetypes) allow identification of the tasks the vertices represent. We demonstrate that human breast tumors and mouse tissues are well described by tetrahedrons in gene expression space, with specific tumor types and biological functions enriched at each of the vertices, suggesting four key tasks.
Liu, W.; Wang, H.; Liu, D.; Miu, Y.
2018-05-01
Precise geometric parameters are essential to ensure the positioning accuracy for space optical cameras. However, state-of-the-art onorbit calibration method inevitably suffers from long update cycle and poor timeliness performance. To this end, in this paper we exploit the optical auto-collimation principle and propose a real-time onboard calibration scheme for monitoring key geometric parameters. Specifically, in the proposed scheme, auto-collimation devices are first designed by installing collimated light sources, area-array CCDs, and prisms inside the satellite payload system. Through utilizing those devices, the changes in the geometric parameters are elegantly converted into changes in the spot image positions. The variation of geometric parameters can be derived via extracting and processing the spot images. An experimental platform is then set up to verify the feasibility and analyze the precision index of the proposed scheme. The experiment results demonstrate that it is feasible to apply the optical auto-collimation principle for real-time onboard monitoring.
Parameter retrieval of chiral metamaterials based on the state-space approach.
Zarifi, Davoud; Soleimani, Mohammad; Abdolali, Ali
2013-08-01
This paper deals with the introduction of an approach for the electromagnetic characterization of homogeneous chiral layers. The proposed method is based on the state-space approach and properties of a 4×4 state transition matrix. Based on this, first, the forward problem analysis through the state-space method is reviewed and properties of the state transition matrix of a chiral layer are presented and proved as two theorems. The formulation of a proposed electromagnetic characterization method is then presented. In this method, scattering data for a linearly polarized plane wave incident normally on a homogeneous chiral slab are combined with properties of a state transition matrix and provide a powerful characterization method. The main difference with respect to other well-established retrieval procedures based on the use of the scattering parameters relies on the direct computation of the transfer matrix of the slab as opposed to the conventional calculation of the propagation constant and impedance of the modes supported by the medium. The proposed approach allows avoiding nonlinearity of the problem but requires getting enough equations to fulfill the task which was provided by considering some properties of the state transition matrix. To demonstrate the applicability and validity of the method, the constitutive parameters of two well-known dispersive chiral metamaterial structures at microwave frequencies are retrieved. The results show that the proposed method is robust and reliable.
A General 2D Meshless Interpolating Boundary Node Method Based on the Parameter Space
Directory of Open Access Journals (Sweden)
Hongyin Yang
2017-01-01
Full Text Available The presented study proposed an improved interpolating boundary node method (IIBNM for 2D potential problems. The improved interpolating moving least-square (IIMLS method was applied to construct the shape functions, of which the delta function properties and boundary conditions were directly implemented. In addition, any weight function used in the moving least-square (MLS method was also applicable in the IIMLS method. Boundary cells were required in the computation of the boundary integrals, and additional discretization error was not avoided if traditional cells were used to approximate the geometry. The present study applied the parametric cells created in the parameter space to preserve the exact geometry, and the geometry was maintained due to the number of cells. Only the number of nodes on the boundary was required as additional information for boundary node construction. Most importantly, the IIMLS method can be applied in the parameter space to construct shape functions without the requirement of additional computations for the curve length.
On the identifiability of inertia parameters of planar Multi-Body Space Systems
Nabavi-Chashmi, Seyed Yaser; Malaek, Seyed Mohammad-Bagher
2018-04-01
This work describes a new formulation to study the identifiability characteristics of Serially Linked Multi-body Space Systems (SLMBSS). The process exploits the so called "Lagrange Formulation" to develop a linear form of Equations of Motion w.r.t the system Inertia Parameters (IPs). Having developed a specific form of regressor matrix, we aim to expedite the identification process. The new approach allows analytical as well as numerical identification and identifiability analysis for different SLMBSSs' configurations. Moreover, the explicit forms of SLMBSSs identifiable parameters are derived by analyzing the identifiability characteristics of the robot. We further show that any SLMBSS designed with Variable Configurations Joint allows all IPs to be identifiable through comparing two successive identification outcomes. This feature paves the way to design new class of SLMBSS for which accurate identification of all IPs is at hand. Different case studies reveal that proposed formulation provides fast and accurate results, as required by the space applications. Further studies might be necessary for cases where planar-body assumption becomes inaccurate.
Application of separable parameter space techniques to multi-tracer PET compartment modeling
International Nuclear Information System (INIS)
Zhang, Jeff L; Michael Morey, A; Kadrmas, Dan J
2016-01-01
Multi-tracer positron emission tomography (PET) can image two or more tracers in a single scan, characterizing multiple aspects of biological functions to provide new insights into many diseases. The technique uses dynamic imaging, resulting in time-activity curves that contain contributions from each tracer present. The process of separating and recovering separate images and/or imaging measures for each tracer requires the application of kinetic constraints, which are most commonly applied by fitting parallel compartment models for all tracers. Such multi-tracer compartment modeling presents challenging nonlinear fits in multiple dimensions. This work extends separable parameter space kinetic modeling techniques, previously developed for fitting single-tracer compartment models, to fitting multi-tracer compartment models. The multi-tracer compartment model solution equations were reformulated to maximally separate the linear and nonlinear aspects of the fitting problem, and separable least-squares techniques were applied to effectively reduce the dimensionality of the nonlinear fit. The benefits of the approach are then explored through a number of illustrative examples, including characterization of separable parameter space multi-tracer objective functions and demonstration of exhaustive search fits which guarantee the true global minimum to within arbitrary search precision. Iterative gradient-descent algorithms using Levenberg–Marquardt were also tested, demonstrating improved fitting speed and robustness as compared to corresponding fits using conventional model formulations. The proposed technique overcomes many of the challenges in fitting simultaneous multi-tracer PET compartment models. (paper)
Constraining the mSUGRA parameter space through entropy and abundance criteria
International Nuclear Information System (INIS)
Cabral-Rosetti, Luis G.; Mondragon, Myriam; Nunez, Dario; Sussman, Roberto A.; Zavala, Jesus; Nellen, Lukas
2007-01-01
We explore the use of two criteria to constrain the allowed parameter space in mSUGRA models; both criteria are based in the calculation of the present density of neutralinos χ0 as Dark Matter in the Universe. The first one is the usual ''abundance'' criterion that requieres that present neutralino relic density complies with 0.0945 < ΩCDMh2 < 0.1287, which are the 2σ bounds according to WMAP. To calculate the relic density we use the public numerical code micrOMEGAS. The second criterion is the original idea presented in [3] that basically applies the microcanonical definition of entropy to a weakly interacting and self-gravitating gas, and then evaluate the change in entropy per particle of this gas between the freeze-out era and present day virialized structures. An 'entropy consistency' criterion emerges by comparing theoretical and empirical estimates of this entropy. One of the objetives of the work is to analyze the joint application of both criteria, already done in [3], to see if their results, using approximations for the calculations of the relic density, agree with the results coming from the exact numerical results of micrOMEGAS. The main objetive of the work is to use this method to constrain the parameter space in mSUGRA models that are inputs for the calculations of micrOMEGAS, and thus to get some bounds on the predictions for the SUSY spectra
The effect of environmental parameters to dust concentration in air-conditioned space
Ismail, A. M. M.; Manssor, N. A. S.; Nalisa, A.; Yahaya, N.
2017-08-01
Malaysia has a wet and hot climate, therefore most of the spaces are air conditioned. The environment might affect dust concentration inside a space and affect the indoor air quality (IAQ). The main objective of this study is to study the dust concentration collected inside enclosed air-conditioned space. The measurement was done physically at four selected offices and two classrooms using a number of equipment to measure the dust concentration and environmental parameters which are temperature and relative air humidity. It was found that the highest dust concentration produced in office (temperature of 24.7°C, relative humidity of 66.5%) is 0.075 mg/m3, as compared to classroom, the highest dust concentration produced is 0.060 mg/m3 office (temperature of 25.9°C, relative humidity of 64.0%). However, both measurements show that value still within the safety level set by DOSH Malaysia (2005-2010) and ASHRAE 62.2 2016. The office contained higher dust concentration compared to classroom because of frequent movement transpires daily due to the functional of the offices.
Fast estimation of space-robots inertia parameters: A modular mathematical formulation
Nabavi Chashmi, Seyed Yaser; Malaek, Seyed Mohammad-Bagher
2016-10-01
This work aims to propose a new technique that considerably helps enhance time and precision needed to identify ;Inertia Parameters (IPs); of a typical Autonomous Space-Robot (ASR). Operations might include, capturing an unknown Target Space-Object (TSO), ;active space-debris removal; or ;automated in-orbit assemblies;. In these operations generating precise successive commands are essential to the success of the mission. We show how a generalized, repeatable estimation-process could play an effective role to manage the operation. With the help of the well-known Force-Based approach, a new ;modular formulation; has been developed to simultaneously identify IPs of an ASR while it captures a TSO. The idea is to reorganize the equations with associated IPs with a ;Modular Set; of matrices instead of a single matrix representing the overall system dynamics. The devised Modular Matrix Set will then facilitate the estimation process. It provides a conjugate linear model in mass and inertia terms. The new formulation is, therefore, well-suited for ;simultaneous estimation processes; using recursive algorithms like RLS. Further enhancements would be needed for cases the effect of center of mass location becomes important. Extensive case studies reveal that estimation time is drastically reduced which in-turn paves the way to acquire better results.
The cross-validated AUC for MCP-logistic regression with high-dimensional data.
Jiang, Dingfeng; Huang, Jian; Zhang, Ying
2013-10-01
We propose a cross-validated area under the receiving operator characteristic (ROC) curve (CV-AUC) criterion for tuning parameter selection for penalized methods in sparse, high-dimensional logistic regression models. We use this criterion in combination with the minimax concave penalty (MCP) method for variable selection. The CV-AUC criterion is specifically designed for optimizing the classification performance for binary outcome data. To implement the proposed approach, we derive an efficient coordinate descent algorithm to compute the MCP-logistic regression solution surface. Simulation studies are conducted to evaluate the finite sample performance of the proposed method and its comparison with the existing methods including the Akaike information criterion (AIC), Bayesian information criterion (BIC) or Extended BIC (EBIC). The model selected based on the CV-AUC criterion tends to have a larger predictive AUC and smaller classification error than those with tuning parameters selected using the AIC, BIC or EBIC. We illustrate the application of the MCP-logistic regression with the CV-AUC criterion on three microarray datasets from the studies that attempt to identify genes related to cancers. Our simulation studies and data examples demonstrate that the CV-AUC is an attractive method for tuning parameter selection for penalized methods in high-dimensional logistic regression models.
Virtual walks in spin space: A study in a family of two-parameter models
Mullick, Pratik; Sen, Parongama
2018-05-01
We investigate the dynamics of classical spins mapped as walkers in a virtual "spin" space using a generalized two-parameter family of spin models characterized by parameters y and z [de Oliveira et al., J. Phys. A 26, 2317 (1993), 10.1088/0305-4470/26/10/006]. The behavior of S (x ,t ) , the probability that the walker is at position x at time t , is studied in detail. In general S (x ,t ) ˜t-αf (x /tα) with α ≃1 or 0.5 at large times depending on the parameters. In particular, S (x ,t ) for the point y =1 ,z =0.5 corresponding to the Voter model shows a crossover in time; associated with this crossover, two timescales can be defined which vary with the system size L as L2logL . We also show that as the Voter model point is approached from the disordered regions along different directions, the width of the Gaussian distribution S (x ,t ) diverges in a power law manner with different exponents. For the majority Voter case, the results indicate that the the virtual walk can detect the phase transition perhaps more efficiently compared to other nonequilibrium methods.
Halogenation of Hydraulic Fracturing Additives in the Shale Well Parameter Space
Sumner, A. J.; Plata, D.
2017-12-01
Horizontal Drilling and Hydraulic fracturing (HDHF) involves the deep-well injection of a `fracking fluid' composed of diverse and numerous chemical additives designed to facilitate the release and collection of natural gas from shale plays. The potential impacts of HDHF operations on water resources and ecosystems are numerous, and analyses of flowback samples revealed organic compounds from both geogenic and anthropogenic sources. Furthermore, halogenated chemicals were also detected, and these compounds are rarely disclosed, suggesting the in situ halogenation of reactive additives. To test this transformation hypothesis, we designed and operated a novel high pressure and temperature reactor system to simulate the shale well parameter space and investigate the chemical reactivity of twelve commonly disclosed and functionally diverse HDHF additives. Early results revealed an unanticipated halogenation pathway of α-β unsaturated aldehyde, Cinnamaldehyde, in the presence of oxidant and concentrated brine. Ongoing experiments over a range of parameters informed a proposed mechanism, demonstrating the role of various shale-well specific parameters in enabling the demonstrated halogenation pathway. Ultimately, these results will inform a host of potentially unintended interactions of HDHF additives during the extreme conditions down-bore of a shale well during HDHF activities.
International Nuclear Information System (INIS)
Tripathy, Rohit; Bilionis, Ilias; Gonzalez, Marcial
2016-01-01
Uncertainty quantification (UQ) tasks, such as model calibration, uncertainty propagation, and optimization under uncertainty, typically require several thousand evaluations of the underlying computer codes. To cope with the cost of simulations, one replaces the real response surface with a cheap surrogate based, e.g., on polynomial chaos expansions, neural networks, support vector machines, or Gaussian processes (GP). However, the number of simulations required to learn a generic multivariate response grows exponentially as the input dimension increases. This curse of dimensionality can only be addressed, if the response exhibits some special structure that can be discovered and exploited. A wide range of physical responses exhibit a special structure known as an active subspace (AS). An AS is a linear manifold of the stochastic space characterized by maximal response variation. The idea is that one should first identify this low dimensional manifold, project the high-dimensional input onto it, and then link the projection to the output. If the dimensionality of the AS is low enough, then learning the link function is a much easier problem than the original problem of learning a high-dimensional function. The classic approach to discovering the AS requires gradient information, a fact that severely limits its applicability. Furthermore, and partly because of its reliance to gradients, it is not able to handle noisy observations. The latter is an essential trait if one wants to be able to propagate uncertainty through stochastic simulators, e.g., through molecular dynamics codes. In this work, we develop a probabilistic version of AS which is gradient-free and robust to observational noise. Our approach relies on a novel Gaussian process regression with built-in dimensionality reduction. In particular, the AS is represented as an orthogonal projection matrix that serves as yet another covariance function hyper-parameter to be estimated from the data. To train the
Tripathy, Rohit; Bilionis, Ilias; Gonzalez, Marcial
2016-09-01
Uncertainty quantification (UQ) tasks, such as model calibration, uncertainty propagation, and optimization under uncertainty, typically require several thousand evaluations of the underlying computer codes. To cope with the cost of simulations, one replaces the real response surface with a cheap surrogate based, e.g., on polynomial chaos expansions, neural networks, support vector machines, or Gaussian processes (GP). However, the number of simulations required to learn a generic multivariate response grows exponentially as the input dimension increases. This curse of dimensionality can only be addressed, if the response exhibits some special structure that can be discovered and exploited. A wide range of physical responses exhibit a special structure known as an active subspace (AS). An AS is a linear manifold of the stochastic space characterized by maximal response variation. The idea is that one should first identify this low dimensional manifold, project the high-dimensional input onto it, and then link the projection to the output. If the dimensionality of the AS is low enough, then learning the link function is a much easier problem than the original problem of learning a high-dimensional function. The classic approach to discovering the AS requires gradient information, a fact that severely limits its applicability. Furthermore, and partly because of its reliance to gradients, it is not able to handle noisy observations. The latter is an essential trait if one wants to be able to propagate uncertainty through stochastic simulators, e.g., through molecular dynamics codes. In this work, we develop a probabilistic version of AS which is gradient-free and robust to observational noise. Our approach relies on a novel Gaussian process regression with built-in dimensionality reduction. In particular, the AS is represented as an orthogonal projection matrix that serves as yet another covariance function hyper-parameter to be estimated from the data. To train the
Energy Technology Data Exchange (ETDEWEB)
Tripathy, Rohit, E-mail: rtripath@purdue.edu; Bilionis, Ilias, E-mail: ibilion@purdue.edu; Gonzalez, Marcial, E-mail: marcial-gonzalez@purdue.edu
2016-09-15
Uncertainty quantification (UQ) tasks, such as model calibration, uncertainty propagation, and optimization under uncertainty, typically require several thousand evaluations of the underlying computer codes. To cope with the cost of simulations, one replaces the real response surface with a cheap surrogate based, e.g., on polynomial chaos expansions, neural networks, support vector machines, or Gaussian processes (GP). However, the number of simulations required to learn a generic multivariate response grows exponentially as the input dimension increases. This curse of dimensionality can only be addressed, if the response exhibits some special structure that can be discovered and exploited. A wide range of physical responses exhibit a special structure known as an active subspace (AS). An AS is a linear manifold of the stochastic space characterized by maximal response variation. The idea is that one should first identify this low dimensional manifold, project the high-dimensional input onto it, and then link the projection to the output. If the dimensionality of the AS is low enough, then learning the link function is a much easier problem than the original problem of learning a high-dimensional function. The classic approach to discovering the AS requires gradient information, a fact that severely limits its applicability. Furthermore, and partly because of its reliance to gradients, it is not able to handle noisy observations. The latter is an essential trait if one wants to be able to propagate uncertainty through stochastic simulators, e.g., through molecular dynamics codes. In this work, we develop a probabilistic version of AS which is gradient-free and robust to observational noise. Our approach relies on a novel Gaussian process regression with built-in dimensionality reduction. In particular, the AS is represented as an orthogonal projection matrix that serves as yet another covariance function hyper-parameter to be estimated from the data. To train the
Hawking radiation of a high-dimensional rotating black hole
Energy Technology Data Exchange (ETDEWEB)
Zhao, Ren; Zhang, Lichun; Li, Huaifan; Wu, Yueqin [Shanxi Datong University, Institute of Theoretical Physics, Department of Physics, Datong (China)
2010-01-15
We extend the classical Damour-Ruffini method and discuss Hawking radiation spectrum of high-dimensional rotating black hole using Tortoise coordinate transformation defined by taking the reaction of the radiation to the spacetime into consideration. Under the condition that the energy and angular momentum are conservative, taking self-gravitation action into account, we derive Hawking radiation spectrums which satisfy unitary principle in quantum mechanics. It is shown that the process that the black hole radiates particles with energy {omega} is a continuous tunneling process. We provide a theoretical basis for further studying the physical mechanism of black-hole radiation. (orig.)
On spectral distribution of high dimensional covariation matrices
DEFF Research Database (Denmark)
Heinrich, Claudio; Podolskij, Mark
In this paper we present the asymptotic theory for spectral distributions of high dimensional covariation matrices of Brownian diffusions. More specifically, we consider N-dimensional Itô integrals with time varying matrix-valued integrands. We observe n equidistant high frequency data points...... of the underlying Brownian diffusion and we assume that N/n -> c in (0,oo). We show that under a certain mixed spectral moment condition the spectral distribution of the empirical covariation matrix converges in distribution almost surely. Our proof relies on method of moments and applications of graph theory....
High-dimensional quantum channel estimation using classical light
CSIR Research Space (South Africa)
Mabena, Chemist M
2017-11-01
Full Text Available stream_source_info Mabena_20007_2017.pdf.txt stream_content_type text/plain stream_size 960 Content-Encoding UTF-8 stream_name Mabena_20007_2017.pdf.txt Content-Type text/plain; charset=UTF-8 PHYSICAL REVIEW A 96, 053860... (2017) High-dimensional quantum channel estimation using classical light Chemist M. Mabena CSIR National Laser Centre, P.O. Box 395, Pretoria 0001, South Africa and School of Physics, University of the Witwatersrand, Johannesburg 2000, South...
Ghosts in high dimensional non-linear dynamical systems: The example of the hypercycle
International Nuclear Information System (INIS)
Sardanyes, Josep
2009-01-01
Ghost-induced delayed transitions are analyzed in high dimensional non-linear dynamical systems by means of the hypercycle model. The hypercycle is a network of catalytically-coupled self-replicating RNA-like macromolecules, and has been suggested to be involved in the transition from non-living to living matter in the context of earlier prebiotic evolution. It is demonstrated that, in the vicinity of the saddle-node bifurcation for symmetric hypercycles, the persistence time before extinction, T ε , tends to infinity as n→∞ (being n the number of units of the hypercycle), thus suggesting that the increase in the number of hypercycle units involves a longer resilient time before extinction because of the ghost. Furthermore, by means of numerical analysis the dynamics of three large hypercycle networks is also studied, focusing in their extinction dynamics associated to the ghosts. Such networks allow to explore the properties of the ghosts living in high dimensional phase space with n = 5, n = 10 and n = 15 dimensions. These hypercyclic networks, in agreement with other works, are shown to exhibit self-maintained oscillations governed by stable limit cycles. The bifurcation scenarios for these hypercycles are analyzed, as well as the effect of the phase space dimensionality in the delayed transition phenomena and in the scaling properties of the ghosts near bifurcation threshold
Reducing the Complexity of Genetic Fuzzy Classifiers in Highly-Dimensional Classification Problems
Directory of Open Access Journals (Sweden)
DimitrisG. Stavrakoudis
2012-04-01
Full Text Available This paper introduces the Fast Iterative Rule-based Linguistic Classifier (FaIRLiC, a Genetic Fuzzy Rule-Based Classification System (GFRBCS which targets at reducing the structural complexity of the resulting rule base, as well as its learning algorithm's computational requirements, especially when dealing with high-dimensional feature spaces. The proposed methodology follows the principles of the iterative rule learning (IRL approach, whereby a rule extraction algorithm (REA is invoked in an iterative fashion, producing one fuzzy rule at a time. The REA is performed in two successive steps: the first one selects the relevant features of the currently extracted rule, whereas the second one decides the antecedent part of the fuzzy rule, using the previously selected subset of features. The performance of the classifier is finally optimized through a genetic tuning post-processing stage. Comparative results in a hyperspectral remote sensing classification as well as in 12 real-world classification datasets indicate the effectiveness of the proposed methodology in generating high-performing and compact fuzzy rule-based classifiers, even for very high-dimensional feature spaces.
Schaefer, Andreas; Wenzel, Friedemann
2017-04-01
Subduction zones are generally the sources of the earthquakes with the highest magnitudes. Not only in Japan or Chile, but also in Pakistan, the Solomon Islands or for the Lesser Antilles, subduction zones pose a significant hazard for the people. To understand the behavior of subduction zones, especially to identify their capabilities to produce maximum magnitude earthquakes, various physical models have been developed leading to a large number of various datasets, e.g. from geodesy, geomagnetics, structural geology, etc. There have been various studies to utilize this data for the compilation of a subduction zone parameters database, but mostly concentrating on only the major zones. Here, we compile the largest dataset of subduction zone parameters both in parameter diversity but also in the number of considered subduction zones. In total, more than 70 individual sources have been assessed and the aforementioned parametric data have been combined with seismological data and many more sources have been compiled leading to more than 60 individual parameters. Not all parameters have been resolved for each zone, since the data completeness depends on the data availability and quality for each source. In addition, the 3D down-dip geometry of a majority of the subduction zones has been resolved using historical earthquake hypocenter data and centroid moment tensors where available and additionally compared and verified with results from previous studies. With such a database, a statistical study has been undertaken to identify not only correlations between those parameters to estimate a parametric driven way to identify potentials for maximum possible magnitudes, but also to identify similarities between the sources themselves. This identification of similarities leads to a classification system for subduction zones. Here, it could be expected if two sources share enough common characteristics, other characteristics of interest may be similar as well. This concept
Xue, Zhang-Na; Yu, Ya-Jun; Tian, Xiao-Geng
2017-07-01
Based upon the coupled thermoelasticity and Green and Lindsay theory, the new governing equations of two-temperature thermoelastic theory with thermal nonlocal parameter is formulated. To more realistically model thermal loading of a half-space surface, a linear temperature ramping function is adopted. Laplace transform techniques are used to get the general analytical solutions in Laplace domain, and the inverse Laplace transforms based on Fourier expansion techniques are numerically implemented to obtain the numerical solutions in time domain. Specific attention is paid to study the effect of thermal nonlocal parameter, ramping time, and two-temperature parameter on the distributions of temperature, displacement and stress distribution.
International Nuclear Information System (INIS)
Parzen, G.
1997-01-01
It will be shown that starting from a coordinate system where the 6 phase space coordinates are linearly coupled, one can go to a new coordinate system, where the motion is uncoupled, by means of a linear transformation. The original coupled coordinates and the new uncoupled coordinates are related by a 6 x 6 matrix, R. It will be shown that of the 36 elements of the 6 x 6 decoupling matrix R, only 12 elements are independent. A set of equations is given from which the 12 elements of R can be computed form the one period transfer matrix. This set of equations also allows the linear parameters, the β i , α i , i = 1, 3, for the uncoupled coordinates, to be computed from the one period transfer matrix
DRAGON solutions to the 3D transport benchmark over a range in parameter space
International Nuclear Information System (INIS)
Martin, Nicolas; Hebert, Alain; Marleau, Guy
2010-01-01
DRAGON solutions to the 'NEA suite of benchmarks for 3D transport methods and codes over a range in parameter space' are discussed in this paper. A description of the benchmark is first provided, followed by a detailed review of the different computational models used in the lattice code DRAGON. Two numerical methods were selected for generating the required quantities for the 729 configurations of this benchmark. First, S N calculations were performed using fully symmetric angular quadratures and high-order diamond differencing for spatial discretization. To compare S N results with those of another deterministic method, the method of characteristics (MoC) was also considered for this benchmark. Comparisons between reference solutions, S N and MoC results illustrate the advantages and drawbacks of each methods for this 3-D transport problem.
Constraints on pre-big-bang parameter space from CMBR anisotropies
International Nuclear Information System (INIS)
Bozza, V.; Gasperini, M.; Giovannini, M.; Veneziano, G.
2003-01-01
The so-called curvaton mechanism--a way to convert isocurvature perturbations into adiabatic ones--is investigated both analytically and numerically in a pre-big-bang scenario where the role of the curvaton is played by a sufficiently massive Kalb-Ramond axion of superstring theory. When combined with observations of CMBR anisotropies at large and moderate angular scales, the present analysis allows us to constrain quite considerably the parameter space of the model: in particular, the initial displacement of the axion from the minimum of its potential and the rate of evolution of the compactification volume during pre-big-bang inflation. The combination of theoretical and experimental constraints favors a slightly blue spectrum of scalar perturbations, and/or a value of the string scale in the vicinity of the SUSY GUT scale
A hybrid method of estimating pulsating flow parameters in the space-time domain
Pałczyński, Tomasz
2017-05-01
This paper presents a method for estimating pulsating flow parameters in partially open pipes, such as pipelines, internal combustion engine inlets, exhaust pipes and piston compressors. The procedure is based on the method of characteristics, and employs a combination of measurements and simulations. An experimental test rig is described, which enables pressure, temperature and mass flow rate to be measured within a defined cross section. The second part of the paper discusses the main assumptions of a simulation algorithm elaborated in the Matlab/Simulink environment. The simulation results are shown as 3D plots in the space-time domain, and compared with proposed models of phenomena relating to wave propagation, boundary conditions, acoustics and fluid mechanics. The simulation results are finally compared with acoustic phenomena, with an emphasis on the identification of resonant frequencies.
Constraints on pre-big bang parameter space from CMBR anisotropies
Bozza, Valerio; Giovannini, Massimo; Veneziano, Gabriele
2003-01-01
The so-called curvaton mechanism --a way to convert isocurvature perturbations into adiabatic ones-- is investigated both analytically and numerically in a pre-big bang scenario where the role of the curvaton is played by a sufficiently massive Kalb--Ramond axion of superstring theory. When combined with observations of CMBR anisotropies at large and moderate angular scales, the present analysis allows us to constrain quite considerably the parameter space of the model: in particular, the initial displacement of the axion from the minimum of its potential and the rate of evolution of the compactification volume during pre-big bang inflation. The combination of theoretical and experimental constraints favours a slightly blue spectrum of scalar perturbations, and/or a value of the string scale in the vicinity of the SUSY-GUT scale.
High-Dimensional Adaptive Particle Swarm Optimization on Heterogeneous Systems
International Nuclear Information System (INIS)
Wachowiak, M P; Sarlo, B B; Foster, A E Lambe
2014-01-01
Much work has recently been reported in parallel GPU-based particle swarm optimization (PSO). Motivated by the encouraging results of these investigations, while also recognizing the limitations of GPU-based methods for big problems using a large amount of data, this paper explores the efficacy of employing other types of parallel hardware for PSO. Most commodity systems feature a variety of architectures whose high-performance capabilities can be exploited. In this paper, high-dimensional problems and those that employ a large amount of external data are explored within the context of heterogeneous systems. Large problems are decomposed into constituent components, and analyses are undertaken of which components would benefit from multi-core or GPU parallelism. The current study therefore provides another demonstration that ''supercomputing on a budget'' is possible when subtasks of large problems are run on hardware most suited to these tasks. Experimental results show that large speedups can be achieved on high dimensional, data-intensive problems. Cost functions must first be analysed for parallelization opportunities, and assigned hardware based on the particular task
High-dimensional single-cell cancer biology.
Irish, Jonathan M; Doxie, Deon B
2014-01-01
Cancer cells are distinguished from each other and from healthy cells by features that drive clonal evolution and therapy resistance. New advances in high-dimensional flow cytometry make it possible to systematically measure mechanisms of tumor initiation, progression, and therapy resistance on millions of cells from human tumors. Here we describe flow cytometry techniques that enable a "single-cell " view of cancer. High-dimensional techniques like mass cytometry enable multiplexed single-cell analysis of cell identity, clinical biomarkers, signaling network phospho-proteins, transcription factors, and functional readouts of proliferation, cell cycle status, and apoptosis. This capability pairs well with a signaling profiles approach that dissects mechanism by systematically perturbing and measuring many nodes in a signaling network. Single-cell approaches enable study of cellular heterogeneity of primary tissues and turn cell subsets into experimental controls or opportunities for new discovery. Rare populations of stem cells or therapy-resistant cancer cells can be identified and compared to other types of cells within the same sample. In the long term, these techniques will enable tracking of minimal residual disease (MRD) and disease progression. By better understanding biological systems that control development and cell-cell interactions in healthy and diseased contexts, we can learn to program cells to become therapeutic agents or target malignant signaling events to specifically kill cancer cells. Single-cell approaches that provide deep insight into cell signaling and fate decisions will be critical to optimizing the next generation of cancer treatments combining targeted approaches and immunotherapy.
Drummond, Alexei J; Nicholls, Geoff K; Rodrigo, Allen G; Solomon, Wiremu
2002-07-01
Molecular sequences obtained at different sampling times from populations of rapidly evolving pathogens and from ancient subfossil and fossil sources are increasingly available with modern sequencing technology. Here, we present a Bayesian statistical inference approach to the joint estimation of mutation rate and population size that incorporates the uncertainty in the genealogy of such temporally spaced sequences by using Markov chain Monte Carlo (MCMC) integration. The Kingman coalescent model is used to describe the time structure of the ancestral tree. We recover information about the unknown true ancestral coalescent tree, population size, and the overall mutation rate from temporally spaced data, that is, from nucleotide sequences gathered at different times, from different individuals, in an evolving haploid population. We briefly discuss the methodological implications and show what can be inferred, in various practically relevant states of prior knowledge. We develop extensions for exponentially growing population size and joint estimation of substitution model parameters. We illustrate some of the important features of this approach on a genealogy of HIV-1 envelope (env) partial sequences.
Energy Technology Data Exchange (ETDEWEB)
Plesko, Catherine S [Los Alamos National Laboratory; Clement, R Ryan [Los Alamos National Laboratory; Weaver, Robert P [Los Alamos National Laboratory; Bradley, Paul A [Los Alamos National Laboratory; Huebner, Walter F [Los Alamos National Laboratory
2009-01-01
The mitigation of impact hazards resulting from Earth-approaching asteroids and comets has received much attention in the popular press. However, many questions remain about the near-term and long-term, feasibility and appropriate application of all proposed methods. Recent and ongoing ground- and space-based observations of small solar-system body composition and dynamics have revolutionized our understanding of these bodies (e.g., Ryan (2000), Fujiwara et al. (2006), and Jedicke et al. (2006)). Ongoing increases in computing power and algorithm sophistication make it possible to calculate the response of these inhomogeneous objects to proposed mitigation techniques. Here we present the first phase of a comprehensive hazard mitigation planning effort undertaken by Southwest Research Institute and Los Alamos National Laboratory. We begin by reviewing the parameter space of the object's physical and chemical composition and trajectory. We then use the radiation hydrocode RAGE (Gittings et al. 2008), Monte Carlo N-Particle (MCNP) radiation transport (see Clement et al., this conference), and N-body dynamics codes to explore the effects these variations in object properties have on the coupling of energy into the object from a variety of mitigation techniques, including deflection and disruption by nuclear and conventional munitions, and a kinetic impactor.
Reconciling Planck with the local value of H0 in extended parameter space
Directory of Open Access Journals (Sweden)
Eleonora Di Valentino
2016-10-01
Full Text Available The recent determination of the local value of the Hubble constant by Riess et al., 2016 (hereafter R16 is now 3.3 sigma higher than the value derived from the most recent CMB anisotropy data provided by the Planck satellite in a ΛCDM model. Here we perform a combined analysis of the Planck and R16 results in an extended parameter space, varying simultaneously 12 cosmological parameters instead of the usual 6. We find that a phantom-like dark energy component, with effective equation of state w=−1.29−0.12+0.15 at 68% c.l. can solve the current tension between the Planck dataset and the R16 prior in an extended ΛCDM scenario. On the other hand, the neutrino effective number is fully compatible with standard expectations. This result is confirmed when including cosmic shear data from the CFHTLenS survey and CMB lensing constraints from Planck. However, when BAO measurements are included we find that some of the tension with R16 remains, as also is the case when we include the supernova type Ia luminosity distances from the JLA catalog.
GMC COLLISIONS AS TRIGGERS OF STAR FORMATION. I. PARAMETER SPACE EXPLORATION WITH 2D SIMULATIONS
Energy Technology Data Exchange (ETDEWEB)
Wu, Benjamin [Department of Physics, University of Florida, Gainesville, FL 32611 (United States); Loo, Sven Van [School of Physics and Astronomy, University of Leeds, Leeds LS2 9JT (United Kingdom); Tan, Jonathan C. [Departments of Astronomy and Physics, University of Florida, Gainesville, FL 32611 (United States); Bruderer, Simon, E-mail: benwu@phys.ufl.edu [Max Planck Institute for Extraterrestrial Physics, Giessenbachstrasse 1, D-85748 Garching (Germany)
2015-09-20
We utilize magnetohydrodynamic (MHD) simulations to develop a numerical model for giant molecular cloud (GMC)–GMC collisions between nearly magnetically critical clouds. The goal is to determine if, and under what circumstances, cloud collisions can cause pre-existing magnetically subcritical clumps to become supercritical and undergo gravitational collapse. We first develop and implement new photodissociation region based heating and cooling functions that span the atomic to molecular transition, creating a multiphase ISM and allowing modeling of non-equilibrium temperature structures. Then in 2D and with ideal MHD, we explore a wide parameter space of magnetic field strength, magnetic field geometry, collision velocity, and impact parameter and compare isolated versus colliding clouds. We find factors of ∼2–3 increase in mean clump density from typical collisions, with strong dependence on collision velocity and magnetic field strength, but ultimately limited by flux-freezing in 2D geometries. For geometries enabling flow along magnetic field lines, greater degrees of collapse are seen. We discuss observational diagnostics of cloud collisions, focussing on {sup 13}CO(J = 2–1), {sup 13}CO(J = 3–2), and {sup 12}CO(J = 8–7) integrated intensity maps and spectra, which we synthesize from our simulation outputs. We find that the ratio of J = 8–7 to lower-J emission is a powerful diagnostic probe of GMC collisions.
Energy Technology Data Exchange (ETDEWEB)
Fatima, Zareen; Motosugi, Utaroh; Ishigame, Keiichi; Araki, Tsutomu [University of Yamanashi, Department of Radiology, Chuo-shi, Yamanashi (Japan); Waqar, Ahmed Bilal [University of Yamanashi, Department of Molecular Pathology, Interdisciplinary Graduate School of Medicine and Engineering, Chuo-shi, Yamanashi (Japan); Hori, Masaaki [Juntendo University, Department of Radiology, School of Medicine, Tokyo (Japan); Oishi, Naoki; Katoh, Ryohei [University of Yamanashi, Department of Pathology, Chuo-shi, Yamanashi (Japan); Onodera, Toshiyuki; Yagi, Kazuo [Tokyo Metropolitan University, Department of Radiological Sciences, Graduate School of Human Health Sciences, Tokyo (Japan)
2013-08-15
The purposes of this MR-based study were to calculate q-space imaging (QSI)-derived mean displacement (MDP) in meningiomas, to evaluate the correlation of MDP values with apparent diffusion coefficient (ADC) and to investigate the relationships among these diffusion parameters, tumour cell count (TCC) and MIB-1 labelling index (LI). MRI, including QSI and conventional diffusion-weighted imaging (DWI), was performed in 44 meningioma patients (52 lesions). ADC and MDP maps were acquired from post-processing of the data. Quantitative analyses of these maps were performed by applying regions of interest. Pearson correlation coefficients were calculated for ADC and MDP in all lesions and for ADC and TCC, MDP and TCC, ADC and MIB-1 LI, and MDP and MIB-1 LI in 17 patients who underwent subsequent surgery. ADC and MDP values were found to have a strong correlation: r = 0.78 (P = <0.0001). Both ADC and MDP values had a significant negative association with TCC: r = -0.53 (p = 0.02) and -0.48 (P = 0.04), respectively. MIB-1 LI was not, however, found to have a significant association with these diffusion parameters. In meningiomas, both ADC and MDP may be representative of cell density. (orig.)
International Nuclear Information System (INIS)
Fatima, Zareen; Motosugi, Utaroh; Ishigame, Keiichi; Araki, Tsutomu; Waqar, Ahmed Bilal; Hori, Masaaki; Oishi, Naoki; Katoh, Ryohei; Onodera, Toshiyuki; Yagi, Kazuo
2013-01-01
The purposes of this MR-based study were to calculate q-space imaging (QSI)-derived mean displacement (MDP) in meningiomas, to evaluate the correlation of MDP values with apparent diffusion coefficient (ADC) and to investigate the relationships among these diffusion parameters, tumour cell count (TCC) and MIB-1 labelling index (LI). MRI, including QSI and conventional diffusion-weighted imaging (DWI), was performed in 44 meningioma patients (52 lesions). ADC and MDP maps were acquired from post-processing of the data. Quantitative analyses of these maps were performed by applying regions of interest. Pearson correlation coefficients were calculated for ADC and MDP in all lesions and for ADC and TCC, MDP and TCC, ADC and MIB-1 LI, and MDP and MIB-1 LI in 17 patients who underwent subsequent surgery. ADC and MDP values were found to have a strong correlation: r = 0.78 (P = <0.0001). Both ADC and MDP values had a significant negative association with TCC: r = -0.53 (p = 0.02) and -0.48 (P = 0.04), respectively. MIB-1 LI was not, however, found to have a significant association with these diffusion parameters. In meningiomas, both ADC and MDP may be representative of cell density. (orig.)
He, Ling Yan; Wang, Tie-Jun; Wang, Chuan
2016-07-11
High-dimensional quantum system provides a higher capacity of quantum channel, which exhibits potential applications in quantum information processing. However, high-dimensional universal quantum logic gates is difficult to achieve directly with only high-dimensional interaction between two quantum systems and requires a large number of two-dimensional gates to build even a small high-dimensional quantum circuits. In this paper, we propose a scheme to implement a general controlled-flip (CF) gate where the high-dimensional single photon serve as the target qudit and stationary qubits work as the control logic qudit, by employing a three-level Λ-type system coupled with a whispering-gallery-mode microresonator. In our scheme, the required number of interaction times between the photon and solid state system reduce greatly compared with the traditional method which decomposes the high-dimensional Hilbert space into 2-dimensional quantum space, and it is on a shorter temporal scale for the experimental realization. Moreover, we discuss the performance and feasibility of our hybrid CF gate, concluding that it can be easily extended to a 2n-dimensional case and it is feasible with current technology.
Class prediction for high-dimensional class-imbalanced data
Directory of Open Access Journals (Sweden)
Lusa Lara
2010-10-01
Full Text Available Abstract Background The goal of class prediction studies is to develop rules to accurately predict the class membership of new samples. The rules are derived using the values of the variables available for each subject: the main characteristic of high-dimensional data is that the number of variables greatly exceeds the number of samples. Frequently the classifiers are developed using class-imbalanced data, i.e., data sets where the number of samples in each class is not equal. Standard classification methods used on class-imbalanced data often produce classifiers that do not accurately predict the minority class; the prediction is biased towards the majority class. In this paper we investigate if the high-dimensionality poses additional challenges when dealing with class-imbalanced prediction. We evaluate the performance of six types of classifiers on class-imbalanced data, using simulated data and a publicly available data set from a breast cancer gene-expression microarray study. We also investigate the effectiveness of some strategies that are available to overcome the effect of class imbalance. Results Our results show that the evaluated classifiers are highly sensitive to class imbalance and that variable selection introduces an additional bias towards classification into the majority class. Most new samples are assigned to the majority class from the training set, unless the difference between the classes is very large. As a consequence, the class-specific predictive accuracies differ considerably. When the class imbalance is not too severe, down-sizing and asymmetric bagging embedding variable selection work well, while over-sampling does not. Variable normalization can further worsen the performance of the classifiers. Conclusions Our results show that matching the prevalence of the classes in training and test set does not guarantee good performance of classifiers and that the problems related to classification with class
High-dimensional change-point estimation: Combining filtering with convex optimization
Soh, Yong Sheng; Chandrasekaran, Venkat
2017-01-01
We consider change-point estimation in a sequence of high-dimensional signals given noisy observations. Classical approaches to this problem such as the filtered derivative method are useful for sequences of scalar-valued signals, but they have undesirable scaling behavior in the high-dimensional setting. However, many high-dimensional signals encountered in practice frequently possess latent low-dimensional structure. Motivated by this observation, we propose a technique for high-dimensional...
Applying recursive numerical integration techniques for solving high dimensional integrals
International Nuclear Information System (INIS)
Ammon, Andreas; Genz, Alan; Hartung, Tobias; Jansen, Karl; Volmer, Julia; Leoevey, Hernan
2016-11-01
The error scaling for Markov-Chain Monte Carlo techniques (MCMC) with N samples behaves like 1/√(N). This scaling makes it often very time intensive to reduce the error of computed observables, in particular for applications in lattice QCD. It is therefore highly desirable to have alternative methods at hand which show an improved error scaling. One candidate for such an alternative integration technique is the method of recursive numerical integration (RNI). The basic idea of this method is to use an efficient low-dimensional quadrature rule (usually of Gaussian type) and apply it iteratively to integrate over high-dimensional observables and Boltzmann weights. We present the application of such an algorithm to the topological rotor and the anharmonic oscillator and compare the error scaling to MCMC results. In particular, we demonstrate that the RNI technique shows an error scaling in the number of integration points m that is at least exponential.
High-dimensional cluster analysis with the Masked EM Algorithm
Kadir, Shabnam N.; Goodman, Dan F. M.; Harris, Kenneth D.
2014-01-01
Cluster analysis faces two problems in high dimensions: first, the “curse of dimensionality” that can lead to overfitting and poor generalization performance; and second, the sheer time taken for conventional algorithms to process large amounts of high-dimensional data. We describe a solution to these problems, designed for the application of “spike sorting” for next-generation high channel-count neural probes. In this problem, only a small subset of features provide information about the cluster member-ship of any one data vector, but this informative feature subset is not the same for all data points, rendering classical feature selection ineffective. We introduce a “Masked EM” algorithm that allows accurate and time-efficient clustering of up to millions of points in thousands of dimensions. We demonstrate its applicability to synthetic data, and to real-world high-channel-count spike sorting data. PMID:25149694
Applying recursive numerical integration techniques for solving high dimensional integrals
Energy Technology Data Exchange (ETDEWEB)
Ammon, Andreas [IVU Traffic Technologies AG, Berlin (Germany); Genz, Alan [Washington State Univ., Pullman, WA (United States). Dept. of Mathematics; Hartung, Tobias [King' s College, London (United Kingdom). Dept. of Mathematics; Jansen, Karl; Volmer, Julia [Deutsches Elektronen-Synchrotron (DESY), Zeuthen (Germany). John von Neumann-Inst. fuer Computing NIC; Leoevey, Hernan [Humboldt Univ. Berlin (Germany). Inst. fuer Mathematik
2016-11-15
The error scaling for Markov-Chain Monte Carlo techniques (MCMC) with N samples behaves like 1/√(N). This scaling makes it often very time intensive to reduce the error of computed observables, in particular for applications in lattice QCD. It is therefore highly desirable to have alternative methods at hand which show an improved error scaling. One candidate for such an alternative integration technique is the method of recursive numerical integration (RNI). The basic idea of this method is to use an efficient low-dimensional quadrature rule (usually of Gaussian type) and apply it iteratively to integrate over high-dimensional observables and Boltzmann weights. We present the application of such an algorithm to the topological rotor and the anharmonic oscillator and compare the error scaling to MCMC results. In particular, we demonstrate that the RNI technique shows an error scaling in the number of integration points m that is at least exponential.
Reduced order surrogate modelling (ROSM) of high dimensional deterministic simulations
Mitry, Mina
Often, computationally expensive engineering simulations can prohibit the engineering design process. As a result, designers may turn to a less computationally demanding approximate, or surrogate, model to facilitate their design process. However, owing to the the curse of dimensionality, classical surrogate models become too computationally expensive for high dimensional data. To address this limitation of classical methods, we develop linear and non-linear Reduced Order Surrogate Modelling (ROSM) techniques. Two algorithms are presented, which are based on a combination of linear/kernel principal component analysis and radial basis functions. These algorithms are applied to subsonic and transonic aerodynamic data, as well as a model for a chemical spill in a channel. The results of this thesis show that ROSM can provide a significant computational benefit over classical surrogate modelling, sometimes at the expense of a minor loss in accuracy.
Asymptotics of empirical eigenstructure for high dimensional spiked covariance.
Wang, Weichen; Fan, Jianqing
2017-06-01
We derive the asymptotic distributions of the spiked eigenvalues and eigenvectors under a generalized and unified asymptotic regime, which takes into account the magnitude of spiked eigenvalues, sample size, and dimensionality. This regime allows high dimensionality and diverging eigenvalues and provides new insights into the roles that the leading eigenvalues, sample size, and dimensionality play in principal component analysis. Our results are a natural extension of those in Paul (2007) to a more general setting and solve the rates of convergence problems in Shen et al. (2013). They also reveal the biases of estimating leading eigenvalues and eigenvectors by using principal component analysis, and lead to a new covariance estimator for the approximate factor model, called shrinkage principal orthogonal complement thresholding (S-POET), that corrects the biases. Our results are successfully applied to outstanding problems in estimation of risks of large portfolios and false discovery proportions for dependent test statistics and are illustrated by simulation studies.
Directory of Open Access Journals (Sweden)
Laurent Berge
2012-01-01
Full Text Available This paper presents the R package HDclassif which is devoted to the clustering and the discriminant analysis of high-dimensional data. The classification methods proposed in the package result from a new parametrization of the Gaussian mixture model which combines the idea of dimension reduction and model constraints on the covariance matrices. The supervised classification method using this parametrization is called high dimensional discriminant analysis (HDDA. In a similar manner, the associated clustering method iscalled high dimensional data clustering (HDDC and uses the expectation-maximization algorithm for inference. In order to correctly t the data, both methods estimate the specific subspace and the intrinsic dimension of the groups. Due to the constraints on the covariance matrices, the number of parameters to estimate is significantly lower than other model-based methods and this allows the methods to be stable and efficient in high dimensions. Two introductory examples illustrated with R codes allow the user to discover the hdda and hddc functions. Experiments on simulated and real datasets also compare HDDC and HDDA with existing classification methods on high-dimensional datasets. HDclassif is a free software and distributed under the general public license, as part of the R software project.
Finding viable models in SUSY parameter spaces with signal specific discovery potential
Burgess, Thomas; Lindroos, Jan Øye; Lipniacka, Anna; Sandaker, Heidi
2013-08-01
Recent results from ATLAS giving a Higgs mass of 125.5 GeV, further constrain already highly constrained supersymmetric models such as pMSSM or CMSSM/mSUGRA. As a consequence, finding potentially discoverable and non-excluded regions of model parameter space is becoming increasingly difficult. Several groups have invested large effort in studying the consequences of Higgs mass bounds, upper limits on rare B-meson decays, and limits on relic dark matter density on constrained models, aiming at predicting superpartner masses, and establishing likelihood of SUSY models compared to that of the Standard Model vis-á-vis experimental data. In this paper a framework for efficient search for discoverable, non-excluded regions of different SUSY spaces giving specific experimental signature of interest is presented. The method employs an improved Markov Chain Monte Carlo (MCMC) scheme exploiting an iteratively updated likelihood function to guide search for viable models. Existing experimental and theoretical bounds as well as the LHC discovery potential are taken into account. This includes recent bounds on relic dark matter density, the Higgs sector and rare B-mesons decays. A clustering algorithm is applied to classify selected models according to expected phenomenology enabling automated choice of experimental benchmarks and regions to be used for optimizing searches. The aim is to provide experimentalist with a viable tool helping to target experimental signatures to search for, once a class of models of interest is established. As an example a search for viable CMSSM models with τ-lepton signatures observable with the 2012 LHC data set is presented. In the search 105209 unique models were probed. From these, ten reference benchmark points covering different ranges of phenomenological observables at the LHC were selected.
Zhang, Bo; Chen, Zhen; Albert, Paul S
2012-01-01
High-dimensional biomarker data are often collected in epidemiological studies when assessing the association between biomarkers and human disease is of interest. We develop a latent class modeling approach for joint analysis of high-dimensional semicontinuous biomarker data and a binary disease outcome. To model the relationship between complex biomarker expression patterns and disease risk, we use latent risk classes to link the 2 modeling components. We characterize complex biomarker-specific differences through biomarker-specific random effects, so that different biomarkers can have different baseline (low-risk) values as well as different between-class differences. The proposed approach also accommodates data features that are common in environmental toxicology and other biomarker exposure data, including a large number of biomarkers, numerous zero values, and complex mean-variance relationship in the biomarkers levels. A Monte Carlo EM (MCEM) algorithm is proposed for parameter estimation. Both the MCEM algorithm and model selection procedures are shown to work well in simulations and applications. In applying the proposed approach to an epidemiological study that examined the relationship between environmental polychlorinated biphenyl (PCB) exposure and the risk of endometriosis, we identified a highly significant overall effect of PCB concentrations on the risk of endometriosis.
Briseño, Jessica; Herrera, Graciela S.
2010-05-01
Herrera (1998) proposed a method for the optimal design of groundwater quality monitoring networks that involves space and time in a combined form. The method was applied later by Herrera et al (2001) and by Herrera and Pinder (2005). To get the estimates of the contaminant concentration being analyzed, this method uses a space-time ensemble Kalman filter, based on a stochastic flow and transport model. When the method is applied, it is important that the characteristics of the stochastic model be congruent with field data, but, in general, it is laborious to manually achieve a good match between them. For this reason, the main objective of this work is to extend the space-time ensemble Kalman filter proposed by Herrera, to estimate the hydraulic conductivity, together with hydraulic head and contaminant concentration, and its application in a synthetic example. The method has three steps: 1) Given the mean and the semivariogram of the natural logarithm of hydraulic conductivity (ln K), random realizations of this parameter are obtained through two alternatives: Gaussian simulation (SGSim) and Latin Hypercube Sampling method (LHC). 2) The stochastic model is used to produce hydraulic head (h) and contaminant (C) realizations, for each one of the conductivity realizations. With these realization the mean of ln K, h and C are obtained, for h and C, the mean is calculated in space and time, and also the cross covariance matrix h-ln K-C in space and time. The covariance matrix is obtained averaging products of the ln K, h and C realizations on the estimation points and times, and the positions and times with data of the analyzed variables. The estimation points are the positions at which estimates of ln K, h or C are gathered. In an analogous way, the estimation times are those at which estimates of any of the three variables are gathered. 3) Finally the ln K, h and C estimate are obtained using the space-time ensemble Kalman filter. The realization mean for each one
Remote sensing of refractivity from space for global observations of atmospheric parameters
International Nuclear Information System (INIS)
Gorbunov, M.E.; Sokolovskiy, S.V.
1993-01-01
This report presents the first results of computational simulations on the retrieval of meteorological parameters from space refractometric data on the basis of the ECHAM 3 model developed at the Max Planck Institute for Meteorology (Roeckner et al. 1992). For this purpose the grid fields of temperature, geopotential and humidity available from the model were interpolated and a continuous spatial field of refractivity (together with its first derivative) was generated. This field was used for calculating the trajectories of electromagnetic rays for the given orbits of transmitting and receiving satellites and for the determination of the quantities (incident angles or Doppler frequency shifts) being measured at receiving satellite during occultation. These quantities were then used for solving the inverse problem - retrieving the distribution of refractivity in the vicinity of the ray perigees. The retrieved refractivity was used to calculate pressure and temperature (using the hydrostatic equation and the equation of state). The results were compared with initial data, and the retrieval errors were evaluated. The study shows that the refractivity can be retrieved with very high accuracy in particular if a tomographic reconstruction is applied. Effects of humidity and temperature are not separable. Stratospheric temperatures globally and upper tropospheric temperatures at middle and high latitudes can be accurately retrieved, other areas require humidity data. Alternatively humidity data can be retrieved if the temperature fields are known. (orig.)
Directory of Open Access Journals (Sweden)
O. Tkachenko
2017-06-01
Full Text Available Physicochemical and biochemical indices, which characterize quality of white wine grape varieties Zagrey and Aromatnyi of selection of NNC «IV&W named after V. Ye. Tairov», (harvest of 2016 were determined. The field trial which includes various variants of planting density and vine training systems, made it possible to study the influence of viticulture practices on the criteria of carbohydrate-acid and phenolic complex, oxidative enzyme system of grapes. Low-density plantings of Aromatnyi variety (2222 vines per ha were characterized by harvest that slightly exceeded the grapes obtained from dense plantations (4000 vines per ha in terms of carbohydrate-acid and phenolic complexes. The most optimal in terms of the mass concentration of sugars, phenolic substances, polymer forms, macerating ability of must, activity of oxidizing enzyme system was cultivation of this variety on a 160 cm – high trunk. Growing grapes of Zagrey variety with vine spacing, corresponding to 4000 plants per ha, contributed to obtaining harvest with optimal parameters of carbohydrate-acid complex, low technological reserve and mass concentration of phenolic compounds, moderate macerating ability and activity of monophenol monooxygenase in must. Training vines of this variety on a 40 cm high trunk with vertical shoot positioning led to significant deterioration of grape quality due to increased content of phenolic substances and their polymer forms, high macerating capacity of must.
International Nuclear Information System (INIS)
Funk, J.G.; Sykes, G.F. Jr.
1989-04-01
The effects of simulated space environmental parameters on microdamage induced by the environment in a series of commercially available graphite-fiber-reinforced composite materials were determined. Composites with both thermoset and thermoplastic resin systems were studied. Low-Earth-Orbit (LEO) exposures were simulated by thermal cycling; geosynchronous-orbit (GEO) exposures were simulated by electron irradiation plus thermal cycling. The thermal cycling temperature range was -250 F to either 200 F or 150 F. The upper limits of the thermal cycles were different to ensure that an individual composite material was not cycled above its glass transition temperature. Material response was characterized through assessment of the induced microcracking and its influence on mechanical property changes at both room temperature and -250 F. Microdamage was induced in both thermoset and thermoplastic advanced composite materials exposed to the simulated LEO environment. However, a 350 F cure single-phase toughened epoxy composite was not damaged during exposure to the LEO environment. The simuated GEO environment produced microdamage in all materials tested
Displacement in the parameter space versus spurious solution of discretization with large time step
International Nuclear Information System (INIS)
Mendes, Eduardo; Letellier, Christophe
2004-01-01
In order to investigate a possible correspondence between differential and difference equations, it is important to possess discretization of ordinary differential equations. It is well known that when differential equations are discretized, the solution thus obtained depends on the time step used. In the majority of cases, such a solution is considered spurious when it does not resemble the expected solution of the differential equation. This often happens when the time step taken into consideration is too large. In this work, we show that, even for quite large time steps, some solutions which do not correspond to the expected ones are still topologically equivalent to solutions of the original continuous system if a displacement in the parameter space is considered. To reduce such a displacement, a judicious choice of the discretization scheme should be made. To this end, a recent discretization scheme, based on the Lie expansion of the original differential equations, proposed by Monaco and Normand-Cyrot will be analysed. Such a scheme will be shown to be sufficient for providing an adequate discretization for quite large time steps compared to the pseudo-period of the underlying dynamics
FORECASTING COSMOLOGICAL PARAMETER CONSTRAINTS FROM NEAR-FUTURE SPACE-BASED GALAXY SURVEYS
International Nuclear Information System (INIS)
Pavlov, Anatoly; Ratra, Bharat; Samushia, Lado
2012-01-01
The next generation of space-based galaxy surveys is expected to measure the growth rate of structure to a level of about one percent over a range of redshifts. The rate of growth of structure as a function of redshift depends on the behavior of dark energy and so can be used to constrain parameters of dark energy models. In this work, we investigate how well these future data will be able to constrain the time dependence of the dark energy density. We consider parameterizations of the dark energy equation of state, such as XCDM and ωCDM, as well as a consistent physical model of time-evolving scalar field dark energy, φCDM. We show that if the standard, specially flat cosmological model is taken as a fiducial model of the universe, these near-future measurements of structure growth will be able to constrain the time dependence of scalar field dark energy density to a precision of about 10%, which is almost an order of magnitude better than what can be achieved from a compilation of currently available data sets.
Yu, Wenbao; Park, Taesung
2014-01-01
It is common to get an optimal combination of markers for disease classification and prediction when multiple markers are available. Many approaches based on the area under the receiver operating characteristic curve (AUC) have been proposed. Existing works based on AUC in a high-dimensional context depend mainly on a non-parametric, smooth approximation of AUC, with no work using a parametric AUC-based approach, for high-dimensional data. We propose an AUC-based approach using penalized regression (AucPR), which is a parametric method used for obtaining a linear combination for maximizing the AUC. To obtain the AUC maximizer in a high-dimensional context, we transform a classical parametric AUC maximizer, which is used in a low-dimensional context, into a regression framework and thus, apply the penalization regression approach directly. Two kinds of penalization, lasso and elastic net, are considered. The parametric approach can avoid some of the difficulties of a conventional non-parametric AUC-based approach, such as the lack of an appropriate concave objective function and a prudent choice of the smoothing parameter. We apply the proposed AucPR for gene selection and classification using four real microarray and synthetic data. Through numerical studies, AucPR is shown to perform better than the penalized logistic regression and the nonparametric AUC-based method, in the sense of AUC and sensitivity for a given specificity, particularly when there are many correlated genes. We propose a powerful parametric and easily-implementable linear classifier AucPR, for gene selection and disease prediction for high-dimensional data. AucPR is recommended for its good prediction performance. Beside gene expression microarray data, AucPR can be applied to other types of high-dimensional omics data, such as miRNA and protein data.
Explorations on High Dimensional Landscapes: Spin Glasses and Deep Learning
Sagun, Levent
This thesis deals with understanding the structure of high-dimensional and non-convex energy landscapes. In particular, its focus is on the optimization of two classes of functions: homogeneous polynomials and loss functions that arise in machine learning. In the first part, the notion of complexity of a smooth, real-valued function is studied through its critical points. Existing theoretical results predict that certain random functions that are defined on high dimensional domains have a narrow band of values whose pre-image contains the bulk of its critical points. This section provides empirical evidence for convergence of gradient descent to local minima whose energies are near the predicted threshold justifying the existing asymptotic theory. Moreover, it is empirically shown that a similar phenomenon may hold for deep learning loss functions. Furthermore, there is a comparative analysis of gradient descent and its stochastic version showing that in high dimensional regimes the latter is a mere speedup. The next study focuses on the halting time of an algorithm at a given stopping condition. Given an algorithm, the normalized fluctuations of the halting time follow a distribution that remains unchanged even when the input data is sampled from a new distribution. Two qualitative classes are observed: a Gumbel-like distribution that appears in Google searches, human decision times, and spin glasses and a Gaussian-like distribution that appears in conjugate gradient method, deep learning with MNIST and random input data. Following the universality phenomenon, the Hessian of the loss functions of deep learning is studied. The spectrum is seen to be composed of two parts, the bulk which is concentrated around zero, and the edges which are scattered away from zero. Empirical evidence is presented for the bulk indicating how over-parametrized the system is, and for the edges that depend on the input data. Furthermore, an algorithm is proposed such that it would
Saini, Harsh; Lal, Sunil Pranit; Naidu, Vimal Vikash; Pickering, Vincel Wince; Singh, Gurmeet; Tsunoda, Tatsuhiko; Sharma, Alok
2016-12-05
High dimensional feature space generally degrades classification in several applications. In this paper, we propose a strategy called gene masking, in which non-contributing dimensions are heuristically removed from the data to improve classification accuracy. Gene masking is implemented via a binary encoded genetic algorithm that can be integrated seamlessly with classifiers during the training phase of classification to perform feature selection. It can also be used to discriminate between features that contribute most to the classification, thereby, allowing researchers to isolate features that may have special significance. This technique was applied on publicly available datasets whereby it substantially reduced the number of features used for classification while maintaining high accuracies. The proposed technique can be extremely useful in feature selection as it heuristically removes non-contributing features to improve the performance of classifiers.
A sparse grid based method for generative dimensionality reduction of high-dimensional data
Bohn, Bastian; Garcke, Jochen; Griebel, Michael
2016-03-01
Generative dimensionality reduction methods play an important role in machine learning applications because they construct an explicit mapping from a low-dimensional space to the high-dimensional data space. We discuss a general framework to describe generative dimensionality reduction methods, where the main focus lies on a regularized principal manifold learning variant. Since most generative dimensionality reduction algorithms exploit the representer theorem for reproducing kernel Hilbert spaces, their computational costs grow at least quadratically in the number n of data. Instead, we introduce a grid-based discretization approach which automatically scales just linearly in n. To circumvent the curse of dimensionality of full tensor product grids, we use the concept of sparse grids. Furthermore, in real-world applications, some embedding directions are usually more important than others and it is reasonable to refine the underlying discretization space only in these directions. To this end, we employ a dimension-adaptive algorithm which is based on the ANOVA (analysis of variance) decomposition of a function. In particular, the reconstruction error is used to measure the quality of an embedding. As an application, the study of large simulation data from an engineering application in the automotive industry (car crash simulation) is performed.
Progress in high-dimensional percolation and random graphs
Heydenreich, Markus
2017-01-01
This text presents an engaging exposition of the active field of high-dimensional percolation that will likely provide an impetus for future work. With over 90 exercises designed to enhance the reader’s understanding of the material, as well as many open problems, the book is aimed at graduate students and researchers who wish to enter the world of this rich topic. The text may also be useful in advanced courses and seminars, as well as for reference and individual study. Part I, consisting of 3 chapters, presents a general introduction to percolation, stating the main results, defining the central objects, and proving its main properties. No prior knowledge of percolation is assumed. Part II, consisting of Chapters 4–9, discusses mean-field critical behavior by describing the two main techniques used, namely, differential inequalities and the lace expansion. In Parts I and II, all results are proved, making this the first self-contained text discussing high-dimensiona l percolation. Part III, consist...
Effects of dependence in high-dimensional multiple testing problems
Directory of Open Access Journals (Sweden)
van de Wiel Mark A
2008-02-01
Full Text Available Abstract Background We consider effects of dependence among variables of high-dimensional data in multiple hypothesis testing problems, in particular the False Discovery Rate (FDR control procedures. Recent simulation studies consider only simple correlation structures among variables, which is hardly inspired by real data features. Our aim is to systematically study effects of several network features like sparsity and correlation strength by imposing dependence structures among variables using random correlation matrices. Results We study the robustness against dependence of several FDR procedures that are popular in microarray studies, such as Benjamin-Hochberg FDR, Storey's q-value, SAM and resampling based FDR procedures. False Non-discovery Rates and estimates of the number of null hypotheses are computed from those methods and compared. Our simulation study shows that methods such as SAM and the q-value do not adequately control the FDR to the level claimed under dependence conditions. On the other hand, the adaptive Benjamini-Hochberg procedure seems to be most robust while remaining conservative. Finally, the estimates of the number of true null hypotheses under various dependence conditions are variable. Conclusion We discuss a new method for efficient guided simulation of dependent data, which satisfy imposed network constraints as conditional independence structures. Our simulation set-up allows for a structural study of the effect of dependencies on multiple testing criterions and is useful for testing a potentially new method on π0 or FDR estimation in a dependency context.
High-dimensional quantum cryptography with twisted light
International Nuclear Information System (INIS)
Mirhosseini, Mohammad; Magaña-Loaiza, Omar S; O’Sullivan, Malcolm N; Rodenburg, Brandon; Malik, Mehul; Boyd, Robert W; Lavery, Martin P J; Padgett, Miles J; Gauthier, Daniel J
2015-01-01
Quantum key distribution (QKD) systems often rely on polarization of light for encoding, thus limiting the amount of information that can be sent per photon and placing tight bounds on the error rates that such a system can tolerate. Here we describe a proof-of-principle experiment that indicates the feasibility of high-dimensional QKD based on the transverse structure of the light field allowing for the transfer of more than 1 bit per photon. Our implementation uses the orbital angular momentum (OAM) of photons and the corresponding mutually unbiased basis of angular position (ANG). Our experiment uses a digital micro-mirror device for the rapid generation of OAM and ANG modes at 4 kHz, and a mode sorter capable of sorting single photons based on their OAM and ANG content with a separation efficiency of 93%. Through the use of a seven-dimensional alphabet encoded in the OAM and ANG bases, we achieve a channel capacity of 2.05 bits per sifted photon. Our experiment demonstrates that, in addition to having an increased information capacity, multilevel QKD systems based on spatial-mode encoding can be more resilient against intercept-resend eavesdropping attacks. (paper)
Inference for High-dimensional Differential Correlation Matrices.
Cai, T Tony; Zhang, Anru
2016-01-01
Motivated by differential co-expression analysis in genomics, we consider in this paper estimation and testing of high-dimensional differential correlation matrices. An adaptive thresholding procedure is introduced and theoretical guarantees are given. Minimax rate of convergence is established and the proposed estimator is shown to be adaptively rate-optimal over collections of paired correlation matrices with approximately sparse differences. Simulation results show that the procedure significantly outperforms two other natural methods that are based on separate estimation of the individual correlation matrices. The procedure is also illustrated through an analysis of a breast cancer dataset, which provides evidence at the gene co-expression level that several genes, of which a subset has been previously verified, are associated with the breast cancer. Hypothesis testing on the differential correlation matrices is also considered. A test, which is particularly well suited for testing against sparse alternatives, is introduced. In addition, other related problems, including estimation of a single sparse correlation matrix, estimation of the differential covariance matrices, and estimation of the differential cross-correlation matrices, are also discussed.
Bayesian Subset Modeling for High-Dimensional Generalized Linear Models
Liang, Faming
2013-06-01
This article presents a new prior setting for high-dimensional generalized linear models, which leads to a Bayesian subset regression (BSR) with the maximum a posteriori model approximately equivalent to the minimum extended Bayesian information criterion model. The consistency of the resulting posterior is established under mild conditions. Further, a variable screening procedure is proposed based on the marginal inclusion probability, which shares the same properties of sure screening and consistency with the existing sure independence screening (SIS) and iterative sure independence screening (ISIS) procedures. However, since the proposed procedure makes use of joint information from all predictors, it generally outperforms SIS and ISIS in real applications. This article also makes extensive comparisons of BSR with the popular penalized likelihood methods, including Lasso, elastic net, SIS, and ISIS. The numerical results indicate that BSR can generally outperform the penalized likelihood methods. The models selected by BSR tend to be sparser and, more importantly, of higher prediction ability. In addition, the performance of the penalized likelihood methods tends to deteriorate as the number of predictors increases, while this is not significant for BSR. Supplementary materials for this article are available online. © 2013 American Statistical Association.
Gene flow analysis method, the D-statistic, is robust in a wide parameter space.
Zheng, Yichen; Janke, Axel
2018-01-08
We evaluated the sensitivity of the D-statistic, a parsimony-like method widely used to detect gene flow between closely related species. This method has been applied to a variety of taxa with a wide range of divergence times. However, its parameter space and thus its applicability to a wide taxonomic range has not been systematically studied. Divergence time, population size, time of gene flow, distance of outgroup and number of loci were examined in a sensitivity analysis. The sensitivity study shows that the primary determinant of the D-statistic is the relative population size, i.e. the population size scaled by the number of generations since divergence. This is consistent with the fact that the main confounding factor in gene flow detection is incomplete lineage sorting by diluting the signal. The sensitivity of the D-statistic is also affected by the direction of gene flow, size and number of loci. In addition, we examined the ability of the f-statistics, [Formula: see text] and [Formula: see text], to estimate the fraction of a genome affected by gene flow; while these statistics are difficult to implement to practical questions in biology due to lack of knowledge of when the gene flow happened, they can be used to compare datasets with identical or similar demographic background. The D-statistic, as a method to detect gene flow, is robust against a wide range of genetic distances (divergence times) but it is sensitive to population size. The D-statistic should only be applied with critical reservation to taxa where population sizes are large relative to branch lengths in generations.
Data-driven forecasting of high-dimensional chaotic systems with long short-term memory networks.
Vlachas, Pantelis R; Byeon, Wonmin; Wan, Zhong Y; Sapsis, Themistoklis P; Koumoutsakos, Petros
2018-05-01
We introduce a data-driven forecasting method for high-dimensional chaotic systems using long short-term memory (LSTM) recurrent neural networks. The proposed LSTM neural networks perform inference of high-dimensional dynamical systems in their reduced order space and are shown to be an effective set of nonlinear approximators of their attractor. We demonstrate the forecasting performance of the LSTM and compare it with Gaussian processes (GPs) in time series obtained from the Lorenz 96 system, the Kuramoto-Sivashinsky equation and a prototype climate model. The LSTM networks outperform the GPs in short-term forecasting accuracy in all applications considered. A hybrid architecture, extending the LSTM with a mean stochastic model (MSM-LSTM), is proposed to ensure convergence to the invariant measure. This novel hybrid method is fully data-driven and extends the forecasting capabilities of LSTM networks.
Joint Adaptive Mean-Variance Regularization and Variance Stabilization of High Dimensional Data.
Dazard, Jean-Eudes; Rao, J Sunil
2012-07-01
The paper addresses a common problem in the analysis of high-dimensional high-throughput "omics" data, which is parameter estimation across multiple variables in a set of data where the number of variables is much larger than the sample size. Among the problems posed by this type of data are that variable-specific estimators of variances are not reliable and variable-wise tests statistics have low power, both due to a lack of degrees of freedom. In addition, it has been observed in this type of data that the variance increases as a function of the mean. We introduce a non-parametric adaptive regularization procedure that is innovative in that : (i) it employs a novel "similarity statistic"-based clustering technique to generate local-pooled or regularized shrinkage estimators of population parameters, (ii) the regularization is done jointly on population moments, benefiting from C. Stein's result on inadmissibility, which implies that usual sample variance estimator is improved by a shrinkage estimator using information contained in the sample mean. From these joint regularized shrinkage estimators, we derived regularized t-like statistics and show in simulation studies that they offer more statistical power in hypothesis testing than their standard sample counterparts, or regular common value-shrinkage estimators, or when the information contained in the sample mean is simply ignored. Finally, we show that these estimators feature interesting properties of variance stabilization and normalization that can be used for preprocessing high-dimensional multivariate data. The method is available as an R package, called 'MVR' ('Mean-Variance Regularization'), downloadable from the CRAN website.
Interpolation of final geometry and result fields in process parameter space
Misiun, Grzegorz Stefan; Wang, Chao; Geijselaers, Hubertus J.M.; van den Boogaard, Antonius H.; Saanouni, K.
2016-01-01
Different routes to produce a product in a bulk forming process can be described by a limited set of process parameters. The parameters determine the final geometry as well as the distribution of state variables in the final shape. Ring rolling has been simulated using different parameter settings.
Braak, ter C.J.F.
2006-01-01
Differential Evolution (DE) is a simple genetic algorithm for numerical optimization in real parameter spaces. In a statistical context one would not just want the optimum but also its uncertainty. The uncertainty distribution can be obtained by a Bayesian analysis (after specifying prior and
Takizawa, Kenji; Tezduyar, Tayfun E.; Otoguro, Yuto
2018-04-01
Stabilized methods, which have been very common in flow computations for many years, typically involve stabilization parameters, and discontinuity-capturing (DC) parameters if the method is supplemented with a DC term. Various well-performing stabilization and DC parameters have been introduced for stabilized space-time (ST) computational methods in the context of the advection-diffusion equation and the Navier-Stokes equations of incompressible and compressible flows. These parameters were all originally intended for finite element discretization but quite often used also for isogeometric discretization. The stabilization and DC parameters we present here for ST computations are in the context of the advection-diffusion equation and the Navier-Stokes equations of incompressible flows, target isogeometric discretization, and are also applicable to finite element discretization. The parameters are based on a direction-dependent element length expression. The expression is outcome of an easy to understand derivation. The key components of the derivation are mapping the direction vector from the physical ST element to the parent ST element, accounting for the discretization spacing along each of the parametric coordinates, and mapping what we have in the parent element back to the physical element. The test computations we present for pure-advection cases show that the parameters proposed result in good solution profiles.
High-dimensional statistical inference: From vector to matrix
Zhang, Anru
Statistical inference for sparse signals or low-rank matrices in high-dimensional settings is of significant interest in a range of contemporary applications. It has attracted significant recent attention in many fields including statistics, applied mathematics and electrical engineering. In this thesis, we consider several problems in including sparse signal recovery (compressed sensing under restricted isometry) and low-rank matrix recovery (matrix recovery via rank-one projections and structured matrix completion). The first part of the thesis discusses compressed sensing and affine rank minimization in both noiseless and noisy cases and establishes sharp restricted isometry conditions for sparse signal and low-rank matrix recovery. The analysis relies on a key technical tool which represents points in a polytope by convex combinations of sparse vectors. The technique is elementary while leads to sharp results. It is shown that, in compressed sensing, delta kA 0, delta kA < 1/3 + epsilon, deltak A + thetak,kA < 1 + epsilon, or deltatkA< √(t - 1) / t + epsilon are not sufficient to guarantee the exact recovery of all k-sparse signals for large k. Similar result also holds for matrix recovery. In addition, the conditions delta kA<1/3, deltak A+ thetak,kA<1, delta tkA < √(t - 1)/t and deltarM<1/3, delta rM+ thetar,rM<1, delta trM< √(t - 1)/ t are also shown to be sufficient respectively for stable recovery of approximately sparse signals and low-rank matrices in the noisy case. For the second part of the thesis, we introduce a rank-one projection model for low-rank matrix recovery and propose a constrained nuclear norm minimization method for stable recovery of low-rank matrices in the noisy case. The procedure is adaptive to the rank and robust against small perturbations. Both upper and lower bounds for the estimation accuracy under the Frobenius norm loss are obtained. The proposed estimator is shown to be rate-optimal under certain conditions. The
da Costa, Diogo Ricardo; Hansen, Matheus; Guarise, Gustavo; Medrano-T, Rene O.; Leonel, Edson D.
2016-04-01
We show that extreme orbits, trajectories that connect local maximum and minimum values of one dimensional maps, play a major role in the parameter space of dissipative systems dictating the organization for the windows of periodicity, hence producing sets of shrimp-like structures. Here we solve three fundamental problems regarding the distribution of these sets and give: (i) their precise localization in the parameter space, even for sets of very high periods; (ii) their local and global distributions along cascades; and (iii) the association of these cascades to complicate sets of periodicity. The extreme orbits are proved to be a powerful indicator to investigate the organization of windows of periodicity in parameter planes. As applications of the theory, we obtain some results for the circle map and perturbed logistic map. The formalism presented here can be extended to many other different nonlinear and dissipative systems.
Genuinely high-dimensional nonlocality optimized by complementary measurements
International Nuclear Information System (INIS)
Lim, James; Ryu, Junghee; Yoo, Seokwon; Lee, Changhyoup; Bang, Jeongho; Lee, Jinhyoung
2010-01-01
Qubits exhibit extreme nonlocality when their state is maximally entangled and this is observed by mutually unbiased local measurements. This criterion does not hold for the Bell inequalities of high-dimensional systems (qudits), recently proposed by Collins-Gisin-Linden-Massar-Popescu and Son-Lee-Kim. Taking an alternative approach, called the quantum-to-classical approach, we derive a series of Bell inequalities for qudits that satisfy the criterion as for the qubits. In the derivation each d-dimensional subsystem is assumed to be measured by one of d possible measurements with d being a prime integer. By applying to two qubits (d=2), we find that a derived inequality is reduced to the Clauser-Horne-Shimony-Holt inequality when the degree of nonlocality is optimized over all the possible states and local observables. Further applying to two and three qutrits (d=3), we find Bell inequalities that are violated for the three-dimensionally entangled states but are not violated by any two-dimensionally entangled states. In other words, the inequalities discriminate three-dimensional (3D) entanglement from two-dimensional (2D) entanglement and in this sense they are genuinely 3D. In addition, for the two qutrits we give a quantitative description of the relations among the three degrees of complementarity, entanglement and nonlocality. It is shown that the degree of complementarity jumps abruptly to very close to its maximum as nonlocality starts appearing. These characteristics imply that complementarity plays a more significant role in the present inequality compared with the previously proposed inequality.
Statistical mechanics of complex neural systems and high dimensional data
International Nuclear Information System (INIS)
Advani, Madhu; Lahiri, Subhaneil; Ganguli, Surya
2013-01-01
Recent experimental advances in neuroscience have opened new vistas into the immense complexity of neuronal networks. This proliferation of data challenges us on two parallel fronts. First, how can we form adequate theoretical frameworks for understanding how dynamical network processes cooperate across widely disparate spatiotemporal scales to solve important computational problems? Second, how can we extract meaningful models of neuronal systems from high dimensional datasets? To aid in these challenges, we give a pedagogical review of a collection of ideas and theoretical methods arising at the intersection of statistical physics, computer science and neurobiology. We introduce the interrelated replica and cavity methods, which originated in statistical physics as powerful ways to quantitatively analyze large highly heterogeneous systems of many interacting degrees of freedom. We also introduce the closely related notion of message passing in graphical models, which originated in computer science as a distributed algorithm capable of solving large inference and optimization problems involving many coupled variables. We then show how both the statistical physics and computer science perspectives can be applied in a wide diversity of contexts to problems arising in theoretical neuroscience and data analysis. Along the way we discuss spin glasses, learning theory, illusions of structure in noise, random matrices, dimensionality reduction and compressed sensing, all within the unified formalism of the replica method. Moreover, we review recent conceptual connections between message passing in graphical models, and neural computation and learning. Overall, these ideas illustrate how statistical physics and computer science might provide a lens through which we can uncover emergent computational functions buried deep within the dynamical complexities of neuronal networks. (paper)
Image-based Exploration of Iso-surfaces for Large Multi- Variable Datasets using Parameter Space.
Binyahib, Roba S.
2013-05-13
With an increase in processing power, more complex simulations have resulted in larger data size, with higher resolution and more variables. Many techniques have been developed to help the user to visualize and analyze data from such simulations. However, dealing with a large amount of multivariate data is challenging, time- consuming and often requires high-end clusters. Consequently, novel visualization techniques are needed to explore such data. Many users would like to visually explore their data and change certain visual aspects without the need to use special clusters or having to load a large amount of data. This is the idea behind explorable images (EI). Explorable images are a novel approach that provides limited interactive visualization without the need to re-render from the original data [40]. In this work, the concept of EI has been used to create a workflow that deals with explorable iso-surfaces for scalar fields in a multivariate, time-varying dataset. As a pre-processing step, a set of iso-values for each scalar field is inferred and extracted from a user-assisted sampling technique in time-parameter space. These iso-values are then used to generate iso- surfaces that are then pre-rendered (from a fixed viewpoint) along with additional buffers (i.e. normals, depth, values of other fields, etc.) to provide a compressed representation of iso-surfaces in the dataset. We present a tool that at run-time allows the user to interactively browse and calculate a combination of iso-surfaces superimposed on each other. The result is the same as calculating multiple iso- surfaces from the original data but without the memory and processing overhead. Our tool also allows the user to change the (scalar) values superimposed on each of the surfaces, modify their color map, and interactively re-light the surfaces. We demonstrate the effectiveness of our approach over a multi-terabyte combustion dataset. We also illustrate the efficiency and accuracy of our
Estimates for Parameter Littlewood-Paley gκ⁎ Functions on Nonhomogeneous Metric Measure Spaces
Directory of Open Access Journals (Sweden)
Guanghui Lu
2016-01-01
Full Text Available Let (X,d,μ be a metric measure space which satisfies the geometrically doubling measure and the upper doubling measure conditions. In this paper, the authors prove that, under the assumption that the kernel of Mκ⁎ satisfies a certain Hörmander-type condition, Mκ⁎,ρ is bounded from Lebesgue spaces Lp(μ to Lebesgue spaces Lp(μ for p≥2 and is bounded from L1(μ into L1,∞(μ. As a corollary, Mκ⁎,ρ is bounded on Lp(μ for 1
space H1(μ into the Lebesgue space L1(μ.
Bhadra, Anindya
2013-04-22
We describe a Bayesian technique to (a) perform a sparse joint selection of significant predictor variables and significant inverse covariance matrix elements of the response variables in a high-dimensional linear Gaussian sparse seemingly unrelated regression (SSUR) setting and (b) perform an association analysis between the high-dimensional sets of predictors and responses in such a setting. To search the high-dimensional model space, where both the number of predictors and the number of possibly correlated responses can be larger than the sample size, we demonstrate that a marginalization-based collapsed Gibbs sampler, in combination with spike and slab type of priors, offers a computationally feasible and efficient solution. As an example, we apply our method to an expression quantitative trait loci (eQTL) analysis on publicly available single nucleotide polymorphism (SNP) and gene expression data for humans where the primary interest lies in finding the significant associations between the sets of SNPs and possibly correlated genetic transcripts. Our method also allows for inference on the sparse interaction network of the transcripts (response variables) after accounting for the effect of the SNPs (predictor variables). We exploit properties of Gaussian graphical models to make statements concerning conditional independence of the responses. Our method compares favorably to existing Bayesian approaches developed for this purpose. © 2013, The International Biometric Society.
Energy Technology Data Exchange (ETDEWEB)
Costa, Diogo Ricardo da, E-mail: diogo_cost@hotmail.com [Departamento de Física, UNESP – Universidade Estadual Paulista, Av. 24A, 1515, Bela Vista, 13506-900, Rio Claro, SP (Brazil); Hansen, Matheus [Departamento de Física, UNESP – Universidade Estadual Paulista, Av. 24A, 1515, Bela Vista, 13506-900, Rio Claro, SP (Brazil); Instituto de Física, Univ. São Paulo, Rua do Matão, Cidade Universitária, 05314-970, São Paulo – SP (Brazil); Guarise, Gustavo [Departamento de Física, UNESP – Universidade Estadual Paulista, Av. 24A, 1515, Bela Vista, 13506-900, Rio Claro, SP (Brazil); Medrano-T, Rene O. [Departamento de Ciências Exatas e da Terra, UNIFESP – Universidade Federal de São Paulo, Rua São Nicolau, 210, Centro, 09913-030, Diadema, SP (Brazil); Department of Mathematics, Imperial College London, London SW7 2AZ (United Kingdom); Leonel, Edson D. [Departamento de Física, UNESP – Universidade Estadual Paulista, Av. 24A, 1515, Bela Vista, 13506-900, Rio Claro, SP (Brazil); Abdus Salam International Center for Theoretical Physics, Strada Costiera 11, 34151 Trieste (Italy)
2016-04-22
We show that extreme orbits, trajectories that connect local maximum and minimum values of one dimensional maps, play a major role in the parameter space of dissipative systems dictating the organization for the windows of periodicity, hence producing sets of shrimp-like structures. Here we solve three fundamental problems regarding the distribution of these sets and give: (i) their precise localization in the parameter space, even for sets of very high periods; (ii) their local and global distributions along cascades; and (iii) the association of these cascades to complicate sets of periodicity. The extreme orbits are proved to be a powerful indicator to investigate the organization of windows of periodicity in parameter planes. As applications of the theory, we obtain some results for the circle map and perturbed logistic map. The formalism presented here can be extended to many other different nonlinear and dissipative systems. - Highlights: • Extreme orbits and the organization of periodic regions in parameter space. • One-dimensional dissipative mappings. • The circle map and also a time perturbed logistic map were studied.
International Nuclear Information System (INIS)
Costa, Diogo Ricardo da; Hansen, Matheus; Guarise, Gustavo; Medrano-T, Rene O.; Leonel, Edson D.
2016-01-01
We show that extreme orbits, trajectories that connect local maximum and minimum values of one dimensional maps, play a major role in the parameter space of dissipative systems dictating the organization for the windows of periodicity, hence producing sets of shrimp-like structures. Here we solve three fundamental problems regarding the distribution of these sets and give: (i) their precise localization in the parameter space, even for sets of very high periods; (ii) their local and global distributions along cascades; and (iii) the association of these cascades to complicate sets of periodicity. The extreme orbits are proved to be a powerful indicator to investigate the organization of windows of periodicity in parameter planes. As applications of the theory, we obtain some results for the circle map and perturbed logistic map. The formalism presented here can be extended to many other different nonlinear and dissipative systems. - Highlights: • Extreme orbits and the organization of periodic regions in parameter space. • One-dimensional dissipative mappings. • The circle map and also a time perturbed logistic map were studied.
Accelerated Sensitivity Analysis in High-Dimensional Stochastic Reaction Networks.
Arampatzis, Georgios; Katsoulakis, Markos A; Pantazis, Yannis
2015-01-01
Existing sensitivity analysis approaches are not able to handle efficiently stochastic reaction networks with a large number of parameters and species, which are typical in the modeling and simulation of complex biochemical phenomena. In this paper, a two-step strategy for parametric sensitivity analysis for such systems is proposed, exploiting advantages and synergies between two recently proposed sensitivity analysis methodologies for stochastic dynamics. The first method performs sensitivity analysis of the stochastic dynamics by means of the Fisher Information Matrix on the underlying distribution of the trajectories; the second method is a reduced-variance, finite-difference, gradient-type sensitivity approach relying on stochastic coupling techniques for variance reduction. Here we demonstrate that these two methods can be combined and deployed together by means of a new sensitivity bound which incorporates the variance of the quantity of interest as well as the Fisher Information Matrix estimated from the first method. The first step of the proposed strategy labels sensitivities using the bound and screens out the insensitive parameters in a controlled manner. In the second step of the proposed strategy, a finite-difference method is applied only for the sensitivity estimation of the (potentially) sensitive parameters that have not been screened out in the first step. Results on an epidermal growth factor network with fifty parameters and on a protein homeostasis with eighty parameters demonstrate that the proposed strategy is able to quickly discover and discard the insensitive parameters and in the remaining potentially sensitive parameters it accurately estimates the sensitivities. The new sensitivity strategy can be several times faster than current state-of-the-art approaches that test all parameters, especially in "sloppy" systems. In particular, the computational acceleration is quantified by the ratio between the total number of parameters over the
Accelerated Sensitivity Analysis in High-Dimensional Stochastic Reaction Networks.
Directory of Open Access Journals (Sweden)
Georgios Arampatzis
Full Text Available Existing sensitivity analysis approaches are not able to handle efficiently stochastic reaction networks with a large number of parameters and species, which are typical in the modeling and simulation of complex biochemical phenomena. In this paper, a two-step strategy for parametric sensitivity analysis for such systems is proposed, exploiting advantages and synergies between two recently proposed sensitivity analysis methodologies for stochastic dynamics. The first method performs sensitivity analysis of the stochastic dynamics by means of the Fisher Information Matrix on the underlying distribution of the trajectories; the second method is a reduced-variance, finite-difference, gradient-type sensitivity approach relying on stochastic coupling techniques for variance reduction. Here we demonstrate that these two methods can be combined and deployed together by means of a new sensitivity bound which incorporates the variance of the quantity of interest as well as the Fisher Information Matrix estimated from the first method. The first step of the proposed strategy labels sensitivities using the bound and screens out the insensitive parameters in a controlled manner. In the second step of the proposed strategy, a finite-difference method is applied only for the sensitivity estimation of the (potentially sensitive parameters that have not been screened out in the first step. Results on an epidermal growth factor network with fifty parameters and on a protein homeostasis with eighty parameters demonstrate that the proposed strategy is able to quickly discover and discard the insensitive parameters and in the remaining potentially sensitive parameters it accurately estimates the sensitivities. The new sensitivity strategy can be several times faster than current state-of-the-art approaches that test all parameters, especially in "sloppy" systems. In particular, the computational acceleration is quantified by the ratio between the total number of
Large model-space calculation of the nuclear level density parameter
International Nuclear Information System (INIS)
Agrawal, B.K.; Samaddar, S.K.; De, J.N.; Shlomo, S.
1998-01-01
Recently, several attempts have been made to obtain nuclear level density (ρ) and level density parameter (α) within the microscopic approaches based on path integral representation of the partition function. The results for the inverse level density parameter K es and the level density as a function of excitation energy are presented
An adaptive ANOVA-based PCKF for high-dimensional nonlinear inverse modeling
Li, Weixuan; Lin, Guang; Zhang, Dongxiao
2014-02-01
The probabilistic collocation-based Kalman filter (PCKF) is a recently developed approach for solving inverse problems. It resembles the ensemble Kalman filter (EnKF) in every aspect-except that it represents and propagates model uncertainty by polynomial chaos expansion (PCE) instead of an ensemble of model realizations. Previous studies have shown PCKF is a more efficient alternative to EnKF for many data assimilation problems. However, the accuracy and efficiency of PCKF depends on an appropriate truncation of the PCE series. Having more polynomial chaos basis functions in the expansion helps to capture uncertainty more accurately but increases computational cost. Selection of basis functions is particularly important for high-dimensional stochastic problems because the number of polynomial chaos basis functions required to represent model uncertainty grows dramatically as the number of input parameters (random dimensions) increases. In classic PCKF algorithms, the PCE basis functions are pre-set based on users' experience. Also, for sequential data assimilation problems, the basis functions kept in PCE expression remain unchanged in different Kalman filter loops, which could limit the accuracy and computational efficiency of classic PCKF algorithms. To address this issue, we present a new algorithm that adaptively selects PCE basis functions for different problems and automatically adjusts the number of basis functions in different Kalman filter loops. The algorithm is based on adaptive functional ANOVA (analysis of variance) decomposition, which approximates a high-dimensional function with the summation of a set of low-dimensional functions. Thus, instead of expanding the original model into PCE, we implement the PCE expansion on these low-dimensional functions, which is much less costly. We also propose a new adaptive criterion for ANOVA that is more suited for solving inverse problems. The new algorithm was tested with different examples and demonstrated
Operational definition of (brane-induced) space-time and constraints on the fundamental parameters
International Nuclear Information System (INIS)
Maziashvili, Michael
2008-01-01
First we contemplate the operational definition of space-time in four dimensions in light of basic principles of quantum mechanics and general relativity and consider some of its phenomenological consequences. The quantum gravitational fluctuations of the background metric that comes through the operational definition of space-time are controlled by the Planck scale and are therefore strongly suppressed. Then we extend our analysis to the braneworld setup with low fundamental scale of gravity. It is observed that in this case the quantum gravitational fluctuations on the brane may become unacceptably large. The magnification of fluctuations is not linked directly to the low quantum gravity scale but rather to the higher-dimensional modification of Newton's inverse square law at relatively large distances. For models with compact extra dimensions the shape modulus of extra space can be used as a most natural and safe stabilization mechanism against these fluctuations
International Nuclear Information System (INIS)
Zhang Zhongcan; Hu Chenguo; Fang Zhenyun
1998-01-01
The authors study the method which directly adopts the azimuthal angles and the rotation angle of the axis to describe the evolving process of the angular momentum eigenstates under the space rotation transformation. The authors obtain the angular momentum rotation and multi-rotation matrix elements' path integral which evolves with the parameter λ(0→θ,θ the rotation angle), and establish the general method of treating the functional (path) integral as a normal multi-integrals
International Nuclear Information System (INIS)
Barnes, G.D.
1982-01-01
The feasibility of a polygeneration plant at Kennedy Space Center was studied. Liquid hydrogen and gaseous nitrogen are the two principal products in consideration. Environmental parameters (air quality, water quality, biological diversity and hazardous waste disposal) necessary for the feasibility study were investigated. A National Environmental Policy Act (NEPA) project flow sheet was to be formulated for the environmental impact statement. Water quality criteria for Florida waters were to be established
Directory of Open Access Journals (Sweden)
Zhang Peiguo
2011-01-01
Full Text Available Abstract By obtaining intervals of the parameter λ, this article investigates the existence of a positive solution for a class of nonlinear boundary value problems of second-order differential equations with integral boundary conditions in abstract spaces. The arguments are based upon a specially constructed cone and the fixed point theory in cone for a strict set contraction operator. MSC: 34B15; 34B16.
Elastic SCAD as a novel penalization method for SVM classification tasks in high-dimensional data.
Becker, Natalia; Toedt, Grischa; Lichter, Peter; Benner, Axel
2011-05-09
Classification and variable selection play an important role in knowledge discovery in high-dimensional data. Although Support Vector Machine (SVM) algorithms are among the most powerful classification and prediction methods with a wide range of scientific applications, the SVM does not include automatic feature selection and therefore a number of feature selection procedures have been developed. Regularisation approaches extend SVM to a feature selection method in a flexible way using penalty functions like LASSO, SCAD and Elastic Net.We propose a novel penalty function for SVM classification tasks, Elastic SCAD, a combination of SCAD and ridge penalties which overcomes the limitations of each penalty alone.Since SVM models are extremely sensitive to the choice of tuning parameters, we adopted an interval search algorithm, which in comparison to a fixed grid search finds rapidly and more precisely a global optimal solution. Feature selection methods with combined penalties (Elastic Net and Elastic SCAD SVMs) are more robust to a change of the model complexity than methods using single penalties. Our simulation study showed that Elastic SCAD SVM outperformed LASSO (L1) and SCAD SVMs. Moreover, Elastic SCAD SVM provided sparser classifiers in terms of median number of features selected than Elastic Net SVM and often better predicted than Elastic Net in terms of misclassification error.Finally, we applied the penalization methods described above on four publicly available breast cancer data sets. Elastic SCAD SVM was the only method providing robust classifiers in sparse and non-sparse situations. The proposed Elastic SCAD SVM algorithm provides the advantages of the SCAD penalty and at the same time avoids sparsity limitations for non-sparse data. We were first to demonstrate that the integration of the interval search algorithm and penalized SVM classification techniques provides fast solutions on the optimization of tuning parameters.The penalized SVM
Matrix correlations for high-dimensional data: The modified RV-coefficient
Smilde, A.K.; Kiers, H.A.L.; Bijlsma, S.; Rubingh, C.M.; Erk, M.J. van
2009-01-01
Motivation: Modern functional genomics generates high-dimensional datasets. It is often convenient to have a single simple number characterizing the relationship between pairs of such high-dimensional datasets in a comprehensive way. Matrix correlations are such numbers and are appealing since they
Observatory data as a proxy of space weather parameters: The importance of historical archives
Czech Academy of Sciences Publication Activity Database
Hejda, Pavel
2016-01-01
Roč. 20, Č. 2 (2016), s. 47-53 ISSN 0257-7968 R&D Projects: GA MŠk LM2010008 Institutional support: RVO:67985530 Keywords : geomagnetic observatory * geomagnetic indices * sunspot members * space weather Subject RIV: DE - Earth Magnetism, Geodesy, Geography OBOR OECD: Physical geography
Cui, Tiangang; Marzouk, Youssef; Willcox, Karen
2016-06-01
Two major bottlenecks to the solution of large-scale Bayesian inverse problems are the scaling of posterior sampling algorithms to high-dimensional parameter spaces and the computational cost of forward model evaluations. Yet incomplete or noisy data, the state variation and parameter dependence of the forward model, and correlations in the prior collectively provide useful structure that can be exploited for dimension reduction in this setting-both in the parameter space of the inverse problem and in the state space of the forward model. To this end, we show how to jointly construct low-dimensional subspaces of the parameter space and the state space in order to accelerate the Bayesian solution of the inverse problem. As a byproduct of state dimension reduction, we also show how to identify low-dimensional subspaces of the data in problems with high-dimensional observations. These subspaces enable approximation of the posterior as a product of two factors: (i) a projection of the posterior onto a low-dimensional parameter subspace, wherein the original likelihood is replaced by an approximation involving a reduced model; and (ii) the marginal prior distribution on the high-dimensional complement of the parameter subspace. We present and compare several strategies for constructing these subspaces using only a limited number of forward and adjoint model simulations. The resulting posterior approximations can rapidly be characterized using standard sampling techniques, e.g., Markov chain Monte Carlo. Two numerical examples demonstrate the accuracy and efficiency of our approach: inversion of an integral equation in atmospheric remote sensing, where the data dimension is very high; and the inference of a heterogeneous transmissivity field in a groundwater system, which involves a partial differential equation forward model with high dimensional state and parameters.
On the breakdown modes and parameter space of Ohmic Tokamak startup
Peng, Yanli; Jiang, Wei; Zhang, Ya; Hu, Xiwei; Zhuang, Ge; Innocenti, Maria; Lapenta, Giovanni
2017-10-01
Tokamak plasma has to be hot. The process of turning the initial dilute neutral hydrogen gas at room temperature into fully ionized plasma is called tokamak startup. Even with over 40 years of research, the parameter ranges for the successful startup still aren't determined by numerical simulations but by trial and errors. However, in recent years it has drawn much attention due to one of the challenges faced by ITER: the maximum electric field for startup can't exceed 0.3 V/m, which makes the parameter range for successful startup narrower. Besides, this physical mechanism is far from being understood either theoretically or numerically. In this work, we have simulated the plasma breakdown phase driven by pure Ohmic heating using a particle-in-cell/Monte Carlo code, with the aim of giving a predictive parameter range for most tokamaks, even for ITER. We have found three situations during the discharge, as a function of the initial parameters: no breakdown, breakdown and runaway. Moreover, breakdown delay and volt-second consumption under different initial conditions are evaluated. In addition, we have simulated breakdown on ITER and confirmed that when the electric field is 0.3 V/m, the optimal pre-filling pressure is 0.001 Pa, which is in good agreement with ITER's design.
Electromagnetic weather in the near-earth space in dependence on solar wind parameters
International Nuclear Information System (INIS)
Belov, B.A.; Burtsev, Yu.A.; Dremukhina, L.A.; Papitashvili, V.O.
1995-01-01
Analysis of modern models of electrical and magnetic fields, electrical current and plasma convection is carried out with the purpose of quantitative description of the near-earth electrodynamic parameters. Possibility of utilizing such models simultaneously with radar and geomagnetic observations for continuous real time control of electromagnetic weather in the earth magnetosphere is considered. Refs. 24, refs. 3
TRICE - A program for reconstructing 3D reciprocal space and determining unit-cell parameters
International Nuclear Information System (INIS)
Zou Xiaodong; Hovmoeller, Anders; Hovmoeller, Sven
2004-01-01
A program system-Trice-for reconstructing the 3D reciprocal lattice from an electron diffraction tilt series is described. The unit-cell parameters can be determined from electron diffraction patterns directly by Trice. The unit cell can be checked and the lattice type and crystal system can be determined from the 3D reciprocal lattice. Trice can be applied to all crystal systems and lattice types
Determination of parameters for hypervelocity dust grains encountered in near-Earth space
Tanner, William G.; Maag, Carl R.; Alexander, W. Merle; Sappenfield, Patricia
1993-01-01
Primarily interest was in the determination of the population of micrometeoroids and space debris and interpretation of the hole size in a thin film or in a micropore foam returned from space with theoretical calculations describing the event. In order to augment the significance of the theoretical calculations of the impact event, an experiment designed to analyze the charge production due to hypervelocity impacts on thin films also produced data which described the penetration properties of micron and sub-micron sized projectiles. The thin film penetration sites in the 500 A and 1000 A aluminum films were counted and a size distribution function was derived. In the case of the very smallest dust grains, there were no independent measurements of velocities like that which existed for the larger dust grains (d(sub p) is less than or equal to 1 micron). The primary task then became to assess the relationship between the penetration hole and the particle diameter of the projectile which made the hole. The most promising means to assess the measure of the diameters of impacting grains came in the form of comparing cratering mechanics to penetration mechanics. Future experimentation will produce measurements of the cratering as opposed to the penetrating event. Particles encountered by surfaces while being flown in space will degrade that surface in a systematic manner even when the impact is with small hypervelocity particles, d(sub p) is less than or equal to 10 microns. Though not to a degree which would precipitate a catastrophic failure of a system, the degradation of the materials comprising the interconnected system will occur. It is the degradation of the optical system and the subsequent embrittlement of other materials that can lead to degradation if not to failure. It is to this end that research was conducted to compare the primary consequences for experiments which will be flown to those which have been returned.
State and parameter estimation of state-space model with entry-wise correlated uniform noise
Czech Academy of Sciences Publication Activity Database
Pavelková, Lenka; Kárný, Miroslav
2014-01-01
Roč. 28, č. 11 (2014), s. 1189-1205 ISSN 0890-6327 R&D Projects: GA TA ČR TA01030123; GA ČR GA13-13502S Institutional research plan: CEZ:AV0Z1075907 Keywords : state-space models * bounded noise * filtering problems * estimation algorithms * uncertain dynamic systems Subject RIV: BC - Control Systems Theory Impact factor: 1.346, year: 2014 http://library.utia.cas.cz/separaty/2014/AS/pavelkova-0422958.pdf
Biomarker identification and effect estimation on schizophrenia –a high dimensional data analysis
Directory of Open Access Journals (Sweden)
Yuanzhang eLi
2015-05-01
Full Text Available Biomarkers have been examined in schizophrenia research for decades. Medical morbidity and mortality rates, as well as personal and societal costs, are associated with schizophrenia patients. The identification of biomarkers and alleles, which often have a small effect individually, may help to develop new diagnostic tests for early identification and treatment. Currently, there is not a commonly accepted statistical approach to identify predictive biomarkers from high dimensional data. We used space Decomposition-Gradient-Regression method (DGR to select biomarkers, which are associated with the risk of schizophrenia. Then, we used the gradient scores, generated from the selected biomarkers, as the prediction factor in regression to estimate their effects. We also used an alternative approach, classification and regression tree (CART, to compare the biomarker selected by DGR and found about 70% of the selected biomarkers were the same. However, the advantage of DGR is that it can evaluate individual effects for each biomarker from their combined effect. In DGR analysis of serum specimens of US military service members with a diagnosis of schizophrenia from 1992 to 2005 and their controls, Alpha-1-Antitrypsin (AAT, Interleukin-6 receptor (IL-6r and Connective Tissue Growth Factor (CTGF were selected to identify schizophrenia for males; and Alpha-1-Antitrypsin (AAT, Apolipoprotein B (Apo B and Sortilin were selected for females. If these findings from military subjects are replicated by other studies, they suggest the possibility of a novel biomarker panel as an adjunct to earlier diagnosis and initiation of treatment.
Hydrogenated amorphous silicon-selenium alloys - a short journey through parameter space
International Nuclear Information System (INIS)
Al-Dallal, S.; Al-Alawi, S.M.; Aljishi, S.
1999-01-01
Hydrogenated amorphous silicon-selenium alloy thin films were grown by capacity coupled radio frequency glow discharge decomposition of (SiH/sub 4/ + He) and (H/sub 2/S + He) gas mixtures. In this work we report on a study to correlate the deposition parameters of a-Si, Se:H thin films with its optical, electronic and spectroscopic properties. The alloy composition was varied by changing the gas volume ratio R/sub v/ = [H/sub 2/Se]/[SiH/sub 4/]. The films are characterized via infrared spectroscopy, photoconductivity, photoluminescence, constant current method and conductivity measurements. (author)
Bhutwala, Krish; Beg, Farhat; Mariscal, Derek; Wilks, Scott; Ma, Tammy
2017-10-01
The Advanced Radiographic Capability (ARC) laser at the National Ignition Facility (NIF) at Lawrence Livermore National Laboratory is the world's most energetic short-pulse laser. It comprises four beamlets, each of substantial energy ( 1.5 kJ), extended short-pulse duration (10-30 ps), and large focal spot (>=50% of energy in 150 µm spot). This allows ARC to achieve proton and light ion acceleration via the Target Normal Sheath Acceleration (TNSA) mechanism, but it is yet unknown how proton beam characteristics scale with ARC-regime laser parameters. As theory has also not yet been validated for laser-generated protons at ARC-regime laser parameters, we attempt to formulate the scaling physics of proton beam characteristics as a function of laser energy, intensity, focal spot size, pulse length, target geometry, etc. through a review of relevant proton acceleration experiments from laser facilities across the world. These predicted scaling laws should then guide target design and future diagnostics for desired proton beam experiments on the NIF ARC. This work performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344 and funded by the LLNL LDRD program under tracking code 17-ERD-039.
From bare to renormalized order parameter in gauge space: Structure and reactions
Potel, G.; Idini, A.; Barranco, F.; Vigezzi, E.; Broglia, R. A.
2017-09-01
It is not physically obvious why one can calculate with similar accuracy, as compared to the experimental data, the absolute cross section associated with two-nucleon transfer processes between members of pairing rotational bands, making use of simple BCS (constant matrix elements) or of many-body [Nambu-Gorkov (NG), nuclear field theory (NFT)] spectroscopic amplitudes. Restoration of spontaneous symmetry breaking and associated emergent generalized rigidity in gauge space provides the answer and points to a new emergence: A physical sum rule resulting from the intertwining of structure and reaction processes, closely connected with the central role induced pairing interaction plays in structure, together with the fact that successive transfer dominates Cooper pair tunneling.
International Nuclear Information System (INIS)
Gazoya, E.D.K.; Prempeh, E.; Banini, G.K.
2015-01-01
The relationship between the spin transformations of the special linear group of order 2, SL (2, C) and the aggregate SO(3) of the three-dimensional pure rotations when considered as a group in itself (and not as a subgroup of the Lorentz group), is investigated. It is shown, by the spinor map X - → AXA ct which is all action of SL(2. C) on the space of Hermitian matrices, that the one- parameter subgroup of rotations generated are precisely those of angles which are multiples 2π. (au)
Prediction-Oriented Marker Selection (PROMISE): With Application to High-Dimensional Regression.
Kim, Soyeon; Baladandayuthapani, Veerabhadran; Lee, J Jack
2017-06-01
In personalized medicine, biomarkers are used to select therapies with the highest likelihood of success based on an individual patient's biomarker/genomic profile. Two goals are to choose important biomarkers that accurately predict treatment outcomes and to cull unimportant biomarkers to reduce the cost of biological and clinical verifications. These goals are challenging due to the high dimensionality of genomic data. Variable selection methods based on penalized regression (e.g., the lasso and elastic net) have yielded promising results. However, selecting the right amount of penalization is critical to simultaneously achieving these two goals. Standard approaches based on cross-validation (CV) typically provide high prediction accuracy with high true positive rates but at the cost of too many false positives. Alternatively, stability selection (SS) controls the number of false positives, but at the cost of yielding too few true positives. To circumvent these issues, we propose prediction-oriented marker selection (PROMISE), which combines SS with CV to conflate the advantages of both methods. Our application of PROMISE with the lasso and elastic net in data analysis shows that, compared to CV, PROMISE produces sparse solutions, few false positives, and small type I + type II error, and maintains good prediction accuracy, with a marginal decrease in the true positive rates. Compared to SS, PROMISE offers better prediction accuracy and true positive rates. In summary, PROMISE can be applied in many fields to select regularization parameters when the goals are to minimize false positives and maximize prediction accuracy.
Energy Technology Data Exchange (ETDEWEB)
Dan Maljovec; Bei Wang; Valerio Pascucci; Peer-Timo Bremer; Michael Pernice; Robert Nourgaliev
2013-05-01
The next generation of methodologies for nuclear reactor Probabilistic Risk Assessment (PRA) explicitly accounts for the time element in modeling the probabilistic system evolution and uses numerical simulation tools to account for possible dependencies between failure events. The Monte-Carlo (MC) and the Dynamic Event Tree (DET) approaches belong to this new class of dynamic PRA methodologies. A challenge of dynamic PRA algorithms is the large amount of data they produce which may be difficult to visualize and analyze in order to extract useful information. We present a software tool that is designed to address these goals. We model a large-scale nuclear simulation dataset as a high-dimensional scalar function defined over a discrete sample of the domain. First, we provide structural analysis of such a function at multiple scales and provide insight into the relationship between the input parameters and the output. Second, we enable exploratory analysis for users, where we help the users to differentiate features from noise through multi-scale analysis on an interactive platform, based on domain knowledge and data characterization. Our analysis is performed by exploiting the topological and geometric properties of the domain, building statistical models based on its topological segmentations and providing interactive visual interfaces to facilitate such explorations. We provide a user’s guide to our software tool by highlighting its analysis and visualization capabilities, along with a use case involving dataset from a nuclear reactor safety simulation.
Julien, Clavel; Leandro, Aristide; Hélène, Morlon
2018-06-19
Working with high-dimensional phylogenetic comparative datasets is challenging because likelihood-based multivariate methods suffer from low statistical performances as the number of traits p approaches the number of species n and because some computational complications occur when p exceeds n. Alternative phylogenetic comparative methods have recently been proposed to deal with the large p small n scenario but their use and performances are limited. Here we develop a penalized likelihood framework to deal with high-dimensional comparative datasets. We propose various penalizations and methods for selecting the intensity of the penalties. We apply this general framework to the estimation of parameters (the evolutionary trait covariance matrix and parameters of the evolutionary model) and model comparison for the high-dimensional multivariate Brownian (BM), Early-burst (EB), Ornstein-Uhlenbeck (OU) and Pagel's lambda models. We show using simulations that our penalized likelihood approach dramatically improves the estimation of evolutionary trait covariance matrices and model parameters when p approaches n, and allows for their accurate estimation when p equals or exceeds n. In addition, we show that penalized likelihood models can be efficiently compared using Generalized Information Criterion (GIC). We implement these methods, as well as the related estimation of ancestral states and the computation of phylogenetic PCA in the R package RPANDA and mvMORPH. Finally, we illustrate the utility of the new proposed framework by evaluating evolutionary models fit, analyzing integration patterns, and reconstructing evolutionary trajectories for a high-dimensional 3-D dataset of brain shape in the New World monkeys. We find a clear support for an Early-burst model suggesting an early diversification of brain morphology during the ecological radiation of the clade. Penalized likelihood offers an efficient way to deal with high-dimensional multivariate comparative data.
Mapping magnetized geologic structures from space: The effect of orbital and body parameters
Schnetzler, C. C.; Taylor, P. T.; Langel, R. A.
1984-01-01
When comparing previous satellite magnetometer missions (such as MAGSAT) with proposed new programs (for example, Geopotential Research Mission, GRM) it is important to quantify the difference in scientific information obtained. The ability to resolve separate magnetic blocks (simulating geological units) is used as a parameter for evaluating the expected geologic information from each mission. The effect of satellite orbital altitude on the ability to resolve two magnetic blocks with varying separations is evaluated and quantified. A systematic, nonlinear, relationship exists between resolution and distance between magnetic blocks as a function of orbital altitude. The proposed GRM would provide an order-of-magnitude greater anomaly resolution than the earlier MAGSAT mission for widely separated bodies. The resolution achieved at any particular altitude varies depending on the location of the bodies and orientation.
Counting and classifying attractors in high dimensional dynamical systems.
Bagley, R J; Glass, L
1996-12-07
Randomly connected Boolean networks have been used as mathematical models of neural, genetic, and immune systems. A key quantity of such networks is the number of basins of attraction in the state space. The number of basins of attraction changes as a function of the size of the network, its connectivity and its transition rules. In discrete networks, a simple count of the number of attractors does not reveal the combinatorial structure of the attractors. These points are illustrated in a reexamination of dynamics in a class of random Boolean networks considered previously by Kauffman. We also consider comparisons between dynamics in discrete networks and continuous analogues. A continuous analogue of a discrete network may have a different number of attractors for many different reasons. Some attractors in discrete networks may be associated with unstable dynamics, and several different attractors in a discrete network may be associated with a single attractor in the continuous case. Special problems in determining attractors in continuous systems arise when there is aperiodic dynamics associated with quasiperiodicity of deterministic chaos.
Ren, Jie; He, Tao; Li, Ye; Liu, Sai; Du, Yinhao; Jiang, Yu; Wu, Cen
2017-05-16
Over the past decades, the prevalence of type 2 diabetes mellitus (T2D) has been steadily increasing around the world. Despite large efforts devoted to better understand the genetic basis of the disease, the identified susceptibility loci can only account for a small portion of the T2D heritability. Some of the existing approaches proposed for the high dimensional genetic data from the T2D case-control study are limited by analyzing a few number of SNPs at a time from a large pool of SNPs, by ignoring the correlations among SNPs and by adopting inefficient selection techniques. We propose a network constrained regularization method to select important SNPs by taking the linkage disequilibrium into account. To accomodate the case control study, an iteratively reweighted least square algorithm has been developed within the coordinate descent framework where optimization of the regularized logistic loss function is performed with respect to one parameter at a time and iteratively cycle through all the parameters until convergence. In this article, a novel approach is developed to identify important SNPs more effectively through incorporating the interconnections among them in the regularized selection. A coordinate descent based iteratively reweighed least squares (IRLS) algorithm has been proposed. Both the simulation study and the analysis of the Nurses's Health Study, a case-control study of type 2 diabetes data with high dimensional SNP measurements, demonstrate the advantage of the network based approach over the competing alternatives.
Mitigating the Insider Threat Using High-Dimensional Search and Modeling
National Research Council Canada - National Science Library
Van Den Berg, Eric; Uphadyaya, Shambhu; Ngo, Phi H; Muthukrishnan, Muthu; Palan, Rajago
2006-01-01
In this project a system was built aimed at mitigating insider attacks centered around a high-dimensional search engine for correlating the large number of monitoring streams necessary for detecting insider attacks...
Using a High-Dimensional Graph of Semantic Space to Model Relationships among Words
Directory of Open Access Journals (Sweden)
Alice F Jackson
2014-05-01
Full Text Available The GOLD model (Graph Of Language Distribution is a network model constructed based on co-occurrence in a large corpus of natural language that may be used to explore what information may be present in a graph-structured model of language, and what information may be extracted through theoretically-driven algorithms as well as standard graph analysis methods. The present study will employ GOLD to examine two types of relationship between words: semantic similarity and associative relatedness. Semantic similarity refers to the degree of overlap in meaning between words, while associative relatedness refers to the degree to which two words occur in the same schematic context. It is expected that a graph structured model of language constructed based on co-occurrence should easily capture associative relatedness, because this type of relationship is thought to be present directly in lexical co-occurrence. However, it is hypothesized that semantic similarity may be extracted from the intersection of the set of first-order connections, because two words that are semantically similar may occupy similar thematic or syntactic roles across contexts and thus would co-occur lexically with the same set of nodes. Two versions the GOLD model that differed in terms of the co-occurence window, bigGOLD at the paragraph level and smallGOLD at the adjacent word level, were directly compared to the performance of a well-established distributional model, Latent Semantic Analysis (LSA. The superior performance of the GOLD models (big and small suggest that a single acquisition and storage mechanism, namely co-occurrence, can account for associative and conceptual relationships between words and is more psychologically plausible than models using singular value decomposition.
Using a high-dimensional graph of semantic space to model relationships among words.
Jackson, Alice F; Bolger, Donald J
2014-01-01
The GOLD model (Graph Of Language Distribution) is a network model constructed based on co-occurrence in a large corpus of natural language that may be used to explore what information may be present in a graph-structured model of language, and what information may be extracted through theoretically-driven algorithms as well as standard graph analysis methods. The present study will employ GOLD to examine two types of relationship between words: semantic similarity and associative relatedness. Semantic similarity refers to the degree of overlap in meaning between words, while associative relatedness refers to the degree to which two words occur in the same schematic context. It is expected that a graph structured model of language constructed based on co-occurrence should easily capture associative relatedness, because this type of relationship is thought to be present directly in lexical co-occurrence. However, it is hypothesized that semantic similarity may be extracted from the intersection of the set of first-order connections, because two words that are semantically similar may occupy similar thematic or syntactic roles across contexts and thus would co-occur lexically with the same set of nodes. Two versions the GOLD model that differed in terms of the co-occurence window, bigGOLD at the paragraph level and smallGOLD at the adjacent word level, were directly compared to the performance of a well-established distributional model, Latent Semantic Analysis (LSA). The superior performance of the GOLD models (big and small) suggest that a single acquisition and storage mechanism, namely co-occurrence, can account for associative and conceptual relationships between words and is more psychologically plausible than models using singular value decomposition (SVD).
Bayesian Inference of High-Dimensional Dynamical Ocean Models
Lin, J.; Lermusiaux, P. F. J.; Lolla, S. V. T.; Gupta, A.; Haley, P. J., Jr.
2015-12-01
This presentation addresses a holistic set of challenges in high-dimension ocean Bayesian nonlinear estimation: i) predict the probability distribution functions (pdfs) of large nonlinear dynamical systems using stochastic partial differential equations (PDEs); ii) assimilate data using Bayes' law with these pdfs; iii) predict the future data that optimally reduce uncertainties; and (iv) rank the known and learn the new model formulations themselves. Overall, we allow the joint inference of the state, equations, geometry, boundary conditions and initial conditions of dynamical models. Examples are provided for time-dependent fluid and ocean flows, including cavity, double-gyre and Strait flows with jets and eddies. The Bayesian model inference, based on limited observations, is illustrated first by the estimation of obstacle shapes and positions in fluid flows. Next, the Bayesian inference of biogeochemical reaction equations and of their states and parameters is presented, illustrating how PDE-based machine learning can rigorously guide the selection and discovery of complex ecosystem models. Finally, the inference of multiscale bottom gravity current dynamics is illustrated, motivated in part by classic overflows and dense water formation sites and their relevance to climate monitoring and dynamics. This is joint work with our MSEAS group at MIT.
International Nuclear Information System (INIS)
Zhang, Liangwei; Lin, Jing; Karim, Ramin
2015-01-01
The accuracy of traditional anomaly detection techniques implemented on full-dimensional spaces degrades significantly as dimensionality increases, thereby hampering many real-world applications. This work proposes an approach to selecting meaningful feature subspace and conducting anomaly detection in the corresponding subspace projection. The aim is to maintain the detection accuracy in high-dimensional circumstances. The suggested approach assesses the angle between all pairs of two lines for one specific anomaly candidate: the first line is connected by the relevant data point and the center of its adjacent points; the other line is one of the axis-parallel lines. Those dimensions which have a relatively small angle with the first line are then chosen to constitute the axis-parallel subspace for the candidate. Next, a normalized Mahalanobis distance is introduced to measure the local outlier-ness of an object in the subspace projection. To comprehensively compare the proposed algorithm with several existing anomaly detection techniques, we constructed artificial datasets with various high-dimensional settings and found the algorithm displayed superior accuracy. A further experiment on an industrial dataset demonstrated the applicability of the proposed algorithm in fault detection tasks and highlighted another of its merits, namely, to provide preliminary interpretation of abnormality through feature ordering in relevant subspaces. - Highlights: • An anomaly detection approach for high-dimensional reliability data is proposed. • The approach selects relevant subspaces by assessing vectorial angles. • The novel ABSAD approach displays superior accuracy over other alternatives. • Numerical illustration approves its efficacy in fault detection applications
Macroscopicity of quantum superpositions on a one-parameter unitary path in Hilbert space
Volkoff, T. J.; Whaley, K. B.
2014-12-01
We analyze quantum states formed as superpositions of an initial pure product state and its image under local unitary evolution, using two measurement-based measures of superposition size: one based on the optimal quantum binary distinguishability of the branches of the superposition and another based on the ratio of the maximal quantum Fisher information of the superposition to that of its branches, i.e., the relative metrological usefulness of the superposition. A general formula for the effective sizes of these states according to the branch-distinguishability measure is obtained and applied to superposition states of N quantum harmonic oscillators composed of Gaussian branches. Considering optimal distinguishability of pure states on a time-evolution path leads naturally to a notion of distinguishability time that generalizes the well-known orthogonalization times of Mandelstam and Tamm and Margolus and Levitin. We further show that the distinguishability time provides a compact operational expression for the superposition size measure based on the relative quantum Fisher information. By restricting the maximization procedure in the definition of this measure to an appropriate algebra of observables, we show that the superposition size of, e.g., NOON states and hierarchical cat states, can scale linearly with the number of elementary particles comprising the superposition state, implying precision scaling inversely with the total number of photons when these states are employed as probes in quantum parameter estimation of a 1-local Hamiltonian in this algebra.
X-ray Pulsars Across the Parameter Space of Luminosity, Accretion Mode, and Spin
Laycock, Silas
-ray Binary (HMXB) populations. Our unique library is already fueling progress on fundamental NS parameters and accretion physics.
Roberts, Arthur; Lhuillier, Andrew; Liu, Yi; Ruggiu, Alessandra; Shi, Yufang
Elucidation of the effects of space flight on the immune system of astronauts and other animal species is important for the survival and success of manned space flight, especially long-term missions. Space flight exposes astronauts to microgravity, galactic cosmic radiation (GCR), and various psycho-social stressors. Blood samples from astronauts returning from space flight have shown changes in the numbers and types of circulating leukocytes. Similarly, normal lym-phocyte homeostasis has been shown to be severely affected in mice using ground-based models of microgravity and GCR exposure, as demonstrated by profound effects on several immuno-logical parameters examined by other investigators and ourselves. In particular, lymphocyte numbers are significantly reduced and subpopulation distribution is altered in the spleen, thy-mus, and peripheral blood following hindlimb unloading (HU) in mice. Lymphocyte depletion was found to be mediated through corticosteroid-induced apoptosis, although the molecular mechanism of apoptosis induction is still under investigation. The proliferative capacity of TCR-stimulated lymphocytes was also inhibited after HU. We have similarly shown that mice exposed to high-energy 56Fe ion radiation have decreased lymphocyte numbers and perturba-tions in proportions of various subpopulations, including CD4+ and CD8+ T cells, and B cells in the spleen, and maturation stages of immature T cells in the thymus. To compare these ground-based results to the effects of actual space-flight, fresh spleen and thymus samples were recently obtained from normal and transgenic mice immediately after 90 d. space-flight in the MDS, and identically-housed ground control mice. Total leukocyte numbers in each organ were enumerated, and subpopulation distribution was examined by flow cytometric analysis of CD3, CD4, CD8, CD19, CD25, DX-5, and CD11b. Splenic T cells were stimulated with anti-CD3 and assessed for proliferation after 2-4 d., and production of
Energy Technology Data Exchange (ETDEWEB)
Zawadzka-Kazimierczuk, Anna; Kozminski, Wiktor [University of Warsaw, Faculty of Chemistry (Poland); Billeter, Martin, E-mail: martin.billeter@chem.gu.se [University of Gothenburg, Biophysics Group, Department of Chemistry and Molecular Biology (Sweden)
2012-09-15
While NMR studies of proteins typically aim at structure, dynamics or interactions, resonance assignments represent in almost all cases the initial step of the analysis. With increasing complexity of the NMR spectra, for example due to decreasing extent of ordered structure, this task often becomes both difficult and time-consuming, and the recording of high-dimensional data with high-resolution may be essential. Random sampling of the evolution time space, combined with sparse multidimensional Fourier transform (SMFT), allows for efficient recording of very high dimensional spectra ({>=}4 dimensions) while maintaining high resolution. However, the nature of this data demands for automation of the assignment process. Here we present the program TSAR (Tool for SMFT-based Assignment of Resonances), which exploits all advantages of SMFT input. Moreover, its flexibility allows to process data from any type of experiments that provide sequential connectivities. The algorithm was tested on several protein samples, including a disordered 81-residue fragment of the {delta} subunit of RNA polymerase from Bacillus subtilis containing various repetitive sequences. For our test examples, TSAR achieves a high percentage of assigned residues without any erroneous assignments.
Wang, Xueyi
2012-02-08
The k-nearest neighbors (k-NN) algorithm is a widely used machine learning method that finds nearest neighbors of a test object in a feature space. We present a new exact k-NN algorithm called kMkNN (k-Means for k-Nearest Neighbors) that uses the k-means clustering and the triangle inequality to accelerate the searching for nearest neighbors in a high dimensional space. The kMkNN algorithm has two stages. In the buildup stage, instead of using complex tree structures such as metric trees, kd-trees, or ball-tree, kMkNN uses a simple k-means clustering method to preprocess the training dataset. In the searching stage, given a query object, kMkNN finds nearest training objects starting from the nearest cluster to the query object and uses the triangle inequality to reduce the distance calculations. Experiments show that the performance of kMkNN is surprisingly good compared to the traditional k-NN algorithm and tree-based k-NN algorithms such as kd-trees and ball-trees. On a collection of 20 datasets with up to 10(6) records and 10(4) dimensions, kMkNN shows a 2-to 80-fold reduction of distance calculations and a 2- to 60-fold speedup over the traditional k-NN algorithm for 16 datasets. Furthermore, kMkNN performs significant better than a kd-tree based k-NN algorithm for all datasets and performs better than a ball-tree based k-NN algorithm for most datasets. The results show that kMkNN is effective for searching nearest neighbors in high dimensional spaces.
Krenn, Julia; Zangerl, Christian; Mergili, Martin
2017-04-01
r.randomwalk is a GIS-based, multi-functional, conceptual open source model application for forward and backward analyses of the propagation of mass flows. It relies on a set of empirically derived, uncertain input parameters. In contrast to many other tools, r.randomwalk accepts input parameter ranges (or, in case of two or more parameters, spaces) in order to directly account for these uncertainties. Parameter spaces represent a possibility to withdraw from discrete input values which in most cases are likely to be off target. r.randomwalk automatically performs multiple calculations with various parameter combinations in a given parameter space, resulting in the impact indicator index (III) which denotes the fraction of parameter value combinations predicting an impact on a given pixel. Still, there is a need to constrain the parameter space used for a certain process type or magnitude prior to performing forward calculations. This can be done by optimizing the parameter space in terms of bringing the model results in line with well-documented past events. As most existing parameter optimization algorithms are designed for discrete values rather than for ranges or spaces, the necessity for a new and innovative technique arises. The present study aims at developing such a technique and at applying it to derive guiding parameter spaces for the forward calculation of rock avalanches through back-calculation of multiple events. In order to automatize the work flow we have designed r.ranger, an optimization and sensitivity analysis tool for parameter spaces which can be directly coupled to r.randomwalk. With r.ranger we apply a nested approach where the total value range of each parameter is divided into various levels of subranges. All possible combinations of subranges of all parameters are tested for the performance of the associated pattern of III. Performance indicators are the area under the ROC curve (AUROC) and the factor of conservativeness (FoC). This
Jia, Bing
2014-03-01
A comb-shaped chaotic region has been simulated in multiple two-dimensional parameter spaces using the Hindmarsh—Rose (HR) neuron model in many recent studies, which can interpret almost all of the previously simulated bifurcation processes with chaos in neural firing patterns. In the present paper, a comb-shaped chaotic region in a two-dimensional parameter space was reproduced, which presented different processes of period-adding bifurcations with chaos with changing one parameter and fixed the other parameter at different levels. In the biological experiments, different period-adding bifurcation scenarios with chaos by decreasing the extra-cellular calcium concentration were observed from some neural pacemakers at different levels of extra-cellular 4-aminopyridine concentration and from other pacemakers at different levels of extra-cellular caesium concentration. By using the nonlinear time series analysis method, the deterministic dynamics of the experimental chaotic firings were investigated. The period-adding bifurcations with chaos observed in the experiments resembled those simulated in the comb-shaped chaotic region using the HR model. The experimental results show that period-adding bifurcations with chaos are preserved in different two-dimensional parameter spaces, which provides evidence of the existence of the comb-shaped chaotic region and a demonstration of the simulation results in different two-dimensional parameter spaces in the HR neuron model. The results also present relationships between different firing patterns in two-dimensional parameter spaces.
International Nuclear Information System (INIS)
Jia Bing
2014-01-01
A comb-shaped chaotic region has been simulated in multiple two-dimensional parameter spaces using the Hindmarsh—Rose (HR) neuron model in many recent studies, which can interpret almost all of the previously simulated bifurcation processes with chaos in neural firing patterns. In the present paper, a comb-shaped chaotic region in a two-dimensional parameter space was reproduced, which presented different processes of period-adding bifurcations with chaos with changing one parameter and fixed the other parameter at different levels. In the biological experiments, different period-adding bifurcation scenarios with chaos by decreasing the extra-cellular calcium concentration were observed from some neural pacemakers at different levels of extra-cellular 4-aminopyridine concentration and from other pacemakers at different levels of extra-cellular caesium concentration. By using the nonlinear time series analysis method, the deterministic dynamics of the experimental chaotic firings were investigated. The period-adding bifurcations with chaos observed in the experiments resembled those simulated in the comb-shaped chaotic region using the HR model. The experimental results show that period-adding bifurcations with chaos are preserved in different two-dimensional parameter spaces, which provides evidence of the existence of the comb-shaped chaotic region and a demonstration of the simulation results in different two-dimensional parameter spaces in the HR neuron model. The results also present relationships between different firing patterns in two-dimensional parameter spaces
Engineering two-photon high-dimensional states through quantum interference
Zhang, Yingwen; Roux, Filippus S.; Konrad, Thomas; Agnew, Megan; Leach, Jonathan; Forbes, Andrew
2016-01-01
Many protocols in quantum science, for example, linear optical quantum computing, require access to large-scale entangled quantum states. Such systems can be realized through many-particle qubits, but this approach often suffers from scalability problems. An alternative strategy is to consider a lesser number of particles that exist in high-dimensional states. The spatial modes of light are one such candidate that provides access to high-dimensional quantum states, and thus they increase the storage and processing potential of quantum information systems. We demonstrate the controlled engineering of two-photon high-dimensional states entangled in their orbital angular momentum through Hong-Ou-Mandel interference. We prepare a large range of high-dimensional entangled states and implement precise quantum state filtering. We characterize the full quantum state before and after the filter, and are thus able to determine that only the antisymmetric component of the initial state remains. This work paves the way for high-dimensional processing and communication of multiphoton quantum states, for example, in teleportation beyond qubits. PMID:26933685
A Comparison of Methods for Estimating the Determinant of High-Dimensional Covariance Matrix
Hu, Zongliang
2017-09-27
The determinant of the covariance matrix for high-dimensional data plays an important role in statistical inference and decision. It has many real applications including statistical tests and information theory. Due to the statistical and computational challenges with high dimensionality, little work has been proposed in the literature for estimating the determinant of high-dimensional covariance matrix. In this paper, we estimate the determinant of the covariance matrix using some recent proposals for estimating high-dimensional covariance matrix. Specifically, we consider a total of eight covariance matrix estimation methods for comparison. Through extensive simulation studies, we explore and summarize some interesting comparison results among all compared methods. We also provide practical guidelines based on the sample size, the dimension, and the correlation of the data set for estimating the determinant of high-dimensional covariance matrix. Finally, from a perspective of the loss function, the comparison study in this paper may also serve as a proxy to assess the performance of the covariance matrix estimation.
A Comparison of Methods for Estimating the Determinant of High-Dimensional Covariance Matrix.
Hu, Zongliang; Dong, Kai; Dai, Wenlin; Tong, Tiejun
2017-09-21
The determinant of the covariance matrix for high-dimensional data plays an important role in statistical inference and decision. It has many real applications including statistical tests and information theory. Due to the statistical and computational challenges with high dimensionality, little work has been proposed in the literature for estimating the determinant of high-dimensional covariance matrix. In this paper, we estimate the determinant of the covariance matrix using some recent proposals for estimating high-dimensional covariance matrix. Specifically, we consider a total of eight covariance matrix estimation methods for comparison. Through extensive simulation studies, we explore and summarize some interesting comparison results among all compared methods. We also provide practical guidelines based on the sample size, the dimension, and the correlation of the data set for estimating the determinant of high-dimensional covariance matrix. Finally, from a perspective of the loss function, the comparison study in this paper may also serve as a proxy to assess the performance of the covariance matrix estimation.
A Comparison of Methods for Estimating the Determinant of High-Dimensional Covariance Matrix
Hu, Zongliang; Dong, Kai; Dai, Wenlin; Tong, Tiejun
2017-01-01
The determinant of the covariance matrix for high-dimensional data plays an important role in statistical inference and decision. It has many real applications including statistical tests and information theory. Due to the statistical and computational challenges with high dimensionality, little work has been proposed in the literature for estimating the determinant of high-dimensional covariance matrix. In this paper, we estimate the determinant of the covariance matrix using some recent proposals for estimating high-dimensional covariance matrix. Specifically, we consider a total of eight covariance matrix estimation methods for comparison. Through extensive simulation studies, we explore and summarize some interesting comparison results among all compared methods. We also provide practical guidelines based on the sample size, the dimension, and the correlation of the data set for estimating the determinant of high-dimensional covariance matrix. Finally, from a perspective of the loss function, the comparison study in this paper may also serve as a proxy to assess the performance of the covariance matrix estimation.
Gomez, Luis J; Yücel, Abdulkadir C; Hernandez-Garcia, Luis; Taylor, Stephan F; Michielssen, Eric
2015-01-01
A computational framework for uncertainty quantification in transcranial magnetic stimulation (TMS) is presented. The framework leverages high-dimensional model representations (HDMRs), which approximate observables (i.e., quantities of interest such as electric (E) fields induced inside targeted cortical regions) via series of iteratively constructed component functions involving only the most significant random variables (i.e., parameters that characterize the uncertainty in a TMS setup such as the position and orientation of TMS coils, as well as the size, shape, and conductivity of the head tissue). The component functions of HDMR expansions are approximated via a multielement probabilistic collocation (ME-PC) method. While approximating each component function, a quasi-static finite-difference simulator is used to compute observables at integration/collocation points dictated by the ME-PC method. The proposed framework requires far fewer simulations than traditional Monte Carlo methods for providing highly accurate statistical information (e.g., the mean and standard deviation) about the observables. The efficiency and accuracy of the proposed framework are demonstrated via its application to the statistical characterization of E-fields generated by TMS inside cortical regions of an MRI-derived realistic head model. Numerical results show that while uncertainties in tissue conductivities have negligible effects on TMS operation, variations in coil position/orientation and brain size significantly affect the induced E-fields. Our numerical results have several implications for the use of TMS during depression therapy: 1) uncertainty in the coil position and orientation may reduce the response rates of patients; 2) practitioners should favor targets on the crest of a gyrus to obtain maximal stimulation; and 3) an increasing scalp-to-cortex distance reduces the magnitude of E-fields on the surface and inside the cortex.
An Autonomous Sensor Tasking Approach for Large Scale Space Object Cataloging
Linares, R.; Furfaro, R.
The field of Space Situational Awareness (SSA) has progressed over the last few decades with new sensors coming online, the development of new approaches for making observations, and new algorithms for processing them. Although there has been success in the development of new approaches, a missing piece is the translation of SSA goals to sensors and resource allocation; otherwise known as the Sensor Management Problem (SMP). This work solves the SMP using an artificial intelligence approach called Deep Reinforcement Learning (DRL). Stable methods for training DRL approaches based on neural networks exist, but most of these approaches are not suitable for high dimensional systems. The Asynchronous Advantage Actor-Critic (A3C) method is a recently developed and effective approach for high dimensional systems, and this work leverages these results and applies this approach to decision making in SSA. The decision space for the SSA problems can be high dimensional, even for tasking of a single telescope. Since the number of SOs in space is relatively high, each sensor will have a large number of possible actions at a given time. Therefore, efficient DRL approaches are required when solving the SMP for SSA. This work develops a A3C based method for DRL applied to SSA sensor tasking. One of the key benefits of DRL approaches is the ability to handle high dimensional data. For example DRL methods have been applied to image processing for the autonomous car application. For example, a 256x256 RGB image has 196608 parameters (256*256*3=196608) which is very high dimensional, and deep learning approaches routinely take images like this as inputs. Therefore, when applied to the whole catalog the DRL approach offers the ability to solve this high dimensional problem. This work has the potential to, for the first time, solve the non-myopic sensor tasking problem for the whole SO catalog (over 22,000 objects) providing a truly revolutionary result.
Tsutagawa, Michael H.; Michael, Sherif
2009-01-01
This paper presents the design parameters for a triple junction InGaP/GaAs/Ge space solar cell with a simulated maximum efficiency of 36.28% using Silvaco ATLAS Virtual Wafer Fabrication tool. Design parameters include the layer material, doping concentration, and thicknesses.
Wang, S.; Huang, G. H.; Huang, W.; Fan, Y. R.; Li, Z.
2015-10-01
In this study, a fractional factorial probabilistic collocation method is proposed to reveal statistical significance of hydrologic model parameters and their multi-level interactions affecting model outputs, facilitating uncertainty propagation in a reduced dimensional space. The proposed methodology is applied to the Xiangxi River watershed in China to demonstrate its validity and applicability, as well as its capability of revealing complex and dynamic parameter interactions. A set of reduced polynomial chaos expansions (PCEs) only with statistically significant terms can be obtained based on the results of factorial analysis of variance (ANOVA), achieving a reduction of uncertainty in hydrologic predictions. The predictive performance of reduced PCEs is verified by comparing against standard PCEs and the Monte Carlo with Latin hypercube sampling (MC-LHS) method in terms of reliability, sharpness, and Nash-Sutcliffe efficiency (NSE). Results reveal that the reduced PCEs are able to capture hydrologic behaviors of the Xiangxi River watershed, and they are efficient functional representations for propagating uncertainties in hydrologic predictions.
Alexander, LYSENKO; Iurii, VOLK
2018-03-01
We developed a cubic non-linear theory describing the dynamics of the multiharmonic space-charge wave (SCW), with harmonics frequencies smaller than the two-stream instability critical frequency, with different relativistic electron beam (REB) parameters. The self-consistent differential equation system for multiharmonic SCW harmonic amplitudes was elaborated in a cubic non-linear approximation. This system considers plural three-wave parametric resonant interactions between wave harmonics and the two-stream instability effect. Different REB parameters such as the input angle with respect to focusing magnetic field, the average relativistic factor value, difference of partial relativistic factors, and plasma frequency of partial beams were investigated regarding their influence on the frequency spectrum width and multiharmonic SCW saturation levels. We suggested ways in which the multiharmonic SCW frequency spectrum widths could be increased in order to use them in multiharmonic two-stream superheterodyne free-electron lasers, with the main purpose of forming a powerful multiharmonic electromagnetic wave.
Linear stability theory as an early warning sign for transitions in high dimensional complex systems
International Nuclear Information System (INIS)
Piovani, Duccio; Grujić, Jelena; Jensen, Henrik Jeldtoft
2016-01-01
We analyse in detail a new approach to the monitoring and forecasting of the onset of transitions in high dimensional complex systems by application to the Tangled Nature model of evolutionary ecology and high dimensional replicator systems with a stochastic element. A high dimensional stability matrix is derived in the mean field approximation to the stochastic dynamics. This allows us to determine the stability spectrum about the observed quasi-stable configurations. From overlap of the instantaneous configuration vector of the full stochastic system with the eigenvectors of the unstable directions of the deterministic mean field approximation, we are able to construct a good early-warning indicator of the transitions occurring intermittently. (paper)
Fickler, Robert; Lapkiewicz, Radek; Huber, Marcus; Lavery, Martin P J; Padgett, Miles J; Zeilinger, Anton
2014-07-30
Photonics has become a mature field of quantum information science, where integrated optical circuits offer a way to scale the complexity of the set-up as well as the dimensionality of the quantum state. On photonic chips, paths are the natural way to encode information. To distribute those high-dimensional quantum states over large distances, transverse spatial modes, like orbital angular momentum possessing Laguerre Gauss modes, are favourable as flying information carriers. Here we demonstrate a quantum interface between these two vibrant photonic fields. We create three-dimensional path entanglement between two photons in a nonlinear crystal and use a mode sorter as the quantum interface to transfer the entanglement to the orbital angular momentum degree of freedom. Thus our results show a flexible way to create high-dimensional spatial mode entanglement. Moreover, they pave the way to implement broad complex quantum networks where high-dimensionally entangled states could be distributed over distant photonic chips.
Directory of Open Access Journals (Sweden)
Thenmozhi Srinivasan
2015-01-01
Full Text Available Clusters of high-dimensional data techniques are emerging, according to data noisy and poor quality challenges. This paper has been developed to cluster data using high-dimensional similarity based PCM (SPCM, with ant colony optimization intelligence which is effective in clustering nonspatial data without getting knowledge about cluster number from the user. The PCM becomes similarity based by using mountain method with it. Though this is efficient clustering, it is checked for optimization using ant colony algorithm with swarm intelligence. Thus the scalable clustering technique is obtained and the evaluation results are checked with synthetic datasets.
The validation and assessment of machine learning: a game of prediction from high-dimensional data
DEFF Research Database (Denmark)
Pers, Tune Hannes; Albrechtsen, A; Holst, C
2009-01-01
In applied statistics, tools from machine learning are popular for analyzing complex and high-dimensional data. However, few theoretical results are available that could guide to the appropriate machine learning tool in a new application. Initial development of an overall strategy thus often...... the ideas, the game is applied to data from the Nugenob Study where the aim is to predict the fat oxidation capacity based on conventional factors and high-dimensional metabolomics data. Three players have chosen to use support vector machines, LASSO, and random forests, respectively....
Precision Parameter Estimation and Machine Learning
Wandelt, Benjamin D.
2008-12-01
I discuss the strategy of ``Acceleration by Parallel Precomputation and Learning'' (AP-PLe) that can vastly accelerate parameter estimation in high-dimensional parameter spaces and costly likelihood functions, using trivially parallel computing to speed up sequential exploration of parameter space. This strategy combines the power of distributed computing with machine learning and Markov-Chain Monte Carlo techniques efficiently to explore a likelihood function, posterior distribution or χ2-surface. This strategy is particularly successful in cases where computing the likelihood is costly and the number of parameters is moderate or large. We apply this technique to two central problems in cosmology: the solution of the cosmological parameter estimation problem with sufficient accuracy for the Planck data using PICo; and the detailed calculation of cosmological helium and hydrogen recombination with RICO. Since the APPLe approach is designed to be able to use massively parallel resources to speed up problems that are inherently serial, we can bring the power of distributed computing to bear on parameter estimation problems. We have demonstrated this with the CosmologyatHome project.
Ravindranath, Swara; Ho, Luis C.; Peng, Chien Y.; Filippenko, Alexei V.; Sargent, Wallace L. W.
2001-08-01
We present surface photometry for the central regions of a sample of 33 early-type (E, S0, and S0/a) galaxies observed at 1.6 μm (H band) using the Hubble Space Telescope. Dust absorption has less of an impact on the galaxy morphologies in the near-infrared than found in previous work based on observations at optical wavelengths. When present, dust seems to be most commonly associated with optical line emission. We employ a new technique of two-dimensional fitting to extract quantitative parameters for the bulge light distribution and nuclear point sources, taking into consideration the effects of the point-spread function. By parameterizing the bulge profile with a Nuker law, we confirm that the central surface brightness distributions largely fall into two categories, each of which correlates with the global properties of the galaxies. ``Core'' galaxies tend to be luminous elliptical galaxies with boxy or pure elliptical isophotes, whereas ``power-law'' galaxies are preferentially lower luminosity systems with disky isophotes. The infrared surface brightness profiles are very similar to those in the optical, with notable exceptions being very dusty objects. Similar to the study of Faber et al., based on optical data, we find that galaxy cores obey a set of fundamental plane relations wherein more luminous galaxies with higher central stellar velocity dispersions generally possess larger cores with lower surface brightnesses. Unlike most previous studies, however, we do not find a clear gap in the distribution of inner cusp slopes; several objects have inner cusp slopes (0.3law galaxies. The nature of these intermediate objects is unclear. We draw attention to two objects in the sample that appear to be promising cases of galaxies with isothermal cores that are not the brightest members of a cluster. Unresolved nuclear point sources are found in ~50% of the sample galaxies, roughly independent of profile type, with magnitudes in the range mnucH=12.8 to 17.4 mag
An irregular grid approach for pricing high-dimensional American options
Berridge, S.J.; Schumacher, J.M.
2008-01-01
We propose and test a new method for pricing American options in a high-dimensional setting. The method is centered around the approximation of the associated complementarity problem on an irregular grid. We approximate the partial differential operator on this grid by appealing to the SDE
CSIR Research Space (South Africa)
Giovannini, D
2013-06-01
Full Text Available : QELS_Fundamental Science, San Jose, California United States, 9-14 June 2013 Reconstruction of High-Dimensional States Entangled in Orbital Angular Momentum Using Mutually Unbiased Measurements D. Giovannini1, ⇤, J. Romero1, 2, J. Leach3, A...
Global communication schemes for the numerical solution of high-dimensional PDEs
DEFF Research Database (Denmark)
Hupp, Philipp; Heene, Mario; Jacob, Riko
2016-01-01
The numerical treatment of high-dimensional partial differential equations is among the most compute-hungry problems and in urgent need for current and future high-performance computing (HPC) systems. It is thus also facing the grand challenges of exascale computing such as the requirement...
High-Dimensional Exploratory Item Factor Analysis by a Metropolis-Hastings Robbins-Monro Algorithm
Cai, Li
2010-01-01
A Metropolis-Hastings Robbins-Monro (MH-RM) algorithm for high-dimensional maximum marginal likelihood exploratory item factor analysis is proposed. The sequence of estimates from the MH-RM algorithm converges with probability one to the maximum likelihood solution. Details on the computer implementation of this algorithm are provided. The…
Estimating the effect of a variable in a high-dimensional regression model
DEFF Research Database (Denmark)
Jensen, Peter Sandholt; Wurtz, Allan
assume that the effect is identified in a high-dimensional linear model specified by unconditional moment restrictions. We consider properties of the following methods, which rely on lowdimensional models to infer the effect: Extreme bounds analysis, the minimum t-statistic over models, Sala...
Multi-Scale Factor Analysis of High-Dimensional Brain Signals
Ting, Chee-Ming; Ombao, Hernando; Salleh, Sh-Hussain
2017-01-01
In this paper, we develop an approach to modeling high-dimensional networks with a large number of nodes arranged in a hierarchical and modular structure. We propose a novel multi-scale factor analysis (MSFA) model which partitions the massive
Spectrally-Corrected Estimation for High-Dimensional Markowitz Mean-Variance Optimization
Z. Bai (Zhidong); H. Li (Hua); M.J. McAleer (Michael); W.-K. Wong (Wing-Keung)
2016-01-01
textabstractThis paper considers the portfolio problem for high dimensional data when the dimension and size are both large. We analyze the traditional Markowitz mean-variance (MV) portfolio by large dimension matrix theory, and find the spectral distribution of the sample covariance is the main
Berridge, S.J.; Schumacher, J.M.
2004-01-01
We propose a method for pricing high-dimensional American options on an irregular grid; the method involves using quadratic functions to approximate the local effect of the Black-Scholes operator.Once such an approximation is known, one can solve the pricing problem by time stepping in an explicit
Multigrid for high dimensional elliptic partial differential equations on non-equidistant grids
bin Zubair, H.; Oosterlee, C.E.; Wienands, R.
2006-01-01
This work presents techniques, theory and numbers for multigrid in a general d-dimensional setting. The main focus is the multigrid convergence for high-dimensional partial differential equations (PDEs). As a model problem we have chosen the anisotropic diffusion equation, on a unit hypercube. We
An Irregular Grid Approach for Pricing High-Dimensional American Options
Berridge, S.J.; Schumacher, J.M.
2004-01-01
We propose and test a new method for pricing American options in a high-dimensional setting.The method is centred around the approximation of the associated complementarity problem on an irregular grid.We approximate the partial differential operator on this grid by appealing to the SDE
Pricing and hedging high-dimensional American options : an irregular grid approach
Berridge, S.; Schumacher, H.
2002-01-01
We propose and test a new method for pricing American options in a high dimensional setting. The method is centred around the approximation of the associated variational inequality on an irregular grid. We approximate the partial differential operator on this grid by appealing to the SDE
Regulation of NF-κB oscillation by spatial parameters in true intracellular space (TiCS)
Ohshima, Daisuke; Sagara, Hiroshi; Ichikawa, Kazuhisa
2013-10-01
Transcription factor NF-κB is activated by cytokine stimulation, viral infection, or hypoxic environment leading to its translocation to the nucleus. The nuclear NF-κB is exported from the nucleus to the cytoplasm again, and by repetitive import and export, NF-κB shows damped oscillation with the period of 1.5-2.0 h. Oscillation pattern of NF-κB is thought to determine the gene expression profile. We published a report on a computational simulation for the oscillation of nuclear NF-κB in a 3D spherical cell, and showed the importance of spatial parameters such as diffusion coefficient and locus of translation for determining the oscillation pattern. Although the value of diffusion coefficient is inherent to protein species, its effective value can be modified by organelle crowding in intracellular space. Here we tested this possibility by computer simulation. The results indicate that the effective value of diffusion coefficient is significantly changed by the organelle crowding, and this alters the oscillation pattern of nuclear NF-κB.
International Nuclear Information System (INIS)
Liu, W; Sawant, A; Ruan, D
2016-01-01
Purpose: The development of high dimensional imaging systems (e.g. volumetric MRI, CBCT, photogrammetry systems) in image-guided radiotherapy provides important pathways to the ultimate goal of real-time volumetric/surface motion monitoring. This study aims to develop a prediction method for the high dimensional state subject to respiratory motion. Compared to conventional linear dimension reduction based approaches, our method utilizes manifold learning to construct a descriptive feature submanifold, where more efficient and accurate prediction can be performed. Methods: We developed a prediction framework for high-dimensional state subject to respiratory motion. The proposed method performs dimension reduction in a nonlinear setting to permit more descriptive features compared to its linear counterparts (e.g., classic PCA). Specifically, a kernel PCA is used to construct a proper low-dimensional feature manifold, where low-dimensional prediction is performed. A fixed-point iterative pre-image estimation method is applied subsequently to recover the predicted value in the original state space. We evaluated and compared the proposed method with PCA-based method on 200 level-set surfaces reconstructed from surface point clouds captured by the VisionRT system. The prediction accuracy was evaluated with respect to root-mean-squared-error (RMSE) for both 200ms and 600ms lookahead lengths. Results: The proposed method outperformed PCA-based approach with statistically higher prediction accuracy. In one-dimensional feature subspace, our method achieved mean prediction accuracy of 0.86mm and 0.89mm for 200ms and 600ms lookahead lengths respectively, compared to 0.95mm and 1.04mm from PCA-based method. The paired t-tests further demonstrated the statistical significance of the superiority of our method, with p-values of 6.33e-3 and 5.78e-5, respectively. Conclusion: The proposed approach benefits from the descriptiveness of a nonlinear manifold and the prediction
International Nuclear Information System (INIS)
Ershov-Pavlov, E.A.; Katsalap, K.Yu.; Stepanov, K.L.; Stankevich, Yu.A.
2008-01-01
A physical model is developed accounting for dynamics and radiation of plasma plumes induced by nanosecond laser pulses on surface of solid samples. The model has been applied to simulate emission spectra of the laser erosion plasma at the elemental analysis of metals using single- and double-pulse excitation modes. Dynamics of the sample heating and expansion of the erosion products are accounted for by the thermal conductivity and gas dynamic equations, respectively, supposing axial symmetry. Using the resulting time-space distributions of the plasma parameters, emission spectra of the laser plumes are evaluated by solving the radiation transfer equation. Particle concentration in consecutive ionization stages is described by the Saha equation in the Debye approximation. The population of excited levels is determined according to Boltzmann distribution. Local characteristics determining spectral emission and absorption coefficients are obtained point-by-point along an observation line. Voigt spectral line profiles are considered with main broadening mechanisms taken into account. The plasma dynamics and plume emission spectra have been studied experimentally and by the model. A Q-switched Nd:YAG laser at 1064 nm wavelength has been used to irradiate Al sample with the pulses of 15 ns and 50 mJ duration and energy, respectively. It has resulted in maximum power density of 0.8 MW/cm 2 on the sample surface. The laser plume emission spectra have been recorded at a side-on observation. Problems of the spectra contrast and of the elemental analysis efficiency are considered relying on a comparative study of the measurement and simulation results at the both excitation modes
Bit-Table Based Biclustering and Frequent Closed Itemset Mining in High-Dimensional Binary Data
Directory of Open Access Journals (Sweden)
András Király
2014-01-01
Full Text Available During the last decade various algorithms have been developed and proposed for discovering overlapping clusters in high-dimensional data. The two most prominent application fields in this research, proposed independently, are frequent itemset mining (developed for market basket data and biclustering (applied to gene expression data analysis. The common limitation of both methodologies is the limited applicability for very large binary data sets. In this paper we propose a novel and efficient method to find both frequent closed itemsets and biclusters in high-dimensional binary data. The method is based on simple but very powerful matrix and vector multiplication approaches that ensure that all patterns can be discovered in a fast manner. The proposed algorithm has been implemented in the commonly used MATLAB environment and freely available for researchers.
Statistical Analysis for High-Dimensional Data : The Abel Symposium 2014
Bühlmann, Peter; Glad, Ingrid; Langaas, Mette; Richardson, Sylvia; Vannucci, Marina
2016-01-01
This book features research contributions from The Abel Symposium on Statistical Analysis for High Dimensional Data, held in Nyvågar, Lofoten, Norway, in May 2014. The focus of the symposium was on statistical and machine learning methodologies specifically developed for inference in “big data” situations, with particular reference to genomic applications. The contributors, who are among the most prominent researchers on the theory of statistics for high dimensional inference, present new theories and methods, as well as challenging applications and computational solutions. Specific themes include, among others, variable selection and screening, penalised regression, sparsity, thresholding, low dimensional structures, computational challenges, non-convex situations, learning graphical models, sparse covariance and precision matrices, semi- and non-parametric formulations, multiple testing, classification, factor models, clustering, and preselection. Highlighting cutting-edge research and casting light on...
Su, Yapeng; Shi, Qihui; Wei, Wei
2017-02-01
New insights on cellular heterogeneity in the last decade provoke the development of a variety of single cell omics tools at a lightning pace. The resultant high-dimensional single cell data generated by these tools require new theoretical approaches and analytical algorithms for effective visualization and interpretation. In this review, we briefly survey the state-of-the-art single cell proteomic tools with a particular focus on data acquisition and quantification, followed by an elaboration of a number of statistical and computational approaches developed to date for dissecting the high-dimensional single cell data. The underlying assumptions, unique features, and limitations of the analytical methods with the designated biological questions they seek to answer will be discussed. Particular attention will be given to those information theoretical approaches that are anchored in a set of first principles of physics and can yield detailed (and often surprising) predictions. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
A Shell Multi-dimensional Hierarchical Cubing Approach for High-Dimensional Cube
Zou, Shuzhi; Zhao, Li; Hu, Kongfa
The pre-computation of data cubes is critical for improving the response time of OLAP systems and accelerating data mining tasks in large data warehouses. However, as the sizes of data warehouses grow, the time it takes to perform this pre-computation becomes a significant performance bottleneck. In a high dimensional data warehouse, it might not be practical to build all these cuboids and their indices. In this paper, we propose a shell multi-dimensional hierarchical cubing algorithm, based on an extension of the previous minimal cubing approach. This method partitions the high dimensional data cube into low multi-dimensional hierarchical cube. Experimental results show that the proposed method is significantly more efficient than other existing cubing methods.
Minimax Rate-optimal Estimation of High-dimensional Covariance Matrices with Incomplete Data.
Cai, T Tony; Zhang, Anru
2016-09-01
Missing data occur frequently in a wide range of applications. In this paper, we consider estimation of high-dimensional covariance matrices in the presence of missing observations under a general missing completely at random model in the sense that the missingness is not dependent on the values of the data. Based on incomplete data, estimators for bandable and sparse covariance matrices are proposed and their theoretical and numerical properties are investigated. Minimax rates of convergence are established under the spectral norm loss and the proposed estimators are shown to be rate-optimal under mild regularity conditions. Simulation studies demonstrate that the estimators perform well numerically. The methods are also illustrated through an application to data from four ovarian cancer studies. The key technical tools developed in this paper are of independent interest and potentially useful for a range of related problems in high-dimensional statistical inference with missing data.
Minimax Rate-optimal Estimation of High-dimensional Covariance Matrices with Incomplete Data*
Cai, T. Tony; Zhang, Anru
2016-01-01
Missing data occur frequently in a wide range of applications. In this paper, we consider estimation of high-dimensional covariance matrices in the presence of missing observations under a general missing completely at random model in the sense that the missingness is not dependent on the values of the data. Based on incomplete data, estimators for bandable and sparse covariance matrices are proposed and their theoretical and numerical properties are investigated. Minimax rates of convergence are established under the spectral norm loss and the proposed estimators are shown to be rate-optimal under mild regularity conditions. Simulation studies demonstrate that the estimators perform well numerically. The methods are also illustrated through an application to data from four ovarian cancer studies. The key technical tools developed in this paper are of independent interest and potentially useful for a range of related problems in high-dimensional statistical inference with missing data. PMID:27777471
Xu, Chao; Fang, Jian; Shen, Hui; Wang, Yu-Ping; Deng, Hong-Wen
2018-01-25
Extreme phenotype sampling (EPS) is a broadly-used design to identify candidate genetic factors contributing to the variation of quantitative traits. By enriching the signals in extreme phenotypic samples, EPS can boost the association power compared to random sampling. Most existing statistical methods for EPS examine the genetic factors individually, despite many quantitative traits have multiple genetic factors underlying their variation. It is desirable to model the joint effects of genetic factors, which may increase the power and identify novel quantitative trait loci under EPS. The joint analysis of genetic data in high-dimensional situations requires specialized techniques, e.g., the least absolute shrinkage and selection operator (LASSO). Although there are extensive research and application related to LASSO, the statistical inference and testing for the sparse model under EPS remain unknown. We propose a novel sparse model (EPS-LASSO) with hypothesis test for high-dimensional regression under EPS based on a decorrelated score function. The comprehensive simulation shows EPS-LASSO outperforms existing methods with stable type I error and FDR control. EPS-LASSO can provide a consistent power for both low- and high-dimensional situations compared with the other methods dealing with high-dimensional situations. The power of EPS-LASSO is close to other low-dimensional methods when the causal effect sizes are small and is superior when the effects are large. Applying EPS-LASSO to a transcriptome-wide gene expression study for obesity reveals 10 significant body mass index associated genes. Our results indicate that EPS-LASSO is an effective method for EPS data analysis, which can account for correlated predictors. The source code is available at https://github.com/xu1912/EPSLASSO. hdeng2@tulane.edu. Supplementary data are available at Bioinformatics online. © The Author (2018). Published by Oxford University Press. All rights reserved. For Permissions, please
A Comparison of Machine Learning Methods in a High-Dimensional Classification Problem
Zekić-Sušac, Marijana; Pfeifer, Sanja; Šarlija, Nataša
2014-01-01
Background: Large-dimensional data modelling often relies on variable reduction methods in the pre-processing and in the post-processing stage. However, such a reduction usually provides less information and yields a lower accuracy of the model. Objectives: The aim of this paper is to assess the high-dimensional classification problem of recognizing entrepreneurial intentions of students by machine learning methods. Methods/Approach: Four methods were tested: artificial neural networks, CART ...
Preface [HD3-2015: International meeting on high-dimensional data-driven science
International Nuclear Information System (INIS)
2016-01-01
A never-ending series of innovations in measurement technology and evolutions in information and communication technologies have led to the ongoing generation and accumulation of large quantities of high-dimensional data every day. While detailed data-centric approaches have been pursued in respective research fields, situations have been encountered where the same mathematical framework of high-dimensional data analysis can be found in a wide variety of seemingly unrelated research fields, such as estimation on the basis of undersampled Fourier transform in nuclear magnetic resonance spectroscopy in chemistry, in magnetic resonance imaging in medicine, and in astronomical interferometry in astronomy. In such situations, bringing diverse viewpoints together therefore becomes a driving force for the creation of innovative developments in various different research fields. This meeting focuses on “Sparse Modeling” (SpM) as a methodology for creation of innovative developments through the incorporation of a wide variety of viewpoints in various research fields. The objective of this meeting is to offer a forum where researchers with interest in SpM can assemble and exchange information on the latest results and newly established methodologies, and discuss future directions of the interdisciplinary studies for High-Dimensional Data-Driven science (HD 3 ). The meeting was held in Kyoto from 14-17 December 2015. We are pleased to publish 22 papers contributed by invited speakers in this volume of Journal of Physics: Conference Series. We hope that this volume will promote further development of High-Dimensional Data-Driven science. (paper)
Runcie, Daniel E; Mukherjee, Sayan
2013-07-01
Quantitative genetic studies that model complex, multivariate phenotypes are important for both evolutionary prediction and artificial selection. For example, changes in gene expression can provide insight into developmental and physiological mechanisms that link genotype and phenotype. However, classical analytical techniques are poorly suited to quantitative genetic studies of gene expression where the number of traits assayed per individual can reach many thousand. Here, we derive a Bayesian genetic sparse factor model for estimating the genetic covariance matrix (G-matrix) of high-dimensional traits, such as gene expression, in a mixed-effects model. The key idea of our model is that we need consider only G-matrices that are biologically plausible. An organism's entire phenotype is the result of processes that are modular and have limited complexity. This implies that the G-matrix will be highly structured. In particular, we assume that a limited number of intermediate traits (or factors, e.g., variations in development or physiology) control the variation in the high-dimensional phenotype, and that each of these intermediate traits is sparse - affecting only a few observed traits. The advantages of this approach are twofold. First, sparse factors are interpretable and provide biological insight into mechanisms underlying the genetic architecture. Second, enforcing sparsity helps prevent sampling errors from swamping out the true signal in high-dimensional data. We demonstrate the advantages of our model on simulated data and in an analysis of a published Drosophila melanogaster gene expression data set.
Sepehry-Fard, F.; Coulthard, Maurice H.
1995-01-01
The objective of this publication is to introduce the enhancement methods for the overall reliability and maintainability methods of assessment on the International Space Station. It is essential that the process to predict the values of the maintenance time dependent variable parameters such as mean time between failure (MTBF) over time do not in themselves generate uncontrolled deviation in the results of the ILS analysis such as life cycle costs, spares calculation, etc. Furthermore, the very acute problems of micrometeorite, Cosmic rays, flares, atomic oxygen, ionization effects, orbital plumes and all the other factors that differentiate maintainable space operations from non-maintainable space operations and/or ground operations must be accounted for. Therefore, these parameters need be subjected to a special and complex process. Since reliability and maintainability strongly depend on the operating conditions that are encountered during the entire life of the International Space Station, it is important that such conditions are accurately identified at the beginning of the logistics support requirements process. Environmental conditions which exert a strong influence on International Space Station will be discussed in this report. Concurrent (combined) space environments may be more detrimental to the reliability and maintainability of the International Space Station than the effects of a single environment. In characterizing the logistics support requirements process, the developed design/test criteria must consider both the single and/or combined environments in anticipation of providing hardware capability to withstand the hazards of the International Space Station profile. The effects of the combined environments (typical) in a matrix relationship on the International Space Station will be shown. The combinations of the environments where the total effect is more damaging than the cumulative effects of the environments acting singly, may include a
Directory of Open Access Journals (Sweden)
R. Talebitooti
Full Text Available In this paper the effect of quadratic and cubic non-linearities of the system consisting of the crankshaft and torsional vibration damper (TVD is taken into account. TVD consists of non-linear elastomer material used for controlling the torsional vibration of crankshaft. The method of multiple scales is used to solve the governing equations of the system. Meanwhile, the frequency response of the system for both harmonic and sub-harmonic resonances is extracted. In addition, the effects of detuning parameters and other dimensionless parameters for a case of harmonic resonance are investigated. Moreover, the external forces including both inertia and gas forces are simultaneously applied into the model. Finally, in order to study the effectiveness of the parameters, the dimensionless governing equations of the system are solved, considering the state space method. Then, the effects of the torsional damper as well as all corresponding parameters of the system are discussed.
On-chip generation of high-dimensional entangled quantum states and their coherent control.
Kues, Michael; Reimer, Christian; Roztocki, Piotr; Cortés, Luis Romero; Sciara, Stefania; Wetzel, Benjamin; Zhang, Yanbing; Cino, Alfonso; Chu, Sai T; Little, Brent E; Moss, David J; Caspani, Lucia; Azaña, José; Morandotti, Roberto
2017-06-28
Optical quantum states based on entangled photons are essential for solving questions in fundamental physics and are at the heart of quantum information science. Specifically, the realization of high-dimensional states (D-level quantum systems, that is, qudits, with D > 2) and their control are necessary for fundamental investigations of quantum mechanics, for increasing the sensitivity of quantum imaging schemes, for improving the robustness and key rate of quantum communication protocols, for enabling a richer variety of quantum simulations, and for achieving more efficient and error-tolerant quantum computation. Integrated photonics has recently become a leading platform for the compact, cost-efficient, and stable generation and processing of non-classical optical states. However, so far, integrated entangled quantum sources have been limited to qubits (D = 2). Here we demonstrate on-chip generation of entangled qudit states, where the photons are created in a coherent superposition of multiple high-purity frequency modes. In particular, we confirm the realization of a quantum system with at least one hundred dimensions, formed by two entangled qudits with D = 10. Furthermore, using state-of-the-art, yet off-the-shelf telecommunications components, we introduce a coherent manipulation platform with which to control frequency-entangled states, capable of performing deterministic high-dimensional gate operations. We validate this platform by measuring Bell inequality violations and performing quantum state tomography. Our work enables the generation and processing of high-dimensional quantum states in a single spatial mode.
High-dimensional chaos from self-sustained collisions of solitons
Energy Technology Data Exchange (ETDEWEB)
Yildirim, O. Ozgur, E-mail: donhee@seas.harvard.edu, E-mail: oozgury@gmail.com [Cavium, Inc., 600 Nickerson Rd., Marlborough, Massachusetts 01752 (United States); Ham, Donhee, E-mail: donhee@seas.harvard.edu, E-mail: oozgury@gmail.com [Harvard University, 33 Oxford St., Cambridge, Massachusetts 02138 (United States)
2014-06-16
We experimentally demonstrate chaos generation based on collisions of electrical solitons on a nonlinear transmission line. The nonlinear line creates solitons, and an amplifier connected to it provides gain to these solitons for their self-excitation and self-sustenance. Critically, the amplifier also provides a mechanism to enable and intensify collisions among solitons. These collisional interactions are of intrinsically nonlinear nature, modulating the phase and amplitude of solitons, thus causing chaos. This chaos generated by the exploitation of the nonlinear wave phenomena is inherently high-dimensional, which we also demonstrate.
A novel algorithm of artificial immune system for high-dimensional function numerical optimization
Institute of Scientific and Technical Information of China (English)
DU Haifeng; GONG Maoguo; JIAO Licheng; LIU Ruochen
2005-01-01
Based on the clonal selection theory and immune memory theory, a novel artificial immune system algorithm, immune memory clonal programming algorithm (IMCPA), is put forward. Using the theorem of Markov chain, it is proved that IMCPA is convergent. Compared with some other evolutionary programming algorithms (like Breeder genetic algorithm), IMCPA is shown to be an evolutionary strategy capable of solving complex machine learning tasks, like high-dimensional function optimization, which maintains the diversity of the population and avoids prematurity to some extent, and has a higher convergence speed.
Computing and visualizing time-varying merge trees for high-dimensional data
Energy Technology Data Exchange (ETDEWEB)
Oesterling, Patrick [Univ. of Leipzig (Germany); Heine, Christian [Univ. of Kaiserslautern (Germany); Weber, Gunther H. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Morozov, Dmitry [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Scheuermann, Gerik [Univ. of Leipzig (Germany)
2017-06-03
We introduce a new method that identifies and tracks features in arbitrary dimensions using the merge tree -- a structure for identifying topological features based on thresholding in scalar fields. This method analyzes the evolution of features of the function by tracking changes in the merge tree and relates features by matching subtrees between consecutive time steps. Using the time-varying merge tree, we present a structural visualization of the changing function that illustrates both features and their temporal evolution. We demonstrate the utility of our approach by applying it to temporal cluster analysis of high-dimensional point clouds.
Non-Asymptotic Oracle Inequalities for the High-Dimensional Cox Regression via Lasso.
Kong, Shengchun; Nan, Bin
2014-01-01
We consider finite sample properties of the regularized high-dimensional Cox regression via lasso. Existing literature focuses on linear models or generalized linear models with Lipschitz loss functions, where the empirical risk functions are the summations of independent and identically distributed (iid) losses. The summands in the negative log partial likelihood function for censored survival data, however, are neither iid nor Lipschitz.We first approximate the negative log partial likelihood function by a sum of iid non-Lipschitz terms, then derive the non-asymptotic oracle inequalities for the lasso penalized Cox regression using pointwise arguments to tackle the difficulties caused by lacking iid Lipschitz losses.
High-dimensional data: p >> n in mathematical statistics and bio-medical applications
Van De Geer, Sara A.; Van Houwelingen, Hans C.
2004-01-01
The workshop 'High-dimensional data: p >> n in mathematical statistics and bio-medical applications' was held at the Lorentz Center in Leiden from 9 to 20 September 2002. This special issue of Bernoulli contains a selection of papers presented at that workshop. ¶ The introduction of high-throughput micro-array technology to measure gene-expression levels and the publication of the pioneering paper by Golub et al. (1999) has brought to life a whole new branch of data analysis under the name of...
Energy Technology Data Exchange (ETDEWEB)
Storm, Emma; Weniger, Christoph [GRAPPA, Institute of Physics, University of Amsterdam, Science Park 904, 1090 GL Amsterdam (Netherlands); Calore, Francesca, E-mail: e.m.storm@uva.nl, E-mail: c.weniger@uva.nl, E-mail: francesca.calore@lapth.cnrs.fr [LAPTh, CNRS, 9 Chemin de Bellevue, BP-110, Annecy-le-Vieux, 74941, Annecy Cedex (France)
2017-08-01
We present SkyFACT (Sky Factorization with Adaptive Constrained Templates), a new approach for studying, modeling and decomposing diffuse gamma-ray emission. Like most previous analyses, the approach relies on predictions from cosmic-ray propagation codes like GALPROP and DRAGON. However, in contrast to previous approaches, we account for the fact that models are not perfect and allow for a very large number (∼> 10{sup 5}) of nuisance parameters to parameterize these imperfections. We combine methods of image reconstruction and adaptive spatio-spectral template regression in one coherent hybrid approach. To this end, we use penalized Poisson likelihood regression, with regularization functions that are motivated by the maximum entropy method. We introduce methods to efficiently handle the high dimensionality of the convex optimization problem as well as the associated semi-sparse covariance matrix, using the L-BFGS-B algorithm and Cholesky factorization. We test the method both on synthetic data as well as on gamma-ray emission from the inner Galaxy, |ℓ|<90{sup o} and | b |<20{sup o}, as observed by the Fermi Large Area Telescope. We finally define a simple reference model that removes most of the residual emission from the inner Galaxy, based on conventional diffuse emission components as well as components for the Fermi bubbles, the Fermi Galactic center excess, and extended sources along the Galactic disk. Variants of this reference model can serve as basis for future studies of diffuse emission in and outside the Galactic disk.
Wu, Shuang; Liu, Zhi-Ping; Qiu, Xing; Wu, Hulin
2014-01-01
The immune response to viral infection is regulated by an intricate network of many genes and their products. The reverse engineering of gene regulatory networks (GRNs) using mathematical models from time course gene expression data collected after influenza infection is key to our understanding of the mechanisms involved in controlling influenza infection within a host. A five-step pipeline: detection of temporally differentially expressed genes, clustering genes into co-expressed modules, identification of network structure, parameter estimate refinement, and functional enrichment analysis, is developed for reconstructing high-dimensional dynamic GRNs from genome-wide time course gene expression data. Applying the pipeline to the time course gene expression data from influenza-infected mouse lungs, we have identified 20 distinct temporal expression patterns in the differentially expressed genes and constructed a module-based dynamic network using a linear ODE model. Both intra-module and inter-module annotations and regulatory relationships of our inferred network show some interesting findings and are highly consistent with existing knowledge about the immune response in mice after influenza infection. The proposed method is a computationally efficient, data-driven pipeline bridging experimental data, mathematical modeling, and statistical analysis. The application to the influenza infection data elucidates the potentials of our pipeline in providing valuable insights into systematic modeling of complicated biological processes.
Bayesian Multiresolution Variable Selection for Ultra-High Dimensional Neuroimaging Data.
Zhao, Yize; Kang, Jian; Long, Qi
2018-01-01
Ultra-high dimensional variable selection has become increasingly important in analysis of neuroimaging data. For example, in the Autism Brain Imaging Data Exchange (ABIDE) study, neuroscientists are interested in identifying important biomarkers for early detection of the autism spectrum disorder (ASD) using high resolution brain images that include hundreds of thousands voxels. However, most existing methods are not feasible for solving this problem due to their extensive computational costs. In this work, we propose a novel multiresolution variable selection procedure under a Bayesian probit regression framework. It recursively uses posterior samples for coarser-scale variable selection to guide the posterior inference on finer-scale variable selection, leading to very efficient Markov chain Monte Carlo (MCMC) algorithms. The proposed algorithms are computationally feasible for ultra-high dimensional data. Also, our model incorporates two levels of structural information into variable selection using Ising priors: the spatial dependence between voxels and the functional connectivity between anatomical brain regions. Applied to the resting state functional magnetic resonance imaging (R-fMRI) data in the ABIDE study, our methods identify voxel-level imaging biomarkers highly predictive of the ASD, which are biologically meaningful and interpretable. Extensive simulations also show that our methods achieve better performance in variable selection compared to existing methods.
Energy Efficient MAC Scheme for Wireless Sensor Networks with High-Dimensional Data Aggregate
Directory of Open Access Journals (Sweden)
Seokhoon Kim
2015-01-01
Full Text Available This paper presents a novel and sustainable medium access control (MAC scheme for wireless sensor network (WSN systems that process high-dimensional aggregated data. Based on a preamble signal and buffer threshold analysis, it maximizes the energy efficiency of the wireless sensor devices which have limited energy resources. The proposed group management MAC (GM-MAC approach not only sets the buffer threshold value of a sensor device to be reciprocal to the preamble signal but also sets a transmittable group value to each sensor device by using the preamble signal of the sink node. The primary difference between the previous and the proposed approach is that existing state-of-the-art schemes use duty cycle and sleep mode to save energy consumption of individual sensor devices, whereas the proposed scheme employs the group management MAC scheme for sensor devices to maximize the overall energy efficiency of the whole WSN systems by minimizing the energy consumption of sensor devices located near the sink node. Performance evaluations show that the proposed scheme outperforms the previous schemes in terms of active time of sensor devices, transmission delay, control overhead, and energy consumption. Therefore, the proposed scheme is suitable for sensor devices in a variety of wireless sensor networking environments with high-dimensional data aggregate.
Selecting Optimal Feature Set in High-Dimensional Data by Swarm Search
Directory of Open Access Journals (Sweden)
Simon Fong
2013-01-01
Full Text Available Selecting the right set of features from data of high dimensionality for inducing an accurate classification model is a tough computational challenge. It is almost a NP-hard problem as the combinations of features escalate exponentially as the number of features increases. Unfortunately in data mining, as well as other engineering applications and bioinformatics, some data are described by a long array of features. Many feature subset selection algorithms have been proposed in the past, but not all of them are effective. Since it takes seemingly forever to use brute force in exhaustively trying every possible combination of features, stochastic optimization may be a solution. In this paper, we propose a new feature selection scheme called Swarm Search to find an optimal feature set by using metaheuristics. The advantage of Swarm Search is its flexibility in integrating any classifier into its fitness function and plugging in any metaheuristic algorithm to facilitate heuristic search. Simulation experiments are carried out by testing the Swarm Search over some high-dimensional datasets, with different classification algorithms and various metaheuristic algorithms. The comparative experiment results show that Swarm Search is able to attain relatively low error rates in classification without shrinking the size of the feature subset to its minimum.
The validation and assessment of machine learning: a game of prediction from high-dimensional data.
Directory of Open Access Journals (Sweden)
Tune H Pers
Full Text Available In applied statistics, tools from machine learning are popular for analyzing complex and high-dimensional data. However, few theoretical results are available that could guide to the appropriate machine learning tool in a new application. Initial development of an overall strategy thus often implies that multiple methods are tested and compared on the same set of data. This is particularly difficult in situations that are prone to over-fitting where the number of subjects is low compared to the number of potential predictors. The article presents a game which provides some grounds for conducting a fair model comparison. Each player selects a modeling strategy for predicting individual response from potential predictors. A strictly proper scoring rule, bootstrap cross-validation, and a set of rules are used to make the results obtained with different strategies comparable. To illustrate the ideas, the game is applied to data from the Nugenob Study where the aim is to predict the fat oxidation capacity based on conventional factors and high-dimensional metabolomics data. Three players have chosen to use support vector machines, LASSO, and random forests, respectively.
High-dimensional quantum key distribution with the entangled single-photon-added coherent state
Energy Technology Data Exchange (ETDEWEB)
Wang, Yang [Zhengzhou Information Science and Technology Institute, Zhengzhou, 450001 (China); Synergetic Innovation Center of Quantum Information and Quantum Physics, University of Science and Technology of China, Hefei, Anhui 230026 (China); Bao, Wan-Su, E-mail: 2010thzz@sina.com [Zhengzhou Information Science and Technology Institute, Zhengzhou, 450001 (China); Synergetic Innovation Center of Quantum Information and Quantum Physics, University of Science and Technology of China, Hefei, Anhui 230026 (China); Bao, Hai-Ze; Zhou, Chun; Jiang, Mu-Sheng; Li, Hong-Wei [Zhengzhou Information Science and Technology Institute, Zhengzhou, 450001 (China); Synergetic Innovation Center of Quantum Information and Quantum Physics, University of Science and Technology of China, Hefei, Anhui 230026 (China)
2017-04-25
High-dimensional quantum key distribution (HD-QKD) can generate more secure bits for one detection event so that it can achieve long distance key distribution with a high secret key capacity. In this Letter, we present a decoy state HD-QKD scheme with the entangled single-photon-added coherent state (ESPACS) source. We present two tight formulas to estimate the single-photon fraction of postselected events and Eve's Holevo information and derive lower bounds on the secret key capacity and the secret key rate of our protocol. We also present finite-key analysis for our protocol by using the Chernoff bound. Our numerical results show that our protocol using one decoy state can perform better than that of previous HD-QKD protocol with the spontaneous parametric down conversion (SPDC) using two decoy states. Moreover, when considering finite resources, the advantage is more obvious. - Highlights: • Implement the single-photon-added coherent state source into the high-dimensional quantum key distribution. • Enhance both the secret key capacity and the secret key rate compared with previous schemes. • Show an excellent performance in view of statistical fluctuations.
A Feature Subset Selection Method Based On High-Dimensional Mutual Information
Directory of Open Access Journals (Sweden)
Chee Keong Kwoh
2011-04-01
Full Text Available Feature selection is an important step in building accurate classifiers and provides better understanding of the data sets. In this paper, we propose a feature subset selection method based on high-dimensional mutual information. We also propose to use the entropy of the class attribute as a criterion to determine the appropriate subset of features when building classifiers. We prove that if the mutual information between a feature set X and the class attribute Y equals to the entropy of Y , then X is a Markov Blanket of Y . We show that in some cases, it is infeasible to approximate the high-dimensional mutual information with algebraic combinations of pairwise mutual information in any forms. In addition, the exhaustive searches of all combinations of features are prerequisite for finding the optimal feature subsets for classifying these kinds of data sets. We show that our approach outperforms existing filter feature subset selection methods for most of the 24 selected benchmark data sets.
Using High-Dimensional Image Models to Perform Highly Undetectable Steganography
Pevný, Tomáš; Filler, Tomáš; Bas, Patrick
This paper presents a complete methodology for designing practical and highly-undetectable stegosystems for real digital media. The main design principle is to minimize a suitably-defined distortion by means of efficient coding algorithm. The distortion is defined as a weighted difference of extended state-of-the-art feature vectors already used in steganalysis. This allows us to "preserve" the model used by steganalyst and thus be undetectable even for large payloads. This framework can be efficiently implemented even when the dimensionality of the feature set used by the embedder is larger than 107. The high dimensional model is necessary to avoid known security weaknesses. Although high-dimensional models might be problem in steganalysis, we explain, why they are acceptable in steganography. As an example, we introduce HUGO, a new embedding algorithm for spatial-domain digital images and we contrast its performance with LSB matching. On the BOWS2 image database and in contrast with LSB matching, HUGO allows the embedder to hide 7× longer message with the same level of security level.
Quantum secret sharing based on modulated high-dimensional time-bin entanglement
International Nuclear Information System (INIS)
Takesue, Hiroki; Inoue, Kyo
2006-01-01
We propose a scheme for quantum secret sharing (QSS) that uses a modulated high-dimensional time-bin entanglement. By modulating the relative phase randomly by {0,π}, a sender with the entanglement source can randomly change the sign of the correlation of the measurement outcomes obtained by two distant recipients. The two recipients must cooperate if they are to obtain the sign of the correlation, which is used as a secret key. We show that our scheme is secure against intercept-and-resend (IR) and beam splitting attacks by an outside eavesdropper thanks to the nonorthogonality of high-dimensional time-bin entangled states. We also show that a cheating attempt based on an IR attack by one of the recipients can be detected by changing the dimension of the time-bin entanglement randomly and inserting two 'vacant' slots between the packets. Then, cheating attempts can be detected by monitoring the count rate in the vacant slots. The proposed scheme has better experimental feasibility than previously proposed entanglement-based QSS schemes
Similarity measurement method of high-dimensional data based on normalized net lattice subspace
Institute of Scientific and Technical Information of China (English)
Li Wenfa; Wang Gongming; Li Ke; Huang Su
2017-01-01
The performance of conventional similarity measurement methods is affected seriously by the curse of dimensionality of high-dimensional data.The reason is that data difference between sparse and noisy dimensionalities occupies a large proportion of the similarity, leading to the dissimilarities between any results.A similarity measurement method of high-dimensional data based on normalized net lattice subspace is proposed.The data range of each dimension is divided into several intervals, and the components in different dimensions are mapped onto the corresponding interval.Only the component in the same or adjacent interval is used to calculate the similarity.To validate this meth-od, three data types are used, and seven common similarity measurement methods are compared. The experimental result indicates that the relative difference of the method is increasing with the di-mensionality and is approximately two or three orders of magnitude higher than the conventional method.In addition, the similarity range of this method in different dimensions is [0, 1], which is fit for similarity analysis after dimensionality reduction.
Yu, Hualong; Ni, Jun
2014-01-01
Training classifiers on skewed data can be technically challenging tasks, especially if the data is high-dimensional simultaneously, the tasks can become more difficult. In biomedicine field, skewed data type often appears. In this study, we try to deal with this problem by combining asymmetric bagging ensemble classifier (asBagging) that has been presented in previous work and an improved random subspace (RS) generation strategy that is called feature subspace (FSS). Specifically, FSS is a novel method to promote the balance level between accuracy and diversity of base classifiers in asBagging. In view of the strong generalization capability of support vector machine (SVM), we adopt it to be base classifier. Extensive experiments on four benchmark biomedicine data sets indicate that the proposed ensemble learning method outperforms many baseline approaches in terms of Accuracy, F-measure, G-mean and AUC evaluation criterions, thus it can be regarded as an effective and efficient tool to deal with high-dimensional and imbalanced biomedical data.
Zhang, Yu; Wu, Jianxin; Cai, Jianfei
2016-05-01
In large-scale visual recognition and image retrieval tasks, feature vectors, such as Fisher vector (FV) or the vector of locally aggregated descriptors (VLAD), have achieved state-of-the-art results. However, the combination of the large numbers of examples and high-dimensional vectors necessitates dimensionality reduction, in order to reduce its storage and CPU costs to a reasonable range. In spite of the popularity of various feature compression methods, this paper shows that the feature (dimension) selection is a better choice for high-dimensional FV/VLAD than the feature (dimension) compression methods, e.g., product quantization. We show that strong correlation among the feature dimensions in the FV and the VLAD may not exist, which renders feature selection a natural choice. We also show that, many dimensions in FV/VLAD are noise. Throwing them away using feature selection is better than compressing them and useful dimensions altogether using feature compression methods. To choose features, we propose an efficient importance sorting algorithm considering both the supervised and unsupervised cases, for visual recognition and image retrieval, respectively. Combining with the 1-bit quantization, feature selection has achieved both higher accuracy and less computational cost than feature compression methods, such as product quantization, on the FV and the VLAD image representations.
High-dimensional quantum key distribution with the entangled single-photon-added coherent state
International Nuclear Information System (INIS)
Wang, Yang; Bao, Wan-Su; Bao, Hai-Ze; Zhou, Chun; Jiang, Mu-Sheng; Li, Hong-Wei
2017-01-01
High-dimensional quantum key distribution (HD-QKD) can generate more secure bits for one detection event so that it can achieve long distance key distribution with a high secret key capacity. In this Letter, we present a decoy state HD-QKD scheme with the entangled single-photon-added coherent state (ESPACS) source. We present two tight formulas to estimate the single-photon fraction of postselected events and Eve's Holevo information and derive lower bounds on the secret key capacity and the secret key rate of our protocol. We also present finite-key analysis for our protocol by using the Chernoff bound. Our numerical results show that our protocol using one decoy state can perform better than that of previous HD-QKD protocol with the spontaneous parametric down conversion (SPDC) using two decoy states. Moreover, when considering finite resources, the advantage is more obvious. - Highlights: • Implement the single-photon-added coherent state source into the high-dimensional quantum key distribution. • Enhance both the secret key capacity and the secret key rate compared with previous schemes. • Show an excellent performance in view of statistical fluctuations.
High-Dimensional Single-Photon Quantum Gates: Concepts and Experiments.
Babazadeh, Amin; Erhard, Manuel; Wang, Feiran; Malik, Mehul; Nouroozi, Rahman; Krenn, Mario; Zeilinger, Anton
2017-11-03
Transformations on quantum states form a basic building block of every quantum information system. From photonic polarization to two-level atoms, complete sets of quantum gates for a variety of qubit systems are well known. For multilevel quantum systems beyond qubits, the situation is more challenging. The orbital angular momentum modes of photons comprise one such high-dimensional system for which generation and measurement techniques are well studied. However, arbitrary transformations for such quantum states are not known. Here we experimentally demonstrate a four-dimensional generalization of the Pauli X gate and all of its integer powers on single photons carrying orbital angular momentum. Together with the well-known Z gate, this forms the first complete set of high-dimensional quantum gates implemented experimentally. The concept of the X gate is based on independent access to quantum states with different parities and can thus be generalized to other photonic degrees of freedom and potentially also to other quantum systems.
Zhu, Lingxue; Lei, Jing; Devlin, Bernie; Roeder, Kathryn
2017-09-01
Scientists routinely compare gene expression levels in cases versus controls in part to determine genes associated with a disease. Similarly, detecting case-control differences in co-expression among genes can be critical to understanding complex human diseases; however statistical methods have been limited by the high dimensional nature of this problem. In this paper, we construct a sparse-Leading-Eigenvalue-Driven (sLED) test for comparing two high-dimensional covariance matrices. By focusing on the spectrum of the differential matrix, sLED provides a novel perspective that accommodates what we assume to be common, namely sparse and weak signals in gene expression data, and it is closely related with Sparse Principal Component Analysis. We prove that sLED achieves full power asymptotically under mild assumptions, and simulation studies verify that it outperforms other existing procedures under many biologically plausible scenarios. Applying sLED to the largest gene-expression dataset obtained from post-mortem brain tissue from Schizophrenia patients and controls, we provide a novel list of genes implicated in Schizophrenia and reveal intriguing patterns in gene co-expression change for Schizophrenia subjects. We also illustrate that sLED can be generalized to compare other gene-gene "relationship" matrices that are of practical interest, such as the weighted adjacency matrices.
Tao, Chenyang; Nichols, Thomas E; Hua, Xue; Ching, Christopher R K; Rolls, Edmund T; Thompson, Paul M; Feng, Jianfeng
2017-01-01
We propose a generalized reduced rank latent factor regression model (GRRLF) for the analysis of tensor field responses and high dimensional covariates. The model is motivated by the need from imaging-genetic studies to identify genetic variants that are associated with brain imaging phenotypes, often in the form of high dimensional tensor fields. GRRLF identifies from the structure in the data the effective dimensionality of the data, and then jointly performs dimension reduction of the covariates, dynamic identification of latent factors, and nonparametric estimation of both covariate and latent response fields. After accounting for the latent and covariate effects, GRLLF performs a nonparametric test on the remaining factor of interest. GRRLF provides a better factorization of the signals compared with common solutions, and is less susceptible to overfitting because it exploits the effective dimensionality. The generality and the flexibility of GRRLF also allow various statistical models to be handled in a unified framework and solutions can be efficiently computed. Within the field of neuroimaging, it improves the sensitivity for weak signals and is a promising alternative to existing approaches. The operation of the framework is demonstrated with both synthetic datasets and a real-world neuroimaging example in which the effects of a set of genes on the structure of the brain at the voxel level were measured, and the results compared favorably with those from existing approaches. Copyright © 2016. Published by Elsevier Inc.
Challenges and Approaches to Statistical Design and Inference in High Dimensional Investigations
Garrett, Karen A.; Allison, David B.
2015-01-01
Summary Advances in modern technologies have facilitated high-dimensional experiments (HDEs) that generate tremendous amounts of genomic, proteomic, and other “omic” data. HDEs involving whole-genome sequences and polymorphisms, expression levels of genes, protein abundance measurements, and combinations thereof have become a vanguard for new analytic approaches to the analysis of HDE data. Such situations demand creative approaches to the processes of statistical inference, estimation, prediction, classification, and study design. The novel and challenging biological questions asked from HDE data have resulted in many specialized analytic techniques being developed. This chapter discusses some of the unique statistical challenges facing investigators studying high-dimensional biology, and describes some approaches being developed by statistical scientists. We have included some focus on the increasing interest in questions involving testing multiple propositions simultaneously, appropriate inferential indicators for the types of questions biologists are interested in, and the need for replication of results across independent studies, investigators, and settings. A key consideration inherent throughout is the challenge in providing methods that a statistician judges to be sound and a biologist finds informative. PMID:19588106
Challenges and approaches to statistical design and inference in high-dimensional investigations.
Gadbury, Gary L; Garrett, Karen A; Allison, David B
2009-01-01
Advances in modern technologies have facilitated high-dimensional experiments (HDEs) that generate tremendous amounts of genomic, proteomic, and other "omic" data. HDEs involving whole-genome sequences and polymorphisms, expression levels of genes, protein abundance measurements, and combinations thereof have become a vanguard for new analytic approaches to the analysis of HDE data. Such situations demand creative approaches to the processes of statistical inference, estimation, prediction, classification, and study design. The novel and challenging biological questions asked from HDE data have resulted in many specialized analytic techniques being developed. This chapter discusses some of the unique statistical challenges facing investigators studying high-dimensional biology and describes some approaches being developed by statistical scientists. We have included some focus on the increasing interest in questions involving testing multiple propositions simultaneously, appropriate inferential indicators for the types of questions biologists are interested in, and the need for replication of results across independent studies, investigators, and settings. A key consideration inherent throughout is the challenge in providing methods that a statistician judges to be sound and a biologist finds informative.
Tikhonov, Mikhail; Monasson, Remi
2018-01-01
Much of our understanding of ecological and evolutionary mechanisms derives from analysis of low-dimensional models: with few interacting species, or few axes defining "fitness". It is not always clear to what extent the intuition derived from low-dimensional models applies to the complex, high-dimensional reality. For instance, most naturally occurring microbial communities are strikingly diverse, harboring a large number of coexisting species, each of which contributes to shaping the environment of others. Understanding the eco-evolutionary interplay in these systems is an important challenge, and an exciting new domain for statistical physics. Recent work identified a promising new platform for investigating highly diverse ecosystems, based on the classic resource competition model of MacArthur. Here, we describe how the same analytical framework can be used to study evolutionary questions. Our analysis illustrates how, at high dimension, the intuition promoted by a one-dimensional (scalar) notion of fitness can become misleading. Specifically, while the low-dimensional picture emphasizes organism cost or efficiency, we exhibit a regime where cost becomes irrelevant for survival, and link this observation to generic properties of high-dimensional geometry.
Vernon, Ian; Liu, Junli; Goldstein, Michael; Rowe, James; Topping, Jen; Lindsey, Keith
2018-01-02
Many mathematical models have now been employed across every area of systems biology. These models increasingly involve large numbers of unknown parameters, have complex structure which can result in substantial evaluation time relative to the needs of the analysis, and need to be compared to observed data of various forms. The correct analysis of such models usually requires a global parameter search, over a high dimensional parameter space, that incorporates and respects the most important sources of uncertainty. This can be an extremely difficult task, but it is essential for any meaningful inference or prediction to be made about any biological system. It hence represents a fundamental challenge for the whole of systems biology. Bayesian statistical methodology for the uncertainty analysis of complex models is introduced, which is designed to address the high dimensional global parameter search problem. Bayesian emulators that mimic the systems biology model but which are extremely fast to evaluate are embeded within an iterative history match: an efficient method to search high dimensional spaces within a more formal statistical setting, while incorporating major sources of uncertainty. The approach is demonstrated via application to a model of hormonal crosstalk in Arabidopsis root development, which has 32 rate parameters, for which we identify the sets of rate parameter values that lead to acceptable matches between model output and observed trend data. The multiple insights into the model's structure that this analysis provides are discussed. The methodology is applied to a second related model, and the biological consequences of the resulting comparison, including the evaluation of gene functions, are described. Bayesian uncertainty analysis for complex models using both emulators and history matching is shown to be a powerful technique that can greatly aid the study of a large class of systems biology models. It both provides insight into model behaviour
Sepehry-Fard, F.; Coulthard, Maurice H.
1995-01-01
The process of predicting the values of maintenance time dependent variable parameters such as mean time between failures (MTBF) over time must be one that will not in turn introduce uncontrolled deviation in the results of the ILS analysis such as life cycle costs, spares calculation, etc. A minor deviation in the values of the maintenance time dependent variable parameters such as MTBF over time will have a significant impact on the logistics resources demands, International Space Station availability and maintenance support costs. There are two types of parameters in the logistics and maintenance world: a. Fixed; b. Variable Fixed parameters, such as cost per man hour, are relatively easy to predict and forecast. These parameters normally follow a linear path and they do not change randomly. However, the variable parameters subject to the study in this report such as MTBF do not follow a linear path and they normally fall within the distribution curves which are discussed in this publication. The very challenging task then becomes the utilization of statistical techniques to accurately forecast the future non-linear time dependent variable arisings and events with a high confidence level. This, in turn, shall translate in tremendous cost savings and improved availability all around.
International Nuclear Information System (INIS)
Hawley, J.T.; Chiu, C.; Todreas, N.E.; Rohsenow, W.M.
1980-01-01
Correlations are presented for subchannel and bundle friction factors and flowsplit parameters for laminar, transition and turbulent longitudinal flows in wire wrap spaced hexagonal arrays. These results are obtained from pressure drop models of flow in individual subchannels. For turbulent flow, an existing pressure drop model for flow in edge subchannels is extended, and the resulting edge subchannel friction factor is identified. Using the expressions for flowsplit parameters and the equal pressure drops assumption, the interior subchannel and bundle friction factors are obtained. For laminar flow, models are developed for pressure drops of individual subchannels. From these models, expressions for the subchannel friction factors are identified and expressions for the flowsplit parameters are derived
Energy Technology Data Exchange (ETDEWEB)
Hawley, J.T.; Chiu, C.; Rohsenow, W.M.; Todreas, N.E.
1980-08-01
Correlations are presented for subchannel and bundle friction factors and flowsplit parameters for laminar, transition and turbulent longitudinal flows in wire wrap spaced hexagonal arrays. These results are obtained from pressure drop models of flow in individual subchannels. For turbulent flow, an existing pressure drop model for flow in edge subchannels is extended, and the resulting edge subchannel friction factor is identified. Using the expressions for flowsplit parameters and the equal pressured drop assumption, the interior subchannel and bundle friction factors are obtained. For laminar flow, models are developed for pressure drops of individual subchannels. From these models, expressions for the subchannel friction factors are identified and expressions for the flowsplit parameters are derived.
Yu, Wenbao; Park, Taesung
2014-01-01
Motivation It is common to get an optimal combination of markers for disease classification and prediction when multiple markers are available. Many approaches based on the area under the receiver operating characteristic curve (AUC) have been proposed. Existing works based on AUC in a high-dimensional context depend mainly on a non-parametric, smooth approximation of AUC, with no work using a parametric AUC-based approach, for high-dimensional data. Results We propose an AUC-based approach u...
Systematic parameter inference in stochastic mesoscopic modeling
Energy Technology Data Exchange (ETDEWEB)
Lei, Huan; Yang, Xiu [Pacific Northwest National Laboratory, Richland, WA 99352 (United States); Li, Zhen [Division of Applied Mathematics, Brown University, Providence, RI 02912 (United States); Karniadakis, George Em, E-mail: george_karniadakis@brown.edu [Division of Applied Mathematics, Brown University, Providence, RI 02912 (United States)
2017-02-01
We propose a method to efficiently determine the optimal coarse-grained force field in mesoscopic stochastic simulations of Newtonian fluid and polymer melt systems modeled by dissipative particle dynamics (DPD) and energy conserving dissipative particle dynamics (eDPD). The response surfaces of various target properties (viscosity, diffusivity, pressure, etc.) with respect to model parameters are constructed based on the generalized polynomial chaos (gPC) expansion using simulation results on sampling points (e.g., individual parameter sets). To alleviate the computational cost to evaluate the target properties, we employ the compressive sensing method to compute the coefficients of the dominant gPC terms given the prior knowledge that the coefficients are “sparse”. The proposed method shows comparable accuracy with the standard probabilistic collocation method (PCM) while it imposes a much weaker restriction on the number of the simulation samples especially for systems with high dimensional parametric space. Fully access to the response surfaces within the confidence range enables us to infer the optimal force parameters given the desirable values of target properties at the macroscopic scale. Moreover, it enables us to investigate the intrinsic relationship between the model parameters, identify possible degeneracies in the parameter space, and optimize the model by eliminating model redundancies. The proposed method provides an efficient alternative approach for constructing mesoscopic models by inferring model parameters to recover target properties of the physics systems (e.g., from experimental measurements), where those force field parameters and formulation cannot be derived from the microscopic level in a straight forward way.
DEFF Research Database (Denmark)
Koziel, Slawomir; Bandler, John W.; Madsen, Kaj
2006-01-01
the surrogate, we perform parameter extraction with weighting coefficients dependent on the distance between the point of interest and base points. We provide theoretical results showing that the new methodology can assure any accuracy that is required (provided the base set is dense enough), which...
Energy Technology Data Exchange (ETDEWEB)
Khawli, Toufik Al; Eppelt, Urs; Hermanns, Torsten [RWTH Aachen University, Chair for Nonlinear Dynamics, Steinbachstr. 15, 52047 Aachen (Germany); Gebhardt, Sascha [RWTH Aachen University, Virtual Reality Group, IT Center, Seffenter Weg 23, 52074 Aachen (Germany); Kuhlen, Torsten [Forschungszentrum Jülich GmbH, Institute for Advanced Simulation (IAS), Jülich Supercomputing Centre (JSC), Wilhelm-Johnen-Straße, 52425 Jülich (Germany); Schulz, Wolfgang [Fraunhofer, ILT Laser Technology, Steinbachstr. 15, 52047 Aachen (Germany)
2016-06-08
In production industries, parameter identification, sensitivity analysis and multi-dimensional visualization are vital steps in the planning process for achieving optimal designs and gaining valuable information. Sensitivity analysis and visualization can help in identifying the most-influential parameters and quantify their contribution to the model output, reduce the model complexity, and enhance the understanding of the model behavior. Typically, this requires a large number of simulations, which can be both very expensive and time consuming when the simulation models are numerically complex and the number of parameter inputs increases. There are three main constituent parts in this work. The first part is to substitute the numerical, physical model by an accurate surrogate model, the so-called metamodel. The second part includes a multi-dimensional visualization approach for the visual exploration of metamodels. In the third part, the metamodel is used to provide the two global sensitivity measures: i) the Elementary Effect for screening the parameters, and ii) the variance decomposition method for calculating the Sobol indices that quantify both the main and interaction effects. The application of the proposed approach is illustrated with an industrial application with the goal of optimizing a drilling process using a Gaussian laser beam.
Directory of Open Access Journals (Sweden)
Boulesteix Anne-Laure
2009-12-01
Full Text Available Abstract Background In biometric practice, researchers often apply a large number of different methods in a "trial-and-error" strategy to get as much as possible out of their data and, due to publication pressure or pressure from the consulting customer, present only the most favorable results. This strategy may induce a substantial optimistic bias in prediction error estimation, which is quantitatively assessed in the present manuscript. The focus of our work is on class prediction based on high-dimensional data (e.g. microarray data, since such analyses are particularly exposed to this kind of bias. Methods In our study we consider a total of 124 variants of classifiers (possibly including variable selection or tuning steps within a cross-validation evaluation scheme. The classifiers are applied to original and modified real microarray data sets, some of which are obtained by randomly permuting the class labels to mimic non-informative predictors while preserving their correlation structure. Results We assess the minimal misclassification rate over the different variants of classifiers in order to quantify the bias arising when the optimal classifier is selected a posteriori in a data-driven manner. The bias resulting from the parameter tuning (including gene selection parameters as a special case and the bias resulting from the choice of the classification method are examined both separately and jointly. Conclusions The median minimal error rate over the investigated classifiers was as low as 31% and 41% based on permuted uninformative predictors from studies on colon cancer and prostate cancer, respectively. We conclude that the strategy to present only the optimal result is not acceptable because it yields a substantial bias in error rate estimation, and suggest alternative approaches for properly reporting classification accuracy.
High dimensional biological data retrieval optimization with NoSQL technology
2014-01-01
Background High-throughput transcriptomic data generated by microarray experiments is the most abundant and frequently stored kind of data currently used in translational medicine studies. Although microarray data is supported in data warehouses such as tranSMART, when querying relational databases for hundreds of different patient gene expression records queries are slow due to poor performance. Non-relational data models, such as the key-value model implemented in NoSQL databases, hold promise to be more performant solutions. Our motivation is to improve the performance of the tranSMART data warehouse with a view to supporting Next Generation Sequencing data. Results In this paper we introduce a new data model better suited for high-dimensional data storage and querying, optimized for database scalability and performance. We have designed a key-value pair data model to support faster queries over large-scale microarray data and implemented the model using HBase, an implementation of Google's BigTable storage system. An experimental performance comparison was carried out against the traditional relational data model implemented in both MySQL Cluster and MongoDB, using a large publicly available transcriptomic data set taken from NCBI GEO concerning Multiple Myeloma. Our new key-value data model implemented on HBase exhibits an average 5.24-fold increase in high-dimensional biological data query performance compared to the relational model implemented on MySQL Cluster, and an average 6.47-fold increase on query performance on MongoDB. Conclusions The performance evaluation found that the new key-value data model, in particular its implementation in HBase, outperforms the relational model currently implemented in tranSMART. We propose that NoSQL technology holds great promise for large-scale data management, in particular for high-dimensional biological data such as that demonstrated in the performance evaluation described in this paper. We aim to use this new data
Geraci, Joseph; Dharsee, Moyez; Nuin, Paulo; Haslehurst, Alexandria; Koti, Madhuri; Feilotter, Harriet E; Evans, Ken
2014-03-01
We introduce a novel method for visualizing high dimensional data via a discrete dynamical system. This method provides a 2D representation of the relationship between subjects according to a set of variables without geometric projections, transformed axes or principal components. The algorithm exploits a memory-type mechanism inherent in a certain class of discrete dynamical systems collectively referred to as the chaos game that are closely related to iterative function systems. The goal of the algorithm was to create a human readable representation of high dimensional patient data that was capable of detecting unrevealed subclusters of patients from within anticipated classifications. This provides a mechanism to further pursue a more personalized exploration of pathology when used with medical data. For clustering and classification protocols, the dynamical system portion of the algorithm is designed to come after some feature selection filter and before some model evaluation (e.g. clustering accuracy) protocol. In the version given here, a univariate features selection step is performed (in practice more complex feature selection methods are used), a discrete dynamical system is driven by this reduced set of variables (which results in a set of 2D cluster models), these models are evaluated for their accuracy (according to a user-defined binary classification) and finally a visual representation of the top classification models are returned. Thus, in addition to the visualization component, this methodology can be used for both supervised and unsupervised machine learning as the top performing models are returned in the protocol we describe here. Butterfly, the algorithm we introduce and provide working code for, uses a discrete dynamical system to classify high dimensional data and provide a 2D representation of the relationship between subjects. We report results on three datasets (two in the article; one in the appendix) including a public lung cancer
Taşkin Kaya, Gülşen
2013-10-01
Recently, earthquake damage assessment using satellite images has been a very popular ongoing research direction. Especially with the availability of very high resolution (VHR) satellite images, a quite detailed damage map based on building scale has been produced, and various studies have also been conducted in the literature. As the spatial resolution of satellite images increases, distinguishability of damage patterns becomes more cruel especially in case of using only the spectral information during classification. In order to overcome this difficulty, textural information needs to be involved to the classification to improve the visual quality and reliability of damage map. There are many kinds of textural information which can be derived from VHR satellite images depending on the algorithm used. However, extraction of textural information and evaluation of them have been generally a time consuming process especially for the large areas affected from the earthquake due to the size of VHR image. Therefore, in order to provide a quick damage map, the most useful features describing damage patterns needs to be known in advance as well as the redundant features. In this study, a very high resolution satellite image after Iran, Bam earthquake was used to identify the earthquake damage. Not only the spectral information, textural information was also used during the classification. For textural information, second order Haralick features were extracted from the panchromatic image for the area of interest using gray level co-occurrence matrix with different size of windows and directions. In addition to using spatial features in classification, the most useful features representing the damage characteristic were selected with a novel feature selection method based on high dimensional model representation (HDMR) giving sensitivity of each feature during classification. The method called HDMR was recently proposed as an efficient tool to capture the input
High dimensional biological data retrieval optimization with NoSQL technology.
Wang, Shicai; Pandis, Ioannis; Wu, Chao; He, Sijin; Johnson, David; Emam, Ibrahim; Guitton, Florian; Guo, Yike
2014-01-01
High-throughput transcriptomic data generated by microarray experiments is the most abundant and frequently stored kind of data currently used in translational medicine studies. Although microarray data is supported in data warehouses such as tranSMART, when querying relational databases for hundreds of different patient gene expression records queries are slow due to poor performance. Non-relational data models, such as the key-value model implemented in NoSQL databases, hold promise to be more performant solutions. Our motivation is to improve the performance of the tranSMART data warehouse with a view to supporting Next Generation Sequencing data. In this paper we introduce a new data model better suited for high-dimensional data storage and querying, optimized for database scalability and performance. We have designed a key-value pair data model to support faster queries over large-scale microarray data and implemented the model using HBase, an implementation of Google's BigTable storage system. An experimental performance comparison was carried out against the traditional relational data model implemented in both MySQL Cluster and MongoDB, using a large publicly available transcriptomic data set taken from NCBI GEO concerning Multiple Myeloma. Our new key-value data model implemented on HBase exhibits an average 5.24-fold increase in high-dimensional biological data query performance compared to the relational model implemented on MySQL Cluster, and an average 6.47-fold increase on query performance on MongoDB. The performance evaluation found that the new key-value data model, in particular its implementation in HBase, outperforms the relational model currently implemented in tranSMART. We propose that NoSQL technology holds great promise for large-scale data management, in particular for high-dimensional biological data such as that demonstrated in the performance evaluation described in this paper. We aim to use this new data model as a basis for migrating
Kozawa, Takahiro; Oizumi, Hiroaki; Itani, Toshiro; Tagawa, Seiichi
2010-11-01
The development of extreme ultraviolet (EUV) lithography has progressed owing to worldwide effort. As the development status of EUV lithography approaches the requirements for the high-volume production of semiconductor devices with a minimum line width of 22 nm, the extraction of resist parameters becomes increasingly important from the viewpoints of the accurate evaluation of resist materials for resist screening and the accurate process simulation for process and mask designs. In this study, we demonstrated that resist parameters (namely, quencher concentration, acid diffusion constant, proportionality constant of line edge roughness, and dissolution point) can be extracted from the scanning electron microscopy (SEM) images of patterned resists without the knowledge on the details of resist contents using two types of latest EUV resist.
International Nuclear Information System (INIS)
Krylov, V.I.; Sorokin, S.V.
1998-01-01
The dynamics of a Euler-Bernoulli beam with a time-and-space dependent bending stiffness is studied. The , problem is considered in connection with the application of noise control using smart structures. It is shown that a control for the vibrations of the beam can be achieved by varying the bending stiffness. The technique of direct separation of fast and slow motion coupled with a Green's function method is used to analyze the dynamics of the beam with high-frequency modulation of the stiffness
International Nuclear Information System (INIS)
Haddad, K; Alopoor, H
2016-01-01
Purpose: Recently, the multileaf collimators (MLC) have become an important part of any LINAC collimation systems because they reduce the treatment planning time and improves the conformity. Important factors that affects the MLCs collimation performance are leaves material composition and their thickness. In this study, we investigate the main dosimetric parameters of 120-leaf Millennium MLC including dose in the buildup point, physical penumbra as well as average and end leaf leakages. Effects of the leaves geometry and density on these parameters are evaluated Methods: From EGSnrc Monte Carlo code, BEAMnrc and DOSXYZnrc modules are used to evaluate the dosimetric parameters of a water phantom exposed to a Varian xi for 100cm SSD. Using IAEA phasespace data just above MLC (Z=46cm) and BEAMnrc, for the modified 120-leaf Millennium MLC a new phase space data at Z=52cm is produces. The MLC is modified both in leaf thickness and material composition. EGSgui code generates 521ICRU library for tungsten alloys. DOSXYZnrc with the new phase space evaluates the dose distribution in a water phantom of 60×60×20 cm3 with voxel size of 4×4×2 mm3. Using DOSXYZnrc dose distributions for open beam and closed beam as well as the leakages definition, end leakage, average leakage and physical penumbra are evaluated. Results: A new MLC with improved dosimetric parameters is proposed. The physical penumbra for proposed MLC is 4.7mm compared to 5.16 mm for Millennium. Average leakage in our design is reduced to 1.16% compared to 1.73% for Millennium, the end leaf leakage suggested design is also reduced to 4.86% compared to 7.26% of Millennium. Conclusion: The results show that the proposed MLC with enhanced dosimetric parameters could improve the conformity of treatment planning.
Energy Technology Data Exchange (ETDEWEB)
Haddad, K; Alopoor, H [Shiraz University, Shiraz, I.R. Iran (Iran, Islamic Republic of)
2016-06-15
Purpose: Recently, the multileaf collimators (MLC) have become an important part of any LINAC collimation systems because they reduce the treatment planning time and improves the conformity. Important factors that affects the MLCs collimation performance are leaves material composition and their thickness. In this study, we investigate the main dosimetric parameters of 120-leaf Millennium MLC including dose in the buildup point, physical penumbra as well as average and end leaf leakages. Effects of the leaves geometry and density on these parameters are evaluated Methods: From EGSnrc Monte Carlo code, BEAMnrc and DOSXYZnrc modules are used to evaluate the dosimetric parameters of a water phantom exposed to a Varian xi for 100cm SSD. Using IAEA phasespace data just above MLC (Z=46cm) and BEAMnrc, for the modified 120-leaf Millennium MLC a new phase space data at Z=52cm is produces. The MLC is modified both in leaf thickness and material composition. EGSgui code generates 521ICRU library for tungsten alloys. DOSXYZnrc with the new phase space evaluates the dose distribution in a water phantom of 60×60×20 cm3 with voxel size of 4×4×2 mm3. Using DOSXYZnrc dose distributions for open beam and closed beam as well as the leakages definition, end leakage, average leakage and physical penumbra are evaluated. Results: A new MLC with improved dosimetric parameters is proposed. The physical penumbra for proposed MLC is 4.7mm compared to 5.16 mm for Millennium. Average leakage in our design is reduced to 1.16% compared to 1.73% for Millennium, the end leaf leakage suggested design is also reduced to 4.86% compared to 7.26% of Millennium. Conclusion: The results show that the proposed MLC with enhanced dosimetric parameters could improve the conformity of treatment planning.
Penalized estimation for competing risks regression with applications to high-dimensional covariates
DEFF Research Database (Denmark)
Ambrogi, Federico; Scheike, Thomas H.
2016-01-01
of competing events. The direct binomial regression model of Scheike and others (2008. Predicting cumulative incidence probability by direct binomial regression. Biometrika 95: (1), 205-220) is reformulated in a penalized framework to possibly fit a sparse regression model. The developed approach is easily...... Research 19: (1), 29-51), the research regarding competing risks is less developed (Binder and others, 2009. Boosting for high-dimensional time-to-event data with competing risks. Bioinformatics 25: (7), 890-896). The aim of this work is to consider how to do penalized regression in the presence...... implementable using existing high-performance software to do penalized regression. Results from simulation studies are presented together with an application to genomic data when the endpoint is progression-free survival. An R function is provided to perform regularized competing risks regression according...
Energy Technology Data Exchange (ETDEWEB)
Tahira, Rabia; Ikram, Manzoor; Zubairy, M Suhail [Centre for Quantum Physics, COMSATS Institute of Information Technology, Islamabad (Pakistan); Bougouffa, Smail [Department of Physics, Faculty of Science, Taibah University, PO Box 30002, Madinah (Saudi Arabia)
2010-02-14
We investigate the phenomenon of sudden death of entanglement in a high-dimensional bipartite system subjected to dissipative environments with an arbitrary initial pure entangled state between two fields in the cavities. We find that in a vacuum reservoir, the presence of the state where one or more than one (two) photons in each cavity are present is a necessary condition for the sudden death of entanglement. Otherwise entanglement remains for infinite time and decays asymptotically with the decay of individual qubits. For pure two-qubit entangled states in a thermal environment, we observe that sudden death of entanglement always occurs. The sudden death time of the entangled states is related to the number of photons in the cavities, the temperature of the reservoir and the initial preparation of the entangled states.
International Nuclear Information System (INIS)
Tahira, Rabia; Ikram, Manzoor; Zubairy, M Suhail; Bougouffa, Smail
2010-01-01
We investigate the phenomenon of sudden death of entanglement in a high-dimensional bipartite system subjected to dissipative environments with an arbitrary initial pure entangled state between two fields in the cavities. We find that in a vacuum reservoir, the presence of the state where one or more than one (two) photons in each cavity are present is a necessary condition for the sudden death of entanglement. Otherwise entanglement remains for infinite time and decays asymptotically with the decay of individual qubits. For pure two-qubit entangled states in a thermal environment, we observe that sudden death of entanglement always occurs. The sudden death time of the entangled states is related to the number of photons in the cavities, the temperature of the reservoir and the initial preparation of the entangled states.
Time–energy high-dimensional one-side device-independent quantum key distribution
International Nuclear Information System (INIS)
Bao Hai-Ze; Bao Wan-Su; Wang Yang; Chen Rui-Ke; Ma Hong-Xin; Zhou Chun; Li Hong-Wei
2017-01-01
Compared with full device-independent quantum key distribution (DI-QKD), one-side device-independent QKD (1sDI-QKD) needs fewer requirements, which is much easier to meet. In this paper, by applying recently developed novel time–energy entropic uncertainty relations, we present a time–energy high-dimensional one-side device-independent quantum key distribution (HD-QKD) and provide the security proof against coherent attacks. Besides, we connect the security with the quantum steering. By numerical simulation, we obtain the secret key rate for Alice’s different detection efficiencies. The results show that our protocol can performance much better than the original 1sDI-QKD. Furthermore, we clarify the relation among the secret key rate, Alice’s detection efficiency, and the dispersion coefficient. Finally, we simply analyze its performance in the optical fiber channel. (paper)
A Cure for Variance Inflation in High Dimensional Kernel Principal Component Analysis
DEFF Research Database (Denmark)
Abrahamsen, Trine Julie; Hansen, Lars Kai
2011-01-01
Small sample high-dimensional principal component analysis (PCA) suffers from variance inflation and lack of generalizability. It has earlier been pointed out that a simple leave-one-out variance renormalization scheme can cure the problem. In this paper we generalize the cure in two directions......: First, we propose a computationally less intensive approximate leave-one-out estimator, secondly, we show that variance inflation is also present in kernel principal component analysis (kPCA) and we provide a non-parametric renormalization scheme which can quite efficiently restore generalizability in kPCA....... As for PCA our analysis also suggests a simplified approximate expression. © 2011 Trine J. Abrahamsen and Lars K. Hansen....
Inference for feature selection using the Lasso with high-dimensional data
DEFF Research Database (Denmark)
Brink-Jensen, Kasper; Ekstrøm, Claus Thorn
2014-01-01
Penalized regression models such as the Lasso have proved useful for variable selection in many fields - especially for situations with high-dimensional data where the numbers of predictors far exceeds the number of observations. These methods identify and rank variables of importance but do...... not generally provide any inference of the selected variables. Thus, the variables selected might be the "most important" but need not be significant. We propose a significance test for the selection found by the Lasso. We introduce a procedure that computes inference and p-values for features chosen...... by the Lasso. This method rephrases the null hypothesis and uses a randomization approach which ensures that the error rate is controlled even for small samples. We demonstrate the ability of the algorithm to compute $p$-values of the expected magnitude with simulated data using a multitude of scenarios...
Diagonal Likelihood Ratio Test for Equality of Mean Vectors in High-Dimensional Data
Hu, Zongliang; Tong, Tiejun; Genton, Marc G.
2017-01-01
We propose a likelihood ratio test framework for testing normal mean vectors in high-dimensional data under two common scenarios: the one-sample test and the two-sample test with equal covariance matrices. We derive the test statistics under the assumption that the covariance matrices follow a diagonal matrix structure. In comparison with the diagonal Hotelling's tests, our proposed test statistics display some interesting characteristics. In particular, they are a summation of the log-transformed squared t-statistics rather than a direct summation of those components. More importantly, to derive the asymptotic normality of our test statistics under the null and local alternative hypotheses, we do not require the assumption that the covariance matrix follows a diagonal matrix structure. As a consequence, our proposed test methods are very flexible and can be widely applied in practice. Finally, simulation studies and a real data analysis are also conducted to demonstrate the advantages of our likelihood ratio test method.
Characterization of differentially expressed genes using high-dimensional co-expression networks
DEFF Research Database (Denmark)
Coelho Goncalves de Abreu, Gabriel; Labouriau, Rodrigo S.
2010-01-01
We present a technique to characterize differentially expressed genes in terms of their position in a high-dimensional co-expression network. The set-up of Gaussian graphical models is used to construct representations of the co-expression network in such a way that redundancy and the propagation...... that allow to make effective inference in problems with high degree of complexity (e.g. several thousands of genes) and small number of observations (e.g. 10-100) as typically occurs in high throughput gene expression studies. Taking advantage of the internal structure of decomposable graphical models, we...... construct a compact representation of the co-expression network that allows to identify the regions with high concentration of differentially expressed genes. It is argued that differentially expressed genes located in highly interconnected regions of the co-expression network are less informative than...
Kernel based methods for accelerated failure time model with ultra-high dimensional data
Directory of Open Access Journals (Sweden)
Jiang Feng
2010-12-01
Full Text Available Abstract Background Most genomic data have ultra-high dimensions with more than 10,000 genes (probes. Regularization methods with L1 and Lp penalty have been extensively studied in survival analysis with high-dimensional genomic data. However, when the sample size n ≪ m (the number of genes, directly identifying a small subset of genes from ultra-high (m > 10, 000 dimensional data is time-consuming and not computationally efficient. In current microarray analysis, what people really do is select a couple of thousands (or hundreds of genes using univariate analysis or statistical tests, and then apply the LASSO-type penalty to further reduce the number of disease associated genes. This two-step procedure may introduce bias and inaccuracy and lead us to miss biologically important genes. Results The accelerated failure time (AFT model is a linear regression model and a useful alternative to the Cox model for survival analysis. In this paper, we propose a nonlinear kernel based AFT model and an efficient variable selection method with adaptive kernel ridge regression. Our proposed variable selection method is based on the kernel matrix and dual problem with a much smaller n × n matrix. It is very efficient when the number of unknown variables (genes is much larger than the number of samples. Moreover, the primal variables are explicitly updated and the sparsity in the solution is exploited. Conclusions Our proposed methods can simultaneously identify survival associated prognostic factors and predict survival outcomes with ultra-high dimensional genomic data. We have demonstrated the performance of our methods with both simulation and real data. The proposed method performs superbly with limited computational studies.
Travnik, Jaden B; Pilarski, Patrick M
2017-07-01
Prosthetic devices have advanced in their capabilities and in the number and type of sensors included in their design. As the space of sensorimotor data available to a conventional or machine learning prosthetic control system increases in dimensionality and complexity, it becomes increasingly important that this data be represented in a useful and computationally efficient way. Well structured sensory data allows prosthetic control systems to make informed, appropriate control decisions. In this study, we explore the impact that increased sensorimotor information has on current machine learning prosthetic control approaches. Specifically, we examine the effect that high-dimensional sensory data has on the computation time and prediction performance of a true-online temporal-difference learning prediction method as embedded within a resource-limited upper-limb prosthesis control system. We present results comparing tile coding, the dominant linear representation for real-time prosthetic machine learning, with a newly proposed modification to Kanerva coding that we call selective Kanerva coding. In addition to showing promising results for selective Kanerva coding, our results confirm potential limitations to tile coding as the number of sensory input dimensions increases. To our knowledge, this study is the first to explicitly examine representations for realtime machine learning prosthetic devices in general terms. This work therefore provides an important step towards forming an efficient prosthesis-eye view of the world, wherein prompt and accurate representations of high-dimensional data may be provided to machine learning control systems within artificial limbs and other assistive rehabilitation technologies.
Lombardo, Luca; D'Ercole, Antonio; Latini, Michele Carmelo; Siciliani, Giuseppe
2014-11-27
The aim of this study was to provide clinical indications for the correct management of appliances in space closure treatment of patients with agenesis of the upper lateral incisors. Virtual setup for space closure was performed in 30 patients with upper lateral incisor agenesis. Tip, torque and in-out values were measured and compared with those of previous authors. In the upper dentition, the tip values were comparable to those described by Andrews (Am J Orthod 62(3):296-309, 1972), except for at the first premolars, which require a greater tip, and the first molars, a lesser tip. The torque values showed no differences except for at the canines, where it was greater, and the in-out values were between those reported by Andrews and those by Watanabe et al. (The Shikwa Gakuho 96:209-222, 1996) (except for U3 and U4). The following prescriptions are advisable: tip 5°, torque 8° and in-out 2.5 for U1; tip 9°, torque 3° and in-out 3.25 for U3; tip 10°, torque -8° and in-out 3.75 for U4; and tip 5°, torque -8° and in-out 4 for U5. Andrews' prescription is suitable for the lower jaw, except for at L6. It is also advisable to execute selective grinding (1.33±0.5 mm) and extrusion (0.68±0.23 mm) on the upper canine during treatment, and the first premolar requires some intrusion (0.56±0.30 mm).
International Nuclear Information System (INIS)
Potlog, T.
2007-01-01
Thin Film CdS/CdTe solar cells were fabricated by Close Space Sublimation at the substrate temperature ranging from 300 degrees ± 5 degrees to 340 degrees ± degrees. The best photovoltaic parameters were achieved at substrate temperature 320 degrees and source temperature 610 degrees. The open circuit voltage and current density changes significantly with the substrate temperature and depends on the dimension of the grain sizes. Grain size is an efficiency limiting parameter for CdTe layers with large grains. The open circuit voltage and current density are the best for the cells having dimension of grains between 1.0 μm and ∼ 5.0 μm. CdS/CdTe solar cells with an efficiency of ∼ 10% were obtained. (author)
International Nuclear Information System (INIS)
Newton, W. G.; Gearheart, M.; Li Baoan
2013-01-01
We present a systematic survey of the range of predictions of the neutron star inner crust composition, crust-core transition densities and pressures, and density range of the nuclear 'pasta' phases at the bottom of the crust provided by the compressible liquid drop model in light of the current experimental and theoretical constraints on model parameters. Using a Skyrme-like model for nuclear matter, we construct baseline sequences of crust models by consistently varying the density dependence of the bulk symmetry energy at nuclear saturation density, L, under two conditions: (1) that the magnitude of the symmetry energy at saturation density J is held constant, and (2) J correlates with L under the constraint that the pure neutron matter (PNM) equation of state (EoS) satisfies the results of ab initio calculations at low densities. Such baseline crust models facilitate consistent exploration of the L dependence of crustal properties. The remaining surface energy and symmetric nuclear matter parameters are systematically varied around the baseline, and different functional forms of the PNM EoS at sub-saturation densities implemented, to estimate theoretical 'error bars' for the baseline predictions. Inner crust composition and transition densities are shown to be most sensitive to the surface energy at very low proton fractions and to the behavior of the sub-saturation PNM EoS. Recent calculations of the energies of neutron drops suggest that the low-proton-fraction surface energy might be higher than predicted in Skyrme-like models, which our study suggests may result in a greatly reduced volume of pasta in the crust than conventionally predicted.
Energy Technology Data Exchange (ETDEWEB)
Newton, W. G.; Gearheart, M.; Li Baoan, E-mail: william.newton@tamuc.edu [Department of Physics and Astronomy, Texas A and M University-Commerce, Commerce, TX 75429-3011 (United States)
2013-01-15
We present a systematic survey of the range of predictions of the neutron star inner crust composition, crust-core transition densities and pressures, and density range of the nuclear 'pasta' phases at the bottom of the crust provided by the compressible liquid drop model in light of the current experimental and theoretical constraints on model parameters. Using a Skyrme-like model for nuclear matter, we construct baseline sequences of crust models by consistently varying the density dependence of the bulk symmetry energy at nuclear saturation density, L, under two conditions: (1) that the magnitude of the symmetry energy at saturation density J is held constant, and (2) J correlates with L under the constraint that the pure neutron matter (PNM) equation of state (EoS) satisfies the results of ab initio calculations at low densities. Such baseline crust models facilitate consistent exploration of the L dependence of crustal properties. The remaining surface energy and symmetric nuclear matter parameters are systematically varied around the baseline, and different functional forms of the PNM EoS at sub-saturation densities implemented, to estimate theoretical 'error bars' for the baseline predictions. Inner crust composition and transition densities are shown to be most sensitive to the surface energy at very low proton fractions and to the behavior of the sub-saturation PNM EoS. Recent calculations of the energies of neutron drops suggest that the low-proton-fraction surface energy might be higher than predicted in Skyrme-like models, which our study suggests may result in a greatly reduced volume of pasta in the crust than conventionally predicted.
Namysłowska-Wilczyńska, Barbara
2016-04-01
. These data were subjected to spatial analyses using statistical and geostatistical methods. The evaluation of basic statistics of the investigated quality parameters, including their histograms of distributions, scatter diagrams between these parameters and also correlation coefficients r were presented in this article. The directional semivariogram function and the ordinary (block) kriging procedure were used to build the 3D geostatistical model. The geostatistical parameters of the theoretical models of directional semivariograms of the studied water quality parameters, calculated along the time interval and along the wells depth (taking into account the terrain elevation), were used in the ordinary (block) kriging estimation. The obtained results of estimation, i.e. block diagrams allowed to determine the levels of increased values Z* of studied underground water quality parameters. Analysis of the variability in the selected quality parameters of underground water for an analyzed area in Klodzko water intake was enriched by referring to the results of geostatistical studies carried out for underground water quality parameters and also for a treated water and in Klodzko water supply system (iron Fe, manganese Mn, ammonium ion NH4+ contents), discussed in earlier works. Spatial and time variation in the latter-mentioned parameters was analysed on the basis of the data (2007÷2011, 2008÷2011). Generally, the behaviour of the underground water quality parameters has been found to vary in space and time. Thanks to the spatial analyses of the variation in the quality parameters in the Kłodzko underground water intake area some regularities (trends) in the variation in water quality have been identified.
Directory of Open Access Journals (Sweden)
Diego CARVALHO
2018-02-01
Full Text Available Experiments were carried out to analyze the effect of growth rates (VL and cooling rates (TR on both secondary dendritic arm spacings (λ2 and Vickers microhardness (HV of an Al-9wt.%Si alloy during the horizontal directional solidification under transient heat flow conditions. A water-cooled solidification experimental apparatus was developed allowing a wide range of TR (from 0.2 to 3.5 ºC/s to be experienced. Five computer guided thermocouples were connected with the metal, and the time-temperature data were recorded automatically. The solidification path was also calculated by Scheil model in Thermo-Calc software. Casting samples were characterized by the combined analyses of optical microscopy (OM and scanning electron microscopy coupled with energy dispersive spectrometry (SEM-EDS revealing a complex arrangement of phases including binary (α-Al + Si and ternary (α-Al + Si + β-AlFeSi mixtures within interdendritic regions. It was observed that power law functions characterize the variation of λ2 as a function of VL and TR with exponents of -2/3 and -1/3, respectively. Finally, experimental laws of power and Hall-Petch types are proposed relating the resulting HV to the λ2. According to these results, it was found that, for increasing values of λ2, the results of HV decrease.DOI: http://dx.doi.org/10.5755/j01.ms.24.1.17319
International Nuclear Information System (INIS)
Barthelemy, O.; Margot, J.; Chaker, M.; Sabsabi, M.; Vidal, F.; Johnston, T.W.; Laville, S.; Le Drogoff, B.
2005-01-01
In this work, an aluminum laser plasma produced in ambient air at atmospheric pressure by laser pulses at a fluence of 10 J/cm 2 is characterized by time- and space-resolved measurements of electron density and temperature. Varying the laser pulse duration from 6 ns to 80 fs and the laser wavelength from ultraviolet to infrared only slightly influences the plasma properties. The temperature exhibits a slight decrease both at the plasma edge and close to the target surface. The electron density is found to be spatially homogeneous in the ablation plume during the first microsecond. Finally, the plasma expansion is in good agreement with the Sedov's model during the first 500 ns and it becomes subsonic, with respect to the velocity of sound in air, typically 1 μs after the plasma creation. The physical interpretation of the experimental results is also discussed to the light of a one-dimensional fluid model which provides a good qualitative agreement with measurements
Directory of Open Access Journals (Sweden)
Rui Xu
2013-01-01
Full Text Available Minimum description length (MDL based group-wise registration was a state-of-the-art method to determine the corresponding points of 3D shapes for the construction of statistical shape models (SSMs. However, it suffered from the problem that determined corresponding points did not uniformly spread on original shapes, since corresponding points were obtained by uniformly sampling the aligned shape on the parameterized space of unit sphere. We proposed a particle-system based method to obtain adaptive sampling positions on the unit sphere to resolve this problem. Here, a set of particles was placed on the unit sphere to construct a particle system whose energy was related to the distortions of parameterized meshes. By minimizing this energy, each particle was moved on the unit sphere. When the system became steady, particles were treated as vertices to build a spherical mesh, which was then relaxed to slightly adjust vertices to obtain optimal sampling-positions. We used 47 cases of (left and right lungs and 50 cases of livers, (left and right kidneys, and spleens for evaluations. Experiments showed that the proposed method was able to resolve the problem of the original MDL method, and the proposed method performed better in the generalization and specificity tests.
International Nuclear Information System (INIS)
Sivasakthivel, T.; Murugesan, K.; Thomas, H.R.
2014-01-01
Highlights: • Ground Source Heat Pump (GSHP) technology is suitable for both heating and cooling. • Important parameters that affect the GSHP performance has been listed. • Parameters of GSHP system has been optimized for heating and cooling mode. • Taguchi technique and utility concept are developed for GSHP optimization. - Abstract: Use of ground source energy for space heating applications through Ground Source Heat pump (GSHP) has been established as an efficient thermodynamic process. The electricity input to the GSHP can be reduced by increasing the COP of the system. However, the COP of a GSHP system will be different for heating and cooling mode operations. Hence in order to reduce the electricity input to the GSHP, an optimum value of COP has to be determined when GSHP is operated in both heating and cooling modes. In the present research, a methodology is proposed to optimize the operating parameters of a GSHP system which will operate on both heating and cooling modes. Condenser inlet temperature, condenser outlet temperature, dryness fraction at evaporator inlet and evaporator outlet temperature are considered as the influencing parameters of the heat pump. Optimization of these parameters for only heating or only cooling mode operation is achieved by employing Taguchi method for three level variations of the above parameters using an L 9 (3 4 ) orthogonal array. Higher the better concept has been used to get a higher COP. A computer program in FORTAN has been developed to carry out the computations and the results have been analyzed for the optimum conditions using Signal-to-Noise (SN) ratio and Analysis Of Variance (ANOVA) method. Based on this analysis, the maximum COP for only heating and only cooling operation are obtained as 4.25 and 3.32 respectively. By making use of the utility concept both the higher values of COP obtained for heating and cooling modes are optimized to get a single optimum COP for heating and cooling modes. A single
Andrade, Henrique; Alcoforado, Maria-João; Oliveira, Sandra
2011-09-01
We aim to understand the relationship between people's declared bioclimatic comfort, their personal characteristics (age, origin, clothing, activity and motivation, etc.) and the atmospheric conditions. To attain this goal, questionnaire surveys were made concurrently with weather measurements (air temperature, relative humidity, solar and long-wave radiation and wind speed) in two open leisure areas of Lisbon (Portugal), during the years 2006 and 2007. We analysed the desire expressed by the interviewees to decrease, maintain or increase the values of air temperature and wind speed, in order to improve their level of comfort. Multiple logistic regression was used to analyse the quantitative relation between preference votes and environmental and personal parameters. The preference for a different temperature depends on the season and is strongly associated with wind speed. Furthermore, a general decrease of discomfort with increasing age was also found. Most people declared a preference for lower wind speed in all seasons; the perception of wind shows significant differences depending on gender, with women declaring a lower level of comfort with higher wind speed. It was also found that the tolerance of warmer conditions is higher than of cooler conditions, and that adaptive strategies are undertaken by people to improve their level of comfort outdoors.
Bonetti, Matteo; Haardt, Francesco; Sesana, Alberto; Barausse, Enrico
2018-04-01
Massive black hole binaries (MBHBs) are expected to form at the centre of merging galaxies during the hierarchical assembly of the cosmic structure, and are expected to be the loudest sources of gravitational waves (GWs) in the low frequency domain. However, because of the dearth of energy exchanges with background stars and gas, many of these MBHBs may stall at separations too large for GW emission to drive them to coalescence in less than a Hubble time. Triple MBH systems are then bound to form after a further galaxy merger, triggering a complex and rich dynamics that can eventually lead to MBH coalescence. Here we report on the results of a large set of numerical simulations, where MBH triplets are set in spherical stellar potentials and MBH dynamics is followed through 2.5 post-Newtonian order in the equations of motion. From our full suite of simulated systems we find that a fraction ≃ 20 - 30 % of the MBH binaries that would otherwise stall are led to coalesce within a Hubble time. The corresponding coalescence timescale peaks around 300 Myr, while the eccentricity close to the plunge, albeit small, is non-negligible (≲ 0.1). We construct and discuss marginalised probability distributions of the main parameters involved and, in a companion paper of the series, we will use the results presented here to forecast the contribution of MBH triplets to the GW signal in the nHz regime probed by Pulsar Timing Array experiments.
Vogt, Martin; Bajorath, Jürgen
2008-01-01
Bayesian classifiers are increasingly being used to distinguish active from inactive compounds and search large databases for novel active molecules. We introduce an approach to directly combine the contributions of property descriptors and molecular fingerprints in the search for active compounds that is based on a Bayesian framework. Conventionally, property descriptors and fingerprints are used as alternative features for virtual screening methods. Following the approach introduced here, probability distributions of descriptor values and fingerprint bit settings are calculated for active and database molecules and the divergence between the resulting combined distributions is determined as a measure of biological activity. In test calculations on a large number of compound activity classes, this methodology was found to consistently perform better than similarity searching using fingerprints and multiple reference compounds or Bayesian screening calculations using probability distributions calculated only from property descriptors. These findings demonstrate that there is considerable synergy between different types of property descriptors and fingerprints in recognizing diverse structure-activity relationships, at least in the context of Bayesian modeling.
DEFF Research Database (Denmark)
Pham, Ninh Dang; Pagh, Rasmus
2012-01-01
projection-based technique that is able to estimate the angle-based outlier factor for all data points in time near-linear in the size of the data. Also, our approach is suitable to be performed in parallel environment to achieve a parallel speedup. We introduce a theoretical analysis of the quality...... neighbor are deteriorated in high-dimensional data. Following up on the work of Kriegel et al. (KDD '08), we investigate the use of angle-based outlier factor in mining high-dimensional outliers. While their algorithm runs in cubic time (with a quadratic time heuristic), we propose a novel random......Outlier mining in d-dimensional point sets is a fundamental and well studied data mining task due to its variety of applications. Most such applications arise in high-dimensional domains. A bottleneck of existing approaches is that implicit or explicit assessments on concepts of distance or nearest...
Robust and sparse correlation matrix estimation for the analysis of high-dimensional genomics data.
Serra, Angela; Coretto, Pietro; Fratello, Michele; Tagliaferri, Roberto; Stegle, Oliver
2018-02-15
Microarray technology can be used to study the expression of thousands of genes across a number of different experimental conditions, usually hundreds. The underlying principle is that genes sharing similar expression patterns, across different samples, can be part of the same co-expression system, or they may share the same biological functions. Groups of genes are usually identified based on cluster analysis. Clustering methods rely on the similarity matrix between genes. A common choice to measure similarity is to compute the sample correlation matrix. Dimensionality reduction is another popular data analysis task which is also based on covariance/correlation matrix estimates. Unfortunately, covariance/correlation matrix estimation suffers from the intrinsic noise present in high-dimensional data. Sources of noise are: sampling variations, presents of outlying sample units, and the fact that in most cases the number of units is much larger than the number of genes. In this paper, we propose a robust correlation matrix estimator that is regularized based on adaptive thresholding. The resulting method jointly tames the effects of the high-dimensionality, and data contamination. Computations are easy to implement and do not require hand tunings. Both simulated and real data are analyzed. A Monte Carlo experiment shows that the proposed method is capable of remarkable performances. Our correlation metric is more robust to outliers compared with the existing alternatives in two gene expression datasets. It is also shown how the regularization allows to automatically detect and filter spurious correlations. The same regularization is also extended to other less robust correlation measures. Finally, we apply the ARACNE algorithm on the SyNTreN gene expression data. Sensitivity and specificity of the reconstructed network is compared with the gold standard. We show that ARACNE performs better when it takes the proposed correlation matrix estimator as input. The R
Directory of Open Access Journals (Sweden)
Sushko Iryna
2012-08-01
Full Text Available This work contributes to classify the dynamic behaviors of piecewise smooth systems in which border collision bifurcations characterize the qualitative changes in the dynamics. A central point of our investigation is the intersection of two border collision bifurcation curves in a parameter plane. This problem is also associated with the continuity breaking in a fixed point of a piecewise smooth map. We will relax the hypothesis needed in [4] where it was proved that in the case of an increasing/decreasing contracting functions on the left/right side of a border point, at such a crossing point, we have a big-bang bifurcation, from which infinitely many border collision bifurcation curves are issuing. Cet travail est une contribution à la classification des comportements dynamiques de systèmes réguliers par morceaux dans lesquels les bifurcations de collision au bord caractérisent les changements qualitatifs de la dynamique. Un point central de notre étude est l’intersection de deux courbes de bifurcation de colision au bord dans un plan de paramètre. Ce problème est aussi associé avec la rupture de continuité en un point fixe d’une application régulière par morceaux. Nous allons relacher l’hypothèse requise dans [4], où il a été montré que dans le cas de fonctions contractantes croissantes/décroissantes strictement à gauche/droite d’un point du bord, en un tel point de franchissement, nous avons une bifurcation big-bang, de laquelle est issue une infinité de courbes de bifurcation de collision au bord.
Directory of Open Access Journals (Sweden)
Mohammad Hossein Moazzeni
2016-07-01
Full Text Available Daylight can be considered as one of the most important principles of sustainable architecture. It is unfortunate that this is neglected by designers in Tehran, a city that benefits from a significant amount of daylight and many clear sunny days during the year. Using a daylight controller system increases space natural light quality and decreases building lighting consumption by 60%. It also affects building thermal behavior, because most of them operate as shading. The light shelf is one of the passive systems for controlling daylight, mostly used with shading and installed in the upper half of the windows above eye level. The influence of light shelf parameters, such as its dimensions, shelf rotation angle and orientation on daylight efficiency and visual comfort in educational spaces is investigated in this article. Daylight simulation software and annual analysis based on climate information during space occupation hours were used. The results show that light shelf dimensions, as well as different orientations, especially in southern part, are influential in the distribution of natural light and visual comfort. At the southern orientation, increased light shelf dimensions result in an increase of the area of the work plane with suitable daylight levels by 2%–40% and a significant decrease in disturbing and intolerable glare hours.
Cosmological parameter estimation using particle swarm optimization
Prasad, Jayanti; Souradeep, Tarun
2012-06-01
Constraining theoretical models, which are represented by a set of parameters, using observational data is an important exercise in cosmology. In Bayesian framework this is done by finding the probability distribution of parameters which best fits to the observational data using sampling based methods like Markov chain Monte Carlo (MCMC). It has been argued that MCMC may not be the best option in certain problems in which the target function (likelihood) poses local maxima or have very high dimensionality. Apart from this, there may be examples in which we are mainly interested to find the point in the parameter space at which the probability distribution has the largest value. In this situation the problem of parameter estimation becomes an optimization problem. In the present work we show that particle swarm optimization (PSO), which is an artificial intelligence inspired population based search procedure, can also be used for cosmological parameter estimation. Using PSO we were able to recover the best-fit Λ cold dark matter (LCDM) model parameters from the WMAP seven year data without using any prior guess value or any other property of the probability distribution of parameters like standard deviation, as is common in MCMC. We also report the results of an exercise in which we consider a binned primordial power spectrum (to increase the dimensionality of problem) and find that a power spectrum with features gives lower chi square than the standard power law. Since PSO does not sample the likelihood surface in a fair way, we follow a fitting procedure to find the spread of likelihood function around the best-fit point.
High-Dimensional Analysis of Convex Optimization-Based Massive MIMO Decoders
Ben Atitallah, Ismail
2017-04-01
A wide range of modern large-scale systems relies on recovering a signal from noisy linear measurements. In many applications, the useful signal has inherent properties, such as sparsity, low-rankness, or boundedness, and making use of these properties and structures allow a more efficient recovery. Hence, a significant amount of work has been dedicated to developing and analyzing algorithms that can take advantage of the signal structure. Especially, since the advent of Compressed Sensing (CS) there has been significant progress towards this direction. Generally speaking, the signal structure can be harnessed by solving an appropriate regularized or constrained M-estimator. In modern Multi-input Multi-output (MIMO) communication systems, all transmitted signals are drawn from finite constellations and are thus bounded. Besides, most recent modulation schemes such as Generalized Space Shift Keying (GSSK) or Generalized Spatial Modulation (GSM) yield signals that are inherently sparse. In the recovery procedure, boundedness and sparsity can be promoted by using the ℓ1 norm regularization and by imposing an ℓ∞ norm constraint respectively. In this thesis, we propose novel optimization algorithms to recover certain classes of structured signals with emphasis on MIMO communication systems. The exact analysis permits a clear characterization of how well these systems perform. Also, it allows an automatic tuning of the parameters. In each context, we define the appropriate performance metrics and we analyze them exactly in the High Dimentional Regime (HDR). The framework we use for the analysis is based on Gaussian process inequalities; in particular, on a new strong and tight version of a classical comparison inequality (due to Gordon, 1988) in the presence of additional convexity assumptions. The new framework that emerged from this inequality is coined as Convex Gaussian Min-max Theorem (CGMT).
Diagonal Likelihood Ratio Test for Equality of Mean Vectors in High-Dimensional Data
Hu, Zongliang
2017-10-27
We propose a likelihood ratio test framework for testing normal mean vectors in high-dimensional data under two common scenarios: the one-sample test and the two-sample test with equal covariance matrices. We derive the test statistics under the assumption that the covariance matrices follow a diagonal matrix structure. In comparison with the diagonal Hotelling\\'s tests, our proposed test statistics display some interesting characteristics. In particular, they are a summation of the log-transformed squared t-statistics rather than a direct summation of those components. More importantly, to derive the asymptotic normality of our test statistics under the null and local alternative hypotheses, we do not require the assumption that the covariance matrix follows a diagonal matrix structure. As a consequence, our proposed test methods are very flexible and can be widely applied in practice. Finally, simulation studies and a real data analysis are also conducted to demonstrate the advantages of our likelihood ratio test method.
International Nuclear Information System (INIS)
Snyder, Abigail C.; Jiao, Yu
2010-01-01
Neutron experiments at the Spallation Neutron Source (SNS) at Oak Ridge National Laboratory (ORNL) frequently generate large amounts of data (on the order of 106-1012 data points). Hence, traditional data analysis tools run on a single CPU take too long to be practical and scientists are unable to efficiently analyze all data generated by experiments. Our goal is to develop a scalable algorithm to efficiently compute high-dimensional integrals of arbitrary functions. This algorithm can then be used to integrate the four-dimensional integrals that arise as part of modeling intensity from the experiments at the SNS. Here, three different one-dimensional numerical integration solvers from the GNU Scientific Library were modified and implemented to solve four-dimensional integrals. The results of these solvers on a final integrand provided by scientists at the SNS can be compared to the results of other methods, such as quasi-Monte Carlo methods, computing the same integral. A parallelized version of the most efficient method can allow scientists the opportunity to more effectively analyze all experimental data.
Directory of Open Access Journals (Sweden)
Enkelejda Miho
2018-02-01
Full Text Available The adaptive immune system recognizes antigens via an immense array of antigen-binding antibodies and T-cell receptors, the immune repertoire. The interrogation of immune repertoires is of high relevance for understanding the adaptive immune response in disease and infection (e.g., autoimmunity, cancer, HIV. Adaptive immune receptor repertoire sequencing (AIRR-seq has driven the quantitative and molecular-level profiling of immune repertoires, thereby revealing the high-dimensional complexity of the immune receptor sequence landscape. Several methods for the computational and statistical analysis of large-scale AIRR-seq data have been developed to resolve immune repertoire complexity and to understand the dynamics of adaptive immunity. Here, we review the current research on (i diversity, (ii clustering and network, (iii phylogenetic, and (iv machine learning methods applied to dissect, quantify, and compare the architecture, evolution, and specificity of immune repertoires. We summarize outstanding questions in computational immunology and propose future directions for systems immunology toward coupling AIRR-seq with the computational discovery of immunotherapeutics, vaccines, and immunodiagnostics.
Construction of high-dimensional neural network potentials using environment-dependent atom pairs.
Jose, K V Jovan; Artrith, Nongnuch; Behler, Jörg
2012-05-21
An accurate determination of the potential energy is the crucial step in computer simulations of chemical processes, but using electronic structure methods on-the-fly in molecular dynamics (MD) is computationally too demanding for many systems. Constructing more efficient interatomic potentials becomes intricate with increasing dimensionality of the potential-energy surface (PES), and for numerous systems the accuracy that can be achieved is still not satisfying and far from the reliability of first-principles calculations. Feed-forward neural networks (NNs) have a very flexible functional form, and in recent years they have been shown to be an accurate tool to construct efficient PESs. High-dimensional NN potentials based on environment-dependent atomic energy contributions have been presented for a number of materials. Still, these potentials may be improved by a more detailed structural description, e.g., in form of atom pairs, which directly reflect the atomic interactions and take the chemical environment into account. We present an implementation of an NN method based on atom pairs, and its accuracy and performance are compared to the atom-based NN approach using two very different systems, the methanol molecule and metallic copper. We find that both types of NN potentials provide an excellent description of both PESs, with the pair-based method yielding a slightly higher accuracy making it a competitive alternative for addressing complex systems in MD simulations.
Xia, Yin; Cai, Tianxi; Cai, T Tony
2018-01-01
Motivated by applications in genomics, we consider in this paper global and multiple testing for the comparisons of two high-dimensional linear regression models. A procedure for testing the equality of the two regression vectors globally is proposed and shown to be particularly powerful against sparse alternatives. We then introduce a multiple testing procedure for identifying unequal coordinates while controlling the false discovery rate and false discovery proportion. Theoretical justifications are provided to guarantee the validity of the proposed tests and optimality results are established under sparsity assumptions on the regression coefficients. The proposed testing procedures are easy to implement. Numerical properties of the procedures are investigated through simulation and data analysis. The results show that the proposed tests maintain the desired error rates under the null and have good power under the alternative at moderate sample sizes. The procedures are applied to the Framingham Offspring study to investigate the interactions between smoking and cardiovascular related genetic mutations important for an inflammation marker.
A Comparison of Machine Learning Methods in a High-Dimensional Classification Problem
Directory of Open Access Journals (Sweden)
Zekić-Sušac Marijana
2014-09-01
Full Text Available Background: Large-dimensional data modelling often relies on variable reduction methods in the pre-processing and in the post-processing stage. However, such a reduction usually provides less information and yields a lower accuracy of the model. Objectives: The aim of this paper is to assess the high-dimensional classification problem of recognizing entrepreneurial intentions of students by machine learning methods. Methods/Approach: Four methods were tested: artificial neural networks, CART classification trees, support vector machines, and k-nearest neighbour on the same dataset in order to compare their efficiency in the sense of classification accuracy. The performance of each method was compared on ten subsamples in a 10-fold cross-validation procedure in order to assess computing sensitivity and specificity of each model. Results: The artificial neural network model based on multilayer perceptron yielded a higher classification rate than the models produced by other methods. The pairwise t-test showed a statistical significance between the artificial neural network and the k-nearest neighbour model, while the difference among other methods was not statistically significant. Conclusions: Tested machine learning methods are able to learn fast and achieve high classification accuracy. However, further advancement can be assured by testing a few additional methodological refinements in machine learning methods.
Schran, Christoph; Uhl, Felix; Behler, Jörg; Marx, Dominik
2018-03-01
The design of accurate helium-solute interaction potentials for the simulation of chemically complex molecules solvated in superfluid helium has long been a cumbersome task due to the rather weak but strongly anisotropic nature of the interactions. We show that this challenge can be met by using a combination of an effective pair potential for the He-He interactions and a flexible high-dimensional neural network potential (NNP) for describing the complex interaction between helium and the solute in a pairwise additive manner. This approach yields an excellent agreement with a mean absolute deviation as small as 0.04 kJ mol-1 for the interaction energy between helium and both hydronium and Zundel cations compared with coupled cluster reference calculations with an energetically converged basis set. The construction and improvement of the potential can be performed in a highly automated way, which opens the door for applications to a variety of reactive molecules to study the effect of solvation on the solute as well as the solute-induced structuring of the solvent. Furthermore, we show that this NNP approach yields very convincing agreement with the coupled cluster reference for properties like many-body spatial and radial distribution functions. This holds for the microsolvation of the protonated water monomer and dimer by a few helium atoms up to their solvation in bulk helium as obtained from path integral simulations at about 1 K.
Multi-Scale Factor Analysis of High-Dimensional Brain Signals
Ting, Chee-Ming
2017-05-18
In this paper, we develop an approach to modeling high-dimensional networks with a large number of nodes arranged in a hierarchical and modular structure. We propose a novel multi-scale factor analysis (MSFA) model which partitions the massive spatio-temporal data defined over the complex networks into a finite set of regional clusters. To achieve further dimension reduction, we represent the signals in each cluster by a small number of latent factors. The correlation matrix for all nodes in the network are approximated by lower-dimensional sub-structures derived from the cluster-specific factors. To estimate regional connectivity between numerous nodes (within each cluster), we apply principal components analysis (PCA) to produce factors which are derived as the optimal reconstruction of the observed signals under the squared loss. Then, we estimate global connectivity (between clusters or sub-networks) based on the factors across regions using the RV-coefficient as the cross-dependence measure. This gives a reliable and computationally efficient multi-scale analysis of both regional and global dependencies of the large networks. The proposed novel approach is applied to estimate brain connectivity networks using functional magnetic resonance imaging (fMRI) data. Results on resting-state fMRI reveal interesting modular and hierarchical organization of human brain networks during rest.
Meng, Xi; Nguyen, Bao D; Ridge, Clark; Shaka, A J
2009-01-01
High-dimensional (HD) NMR spectra have poorer digital resolution than low-dimensional (LD) spectra, for a fixed amount of experiment time. This has led to "reduced-dimensionality" strategies, in which several LD projections of the HD NMR spectrum are acquired, each with higher digital resolution; an approximate HD spectrum is then inferred by some means. We propose a strategy that moves in the opposite direction, by adding more time dimensions to increase the information content of the data set, even if only a very sparse time grid is used in each dimension. The full HD time-domain data can be analyzed by the filter diagonalization method (FDM), yielding very narrow resonances along all of the frequency axes, even those with sparse sampling. Integrating over the added dimensions of HD FDM NMR spectra reconstitutes LD spectra with enhanced resolution, often more quickly than direct acquisition of the LD spectrum with a larger number of grid points in each of the fewer dimensions. If the extra-dimensions do not appear in the final spectrum, and are used solely to boost information content, we propose the moniker hidden-dimension NMR. This work shows that HD peaks have unmistakable frequency signatures that can be detected as single HD objects by an appropriate algorithm, even though their patterns would be tricky for a human operator to visualize or recognize, and even if digital resolution in an HD FT spectrum is very coarse compared with natural line widths.
Feature Augmentation via Nonparametrics and Selection (FANS) in High-Dimensional Classification.
Fan, Jianqing; Feng, Yang; Jiang, Jiancheng; Tong, Xin
We propose a high dimensional classification method that involves nonparametric feature augmentation. Knowing that marginal density ratios are the most powerful univariate classifiers, we use the ratio estimates to transform the original feature measurements. Subsequently, penalized logistic regression is invoked, taking as input the newly transformed or augmented features. This procedure trains models equipped with local complexity and global simplicity, thereby avoiding the curse of dimensionality while creating a flexible nonlinear decision boundary. The resulting method is called Feature Augmentation via Nonparametrics and Selection (FANS). We motivate FANS by generalizing the Naive Bayes model, writing the log ratio of joint densities as a linear combination of those of marginal densities. It is related to generalized additive models, but has better interpretability and computability. Risk bounds are developed for FANS. In numerical analysis, FANS is compared with competing methods, so as to provide a guideline on its best application domain. Real data analysis demonstrates that FANS performs very competitively on benchmark email spam and gene expression data sets. Moreover, FANS is implemented by an extremely fast algorithm through parallel computing.
Simulation-based hypothesis testing of high dimensional means under covariance heterogeneity.
Chang, Jinyuan; Zheng, Chao; Zhou, Wen-Xin; Zhou, Wen
2017-12-01
In this article, we study the problem of testing the mean vectors of high dimensional data in both one-sample and two-sample cases. The proposed testing procedures employ maximum-type statistics and the parametric bootstrap techniques to compute the critical values. Different from the existing tests that heavily rely on the structural conditions on the unknown covariance matrices, the proposed tests allow general covariance structures of the data and therefore enjoy wide scope of applicability in practice. To enhance powers of the tests against sparse alternatives, we further propose two-step procedures with a preliminary feature screening step. Theoretical properties of the proposed tests are investigated. Through extensive numerical experiments on synthetic data sets and an human acute lymphoblastic leukemia gene expression data set, we illustrate the performance of the new tests and how they may provide assistance on detecting disease-associated gene-sets. The proposed methods have been implemented in an R-package HDtest and are available on CRAN. © 2017, The International Biometric Society.
Multi-SOM: an Algorithm for High-Dimensional, Small Size Datasets
Directory of Open Access Journals (Sweden)
Shen Lu
2013-04-01
Full Text Available Since it takes time to do experiments in bioinformatics, biological datasets are sometimes small but with high dimensionality. From probability theory, in order to discover knowledge from a set of data, we have to have a sufficient number of samples. Otherwise, the error bounds can become too large to be useful. For the SOM (Self- Organizing Map algorithm, the initial map is based on the training data. In order to avoid the bias caused by the insufficient training data, in this paper we present an algorithm, called Multi-SOM. Multi-SOM builds a number of small self-organizing maps, instead of just one big map. Bayesian decision theory is used to make the final decision among similar neurons on different maps. In this way, we can better ensure that we can get a real random initial weight vector set, the map size is less of consideration and errors tend to average out. In our experiments as applied to microarray datasets which are highly intense data composed of genetic related information, the precision of Multi-SOMs is 10.58% greater than SOMs, and its recall is 11.07% greater than SOMs. Thus, the Multi-SOMs algorithm is practical.
Directory of Open Access Journals (Sweden)
Andre Lamert
2018-03-01
Full Text Available We present and compare two flexible and effective methodologies to predict disturbance zones ahead of underground tunnels by using elastic full-waveform inversion. One methodology uses a linearized, iterative approach based on misfit gradients computed with the adjoint method while the other uses iterative, gradient-free unscented Kalman filtering in conjunction with a level-set representation. Whereas the former does not involve a priori assumptions on the distribution of elastic properties ahead of the tunnel, the latter introduces a massive reduction in the number of explicit model parameters to be inverted for by focusing on the geometric form of potential disturbances and their average elastic properties. Both imaging methodologies are validated through successful reconstructions of simple disturbances. As an application, we consider an elastic multiple disturbance scenario. By using identical synthetic time-domain seismograms as test data, we obtain satisfactory, albeit different, reconstruction results from the two inversion methodologies. The computational costs of both approaches are of the same order of magnitude, with the gradient-based approach showing a slight advantage. The model parameter space reduction approach compensates for this by additionally providing a posteriori estimates of model parameter uncertainty. Keywords: Tunnel seismics, Full waveform inversion, Seismic waves, Level-set method, Adjoint method, Kalman filter
Mao, Baoguang; Guo, Donglei; Qin, Jinwen; Meng, Tao; Wang, Xin; Cao, Minhua
2018-01-08
Despite significant advancement in preparing various hollow structures by Ostwald ripening, one common problem is the intractable uncontrollability of initiating Ostwald ripening due to the complexity of the reaction processes. Here, a new strategy on Hansen solubility parameter (HSP)-guided solvent selection to initiate Ostwald ripening is proposed. Based on this comprehensive principle for solvent optimization, N,N-dimethylformamide (DMF) was screened out, achieving accurate synthesis of interior space-tunable MoSe 2 spherical structures (solid, core-shell, yolk-shell and hollow spheres). The resultant MoSe 2 structures exhibit architecture-dependent electrochemical performances towards hydrogen evolution reaction and sodium-ion batteries. This pre-solvent selection strategy can effectively provide researchers great possibility in efficiently synthesizing various hollow structures. This work paves a new pathway for deeply understanding Ostwald ripening. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Passas, Georgios; Freear, Steven; Fawcett, Darren
2010-08-01
Orthogonal frequency division multiplexing (OFDM)-based feed-forward space-time trellis code (FFSTTC) encoders can be synthesised as very high speed integrated circuit hardware description language (VHDL) designs. Evaluation of their FPGA implementation can lead to conclusions that help a designer to decide the optimum implementation, given the encoder structural parameters. VLSI architectures based on 1-bit multipliers and look-up tables (LUTs) are compared in terms of FPGA slices and block RAMs (area), as well as in terms of minimum clock period (speed). Area and speed graphs versus encoder memory order are provided for quadrature phase shift keying (QPSK) and 8 phase shift keying (8-PSK) modulation and two transmit antennas, revealing best implementation under these conditions. The effect of number of modulation bits and transmit antennas on the encoder implementation complexity is also investigated.
International Nuclear Information System (INIS)
Langrene, Nicolas
2014-01-01
This thesis deals with the numerical solution of general stochastic control problems, with notable applications for electricity markets. We first propose a structural model for the price of electricity, allowing for price spikes well above the marginal fuel price under strained market conditions. This model allows to price and partially hedge electricity derivatives, using fuel forwards as hedging instruments. Then, we propose an algorithm, which combines Monte-Carlo simulations with local basis regressions, to solve general optimal switching problems. A comprehensive rate of convergence of the method is provided. Moreover, we manage to make the algorithm parsimonious in memory (and hence suitable for high dimensional problems) by generalizing to this framework a memory reduction method that avoids the storage of the sample paths. We illustrate this on the problem of investments in new power plants (our structural power price model allowing the new plants to impact the price of electricity). Finally, we study more general stochastic control problems (the control can be continuous and impact the drift and volatility of the state process), the solutions of which belong to the class of fully nonlinear Hamilton-Jacobi-Bellman equations, and can be handled via constrained Backward Stochastic Differential Equations, for which we develop a backward algorithm based on control randomization and parametric optimizations. A rate of convergence between the constraPned BSDE and its discrete version is provided, as well as an estimate of the optimal control. This algorithm is then applied to the problem of super replication of options under uncertain volatilities (and correlations). (author)
Evaluation of a new high-dimensional miRNA profiling platform
Directory of Open Access Journals (Sweden)
Lamblin Anne-Francoise
2009-08-01
Full Text Available Abstract Background MicroRNAs (miRNAs are a class of approximately 22 nucleotide long, widely expressed RNA molecules that play important regulatory roles in eukaryotes. To investigate miRNA function, it is essential that methods to quantify their expression levels be available. Methods We evaluated a new miRNA profiling platform that utilizes Illumina's existing robust DASL chemistry as the basis for the assay. Using total RNA from five colon cancer patients and four cell lines, we evaluated the reproducibility of miRNA expression levels across replicates and with varying amounts of input RNA. The beta test version was comprised of 735 miRNA targets of Illumina's miRNA profiling application. Results Reproducibility between sample replicates within a plate was good (Spearman's correlation 0.91 to 0.98 as was the plate-to-plate reproducibility replicates run on different days (Spearman's correlation 0.84 to 0.98. To determine whether quality data could be obtained from a broad range of input RNA, data obtained from amounts ranging from 25 ng to 800 ng were compared to those obtained at 200 ng. No effect across the range of RNA input was observed. Conclusion These results indicate that very small amounts of starting material are sufficient to allow sensitive miRNA profiling using the Illumina miRNA high-dimensional platform. Nonlinear biases were observed between replicates, indicating the need for abundance-dependent normalization. Overall, the performance characteristics of the Illumina miRNA profiling system were excellent.
Multivariate linear regression of high-dimensional fMRI data with multiple target variables.
Valente, Giancarlo; Castellanos, Agustin Lage; Vanacore, Gianluca; Formisano, Elia
2014-05-01
Multivariate regression is increasingly used to study the relation between fMRI spatial activation patterns and experimental stimuli or behavioral ratings. With linear models, informative brain locations are identified by mapping the model coefficients. This is a central aspect in neuroimaging, as it provides the sought-after link between the activity of neuronal populations and subject's perception, cognition or behavior. Here, we show that mapping of informative brain locations using multivariate linear regression (MLR) may lead to incorrect conclusions and interpretations. MLR algorithms for high dimensional data are designed to deal with targets (stimuli or behavioral ratings, in fMRI) separately, and the predictive map of a model integrates information deriving from both neural activity patterns and experimental design. Not accounting explicitly for the presence of other targets whose associated activity spatially overlaps with the one of interest may lead to predictive maps of troublesome interpretation. We propose a new model that can correctly identify the spatial patterns associated with a target while achieving good generalization. For each target, the training is based on an augmented dataset, which includes all remaining targets. The estimation on such datasets produces both maps and interaction coefficients, which are then used to generalize. The proposed formulation is independent of the regression algorithm employed. We validate this model on simulated fMRI data and on a publicly available dataset. Results indicate that our method achieves high spatial sensitivity and good generalization and that it helps disentangle specific neural effects from interaction with predictive maps associated with other targets. Copyright © 2013 Wiley Periodicals, Inc.
Directory of Open Access Journals (Sweden)
Datta Susmita
2010-08-01
Full Text Available Abstract Background Generally speaking, different classifiers tend to work well for certain types of data and conversely, it is usually not known a priori which algorithm will be optimal in any given classification application. In addition, for most classification problems, selecting the best performing classification algorithm amongst a number of competing algorithms is a difficult task for various reasons. As for example, the order of performance may depend on the performance measure employed for such a comparison. In this work, we present a novel adaptive ensemble classifier constructed by combining bagging and rank aggregation that is capable of adaptively changing its performance depending on the type of data that is being classified. The attractive feature of the proposed classifier is its multi-objective nature where the classification results can be simultaneously optimized with respect to several performance measures, for example, accuracy, sensitivity and specificity. We also show that our somewhat complex strategy has better predictive performance as judged on test samples than a more naive approach that attempts to directly identify the optimal classifier based on the training data performances of the individual classifiers. Results We illustrate the proposed method with two simulated and two real-data examples. In all cases, the ensemble classifier performs at the level of the best individual classifier comprising the ensemble or better. Conclusions For complex high-dimensional datasets resulting from present day high-throughput experiments, it may be wise to consider a number of classification algorithms combined with dimension reduction techniques rather than a fixed standard algorithm set a priori.
Landfors, Mattias; Philip, Philge; Rydén, Patrik; Stenberg, Per
2011-01-01
Genome-wide analysis of gene expression or protein binding patterns using different array or sequencing based technologies is now routinely performed to compare different populations, such as treatment and reference groups. It is often necessary to normalize the data obtained to remove technical variation introduced in the course of conducting experimental work, but standard normalization techniques are not capable of eliminating technical bias in cases where the distribution of the truly altered variables is skewed, i.e. when a large fraction of the variables are either positively or negatively affected by the treatment. However, several experiments are likely to generate such skewed distributions, including ChIP-chip experiments for the study of chromatin, gene expression experiments for the study of apoptosis, and SNP-studies of copy number variation in normal and tumour tissues. A preliminary study using spike-in array data established that the capacity of an experiment to identify altered variables and generate unbiased estimates of the fold change decreases as the fraction of altered variables and the skewness increases. We propose the following work-flow for analyzing high-dimensional experiments with regions of altered variables: (1) Pre-process raw data using one of the standard normalization techniques. (2) Investigate if the distribution of the altered variables is skewed. (3) If the distribution is not believed to be skewed, no additional normalization is needed. Otherwise, re-normalize the data using a novel HMM-assisted normalization procedure. (4) Perform downstream analysis. Here, ChIP-chip data and simulated data were used to evaluate the performance of the work-flow. It was found that skewed distributions can be detected by using the novel DSE-test (Detection of Skewed Experiments). Furthermore, applying the HMM-assisted normalization to experiments where the distribution of the truly altered variables is skewed results in considerably higher
From Ambiguities to Insights: Query-based Comparisons of High-Dimensional Data
Kowalski, Jeanne; Talbot, Conover; Tsai, Hua L.; Prasad, Nijaguna; Umbricht, Christopher; Zeiger, Martha A.
2007-11-01
Genomic technologies will revolutionize drag discovery and development; that much is universally agreed upon. The high dimension of data from such technologies has challenged available data analytic methods; that much is apparent. To date, large-scale data repositories have not been utilized in ways that permit their wealth of information to be efficiently processed for knowledge, presumably due in large part to inadequate analytical tools to address numerous comparisons of high-dimensional data. In candidate gene discovery, expression comparisons are often made between two features (e.g., cancerous versus normal), such that the enumeration of outcomes is manageable. With multiple features, the setting becomes more complex, in terms of comparing expression levels of tens of thousands transcripts across hundreds of features. In this case, the number of outcomes, while enumerable, become rapidly large and unmanageable, and scientific inquiries become more abstract, such as "which one of these (compounds, stimuli, etc.) is not like the others?" We develop analytical tools that promote more extensive, efficient, and rigorous utilization of the public data resources generated by the massive support of genomic studies. Our work innovates by enabling access to such metadata with logically formulated scientific inquires that define, compare and integrate query-comparison pair relations for analysis. We demonstrate our computational tool's potential to address an outstanding biomedical informatics issue of identifying reliable molecular markers in thyroid cancer. Our proposed query-based comparison (QBC) facilitates access to and efficient utilization of metadata through logically formed inquires expressed as query-based comparisons by organizing and comparing results from biotechnologies to address applications in biomedicine.
Dahm, T.; Heimann, S.; Isken, M.; Vasyura-Bathke, H.; Kühn, D.; Sudhaus, H.; Kriegerowski, M.; Daout, S.; Steinberg, A.; Cesca, S.
2017-12-01
Seismic source and moment tensor waveform inversion is often ill-posed or non-unique if station coverage is poor or signals are weak. Therefore, the interpretation of moment tensors can become difficult, if not the full model space is explored, including all its trade-offs and uncertainties. This is especially true for non-double couple components of weak or shallow earthquakes, as for instance found in volcanic, geothermal or mining environments.We developed a bootstrap-based probabilistic optimization scheme (Grond), which is based on pre-calculated Greens function full waveform databases (e.g. fomosto tool, doi.org/10.5880/GFZ.2.1.2017.001). Grond is able to efficiently explore the full model space, the trade-offs and the uncertainties of source parameters. The program is highly flexible with respect to the adaption to specific problems, the design of objective functions, and the diversity of empirical datasets.It uses an integrated, robust waveform data processing based on a newly developed Python toolbox for seismology (Pyrocko, see Heimann et al., 2017, http://doi.org/10.5880/GFZ.2.1.2017.001), and allows for visual inspection of many aspects of the optimization problem. Grond has been applied to the CMT moment tensor inversion using W-phases, to nuclear explosions in Korea, to meteorite atmospheric explosions, to volcano-tectonic events during caldera collapse and to intra-plate volcanic and tectonic crustal events.Grond can be used to optimize simultaneously seismological waveforms, amplitude spectra and static displacements of geodetic data as InSAR and GPS (e.g. KITE, Isken et al., 2017, http://doi.org/10.5880/GFZ.2.1.2017.002). We present examples of Grond optimizations to demonstrate the advantage of a full exploration of source parameter uncertainties for interpretation.
Filaments of Meaning in Word Space
Karlgren, Jussi; Holst, Anders; Sahlgren, Magnus
2008-01-01
Word space models, in the sense of vector space models built on distributional data taken from texts, are used to model semantic relations between words. We argue that the high dimensionality of typical vector space models lead to unintuitive effects on modeling likeness of meaning and that the local structure of word spaces is where interesting semantic relations reside. We show that the local structure of word spaces has substantially different dimensionality and character than the global s...
Cowley, Benjamin R.; Kaufman, Matthew T.; Churchland, Mark M.; Ryu, Stephen I.; Shenoy, Krishna V.; Yu, Byron M.
2012-01-01
The activity of tens to hundreds of neurons can be succinctly summarized by a smaller number of latent variables extracted using dimensionality reduction methods. These latent variables define a reduced-dimensional space in which we can study how population activity varies over time, across trials, and across experimental conditions. Ideally, we would like to visualize the population activity directly in the reduced-dimensional space, whose optimal dimensionality (as determined from the data)...
Integrating high dimensional bi-directional parsing models for gene mention tagging.
Hsu, Chun-Nan; Chang, Yu-Ming; Kuo, Cheng-Ju; Lin, Yu-Shi; Huang, Han-Shen; Chung, I-Fang
2008-07-01
Tagging gene and gene product mentions in scientific text is an important initial step of literature mining. In this article, we describe in detail our gene mention tagger participated in BioCreative 2 challenge and analyze what contributes to its good performance. Our tagger is based on the conditional random fields model (CRF), the most prevailing method for the gene mention tagging task in BioCreative 2. Our tagger is interesting because it accomplished the highest F-scores among CRF-based methods and second over all. Moreover, we obtained our results by mostly applying open source packages, making it easy to duplicate our results. We first describe in detail how we developed our CRF-based tagger. We designed a very high dimensional feature set that includes most of information that may be relevant. We trained bi-directional CRF models with the same set of features, one applies forward parsing and the other backward, and integrated two models based on the output scores and dictionary filtering. One of the most prominent factors that contributes to the good performance of our tagger is the integration of an additional backward parsing model. However, from the definition of CRF, it appears that a CRF model is symmetric and bi-directional parsing models will produce the same results. We show that due to different feature settings, a CRF model can be asymmetric and the feature setting for our tagger in BioCreative 2 not only produces different results but also gives backward parsing models slight but constant advantage over forward parsing model. To fully explore the potential of integrating bi-directional parsing models, we applied different asymmetric feature settings to generate many bi-directional parsing models and integrate them based on the output scores. Experimental results show that this integrated model can achieve even higher F-score solely based on the training corpus for gene mention tagging. Data sets, programs and an on-line service of our gene
Greedy algorithms for high-dimensional non-symmetric linear problems***
Directory of Open Access Journals (Sweden)
Cancès E.
2013-12-01
Full Text Available In this article, we present a family of numerical approaches to solve high-dimensional linear non-symmetric problems. The principle of these methods is to approximate a function which depends on a large number of variates by a sum of tensor product functions, each term of which is iteratively computed via a greedy algorithm ? . There exists a good theoretical framework for these methods in the case of (linear and nonlinear symmetric elliptic problems. However, the convergence results are not valid any more as soon as the problems under consideration are not symmetric. We present here a review of the main algorithms proposed in the literature to circumvent this difficulty, together with some new approaches. The theoretical convergence results and the practical implementation of these algorithms are discussed. Their behaviors are illustrated through some numerical examples. Dans cet article, nous présentons une famille de méthodes numériques pour résoudre des problèmes linéaires non symétriques en grande dimension. Le principe de ces approches est de représenter une fonction dépendant d’un grand nombre de variables sous la forme d’une somme de fonctions produit tensoriel, dont chaque terme est calculé itérativement via un algorithme glouton ? . Ces méthodes possèdent de bonnes propriétés théoriques dans le cas de problèmes elliptiques symétriques (linéaires ou non linéaires, mais celles-ci ne sont plus valables dès lors que les problèmes considérés ne sont plus symétriques. Nous présentons une revue des principaux algorithmes proposés dans la littérature pour contourner cette difficulté ainsi que de nouvelles approches que nous proposons. Les résultats de convergence théoriques et la mise en oeuvre pratique de ces algorithmes sont détaillés et leur comportement est illustré au travers d’exemples numériques.
Optimal design criteria - prediction vs. parameter estimation
Waldl, Helmut
2014-05-01
G-optimality is a popular design criterion for optimal prediction, it tries to minimize the kriging variance over the whole design region. A G-optimal design minimizes the maximum variance of all predicted values. If we use kriging methods for prediction it is self-evident to use the kriging variance as a measure of uncertainty for the estimates. Though the computation of the kriging variance and even more the computation of the empirical kriging variance is computationally very costly and finding the maximum kriging variance in high-dimensional regions can be time demanding such that we cannot really find the G-optimal design with nowadays available computer equipment in practice. We cannot always avoid this problem by using space-filling designs because small designs that minimize the empirical kriging variance are often non-space-filling. D-optimality is the design criterion related to parameter estimation. A D-optimal design maximizes the determinant of the information matrix of the estimates. D-optimality in terms of trend parameter estimation and D-optimality in terms of covariance parameter estimation yield basically different designs. The Pareto frontier of these two competing determinant criteria corresponds with designs that perform well under both criteria. Under certain conditions searching the G-optimal design on the above Pareto frontier yields almost as good results as searching the G-optimal design in the whole design region. In doing so the maximum of the empirical kriging variance has to be computed only a few times though. The method is demonstrated by means of a computer simulation experiment based on data provided by the Belgian institute Management Unit of the North Sea Mathematical Models (MUMM) that describe the evolution of inorganic and organic carbon and nutrients, phytoplankton, bacteria and zooplankton in the Southern Bight of the North Sea.
Examining a Thermodynamic Order Parameter of Protein Folding.
Chong, Song-Ho; Ham, Sihyun
2018-05-08
Dimensionality reduction with a suitable choice of order parameters or reaction coordinates is commonly used for analyzing high-dimensional time-series data generated by atomistic biomolecular simulations. So far, geometric order parameters, such as the root mean square deviation, fraction of native amino acid contacts, and collective coordinates that best characterize rare or large conformational transitions, have been prevailing in protein folding studies. Here, we show that the solvent-averaged effective energy, which is a thermodynamic quantity but unambiguously defined for individual protein conformations, serves as a good order parameter of protein folding. This is illustrated through the application to the folding-unfolding simulation trajectory of villin headpiece subdomain. We rationalize the suitability of the effective energy as an order parameter by the funneledness of the underlying protein free energy landscape. We also demonstrate that an improved conformational space discretization is achieved by incorporating the effective energy. The most distinctive feature of this thermodynamic order parameter is that it works in pointing to near-native folded structures even when the knowledge of the native structure is lacking, and the use of the effective energy will also find applications in combination with methods of protein structure prediction.
Virtual screening of inorganic materials synthesis parameters with deep learning
Kim, Edward; Huang, Kevin; Jegelka, Stefanie; Olivetti, Elsa
2017-12-01
Virtual materials screening approaches have proliferated in the past decade, driven by rapid advances in first-principles computational techniques, and machine-learning algorithms. By comparison, computationally driven materials synthesis screening is still in its infancy, and is mired by the challenges of data sparsity and data scarcity: Synthesis routes exist in a sparse, high-dimensional parameter space that is difficult to optimize over directly, and, for some materials of interest, only scarce volumes of literature-reported syntheses are available. In this article, we present a framework for suggesting quantitative synthesis parameters and potential driving factors for synthesis outcomes. We use a variational autoencoder to compress sparse synthesis representations into a lower dimensional space, which is found to improve the performance of machine-learning tasks. To realize this screening framework even in cases where there are few literature data, we devise a novel data augmentation methodology that incorporates literature synthesis data from related materials systems. We apply this variational autoencoder framework to generate potential SrTiO3 synthesis parameter sets, propose driving factors for brookite TiO2 formation, and identify correlations between alkali-ion intercalation and MnO2 polymorph selection.
Joudaki, Shahab; Blake, Chris; Johnson, Andrew; Amon, Alexandra; Asgari, Marika; Choi, Ami; Erben, Thomas; Glazebrook, Karl; Harnois-Déraps, Joachim; Heymans, Catherine; Hildebrandt, Hendrik; Hoekstra, Henk; Klaes, Dominik; Kuijken, Konrad; Lidman, Chris; Mead, Alexander; Miller, Lance; Parkinson, David; Poole, Gregory B.; Schneider, Peter; Viola, Massimo; Wolf, Christian
2018-03-01
We perform a combined analysis of cosmic shear tomography, galaxy-galaxy lensing tomography, and redshift-space multipole power spectra (monopole and quadrupole) using 450 deg2 of imaging data by the Kilo Degree Survey (KiDS-450) overlapping with two spectroscopic surveys: the 2-degree Field Lensing Survey (2dFLenS) and the Baryon Oscillation Spectroscopic Survey (BOSS). We restrict the galaxy-galaxy lensing and multipole power spectrum measurements to the overlapping regions with KiDS, and self-consistently compute the full covariance between the different observables using a large suite of N-body simulations. We methodically analyse different combinations of the observables, finding that the galaxy-galaxy lensing measurements are particularly useful in improving the constraint on the intrinsic alignment amplitude, while the multipole power spectra are useful in tightening the constraints along the lensing degeneracy direction. The fully combined constraint on S_8 ≡ σ _8 √{Ω _m/0.3}=0.742± 0.035, which is an improvement by 20 per cent compared to KiDS alone, corresponds to a 2.6σ discordance with Planck, and is not significantly affected by fitting to a more conservative set of scales. Given the tightening of the parameter space, we are unable to resolve the discordance with an extended cosmology that is simultaneously favoured in a model selection sense, including the sum of neutrino masses, curvature, evolving dark energy and modified gravity. The complementarity of our observables allows for constraints on modified gravity degrees of freedom that are not simultaneously bounded with either probe alone, and up to a factor of three improvement in the S8 constraint in the extended cosmology compared to KiDS alone.
Trembach, Vera
2014-01-01
Space is an introduction to the mysteries of the Universe. Included are Task Cards for independent learning, Journal Word Cards for creative writing, and Hands-On Activities for reinforcing skills in Math and Language Arts. Space is a perfect introduction to further research of the Solar System.
Shaffer, Patrick; Valsson, Omar; Parrinello, Michele
2016-02-02
The capabilities of molecular simulations have been greatly extended by a number of widely used enhanced sampling methods that facilitate escaping from metastable states and crossing large barriers. Despite these developments there are still many problems which remain out of reach for these methods which has led to a vigorous effort in this area. One of the most important problems that remains unsolved is sampling high-dimensional free-energy landscapes and systems that are not easily described by a small number of collective variables. In this work we demonstrate a new way to compute free-energy landscapes of high dimensionality based on the previously introduced variationally enhanced sampling, and we apply it to the miniprotein chignolin.
Shaffer, Patrick; Valsson, Omar; Parrinello, Michele
2016-01-01
The capabilities of molecular simulations have been greatly extended by a number of widely used enhanced sampling methods that facilitate escaping from metastable states and crossing large barriers. Despite these developments there are still many problems which remain out of reach for these methods which has led to a vigorous effort in this area. One of the most important problems that remains unsolved is sampling high-dimensional free-energy landscapes and systems that are not easily described by a small number of collective variables. In this work we demonstrate a new way to compute free-energy landscapes of high dimensionality based on the previously introduced variationally enhanced sampling, and we apply it to the miniprotein chignolin. PMID:26787868
International Nuclear Information System (INIS)
Oganesian, A.G.
1998-01-01
A method is proposed for estimating unknown vacuum expectation values of high-dimensional operators. The method is based on the idea that the factorization hypothesis is self-consistent. Results are obtained for all vacuum expectation values of dimension-7 operators, and some estimates for dimension-10 operators are presented as well. The resulting values are used to compute corrections of higher dimensions to the Bjorken and Ellis-Jaffe sum rules
Multisymplectic Structure－Preserving in Simple Finite Element Method in High Dimensional Case
Institute of Scientific and Technical Information of China (English)
BAIYong-Qiang; LIUZhen; PEIMing; ZHENGZhu-Jun
2003-01-01
In this paper, we study a finite element scheme of some semi-linear elliptic boundary value problems in high-dhnensjonal space. With uniform mesh, we find that, the numerical scheme derived from finite element method can keep a preserved multisymplectic structure.
Multisymplectic Structure-Preserving in Simple Finite Element Method in High Dimensional Case
Institute of Scientific and Technical Information of China (English)
BAI Yong-Qiang; LIU Zhen; PEI Ming; ZHENG Zhu-Jun
2003-01-01
In this paper, we study a finite element scheme of some semi-linear elliptic boundary value problems inhigh-dimensional space. With uniform mesh, we find that, the numerical scheme derived from finite element method cankeep a preserved multisymplectic structure.
Restoring the Generalizability of SVM Based Decoding in High Dimensional Neuroimage Data
DEFF Research Database (Denmark)
Abrahamsen, Trine Julie; Hansen, Lars Kai
2011-01-01
Variance inflation is caused by a mismatch between linear projections of test and training data when projections are estimated on training sets smaller than the dimensionality of the feature space. We demonstrate that variance inflation can lead to an increased neuroimage decoding error rate...
Pini, Núbia Inocencya Pavesi; Marchi, Luciana Manzotti De; Pascotto, Renata Corrêa
2015-01-01
Maxillary lateral incisor agenesis (MLIA) is a condition that affects both dental esthetics and function in young patients, and represents an important challenge for clinicians. Although several treatment options are available, the mesial repositioning of the canines followed by teeth recontouring into lateral incisors; or space opening/maintenance followed by implant placement have recently emerged as two important treatment approaches. In this article, the current and latest literature has been reviewed in order to summarize the functional and esthetic outcomes obtained with these two forms of treatment of MLIA patients in recent years. Indications, clinical limitations and the most important parameters to achieve the best possible results with each treatment modality are also discussed. Within the limitations of this review, it is not possible to assert at this point in time that one treatment approach is more advantageous than the other. Long-term followup studies comparing the existing treatment options are still lacking in the literature, and they are necessary to shed some light on the issue. It is possible, however, to state that adequate multidisciplinary diagnosis and planning are imperative to define the treatment option that will provide the best individual results for patients with MLIA. PMID:25646137
Henderson, Calen B.; Poleski, Radoslaw; Penny, Matthew; Street, Rachel A.; Bennett, David P.; Hogg, David W.; Gaudi, B. Scott; Zhu, W.; Barclay, T.; Barentsen, G.;
2016-01-01
K2's Campaign 9 (K2C9) will conduct a approximately 3.7 sq. deg survey toward the Galactic bulge from 2016 April 22 through July 2 that will leverage the spatial separation between K2 and the Earth to facilitate measurement of the microlens parallax Pi(sub E) for approximately greater than 170 microlensing events. These will include several that are planetary in nature as well as many short-timescale microlensing events, which are potentially indicative of free-floating planets (FFPs). These satellite parallax measurements will in turn allow for the direct measurement of the masses of and distances to the lensing systems. In this article we provide an overview of the K2C9 space- and ground-based microlensing survey. Specifically, we detail the demographic questions that can be addressed by this program, including the frequency of FFPs and the Galactic distribution of exoplanets, the observational parameters of K2C9, and the array of resources dedicated to concurrent observations. Finally, we outline the avenues through which the larger community can become involved, and generally encourage participation in K2C9, which constitutes an important pathfinding mission and community exercise in anticipation of WFIRST.
Hands-on parameter search for neural simulations by a MIDI-controller.
Eichner, Hubert; Borst, Alexander
2011-01-01
Computational neuroscientists frequently encounter the challenge of parameter fitting--exploring a usually high dimensional variable space to find a parameter set that reproduces an experimental data set. One common approach is using automated search algorithms such as gradient descent or genetic algorithms. However, these approaches suffer several shortcomings related to their lack of understanding the underlying question, such as defining a suitable error function or getting stuck in local minima. Another widespread approach is manual parameter fitting using a keyboard or a mouse, evaluating different parameter sets following the users intuition. However, this process is often cumbersome and time-intensive. Here, we present a new method for manual parameter fitting. A MIDI controller provides input to the simulation software, where model parameters are then tuned according to the knob and slider positions on the device. The model is immediately updated on every parameter change, continuously plotting the latest results. Given reasonably short simulation times of less than one second, we find this method to be highly efficient in quickly determining good parameter sets. Our approach bears a close resemblance to tuning the sound of an analog synthesizer, giving the user a very good intuition of the problem at hand, such as immediate feedback if and how results are affected by specific parameter changes. In addition to be used in research, our approach should be an ideal teaching tool, allowing students to interactively explore complex models such as Hodgkin-Huxley or dynamical systems.
Hands-on parameter search for neural simulations by a MIDI-controller.
Directory of Open Access Journals (Sweden)
Hubert Eichner
Full Text Available Computational neuroscientists frequently encounter the challenge of parameter fitting--exploring a usually high dimensional variable space to find a parameter set that reproduces an experimental data set. One common approach is using automated search algorithms such as gradient descent or genetic algorithms. However, these approaches suffer several shortcomings related to their lack of understanding the underlying question, such as defining a suitable error function or getting stuck in local minima. Another widespread approach is manual parameter fitting using a keyboard or a mouse, evaluating different parameter sets following the users intuition. However, this process is often cumbersome and time-intensive. Here, we present a new method for manual parameter fitting. A MIDI controller provides input to the simulation software, where model parameters are then tuned according to the knob and slider positions on the device. The model is immediately updated on every parameter change, continuously plotting the latest results. Given reasonably short simulation times of less than one second, we find this method to be highly efficient in quickly determining good parameter sets. Our approach bears a close resemblance to tuning the sound of an analog synthesizer, giving the user a very good intuition of the problem at hand, such as immediate feedback if and how results are affected by specific parameter changes. In addition to be used in research, our approach should be an ideal teaching tool, allowing students to interactively explore complex models such as Hodgkin-Huxley or dynamical systems.
Graph Based Models for Unsupervised High Dimensional Data Clustering and Network Analysis
2015-01-01
A. Porter and my advisor. The text is primarily written by me. Chapter 5 is a version of [46] where my contribution is all of the analytical ...inn Euclidean space, a variational method refers to using calculus of variation techniques to find the minimizer (or maximizer) of a functional (energy... geometric inter- pretation of modularity optimization contrasts with existing interpretations (e.g., probabilistic ones or in terms of the Potts model
Ghattas, O.; Petra, N.; Cui, T.; Marzouk, Y.; Benjamin, P.; Willcox, K.
2016-12-01
Model-based projections of the dynamics of the polar ice sheets play a central role in anticipating future sea level rise. However, a number of mathematical and computational challenges place significant barriers on improving predictability of these models. One such challenge is caused by the unknown model parameters (e.g., in the basal boundary conditions) that must be inferred from heterogeneous observational data, leading to an ill-posed inverse problem and the need to quantify uncertainties in its solution. In this talk we discuss the problem of estimating the uncertainty in the solution of (large-scale) ice sheet inverse problems within the framework of Bayesian inference. Computing the general solution of the inverse problem--i.e., the posterior probability density--is intractable with current methods on today's computers, due to the expense of solving the forward model (3D full Stokes flow with nonlinear rheology) and the high dimensionality of the uncertain parameters (which are discretizations of the basal sliding coefficient field). To overcome these twin computational challenges, it is essential to exploit problem structure (e.g., sensitivity of the data to parameters, the smoothing property of the forward model, and correlations in the prior). To this end, we present a data-informed approach that identifies low-dimensional structure in both parameter space and the forward model state space. This approach exploits the fact that the observations inform only a low-dimensional parameter space and allows us to construct a parameter-reduced posterior. Sampling this parameter-reduced posterior still requires multiple evaluations of the forward problem, therefore we also aim to identify a low dimensional state space to reduce the computational cost. To this end, we apply a proper orthogonal decomposition (POD) approach to approximate the state using a low-dimensional manifold constructed using ``snapshots'' from the parameter reduced posterior, and the discrete
Banks, H. T.; Ito, K.
1991-01-01
A hybrid method for computing the feedback gains in linear quadratic regulator problem is proposed. The method, which combines use of a Chandrasekhar type system with an iteration of the Newton-Kleinman form with variable acceleration parameter Smith schemes, is formulated to efficiently compute directly the feedback gains rather than solutions of an associated Riccati equation. The hybrid method is particularly appropriate when used with large dimensional systems such as those arising in approximating infinite-dimensional (distributed parameter) control systems (e.g., those governed by delay-differential and partial differential equations). Computational advantages of the proposed algorithm over the standard eigenvector (Potter, Laub-Schur) based techniques are discussed, and numerical evidence of the efficacy of these ideas is presented.
Cowley, Benjamin R.; Kaufman, Matthew T.; Butler, Zachary S.; Churchland, Mark M.; Ryu, Stephen I.; Shenoy, Krishna V.; Yu, Byron M.
2013-12-01
Objective. Analyzing and interpreting the activity of a heterogeneous population of neurons can be challenging, especially as the number of neurons, experimental trials, and experimental conditions increases. One approach is to extract a set of latent variables that succinctly captures the prominent co-fluctuation patterns across the neural population. A key problem is that the number of latent variables needed to adequately describe the population activity is often greater than 3, thereby preventing direct visualization of the latent space. By visualizing a small number of 2-d projections of the latent space or each latent variable individually, it is easy to miss salient features of the population activity. Approach. To address this limitation, we developed a Matlab graphical user interface (called DataHigh) that allows the user to quickly and smoothly navigate through a continuum of different 2-d projections of the latent space. We also implemented a suite of additional visualization tools (including playing out population activity timecourses as a movie and displaying summary statistics, such as covariance ellipses and average timecourses) and an optional tool for performing dimensionality reduction. Main results. To demonstrate the utility and versatility of DataHigh, we used it to analyze single-trial spike count and single-trial timecourse population activity recorded using a multi-electrode array, as well as trial-averaged population activity recorded using single electrodes. Significance. DataHigh was developed to fulfil a need for visualization in exploratory neural data analysis, which can provide intuition that is critical for building scientific hypotheses and models of population activity.
Cowley, Benjamin R; Kaufman, Matthew T; Butler, Zachary S; Churchland, Mark M; Ryu, Stephen I; Shenoy, Krishna V; Yu, Byron M
2013-12-01
Analyzing and interpreting the activity of a heterogeneous population of neurons can be challenging, especially as the number of neurons, experimental trials, and experimental conditions increases. One approach is to extract a set of latent variables that succinctly captures the prominent co-fluctuation patterns across the neural population. A key problem is that the number of latent variables needed to adequately describe the population activity is often greater than 3, thereby preventing direct visualization of the latent space. By visualizing a small number of 2-d projections of the latent space or each latent variable individually, it is easy to miss salient features of the population activity. To address this limitation, we developed a Matlab graphical user interface (called DataHigh) that allows the user to quickly and smoothly navigate through a continuum of different 2-d projections of the latent space. We also implemented a suite of additional visualization tools (including playing out population activity timecourses as a movie and displaying summary statistics, such as covariance ellipses and average timecourses) and an optional tool for performing dimensionality reduction. To demonstrate the utility and versatility of DataHigh, we used it to analyze single-trial spike count and single-trial timecourse population activity recorded using a multi-electrode array, as well as trial-averaged population activity recorded using single electrodes. DataHigh was developed to fulfil a need for visualization in exploratory neural data analysis, which can provide intuition that is critical for building scientific hypotheses and models of population activity.
Cowley, Benjamin R.; Kaufman, Matthew T.; Butler, Zachary S.; Churchland, Mark M.; Ryu, Stephen I.; Shenoy, Krishna V.; Yu, Byron M.
2014-01-01
Objective Analyzing and interpreting the activity of a heterogeneous population of neurons can be challenging, especially as the number of neurons, experimental trials, and experimental conditions increases. One approach is to extract a set of latent variables that succinctly captures the prominent co-fluctuation patterns across the neural population. A key problem is that the number of latent variables needed to adequately describe the population activity is often greater than three, thereby preventing direct visualization of the latent space. By visualizing a small number of 2-d projections of the latent space or each latent variable individually, it is easy to miss salient features of the population activity. Approach To address this limitation, we developed a Matlab graphical user interface (called DataHigh) that allows the user to quickly and smoothly navigate through a continuum of different 2-d projections of the latent space. We also implemented a suite of additional visualization tools (including playing out population activity timecourses as a movie and displaying summary statistics, such as covariance ellipses and average timecourses) and an optional tool for performing dimensionality reduction. Main results To demonstrate the utility and versatility of DataHigh, we used it to analyze single-trial spike count and single-trial timecourse population activity recorded using a multi-electrode array, as well as trial-averaged population activity recorded using single electrodes. Significance DataHigh was developed to fulfill a need for visualization in exploratory neural data analysis, which can provide intuition that is critical for building scientific hypotheses and models of population activity. PMID:24216250
Calculation of high-dimensional fission-fusion potential-energy surfaces in the SHE region
International Nuclear Information System (INIS)
Moeller, Peter; Sierk, Arnold J.; Ichikawa, Takatoshi; Iwamoto, Akira
2004-01-01
We calculate in a macroscopic-microscopic model fission-fusion potential-energy surfaces relevant to the analysis of heavy-ion reactions employed to form heavy-element evaporation residues. We study these multidimensional potential-energy surfaces both inside and outside the touching point.Inside the point of contact we define the potential on a multi-million-point grid in 5D deformation space where elongation, merging projectile and target spheroidal shapes, neck radius and projectile/target mass asymmetry are independent shape variables. The same deformation space and the corresponding potential-energy surface also describe the shape evolution from the nuclear ground-state to separating fragments in fission, and the fast-fission trajectories in incomplete fusion.For separated nuclei we study the macroscopic-microscopic potential energy, that is the ''collision surface'' between a spheroidally deformed target and a spheroidally deformed projectile as a function of three coordinates which are: the relative location of the projectile center-of-mass with respect to the target center-of-mass and the spheroidal deformations of the target and the projectile. We limit our study to the most favorable relative positions of target and projectile, namely that the symmetry axes of the target and projectile are collinear
International Nuclear Information System (INIS)
Von Nessi, G T; Hole, M J
2014-01-01
We present recent results and technical breakthroughs for the Bayesian inference of tokamak equilibria using force-balance as a prior constraint. Issues surrounding model parameter representation and posterior analysis are discussed and addressed. These points motivate the recent advancements embodied in the Bayesian Equilibrium Analysis and Simulation Tool (BEAST) software being presently utilized to study equilibria on the Mega-Ampere Spherical Tokamak (MAST) experiment in the UK (von Nessi et al 2012 J. Phys. A 46 185501). State-of-the-art results of using BEAST to study MAST equilibria are reviewed, with recent code advancements being systematically presented though out the manuscript. (paper)
Bich, Cao Thi; Dat, Le Thanh; Van Hop, Nguyen; An, Nguyen Ba
2018-04-01
Entanglement plays a vital and in many cases non-replaceable role in the quantum network communication. Here, we propose two new protocols to jointly and remotely prepare a special so-called bipartite equatorial state which is hybrid in the sense that it entangles two Hilbert spaces with arbitrary different dimensions D and N (i.e., a type of entanglement between a quDit and a quNit). The quantum channels required to do that are however not necessarily hybrid. In fact, we utilize four high-dimensional Einstein-Podolsky-Rosen pairs, two of which are quDit-quDit entanglements, while the other two are quNit-quNit ones. In the first protocol the receiver has to be involved actively in the process of remote state preparation, while in the second protocol the receiver is passive as he/she needs to participate only in the final step for reconstructing the target hybrid state. Each protocol meets a specific circumstance that may be encountered in practice and both can be performed with unit success probability. Moreover, the concerned equatorial hybrid entangled state can also be jointly prepared for two receivers at two separated locations by slightly modifying the initial particles' distribution, thereby establishing between them an entangled channel ready for a later use.
Garashchuk, Sophya; Rassolov, Vitaly A
2008-07-14
Semiclassical implementation of the quantum trajectory formalism [J. Chem. Phys. 120, 1181 (2004)] is further developed to give a stable long-time description of zero-point energy in anharmonic systems of high dimensionality. The method is based on a numerically cheap linearized quantum force approach; stabilizing terms compensating for the linearization errors are added into the time-evolution equations for the classical and nonclassical components of the momentum operator. The wave function normalization and energy are rigorously conserved. Numerical tests are performed for model systems of up to 40 degrees of freedom.
Benediktsson, J. A.; Swain, P. H.; Ersoy, O. K.
1993-01-01
Application of neural networks to classification of remote sensing data is discussed. Conventional two-layer backpropagation is found to give good results in classification of remote sensing data but is not efficient in training. A more efficient variant, based on conjugate-gradient optimization, is used for classification of multisource remote sensing and geographic data and very-high-dimensional data. The conjugate-gradient neural networks give excellent performance in classification of multisource data, but do not compare as well with statistical methods in classification of very-high-dimentional data.
Safaei, S.; Haghnegahdar, A.; Razavi, S.
2016-12-01
Complex environmental models are now the primary tool to inform decision makers for the current or future management of environmental resources under the climate and environmental changes. These complex models often contain a large number of parameters that need to be determined by a computationally intensive calibration procedure. Sensitivity analysis (SA) is a very useful tool that not only allows for understanding the model behavior, but also helps in reducing the number of calibration parameters by identifying unimportant ones. The issue is that most global sensitivity techniques are highly computationally demanding themselves for generating robust and stable sensitivity metrics over the entire model response surface. Recently, a novel global sensitivity analysis method, Variogram Analysis of Response Surfaces (VARS), is introduced that can efficiently provide a comprehensive assessment of global sensitivity using the Variogram concept. In this work, we aim to evaluate the effectiveness of this highly efficient GSA method in saving computational burden, when applied to systems with extra-large number of input factors ( 100). We use a test function and a hydrological modelling case study to demonstrate the capability of VARS method in reducing problem dimensionality by identifying important vs unimportant input factors.
Mah, Yee-Haur; Jager, Rolf; Kennard, Christopher; Husain, Masud; Nachev, Parashkev
2014-07-01
Making robust inferences about the functional neuroanatomy of the brain is critically dependent on experimental techniques that examine the consequences of focal loss of brain function. Unfortunately, the use of the most comprehensive such technique-lesion-function mapping-is complicated by the need for time-consuming and subjective manual delineation of the lesions, greatly limiting the practicability of the approach. Here we exploit a recently-described general measure of statistical anomaly, zeta, to devise a fully-automated, high-dimensional algorithm for identifying the parameters of lesions within a brain image given a reference set of normal brain images. We proceed to evaluate such an algorithm in the context of diffusion-weighted imaging of the commonest type of lesion used in neuroanatomical research: ischaemic damage. Summary performance metrics exceed those previously published for diffusion-weighted imaging and approach the current gold standard-manual segmentation-sufficiently closely for fully-automated lesion-mapping studies to become a possibility. We apply the new method to 435 unselected images of patients with ischaemic stroke to derive a probabilistic map of the pattern of damage in lesions involving the occipital lobe, demonstrating the variation of anatomical resolvability of occipital areas so as to guide future lesion-function studies of the region. Copyright © 2012 Elsevier Ltd. All rights reserved.
A Framework for the Interactive Handling of High-Dimensional Simulation Data in Complex Geometries
Benzina, Amal; Buse, Gerrit; Butnaru, Daniel; Murarasu, Alin; Treib, Marc; Varduhn, Vasco; Mundani, Ralf-Peter
2013-01-01
Flow simulations around building infrastructure models involve large scale complex geometries, which when discretized in adequate detail entail high computational cost. Moreover, tasks such as simulation insight by steering or optimization require many such costly simulations. In this paper, we illustrate the whole pipeline of an integrated solution for interactive computational steering, developed for complex flow simulation scenarios that depend on a moderate number of both geometric and physical parameters. A mesh generator takes building information model input data and outputs a valid cartesian discretization. A sparse-grids-based surrogate model—a less costly substitute for the parameterized simulation—uses precomputed data to deliver approximated simulation results at interactive rates. Furthermore, a distributed multi-display visualization environment shows building infrastructure together with flow data. The focus is set on scalability and intuitive user interaction.
Diaz-Ruelas, Alvaro; Jeldtoft Jensen, Henrik; Piovani, Duccio; Robledo, Alberto
2016-12-01
It is well known that low-dimensional nonlinear deterministic maps close to a tangent bifurcation exhibit intermittency and this circumstance has been exploited, e.g., by Procaccia and Schuster [Phys. Rev. A 28, 1210 (1983)], to develop a general theory of 1/f spectra. This suggests it is interesting to study the extent to which the behavior of a high-dimensional stochastic system can be described by such tangent maps. The Tangled Nature (TaNa) Model of evolutionary ecology is an ideal candidate for such a study, a significant model as it is capable of reproducing a broad range of the phenomenology of macroevolution and ecosystems. The TaNa model exhibits strong intermittency reminiscent of punctuated equilibrium and, like the fossil record of mass extinction, the intermittency in the model is found to be non-stationary, a feature typical of many complex systems. We derive a mean-field version for the evolution of the likelihood function controlling the reproduction of species and find a local map close to tangency. This mean-field map, by our own local approximation, is able to describe qualitatively only one episode of the intermittent dynamics of the full TaNa model. To complement this result, we construct a complete nonlinear dynamical system model consisting of successive tangent bifurcations that generates time evolution patterns resembling those of the full TaNa model in macroscopic scales. The switch from one tangent bifurcation to the next in the sequences produced in this model is stochastic in nature, based on criteria obtained from the local mean-field approximation, and capable of imitating the changing set of types of species and total population in the TaNa model. The model combines full deterministic dynamics with instantaneous parameter random jumps at stochastically drawn times. In spite of the limitations of our approach, which entails a drastic collapse of degrees of freedom, the description of a high-dimensional model system in terms of a low
Local Likelihood Approach for High-Dimensional Peaks-Over-Threshold Inference
Baki, Zhuldyzay
2018-05-14
Global warming is affecting the Earth climate year by year, the biggest difference being observable in increasing temperatures in the World Ocean. Following the long- term global ocean warming trend, average sea surface temperatures across the global tropics and subtropics have increased by 0.4–1◦C in the last 40 years. These rates become even higher in semi-enclosed southern seas, such as the Red Sea, threaten- ing the survival of thermal-sensitive species. As average sea surface temperatures are projected to continue to rise, careful study of future developments of extreme temper- atures is paramount for the sustainability of marine ecosystem and biodiversity. In this thesis, we use Extreme-Value Theory to study sea surface temperature extremes from a gridded dataset comprising 16703 locations over the Red Sea. The data were provided by Operational SST and Sea Ice Analysis (OSTIA), a satellite-based data system designed for numerical weather prediction. After pre-processing the data to account for seasonality and global trends, we analyze the marginal distribution of ex- tremes, defined as observations exceeding a high spatially varying threshold, using the Generalized Pareto distribution. This model allows us to extrapolate beyond the ob- served data to compute the 100-year return levels over the entire Red Sea, confirming the increasing trend of extreme temperatures. To understand the dynamics govern- ing the dependence of extreme temperatures in the Red Sea, we propose a flexible local approach based on R-Pareto processes, which extend the univariate Generalized Pareto distribution to the spatial setting. Assuming that the sea surface temperature varies smoothly over space, we perform inference based on the gradient score method over small regional neighborhoods, in which the data are assumed to be stationary in space. This approach allows us to capture spatial non-stationarity, and to reduce the overall computational cost by taking advantage of
Directory of Open Access Journals (Sweden)
Omid Hamidi
2014-01-01
Full Text Available Microarray technology results in high-dimensional and low-sample size data sets. Therefore, fitting sparse models is substantial because only a small number of influential genes can reliably be identified. A number of variable selection approaches have been proposed for high-dimensional time-to-event data based on Cox proportional hazards where censoring is present. The present study applied three sparse variable selection techniques of Lasso, smoothly clipped absolute deviation and the smooth integration of counting, and absolute deviation for gene expression survival time data using the additive risk model which is adopted when the absolute effects of multiple predictors on the hazard function are of interest. The performances of used techniques were evaluated by time dependent ROC curve and bootstrap .632+ prediction error curves. The selected genes by all methods were highly significant (P<0.001. The Lasso showed maximum median of area under ROC curve over time (0.95 and smoothly clipped absolute deviation showed the lowest prediction error (0.105. It was observed that the selected genes by all methods improved the prediction of purely clinical model indicating the valuable information containing in the microarray features. So it was concluded that used approaches can satisfactorily predict survival based on selected gene expression measurements.
An automatic iris occlusion estimation method based on high-dimensional density estimation.
Li, Yung-Hui; Savvides, Marios
2013-04-01
Iris masks play an important role in iris recognition. They indicate which part of the iris texture map is useful and which part is occluded or contaminated by noisy image artifacts such as eyelashes, eyelids, eyeglasses frames, and specular reflections. The accuracy of the iris mask is extremely important. The performance of the iris recognition system will decrease dramatically when the iris mask is inaccurate, even when the best recognition algorithm is used. Traditionally, people used the rule-based algorithms to estimate iris masks from iris images. However, the accuracy of the iris masks generated this way is questionable. In this work, we propose to use Figueiredo and Jain's Gaussian Mixture Models (FJ-GMMs) to model the underlying probabilistic distributions of both valid and invalid regions on iris images. We also explored possible features and found that Gabor Filter Bank (GFB) provides the most discriminative information for our goal. Finally, we applied Simulated Annealing (SA) technique to optimize the parameters of GFB in order to achieve the best recognition rate. Experimental results show that the masks generated by the proposed algorithm increase the iris recognition rate on both ICE2 and UBIRIS dataset, verifying the effectiveness and importance of our proposed method for iris occlusion estimation.
Strategies to reduce the complexity of hydrologic data assimilation for high-dimensional models
Hernandez, F.; Liang, X.
2017-12-01
Probabilistic forecasts in the geosciences offer invaluable information by allowing to estimate the uncertainty of predicted conditions (including threats like floods and droughts). However, while forecast systems based on modern data assimilation algorithms are capable of producing multi-variate probability distributions of future conditions, the computational resources required to fully characterize the dependencies between the model's state variables render their applicability impractical for high-resolution cases. This occurs because of the quadratic space complexity of storing the covariance matrices that encode these dependencies and the cubic time complexity of performing inference operations with them. In this work we introduce two complementary strategies to reduce the size of the covariance matrices that are at the heart of Bayesian assimilation methods—like some variants of (ensemble) Kalman filters and of particle filters—and variational methods. The first strategy involves the optimized grouping of state variables by clustering individual cells of the model into "super-cells." A dynamic fuzzy clustering approach is used to take into account the states (e.g., soil moisture) and forcings (e.g., precipitation) of each cell at each time step. The second strategy consists in finding a compressed representation of the covariance matrix that still encodes the most relevant information but that can be more efficiently stored and processed. A learning and a belief-propagation inference algorithm are developed to take advantage of this modified low-rank representation. The two proposed strategies are incorporated into OPTIMISTS, a state-of-the-art hybrid Bayesian/variational data assimilation algorithm, and comparative streamflow forecasting tests are performed using two watersheds modeled with the Distributed Hydrology Soil Vegetation Model (DHSVM). Contrasts are made between the efficiency gains and forecast accuracy losses of each strategy used in
Directory of Open Access Journals (Sweden)
Ottavia eDipasquale
2015-02-01
Full Text Available High dimensional independent component analysis (ICA, compared to low dimensional ICA, allows performing a detailed parcellation of the resting state networks. The purpose of this study was to give further insight into functional connectivity (FC in Alzheimer’s disease (AD using high dimensional ICA. For this reason, we performed both low and high dimensional ICA analyses of resting state fMRI (rfMRI data of 20 healthy controls and 21 AD patients, focusing on the primarily altered default mode network (DMN and exploring the sensory motor network (SMN. As expected, results obtained at low dimensionality were in line with previous literature. Moreover, high dimensional results allowed us to observe either the presence of within-network disconnections and FC damage confined to some of the resting state sub-networks. Due to the higher sensitivity of the high dimensional ICA analysis, our results suggest that high-dimensional decomposition in sub-networks is very promising to better localize FC alterations in AD and that FC damage is not confined to the default mode network.
Use of high-dimensional spectral data to evaluate organic matter, reflectance relationships in soils
Henderson, T. L.; Baumgardner, M. F.; Coster, D. C.; Franzmeier, D. P.; Stott, D. E.
1990-01-01
Recent breakthroughs in remote sensing technology have led to the development of a spaceborne high spectral resolution imaging sensor, HIRIS, to be launched in the mid-1990s for observation of earth surface features. The effects of organic carbon content on soil reflectance over the spectral range of HIRIS, and to examine the contributions of humic and fulvic acid fractions to soil reflectance was evaluated. Organic matter from four Indiana agricultural soils was extracted, fractionated, and purified, and six individual components of each soil were isolated and prepared for spectral analysis. The four soils, ranging in organic carbon content from 0.99 percent, represented various combinations of genetic parameters such as parent material, age, drainage, and native vegetation. An experimental procedure was developed to measure reflectance of very small soil and organic component samples in the laboratory, simulating the spectral coverage and resolution of the HIRIS sensor. Reflectance in 210 narrow (10 nm) bands was measured using the CARY 17D spectrophotometer over the 400 to 2500 nm wavelength range. Reflectance data were analyzed statistically to determine the regions of the reflective spectrum which provided useful information about soil organic matter content and composition. Wavebands providing significant information about soil organic carbon content were located in all three major regions of the reflective spectrum: visible, near infrared, and middle infrared. The purified humic acid fractions of the four soils were separable in six bands in the 1600 to 2400 nm range, suggesting that longwave middle infrared reflectance may be useful as a non-destructive laboratory technique for humic acid characterization.
Clark, James S.; Soltoff, Benjamin D.; Powell, Amanda S.; Read, Quentin D.
2012-01-01
Background For competing species to coexist, individuals must compete more with others of the same species than with those of other species. Ecologists search for tradeoffs in how species might partition the environment. The negative correlations among competing species that would be indicative of tradeoffs are rarely observed. A recent analysis showed that evidence for partitioning the environment is available when responses are disaggregated to the individual scale, in terms of the covariance structure of responses to environmental variation. That study did not relate that variation to the variables to which individuals were responding. To understand how this pattern of variation is related to niche variables, we analyzed responses to canopy gaps, long viewed as a key variable responsible for species coexistence. Methodology/Principal Findings A longitudinal intervention analysis of individual responses to experimental canopy gaps with 12 yr of pre-treatment and 8 yr post-treatment responses showed that species-level responses are positively correlated – species that grow fast on average in the understory also grow fast on average in response to gap formation. In other words, there is no tradeoff. However, the joint distribution of individual responses to understory and gap showed a negative correlation – species having individuals that respond most to gaps when previously growing slowly also have individuals that respond least to gaps when previously growing rapidly (e.g., Morus rubra), and vice versa (e.g., Quercus prinus). Conclusions/Significance Because competition occurs at the individual scale, not the species scale, aggregated species-level parameters and correlations hide the species-level differences needed for coexistence. By disaggregating models to the scale at which the interaction occurs we show that individual variation provides insight for species differences. PMID:22393349
Directory of Open Access Journals (Sweden)
James S Clark
Full Text Available BACKGROUND: For competing species to coexist, individuals must compete more with others of the same species than with those of other species. Ecologists search for tradeoffs in how species might partition the environment. The negative correlations among competing species that would be indicative of tradeoffs are rarely observed. A recent analysis showed that evidence for partitioning the environment is available when responses are disaggregated to the individual scale, in terms of the covariance structure of responses to environmental variation. That study did not relate that variation to the variables to which individuals were responding. To understand how this pattern of variation is related to niche variables, we analyzed responses to canopy gaps, long viewed as a key variable responsible for species coexistence. METHODOLOGY/PRINCIPAL FINDINGS: A longitudinal intervention analysis of individual responses to experimental canopy gaps with 12 yr of pre-treatment and 8 yr post-treatment responses showed that species-level responses are positively correlated--species that grow fast on average in the understory also grow fast on average in response to gap formation. In other words, there is no tradeoff. However, the joint distribution of individual responses to understory and gap showed a negative correlation--species having individuals that respond most to gaps when previously growing slowly also have individuals that respond least to gaps when previously growing rapidly (e.g., Morus rubra, and vice versa (e.g., Quercus prinus. CONCLUSIONS/SIGNIFICANCE: Because competition occurs at the individual scale, not the species scale, aggregated species-level parameters and correlations hide the species-level differences needed for coexistence. By disaggregating models to the scale at which the interaction occurs we show that individual variation provides insight for species differences.
Neutrino oscillation parameter sampling with MonteCUBES
Blennow, Mattias; Fernandez-Martinez, Enrique
2010-01-01
We present MonteCUBES ("Monte Carlo Utility Based Experiment Simulator"), a software package designed to sample the neutrino oscillation parameter space through Markov Chain Monte Carlo algorithms. MonteCUBES makes use of the GLoBES software so that the existing experiment definitions for GLoBES, describing long baseline and reactor experiments, can be used with MonteCUBES. MonteCUBES consists of two main parts: The first is a C library, written as a plug-in for GLoBES, implementing the Markov Chain Monte Carlo algorithm to sample the parameter space. The second part is a user-friendly graphical Matlab interface to easily read, analyze, plot and export the results of the parameter space sampling. Program summaryProgram title: MonteCUBES (Monte Carlo Utility Based Experiment Simulator) Catalogue identifier: AEFJ_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEFJ_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GNU General Public Licence No. of lines in distributed program, including test data, etc.: 69 634 No. of bytes in distributed program, including test data, etc.: 3 980 776 Distribution format: tar.gz Programming language: C Computer: MonteCUBES builds and installs on 32 bit and 64 bit Linux systems where GLoBES is installed Operating system: 32 bit and 64 bit Linux RAM: Typically a few MBs Classification: 11.1 External routines: GLoBES [1,2] and routines/libraries used by GLoBES Subprograms used:Cat Id ADZI_v1_0, Title GLoBES, Reference CPC 177 (2007) 439 Nature of problem: Since neutrino masses do not appear in the standard model of particle physics, many models of neutrino masses also induce other types of new physics, which could affect the outcome of neutrino oscillation experiments. In general, these new physics imply high-dimensional parameter spaces that are difficult to explore using classical methods such as multi-dimensional projections and minimizations, such as those
Semiletov, I. P.; Shakhova, N. E.; Pipko, I. I.; Pugach, S. P.; Charkin, A. N.; Dudarev, O. V.; Kosmach, D. A.; Nishino, S.
2013-09-01
This study aims to improve understanding of carbon cycling in the Buor-Khaya Bay (BKB) and adjacent part of the Laptev Sea by studying the inter-annual, seasonal, and meso-scale variability of carbon and related hydrological and biogeochemical parameters in the water, as well as factors controlling carbon dioxide (CO2) emission. Here we present data sets obtained on summer cruises and winter expeditions during 12 yr of investigation. Based on data analysis, we suggest that in the heterotrophic BKB area, input of terrestrially borne organic carbon (OC) varies seasonally and inter-annually and is largely determined by rates of coastal erosion and river discharge. Two different BKB sedimentation regimes were revealed: Type 1 (erosion accumulation) and Type 2 (accumulation). A Type 1 sedimentation regime occurs more often and is believed to be the quantitatively most important mechanism for suspended particular matter (SPM) and particulate organic carbon (POC) delivery to the BKB. The mean SPM concentration observed in the BKB under a Type 1 regime was one order of magnitude greater than the mean concentration of SPM (~ 20 mg L-1) observed along the Lena River stream in summer 2003. Loadings of the BKB water column with particulate material vary by more than a factor of two between the two regimes. Higher partial pressure of CO2 (pCO2), higher concentrations of nutrients, and lower levels of oxygen saturation were observed in the bottom water near the eroded coasts, implying that coastal erosion and subsequent oxidation of eroded organic matter (OM) rather than the Lena River serves as the predominant source of nutrients to the BKB. Atmospheric CO2 fluxes from the sea surface in the BKB vary from 1 to 95 mmol m-2 day-1 and are determined by specific features of hydrology and wind conditions, which change spatially, seasonally, and inter-annually. Mean values of CO2 emission from the shallow Laptev Sea were similar in September 1999 and 2005 (7.2 and 7.8 mmol m-2 day-1
Multiobjective constraints for climate model parameter choices: Pragmatic Pareto fronts in CESM1
Langenbrunner, B.; Neelin, J. D.
2017-09-01
Global climate models (GCMs) are examples of high-dimensional input-output systems, where model output is a function of many variables, and an update in model physics commonly improves performance in one objective function (i.e., measure of model performance) at the expense of degrading another. Here concepts from multiobjective optimization in the engineering literature are used to investigate parameter sensitivity and optimization in the face of such trade-offs. A metamodeling technique called cut high-dimensional model representation (cut-HDMR) is leveraged in the context of multiobjective optimization to improve GCM simulation of the tropical Pacific climate, focusing on seasonal precipitation, column water vapor, and skin temperature. An evolutionary algorithm is used to solve for Pareto fronts, which are surfaces in objective function space along which trade-offs in GCM performance occur. This approach allows the modeler to visualize trade-offs quickly and identify the physics at play. In some cases, Pareto fronts are small, implying that trade-offs are minimal, optimal parameter value choices are more straightforward, and the GCM is well-functioning. In all cases considered here, the control run was found not to be Pareto-optimal (i.e., not on the front), highlighting an opportunity for model improvement through objectively informed parameter selection. Taylor diagrams illustrate that these improvements occur primarily in field magnitude, not spatial correlation, and they show that specific parameter updates can improve fields fundamental to tropical moist processes—namely precipitation and skin temperature—without significantly impacting others. These results provide an example of how basic elements of multiobjective optimization can facilitate pragmatic GCM tuning processes.
Energy Technology Data Exchange (ETDEWEB)
Miao, Yan-Gang [Nankai University, School of Physics, Tianjin (China); Chinese Academy of Sciences, State Key Laboratory of Theoretical Physics, Institute of Theoretical Physics, P.O. Box 2735, Beijing (China); CERN, PH-TH Division, Geneva 23 (Switzerland); Xu, Zhen-Ming [Nankai University, School of Physics, Tianjin (China)
2016-04-15
Considering non-Gaussian smeared matter distributions, we investigate the thermodynamic behaviors of the noncommutative high-dimensional Schwarzschild-Tangherlini anti-de Sitter black hole, and we obtain the condition for the existence of extreme black holes. We indicate that the Gaussian smeared matter distribution, which is a special case of non-Gaussian smeared matter distributions, is not applicable for the six- and higher-dimensional black holes due to the hoop conjecture. In particular, the phase transition is analyzed in detail. Moreover, we point out that the Maxwell equal area law holds for the noncommutative black hole whose Hawking temperature is within a specific range, but fails for one whose the Hawking temperature is beyond this range. (orig.)
Miao, Yan-Gang
2016-01-01
Considering non-Gaussian smeared matter distributions, we investigate thermodynamic behaviors of the noncommutative high-dimensional Schwarzschild-Tangherlini anti-de Sitter black hole, and obtain the condition for the existence of extreme black holes. We indicate that the Gaussian smeared matter distribution, which is a special case of non-Gaussian smeared matter distributions, is not applicable for the 6- and higher-dimensional black holes due to the hoop conjecture. In particular, the phase transition is analyzed in detail. Moreover, we point out that the Maxwell equal area law maintains for the noncommutative black hole with the Hawking temperature within a specific range, but fails with the Hawking temperature beyond this range.
Directory of Open Access Journals (Sweden)
F. C. Cooper
2013-04-01
Full Text Available The fluctuation-dissipation theorem (FDT has been proposed as a method of calculating the response of the earth's atmosphere to a forcing. For this problem the high dimensionality of the relevant data sets makes truncation necessary. Here we propose a method of truncation based upon the assumption that the response to a localised forcing is spatially localised, as an alternative to the standard method of choosing a number of the leading empirical orthogonal functions. For systems where this assumption holds, the response to any sufficiently small non-localised forcing may be estimated using a set of truncations that are chosen algorithmically. We test our algorithm using 36 and 72 variable versions of a stochastic Lorenz 95 system of ordinary differential equations. We find that, for long integrations, the bias in the response estimated by the FDT is reduced from ~75% of the true response to ~30%.
Directory of Open Access Journals (Sweden)
Ali Dashti
Full Text Available This paper presents an implementation of the brute-force exact k-Nearest Neighbor Graph (k-NNG construction for ultra-large high-dimensional data cloud. The proposed method uses Graphics Processing Units (GPUs and is scalable with multi-levels of parallelism (between nodes of a cluster, between different GPUs on a single node, and within a GPU. The method is applicable to homogeneous computing clusters with a varying number of nodes and GPUs per node. We achieve a 6-fold speedup in data processing as compared with an optimized method running on a cluster of CPUs and bring a hitherto impossible [Formula: see text]-NNG generation for a dataset of twenty million images with 15 k dimensionality into the realm of practical possibility.
McParland, D; Phillips, C M; Brennan, L; Roche, H M; Gormley, I C
2017-12-10
The LIPGENE-SU.VI.MAX study, like many others, recorded high-dimensional continuous phenotypic data and categorical genotypic data. LIPGENE-SU.VI.MAX focuses on the need to account for both phenotypic and genetic factors when studying the metabolic syndrome (MetS), a complex disorder that can lead to higher risk of type 2 diabetes and cardiovascular disease. Interest lies in clustering the LIPGENE-SU.VI.MAX participants into homogeneous groups or sub-phenotypes, by jointly considering their phenotypic and genotypic data, and in determining which variables are discriminatory. A novel latent variable model that elegantly accommodates high dimensional, mixed data is developed to cluster LIPGENE-SU.VI.MAX participants using a Bayesian finite mixture model. A computationally efficient variable selection algorithm is incorporated, estimation is via a Gibbs sampling algorithm and an approximate BIC-MCMC criterion is developed to select the optimal model. Two clusters or sub-phenotypes ('healthy' and 'at risk') are uncovered. A small subset of variables is deemed discriminatory, which notably includes phenotypic and genotypic variables, highlighting the need to jointly consider both factors. Further, 7 years after the LIPGENE-SU.VI.MAX data were collected, participants underwent further analysis to diagnose presence or absence of the MetS. The two uncovered sub-phenotypes strongly correspond to the 7-year follow-up disease classification, highlighting the role of phenotypic and genotypic factors in the MetS and emphasising the potential utility of the clustering approach in early screening. Additionally, the ability of the proposed approach to define the uncertainty in sub-phenotype membership at the participant level is synonymous with the concepts of precision medicine and nutrition. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Buncher system parameter optimization
International Nuclear Information System (INIS)
Wadlinger, E.A.
1981-01-01
A least-squares algorithm is presented to calculate the RF amplitudes and cavity spacings for a series of buncher cavities each resonating at a frequency that is a multiple of a fundamental frequency of interest. The longitudinal phase-space distribution, obtained by particle tracing through the bunching system, is compared to a desired distribution function of energy and phase. The buncher cavity parameters are adjusted to minimize the difference between these two distributions. Examples are given for zero space charge. The manner in which the method can be extended to include space charge using the 3-D space-charge calculation procedure is indicated
International Nuclear Information System (INIS)
Shayganpour, A; Izman, S; Idris, M H; Jafari, H
2012-01-01
Lost foam casting as a relatively new manufacturing process is extensively employed to produce sound complicated castings. In this study, an experimental investigation on lost foam casting of an Al-Si-Cu aluminium cast alloy was conducted. The research was aimed in evaluating the effect of different pouring temperatures, slurry viscosities, vibration durations and sand grain sizes on eutectic silicon spacing of thin-wall castings. A stepped-pattern was used in the study and the focus of the investigations was at the thinnest 3 mm section. A full two-level factorial design experimental technique was used to plan the experiments and afterwards identify the significant factors affecting casting silicon spacing. The results showed that pouring temperature and its interaction with vibration time have pronounced effect on eutectic silicon phase size. Increasing pouring temperature coarsened the eutectic silicon spacing while the higher vibration time diminished coarsening effect. Moreover, no significant effects on silicon spacing were found with variation of sand size and slurry viscosity.
Directory of Open Access Journals (Sweden)
Namysłowska-Wilczyńska Barbara
2016-09-01
Full Text Available This paper presents selected results of research connected with the development of a (3D geostatistical hydrogeochemical model of the Kłodzko Drainage Basin, dedicated to the spatial variation in the different quality parameters of underground water in the water intake area (SW part of Poland. The research covers the period 2011-2012. Spatial analyses of the variation in various quality parameters, i.e., contents of: iron, manganese, ammonium ion, nitrate ion, phosphate ion, total organic carbon, pH redox potential and temperature, were carried out on the basis of the chemical determinations of the quality parameters of underground water samples taken from the wells in the water intake area. Spatial variation in the parameters was analyzed on the basis of data obtained (November 2011 from tests of water taken from 14 existing wells with a depth ranging from 9.5 to 38.0 m b.g.l. The latest data (January 2012 were obtained (gained from 3 new piezometers, made in other locations in the relevant area. A depth of these piezometers amounts to 9-10 m.
Directory of Open Access Journals (Sweden)
Malgorzata Nowicka
2017-05-01
Full Text Available High dimensional mass and flow cytometry (HDCyto experiments have become a method of choice for high throughput interrogation and characterization of cell populations.Here, we present an R-based pipeline for differential analyses of HDCyto data, largely based on Bioconductor packages. We computationally define cell populations using FlowSOM clustering, and facilitate an optional but reproducible strategy for manual merging of algorithm-generated clusters. Our workflow offers different analysis paths, including association of cell type abundance with a phenotype or changes in signaling markers within specific subpopulations, or differential analyses of aggregated signals. Importantly, the differential analyses we show are based on regression frameworks where the HDCyto data is the response; thus, we are able to model arbitrary experimental designs, such as those with batch effects, paired designs and so on. In particular, we apply generalized linear mixed models to analyses of cell population abundance or cell-population-specific analyses of signaling markers, allowing overdispersion in cell count or aggregated signals across samples to be appropriately modeled. To support the formal statistical analyses, we encourage exploratory data analysis at every step, including quality control (e.g. multi-dimensional scaling plots, reporting of clustering results (dimensionality reduction, heatmaps with dendrograms and differential analyses (e.g. plots of aggregated signals.
Regis, Rommel G.
2014-02-01
This article develops two new algorithms for constrained expensive black-box optimization that use radial basis function surrogates for the objective and constraint functions. These algorithms are called COBRA and Extended ConstrLMSRBF and, unlike previous surrogate-based approaches, they can be used for high-dimensional problems where all initial points are infeasible. They both follow a two-phase approach where the first phase finds a feasible point while the second phase improves this feasible point. COBRA and Extended ConstrLMSRBF are compared with alternative methods on 20 test problems and on the MOPTA08 benchmark automotive problem (D.R. Jones, Presented at MOPTA 2008), which has 124 decision variables and 68 black-box inequality constraints. The alternatives include a sequential penalty derivative-free algorithm, a direct search method with kriging surrogates, and two multistart methods. Numerical results show that COBRA algorithms are competitive with Extended ConstrLMSRBF and they generally outperform the alternatives on the MOPTA08 problem and most of the test problems.
Schröder, Markus; Meyer, Hans-Dieter
2017-08-01
We propose a Monte Carlo method, "Monte Carlo Potfit," for transforming high-dimensional potential energy surfaces evaluated on discrete grid points into a sum-of-products form, more precisely into a Tucker form. To this end we use a variational ansatz in which we replace numerically exact integrals with Monte Carlo integrals. This largely reduces the numerical cost by avoiding the evaluation of the potential on all grid points and allows a treatment of surfaces up to 15-18 degrees of freedom. We furthermore show that the error made with this ansatz can be controlled and vanishes in certain limits. We present calculations on the potential of HFCO to demonstrate the features of the algorithm. To demonstrate the power of the method, we transformed a 15D potential of the protonated water dimer (Zundel cation) in a sum-of-products form and calculated the ground and lowest 26 vibrationally excited states of the Zundel cation with the multi-configuration time-dependent Hartree method.
Meng, Xi; Nguyen, Bao D.; Ridge, Clark; Shaka, A. J.
2009-01-01
High-dimensional (HD) NMR spectra have poorer digital resolution than low-dimensional (LD) spectra, for a fixed amount of experiment time. This has led to “reduced-dimensionality” strategies, in which several LD projections of the HD NMR spectrum are acquired, each with higher digital resolution; an approximate HD spectrum is then inferred by some means. We propose a strategy that moves in the opposite direction, by adding more time dimensions to increase the information content of the data set, even if only a very sparse time grid is used in each dimension. The full HD time-domain data can be analyzed by the Filter Diagonalization Method (FDM), yielding very narrow resonances along all of the frequency axes, even those with sparse sampling. Integrating over the added dimensions of HD FDM NMR spectra reconstitutes LD spectra with enhanced resolution, often more quickly than direct acquisition of the LD spectrum with a larger number of grid points in each of the fewer dimensions. If the extra dimensions do not appear in the final spectrum, and are used solely to boost information content, we propose the moniker hidden-dimension NMR. This work shows that HD peaks have unmistakable frequency signatures that can be detected as single HD objects by an appropriate algorithm, even though their patterns would be tricky for a human operator to visualize or recognize, and even if digital resolution in an HD FT spectrum is very coarse compared with natural line widths. PMID:18926747
Chiu, Mei Choi; Pun, Chi Seng; Wong, Hoi Ying
2017-08-01
Investors interested in the global financial market must analyze financial securities internationally. Making an optimal global investment decision involves processing a huge amount of data for a high-dimensional portfolio. This article investigates the big data challenges of two mean-variance optimal portfolios: continuous-time precommitment and constant-rebalancing strategies. We show that both optimized portfolios implemented with the traditional sample estimates converge to the worst performing portfolio when the portfolio size becomes large. The crux of the problem is the estimation error accumulated from the huge dimension of stock data. We then propose a linear programming optimal (LPO) portfolio framework, which applies a constrained ℓ 1 minimization to the theoretical optimal control to mitigate the risk associated with the dimensionality issue. The resulting portfolio becomes a sparse portfolio that selects stocks with a data-driven procedure and hence offers a stable mean-variance portfolio in practice. When the number of observations becomes large, the LPO portfolio converges to the oracle optimal portfolio, which is free of estimation error, even though the number of stocks grows faster than the number of observations. Our numerical and empirical studies demonstrate the superiority of the proposed approach. © 2017 Society for Risk Analysis.
Cavaglieri, Daniele; Bewley, Thomas
2015-04-01
Implicit/explicit (IMEX) Runge-Kutta (RK) schemes are effective for time-marching ODE systems with both stiff and nonstiff terms on the RHS; such schemes implement an (often A-stable or better) implicit RK scheme for the stiff part of the ODE, which is often linear, and, simultaneously, a (more convenient) explicit RK scheme for the nonstiff part of the ODE, which is often nonlinear. Low-storage RK schemes are especially effective for time-marching high-dimensional ODE discretizations of PDE systems on modern (cache-based) computational hardware, in which memory management is often the most significant computational bottleneck. In this paper, we develop and characterize eight new low-storage implicit/explicit RK schemes which have higher accuracy and better stability properties than the only low-storage implicit/explicit RK scheme available previously, the venerable second-order Crank-Nicolson/Runge-Kutta-Wray (CN/RKW3) algorithm that has dominated the DNS/LES literature for the last 25 years, while requiring similar storage (two, three, or four registers of length N) and comparable floating-point operations per timestep.
Tan, Ivy; Storelvmo, Trude
2015-04-01
Substantial improvements have been made to the cloud microphysical schemes used in the latest generation of global climate models (GCMs), however, an outstanding weakness of these schemes lies in the arbitrariness of their tuning parameters, which are also notoriously fraught with uncertainties. Despite the growing effort in improving the cloud microphysical schemes in GCMs, most of this effort has neglected to focus on improving the ability of GCMs to accurately simulate the present-day global distribution of thermodynamic phase partitioning in mixed-phase clouds. Liquid droplets and ice crystals not only influence the Earth's radiative budget and hence climate sensitivity via their contrasting optical properties, but also through the effects of their lifetimes in the atmosphere. The current study employs NCAR's CAM5.1, and uses observations of cloud phase obtained by NASA's CALIOP lidar over a 79-month period (November 2007 to June 2014) guide the accurate simulation of the global distribution of mixed-phase clouds in 20∘ latitudinal bands at the -10∘ C, -20∘C and -30∘C isotherms, by adjusting six relevant cloud microphysical tuning parameters in the CAM5.1 via Quasi-Monte Carlo sampling. Among the parameters include those that control the Wegener-Bergeron-Findeisen (WBF) timescale for the conversion of supercooled liquid droplets to ice and snow in mixed-phase clouds, the fraction of ice nuclei that nucleate ice in the atmosphere, ice crystal sedimentation speed, and wet scavenging in stratiform and convective clouds. Using a Generalized Linear Model as a variance-based sensitivity analysis, the relative contributions of each of the six parameters are quantified to gain a better understanding of the importance of their individual and two-way interaction effects on the liquid to ice proportion in mixed-phase clouds. Thus, the methodology implemented in the current study aims to search for the combination of cloud microphysical parameters in a GCM that
DEFF Research Database (Denmark)
Nutzman, Philip; Gilliland, Ronald L.; McCullough, Peter R.
2011-01-01
We present observations of three distinct transits of HD 17156b obtained with the Fine Guidance Sensors on board the Hubble Space Telescope. We analyzed both the transit photometry and previously published radial velocities to find the planet-star radius ratio Rp /R sstarf = 0.07454 ± 0.00035, in......We present observations of three distinct transits of HD 17156b obtained with the Fine Guidance Sensors on board the Hubble Space Telescope. We analyzed both the transit photometry and previously published radial velocities to find the planet-star radius ratio Rp /R sstarf = 0.07454 ± 0......-composition gas giant of the same mass and equilibrium temperature. For the three transits, we determine the times of mid-transit to a precision of 6.2 s, 7.6 s, and 6.9 s, and the transit times for HD 17156 do not show any significant departures from a constant period. The joint analysis of transit photometry...
Directory of Open Access Journals (Sweden)
Qingjun Zhang
2014-01-01
Full Text Available This paper proposes a novel image formation algorithm for the bistatic synthetic aperture radar (BiSAR with the configuration of a noncooperative transmitter and a stationary receiver in which the traditional imaging algorithm failed because the necessary imaging parameters cannot be estimated from the limited information from the noncooperative data provider. In the new algorithm, the essential parameters for imaging, such as squint angle, Doppler centroid, and Doppler chirp-rate, will be estimated by full exploration of the recorded direct signal (direct signal is the echo from satellite to stationary receiver directly from the transmitter. The Doppler chirp-rate is retrieved by modeling the peak phase of direct signal as a quadratic polynomial. The Doppler centroid frequency and the squint angle can be derived from the image contrast optimization. Then the range focusing, the range cell migration correction (RCMC, and the azimuth focusing are implemented by secondary range compression (SRC and the range cell migration, respectively. At last, the proposed algorithm is validated by imaging of the BiSAR experiment configured with china YAOGAN 10 SAR as the transmitter and the receiver platform located on a building at a height of 109 m in Jiangsu province. The experiment image with geometric correction shows good accordance with local Google images.
Ireland, Gareth; North, Matthew R.; Petropoulos, George P.; Srivastava, Prashant K.; Hodges, Crona
2015-04-01
Acquiring accurate information on the spatio-temporal variability of soil moisture content (SM) and evapotranspiration (ET) is of key importance to extend our understanding of the Earth system's physical processes, and is also required in a wide range of multi-disciplinary research studies and applications. The utility and applicability of Earth Observation (EO) technology provides an economically feasible solution to derive continuous spatio-temporal estimates of key parameters characterising land surface interactions, including ET as well as SM. Such information is of key value to practitioners, decision makers and scientists alike. The PREMIER-EO project recently funded by High Performance Computing Wales (HPCW) is a research initiative directed towards the development of a better understanding of EO technology's present ability to derive operational estimations of surface fluxes and SM. Moreover, the project aims at addressing knowledge gaps related to the operational estimation of such parameters, and thus contribute towards current ongoing global efforts towards enhancing the accuracy of those products. In this presentation we introduce the PREMIER-EO project, providing a detailed overview of the research aims and objectives for the 1 year duration of the project's implementation. Subsequently, we make available the initial results of the work carried out herein, in particular, related to an all-inclusive and robust evaluation of the accuracy of existing operational products of ET and SM from different ecosystems globally. The research outcomes of this project, once completed, will provide an important contribution towards addressing the knowledge gaps related to the operational estimation of ET and SM. This project results will also support efforts ongoing globally towards the operational development of related products using technologically advanced EO instruments which were launched recently or planned be launched in the next 1-2 years. Key Words: PREMIER
International Nuclear Information System (INIS)
Aschwanden, Markus J.; Zhang, Jie; Liu, Kai
2013-01-01
We extend a previous statistical solar flare study of 155 GOES M- and X-class flares observed with AIA/SDO to all seven coronal wavelengths (94, 131, 171, 193, 211, 304, and 335 Å) to test the wavelength dependence of scaling laws and statistical distributions. Except for the 171 and 193 Å wavelengths, which are affected by EUV dimming caused by coronal mass ejections (CMEs), we find near-identical size distributions of geometric (lengths L, flare areas A, volumes V, and fractal dimension D 2 ), temporal (flare durations T), and spatio-temporal parameters (diffusion coefficient κ, spreading exponent β, and maximum expansion velocities v max ) in different wavelengths, which are consistent with the universal predictions of the fractal-diffusive avalanche model of a slowly driven, self-organized criticality (FD-SOC) system, i.e., N(L)∝L –3 , N(A)∝A –2 , N(V)∝V –5/3 , N(T)∝T –2 , and D 2 = 3/2, for a Euclidean dimension d = 3. Empirically, we find also a new strong correlation κ∝L 0.94±0.01 and the three-parameter scaling law L∝κ T 0.1 , which is more consistent with the logistic-growth model than with classical diffusion. The findings suggest long-range correlation lengths in the FD-SOC system that operate in the vicinity of a critical state, which could be used for predictions of individual extreme events. We find also that eruptive flares (with accompanying CMEs) have larger volumes V, longer flare durations T, higher EUV and soft X-ray fluxes, and somewhat larger diffusion coefficients κ than confined flares (without CMEs)
Binder, Harald; Porzelius, Christine; Schumacher, Martin
2011-03-01
Analysis of molecular data promises identification of biomarkers for improving prognostic models, thus potentially enabling better patient management. For identifying such biomarkers, risk prediction models can be employed that link high-dimensional molecular covariate data to a clinical endpoint. In low-dimensional settings, a multitude of statistical techniques already exists for building such models, e.g. allowing for variable selection or for quantifying the added value of a new biomarker. We provide an overview of techniques for regularized estimation that transfer this toward high-dimensional settings, with a focus on models for time-to-event endpoints. Techniques for incorporating specific covariate structure are discussed, as well as techniques for dealing with more complex endpoints. Employing gene expression data from patients with diffuse large B-cell lymphoma, some typical modeling issues from low-dimensional settings are illustrated in a high-dimensional application. First, the performance of classical stepwise regression is compared to stage-wise regression, as implemented by a component-wise likelihood-based boosting approach. A second issues arises, when artificially transforming the response into a binary variable. The effects of the resulting loss of efficiency and potential bias in a high-dimensional setting are illustrated, and a link to competing risks models is provided. Finally, we discuss conditions for adequately quantifying the added value of high-dimensional gene expression measurements, both at the stage of model fitting and when performing evaluation. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.