parallel genetic algorithm: Topics by WorldWideScience.org

Sample records for parallel genetic algorithm

Optimal Design of Passive Power Filters Based on Pseudo-parallel Genetic Algorithm

Science.gov (United States)

Li, Pei; Li, Hongbo; Gao, Nannan; Niu, Lin; Guo, Liangfeng; Pei, Ying; Zhang, Yanyan; Xu, Minmin; Chen, Kerui

2017-05-01

The economic costs together with filter efficiency are taken as targets to optimize the parameter of passive filter. Furthermore, the method of combining pseudo-parallel genetic algorithm with adaptive genetic algorithm is adopted in this paper. In the early stages pseudo-parallel genetic algorithm is introduced to increase the population diversity, and adaptive genetic algorithm is used in the late stages to reduce the workload. At the same time, the migration rate of pseudo-parallel genetic algorithm is improved to change with population diversity adaptively. Simulation results show that the filter designed by the proposed method has better filtering effect with lower economic cost, and can be used in engineering.
A Parallel Genetic Algorithm for Automated Electronic Circuit Design

Science.gov (United States)

Long, Jason D.; Colombano, Silvano P.; Haith, Gary L.; Stassinopoulos, Dimitris

2000-01-01

Parallelized versions of genetic algorithms (GAs) are popular primarily for three reasons: the GA is an inherently parallel algorithm, typical GA applications are very compute intensive, and powerful computing platforms, especially Beowulf-style computing clusters, are becoming more affordable and easier to implement. In addition, the low communication bandwidth required allows the use of inexpensive networking hardware such as standard office ethernet. In this paper we describe a parallel GA and its use in automated high-level circuit design. Genetic algorithms are a type of trial-and-error search technique that are guided by principles of Darwinian evolution. Just as the genetic material of two living organisms can intermix to produce offspring that are better adapted to their environment, GAs expose genetic material, frequently strings of 1s and Os, to the forces of artificial evolution: selection, mutation, recombination, etc. GAs start with a pool of randomly-generated candidate solutions which are then tested and scored with respect to their utility. Solutions are then bred by probabilistically selecting high quality parents and recombining their genetic representations to produce offspring solutions. Offspring are typically subjected to a small amount of random mutation. After a pool of offspring is produced, this process iterates until a satisfactory solution is found or an iteration limit is reached. Genetic algorithms have been applied to a wide variety of problems in many fields, including chemistry, biology, and many engineering disciplines. There are many styles of parallelism used in implementing parallel GAs. One such method is called the master-slave or processor farm approach. In this technique, slave nodes are used solely to compute fitness evaluations (the most time consuming part). The master processor collects fitness scores from the nodes and performs the genetic operators (selection, reproduction, variation, etc.). Because of dependency
Parallel genetic algorithms with migration for the hybrid flow shop scheduling problem

Directory of Open Access Journals (Sweden)

K. Belkadi

2006-01-01

Full Text Available This paper addresses scheduling problems in hybrid flow shop-like systems with a migration parallel genetic algorithm (PGA_MIG. This parallel genetic algorithm model allows genetic diversity by the application of selection and reproduction mechanisms nearer to nature. The space structure of the population is modified by dividing it into disjoined subpopulations. From time to time, individuals are exchanged between the different subpopulations (migration. Influence of parameters and dedicated strategies are studied. These parameters are the number of independent subpopulations, the interconnection topology between subpopulations, the choice/replacement strategy of the migrant individuals, and the migration frequency. A comparison between the sequential and parallel version of genetic algorithm (GA is provided. This comparison relates to the quality of the solution and the execution time of the two versions. The efficiency of the parallel model highly depends on the parameters and especially on the migration frequency. In the same way this parallel model gives a significant improvement of computational time if it is implemented on a parallel architecture which offers an acceptable number of processors (as many processors as subpopulations.
Predicting mining activity with parallel genetic algorithms

Science.gov (United States)

Talaie, S.; Leigh, R.; Louis, S.J.; Raines, G.L.; Beyer, H.G.; O'Reilly, U.M.; Banzhaf, Arnold D.; Blum, W.; Bonabeau, C.; Cantu-Paz, E.W.; ,; ,

2005-01-01

We explore several different techniques in our quest to improve the overall model performance of a genetic algorithm calibrated probabilistic cellular automata. We use the Kappa statistic to measure correlation between ground truth data and data predicted by the model. Within the genetic algorithm, we introduce a new evaluation function sensitive to spatial correctness and we explore the idea of evolving different rule parameters for different subregions of the land. We reduce the time required to run a simulation from 6 hours to 10 minutes by parallelizing the code and employing a 10-node cluster. Our empirical results suggest that using the spatially sensitive evaluation function does indeed improve the performance of the model and our preliminary results also show that evolving different rule parameters for different regions tends to improve overall model performance. Copyright 2005 ACM.
High-speed detection of emergent market clustering via an unsupervised parallel genetic algorithm

Directory of Open Access Journals (Sweden)

Dieter Hendricks

2016-02-01

Full Text Available We implement a master-slave parallel genetic algorithm with a bespoke log-likelihood fitness function to identify emergent clusters within price evolutions. We use graphics processing units (GPUs to implement a parallel genetic algorithm and visualise the results using disjoint minimal spanning trees. We demonstrate that our GPU parallel genetic algorithm, implemented on a commercially available general purpose GPU, is able to recover stock clusters in sub-second speed, based on a subset of stocks in the South African market. This approach represents a pragmatic choice for low-cost, scalable parallel computing and is significantly faster than a prototype serial implementation in an optimised C-based fourth-generation programming language, although the results are not directly comparable because of compiler differences. Combined with fast online intraday correlation matrix estimation from high frequency data for cluster identification, the proposed implementation offers cost-effective, near-real-time risk assessment for financial practitioners.
Optical flow optimization using parallel genetic algorithm

Science.gov (United States)

Zavala-Romero, Olmo; Botella, Guillermo; Meyer-Bäse, Anke; Meyer Base, Uwe

2011-06-01

A new approach to optimize the parameters of a gradient-based optical flow model using a parallel genetic algorithm (GA) is proposed. The main characteristics of the optical flow algorithm are its bio-inspiration and robustness against contrast, static patterns and noise, besides working consistently with several optical illusions where other algorithms fail. This model depends on many parameters which conform the number of channels, the orientations required, the length and shape of the kernel functions used in the convolution stage, among many more. The GA is used to find a set of parameters which improve the accuracy of the optical flow on inputs where the ground-truth data is available. This set of parameters helps to understand which of them are better suited for each type of inputs and can be used to estimate the parameters of the optical flow algorithm when used with videos that share similar characteristics. The proposed implementation takes into account the embarrassingly parallel nature of the GA and uses the OpenMP Application Programming Interface (API) to speedup the process of estimating an optimal set of parameters. The information obtained in this work can be used to dynamically reconfigure systems, with potential applications in robotics, medical imaging and tracking.
Genetic algorithms

Science.gov (United States)

Wang, Lui; Bayer, Steven E.

1991-01-01

Genetic algorithms are mathematical, highly parallel, adaptive search procedures (i.e., problem solving methods) based loosely on the processes of natural genetics and Darwinian survival of the fittest. Basic genetic algorithms concepts are introduced, genetic algorithm applications are introduced, and results are presented from a project to develop a software tool that will enable the widespread use of genetic algorithm technology.
A parallel attractor-finding algorithm based on Boolean satisfiability for genetic regulatory networks.

Directory of Open Access Journals (Sweden)

Wensheng Guo

Full Text Available In biological systems, the dynamic analysis method has gained increasing attention in the past decade. The Boolean network is the most common model of a genetic regulatory network. The interactions of activation and inhibition in the genetic regulatory network are modeled as a set of functions of the Boolean network, while the state transitions in the Boolean network reflect the dynamic property of a genetic regulatory network. A difficult problem for state transition analysis is the finding of attractors. In this paper, we modeled the genetic regulatory network as a Boolean network and proposed a solving algorithm to tackle the attractor finding problem. In the proposed algorithm, we partitioned the Boolean network into several blocks consisting of the strongly connected components according to their gradients, and defined the connection between blocks as decision node. Based on the solutions calculated on the decision nodes and using a satisfiability solving algorithm, we identified the attractors in the state transition graph of each block. The proposed algorithm is benchmarked on a variety of genetic regulatory networks. Compared with existing algorithms, it achieved similar performance on small test cases, and outperformed it on larger and more complex ones, which happens to be the trend of the modern genetic regulatory network. Furthermore, while the existing satisfiability-based algorithms cannot be parallelized due to their inherent algorithm design, the proposed algorithm exhibits a good scalability on parallel computing architectures.
A Parallel Genetic Algorithm for Automated Electronic Circuit Design

Science.gov (United States)

Lohn, Jason D.; Colombano, Silvano P.; Haith, Gary L.; Stassinopoulos, Dimitris; Norvig, Peter (Technical Monitor)

2000-01-01

We describe a parallel genetic algorithm (GA) that automatically generates circuit designs using evolutionary search. A circuit-construction programming language is introduced and we show how evolution can generate practical analog circuit designs. Our system allows circuit size (number of devices), circuit topology, and device values to be evolved. We present experimental results as applied to analog filter and amplifier design tasks.
Development of a parallel genetic algorithm using MPI and its application in a nuclear reactor core. Design optimization

International Nuclear Information System (INIS)

Waintraub, Marcel; Pereira, Claudio M.N.A.; Baptista, Rafael P.

2005-01-01

This work presents the development of a distributed parallel genetic algorithm applied to a nuclear reactor core design optimization. In the implementation of the parallelism, a 'Message Passing Interface' (MPI) library, standard for parallel computation in distributed memory platforms, has been used. Another important characteristic of MPI is its portability for various architectures. The main objectives of this paper are: validation of the results obtained by the application of this algorithm in a nuclear reactor core optimization problem, through comparisons with previous results presented by Pereira et al.; and performance test of the Brazilian Nuclear Engineering Institute (IEN) cluster in reactors physics optimization problems. The experiments demonstrated that the developed parallel genetic algorithm using the MPI library presented significant gains in the obtained results and an accentuated reduction of the processing time. Such results ratify the use of the parallel genetic algorithms for the solution of nuclear reactor core optimization problems. (author)
Optimization of tokamak plasma equilibrium shape using parallel genetic algorithms

International Nuclear Information System (INIS)

Zhulin An; Bin Wu; Lijian Qiu

2006-01-01

In the device of non-circular cross sectional tokamaks, the plasma equilibrium shape has a strong influence on the confinement and MHD stability. The plasma equilibrium shape is determined by the configuration of the poloidal field (PF) system. Usually there are many PF systems that could support the specified plasma equilibrium, the differences are the number of coils used, their positions, sizes and currents. It is necessary to find the optimal choice that meets the engineering constrains, which is often done by a constrained optimization. The Genetic Algorithms (GAs) based method has been used to solve the problem of the optimization, but the time complexity limits the algorithms to become widely used. Due to the large search space that the optimization has, it takes several hours to get a nice result. The inherent parallelism in GAs can be exploited to enhance their search efficiency. In this paper, we introduce a parallel genetic algorithms (PGAs) based approach which can reduce the computational time. The algorithm has a master-slave structure, the slave explore the search space separately and return the results to the master. A program is also developed, and it can be running on any computers which support massage passing interface. Both the algorithm and the program are detailed discussed in the paper. We also include an application that uses the program to determine the positions and currents of PF coils in EAST. The program reach the target value within half an hour and yield a speedup rate of 5.21 on 8 CPUs. (author)
Creating IRT-Based Parallel Test Forms Using the Genetic Algorithm Method

Science.gov (United States)

Sun, Koun-Tem; Chen, Yu-Jen; Tsai, Shu-Yen; Cheng, Chien-Fen

2008-01-01

In educational measurement, the construction of parallel test forms is often a combinatorial optimization problem that involves the time-consuming selection of items to construct tests having approximately the same test information functions (TIFs) and constraints. This article proposes a novel method, genetic algorithm (GA), to construct parallel…
Parallel Genetic Algorithms for calibrating Cellular Automata models: Application to lava flows

International Nuclear Information System (INIS)

D'Ambrosio, D.; Spataro, W.; Di Gregorio, S.; Calabria Univ., Cosenza; Crisci, G.M.; Rongo, R.; Calabria Univ., Cosenza

2005-01-01

Cellular Automata are highly nonlinear dynamical systems which are suitable far simulating natural phenomena whose behaviour may be specified in terms of local interactions. The Cellular Automata model SCIARA, developed far the simulation of lava flows, demonstrated to be able to reproduce the behaviour of Etnean events. However, in order to apply the model far the prediction of future scenarios, a thorough calibrating phase is required. This work presents the application of Genetic Algorithms, general-purpose search algorithms inspired to natural selection and genetics, far the parameters optimisation of the model SCIARA. Difficulties due to the elevated computational time suggested the adoption a Master-Slave Parallel Genetic Algorithm far the calibration of the model with respect to the 2001 Mt. Etna eruption. Results demonstrated the usefulness of the approach, both in terms of computing time and quality of performed simulations
Eddy current testing probe optimization using a parallel genetic algorithm

Directory of Open Access Journals (Sweden)

Dolapchiev Ivaylo

2008-01-01

Full Text Available This paper uses the developed parallel version of Michalewicz's Genocop III Genetic Algorithm (GA searching technique to optimize the coil geometry of an eddy current non-destructive testing probe (ECTP. The electromagnetic field is computed using FEMM 2D finite element code. The aim of this optimization was to determine coil dimensions and positions that improve ECTP sensitivity to physical properties of the tested devices.
Reliability–redundancy allocation problem considering optimal redundancy strategy using parallel genetic algorithm

International Nuclear Information System (INIS)

Kim, Heungseob; Kim, Pansoo

2017-01-01

To maximize the reliability of a system, the traditional reliability–redundancy allocation problem (RRAP) determines the component reliability and level of redundancy for each subsystem. This paper proposes an advanced RRAP that also considers the optimal redundancy strategy, either active or cold standby. In addition, new examples are presented for it. Furthermore, the exact reliability function for a cold standby redundant subsystem with an imperfect detector/switch is suggested, and is expected to replace the previous approximating model that has been used in most related studies. A parallel genetic algorithm for solving the RRAP as a mixed-integer nonlinear programming model is presented, and its performance is compared with those of previous studies by using numerical examples on three benchmark problems. - Highlights: • Optimal strategy is proposed to solve reliability redundancy allocation problem. • The redundancy strategy uses parallel genetic algorithm. • Improved reliability function for a cold standby subsystem is suggested. • Proposed redundancy strategy enhances the system reliability.
Optimization Design by Genetic Algorithm Controller for Trajectory Control of a 3-RRR Parallel Robot

Directory of Open Access Journals (Sweden)

Lianchao Sheng

2018-01-01

Full Text Available In order to improve the control precision and robustness of the existing proportion integration differentiation (PID controller of a 3-Revolute–Revolute–Revolute (3-RRR parallel robot, a variable PID parameter controller optimized by a genetic algorithm controller is proposed in this paper. Firstly, the inverse kinematics model of the 3-RRR parallel robot was established according to the vector method, and the motor conversion matrix was deduced. Then, the error square integral was chosen as the fitness function, and the genetic algorithm controller was designed. Finally, the control precision of the new controller was verified through the simulation model of the 3-RRR planar parallel robot—built in SimMechanics—and the robustness of the new controller was verified by adding interference. The results show that compared with the traditional PID controller, the new controller designed in this paper has better control precision and robustness, which provides the basis for practical application.
Foundations of genetic algorithms 1991

CERN Document Server

1991-01-01

Foundations of Genetic Algorithms 1991 (FOGA 1) discusses the theoretical foundations of genetic algorithms (GA) and classifier systems.This book compiles research papers on selection and convergence, coding and representation, problem hardness, deception, classifier system design, variation and recombination, parallelization, and population divergence. Other topics include the non-uniform Walsh-schema transform; spurious correlations and premature convergence in genetic algorithms; and variable default hierarchy separation in a classifier system. The grammar-based genetic algorithm; condition
Where genetic algorithms excel.

Science.gov (United States)

Baum, E B; Boneh, D; Garrett, C

2001-01-01

We analyze the performance of a genetic algorithm (GA) we call Culling, and a variety of other algorithms, on a problem we refer to as the Additive Search Problem (ASP). We show that the problem of learning the Ising perceptron is reducible to a noisy version of ASP. Noisy ASP is the first problem we are aware of where a genetic-type algorithm bests all known competitors. We generalize ASP to k-ASP to study whether GAs will achieve "implicit parallelism" in a problem with many more schemata. GAs fail to achieve this implicit parallelism, but we describe an algorithm we call Explicitly Parallel Search that succeeds. We also compute the optimal culling point for selective breeding, which turns out to be independent of the fitness function or the population distribution. We also analyze a mean field theoretic algorithm performing similarly to Culling on many problems. These results provide insight into when and how GAs can beat competing methods.
Parallel genetic algorithm as a tool for nuclear reactors reload

International Nuclear Information System (INIS)

Santos, Darley Roberto G.; Schirru, Roberto

1999-01-01

This work intends to present a tool which can be used by designers in order to get better solutions, in terms of computational costs, to solve problems of nuclear reactor reloads. It is known that the project of nuclear fuel reload is a complex combinatorial one. Generally, iterative processes are the most used ones because they generate answers to satisfy all restrictions. The model presented here uses Artificial Intelligence techniques, more precisely Genetic Algorithms techniques, mixed with parallelization techniques.Test of the tool presented here were highly satisfactory, due to a considerable reduction in computational time. (author)
A parallel adaptive quantum genetic algorithm for the controllability of arbitrary networks.

Science.gov (United States)

Li, Yuhong; Gong, Guanghong; Li, Ni

2018-01-01

In this paper, we propose a novel algorithm-parallel adaptive quantum genetic algorithm-which can rapidly determine the minimum control nodes of arbitrary networks with both control nodes and state nodes. The corresponding network can be fully controlled with the obtained control scheme. We transformed the network controllability issue into a combinational optimization problem based on the Popov-Belevitch-Hautus rank condition. A set of canonical networks and a list of real-world networks were experimented. Comparison results demonstrated that the algorithm was more ideal to optimize the controllability of networks, especially those larger-size networks. We demonstrated subsequently that there were links between the optimal control nodes and some network statistical characteristics. The proposed algorithm provides an effective approach to improve the controllability optimization of large networks or even extra-large networks with hundreds of thousands nodes.

Using Hadoop MapReduce for Parallel Genetic Algorithms: A Comparison of the Global, Grid and Island Models.

Science.gov (United States)

Ferrucci, Filomena; Salza, Pasquale; Sarro, Federica

2017-06-29

The need to improve the scalability of Genetic Algorithms (GAs) has motivated the research on Parallel Genetic Algorithms (PGAs), and different technologies and approaches have been used. Hadoop MapReduce represents one of the most mature technologies to develop parallel algorithms. Based on the fact that parallel algorithms introduce communication overhead, the aim of the present work is to understand if, and possibly when, the parallel GAs solutions using Hadoop MapReduce show better performance than sequential versions in terms of execution time. Moreover, we are interested in understanding which PGA model can be most effective among the global, grid, and island models. We empirically assessed the performance of these three parallel models with respect to a sequential GA on a software engineering problem, evaluating the execution time and the achieved speedup. We also analysed the behaviour of the parallel models in relation to the overhead produced by the use of Hadoop MapReduce and the GAs' computational effort, which gives a more machine-independent measure of these algorithms. We exploited three problem instances to differentiate the computation load and three cluster configurations based on 2, 4, and 8 parallel nodes. Moreover, we estimated the costs of the execution of the experimentation on a potential cloud infrastructure, based on the pricing of the major commercial cloud providers. The empirical study revealed that the use of PGA based on the island model outperforms the other parallel models and the sequential GA for all the considered instances and clusters. Using 2, 4, and 8 nodes, the island model achieves an average speedup over the three datasets of 1.8, 3.4, and 7.0 times, respectively. Hadoop MapReduce has a set of different constraints that need to be considered during the design and the implementation of parallel algorithms. The overhead of data store (i.e., HDFS) accesses, communication, and latency requires solutions that reduce data store
Parallel island genetic algorithm applied to a nuclear power plant auxiliary feedwater system surveillance tests policy optimization

International Nuclear Information System (INIS)

Pereira, Claudio M.N.A.; Lapa, Celso M.F.

2003-01-01

In this work, we focus the application of an Island Genetic Algorithm (IGA), a coarse-grained parallel genetic algorithm (PGA) model, to a Nuclear Power Plant (NPP) Auxiliary Feedwater System (AFWS) surveillance tests policy optimization. Here, the main objective is to outline, by means of comparisons, the advantages of the IGA over the simple (non-parallel) genetic algorithm (GA), which has been successfully applied in the solution of such kind of problem. The goal of the optimization is to maximize the system's average availability for a given period of time, considering realistic features such as: i) aging effects on standby components during the tests; ii) revealing failures in the tests implies on corrective maintenance, increasing outage times; iii) components have distinct test parameters (outage time, aging factors, etc.) and iv) tests are not necessarily periodic. In our experiments, which were made in a cluster comprised by 8 1-GHz personal computers, we could clearly observe gains not only in the computational time, which reduced linearly with the number of computers, but in the optimization outcome
The parallel processing impact in the optimization of the reactors neutronic by genetic algorithms

International Nuclear Information System (INIS)

Pereira, Claudio M.N.A.; Universidade Federal, Rio de Janeiro, RJ; Lapa, Celso M.F.; Mol, Antonio C.A.

2002-01-01

Nowadays, many optimization problems found in nuclear engineering has been solved through genetic algorithms (GA). The robustness of such methods is strongly related to the nature of search process which is based on populations of solution candidates, and this fact implies high computational cost in the optimization process. The use of GA become more critical when the evaluation process of a solution candidate is highly time consuming. Problems of this nature are common in the nuclear engineering, and an example is the reactor design optimization, where neutronic codes, which consume high CPU time, must be run. Aiming to investigate the impact of the use of parallel computation in the solution, through GA, of a reactor design optimization problem, a parallel genetic algorithm (PGA), using the Island Model, was developed. Exhaustive experiments, then 1500 processing hours in 550 MHz personal computers, have been done, in order to compare the conventional GA with the PGA. Such experiments have demonstrating the superiority of the PGA not only in terms of execution time, but also, in the optimization results. (author)
Parallel sorting algorithms

CERN Document Server

Akl, Selim G

1985-01-01

Parallel Sorting Algorithms explains how to use parallel algorithms to sort a sequence of items on a variety of parallel computers. The book reviews the sorting problem, the parallel models of computation, parallel algorithms, and the lower bounds on the parallel sorting problems. The text also presents twenty different algorithms, such as linear arrays, mesh-connected computers, cube-connected computers. Another example where algorithm can be applied is on the shared-memory SIMD (single instruction stream multiple data stream) computers in which the whole sequence to be sorted can fit in the
A distributed parallel genetic algorithm of placement strategy for virtual machines deployment on cloud platform.

Science.gov (United States)

Dong, Yu-Shuang; Xu, Gao-Chao; Fu, Xiao-Dong

2014-01-01

The cloud platform provides various services to users. More and more cloud centers provide infrastructure as the main way of operating. To improve the utilization rate of the cloud center and to decrease the operating cost, the cloud center provides services according to requirements of users by sharding the resources with virtualization. Considering both QoS for users and cost saving for cloud computing providers, we try to maximize performance and minimize energy cost as well. In this paper, we propose a distributed parallel genetic algorithm (DPGA) of placement strategy for virtual machines deployment on cloud platform. It executes the genetic algorithm parallelly and distributedly on several selected physical hosts in the first stage. Then it continues to execute the genetic algorithm of the second stage with solutions obtained from the first stage as the initial population. The solution calculated by the genetic algorithm of the second stage is the optimal one of the proposed approach. The experimental results show that the proposed placement strategy of VM deployment can ensure QoS for users and it is more effective and more energy efficient than other placement strategies on the cloud platform.
A Distributed Parallel Genetic Algorithm of Placement Strategy for Virtual Machines Deployment on Cloud Platform

Directory of Open Access Journals (Sweden)

Yu-Shuang Dong

2014-01-01

Full Text Available The cloud platform provides various services to users. More and more cloud centers provide infrastructure as the main way of operating. To improve the utilization rate of the cloud center and to decrease the operating cost, the cloud center provides services according to requirements of users by sharding the resources with virtualization. Considering both QoS for users and cost saving for cloud computing providers, we try to maximize performance and minimize energy cost as well. In this paper, we propose a distributed parallel genetic algorithm (DPGA of placement strategy for virtual machines deployment on cloud platform. It executes the genetic algorithm parallelly and distributedly on several selected physical hosts in the first stage. Then it continues to execute the genetic algorithm of the second stage with solutions obtained from the first stage as the initial population. The solution calculated by the genetic algorithm of the second stage is the optimal one of the proposed approach. The experimental results show that the proposed placement strategy of VM deployment can ensure QoS for users and it is more effective and more energy efficient than other placement strategies on the cloud platform.
Cloud identification using genetic algorithms and massively parallel computation

Science.gov (United States)

Buckles, Bill P.; Petry, Frederick E.

1996-01-01

As a Guest Computational Investigator under the NASA administered component of the High Performance Computing and Communication Program, we implemented a massively parallel genetic algorithm on the MasPar SIMD computer. Experiments were conducted using Earth Science data in the domains of meteorology and oceanography. Results obtained in these domains are competitive with, and in most cases better than, similar problems solved using other methods. In the meteorological domain, we chose to identify clouds using AVHRR spectral data. Four cloud speciations were used although most researchers settle for three. Results were remarkedly consistent across all tests (91% accuracy). Refinements of this method may lead to more timely and complete information for Global Circulation Models (GCMS) that are prevalent in weather forecasting and global environment studies. In the oceanographic domain, we chose to identify ocean currents from a spectrometer having similar characteristics to AVHRR. Here the results were mixed (60% to 80% accuracy). Given that one is willing to run the experiment several times (say 10), then it is acceptable to claim the higher accuracy rating. This problem has never been successfully automated. Therefore, these results are encouraging even though less impressive than the cloud experiment. Successful conclusion of an automated ocean current detection system would impact coastal fishing, naval tactics, and the study of micro-climates. Finally we contributed to the basic knowledge of GA (genetic algorithm) behavior in parallel environments. We developed better knowledge of the use of subpopulations in the context of shared breeding pools and the migration of individuals. Rigorous experiments were conducted based on quantifiable performance criteria. While much of the work confirmed current wisdom, for the first time we were able to submit conclusive evidence. The software developed under this grant was placed in the public domain. An extensive user
The Parallel Algorithm Based on Genetic Algorithm for Improving the Performance of Cognitive Radio

Directory of Open Access Journals (Sweden)

Liu Miao

2018-01-01

Full Text Available The intercarrier interference (ICI problem of cognitive radio (CR is severe. In this paper, the machine learning algorithm is used to obtain the optimal interference subcarriers of an unlicensed user (un-LU. Masking the optimal interference subcarriers can suppress the ICI of CR. Moreover, the parallel ICI suppression algorithm is designed to improve the calculation speed and meet the practical requirement of CR. Simulation results show that the data transmission rate threshold of un-LU can be set, the data transmission quality of un-LU can be ensured, the ICI of a licensed user (LU is suppressed, and the bit error rate (BER performance of LU is improved by implementing the parallel suppression algorithm. The ICI problem of CR is solved well by the new machine learning algorithm. The computing performance of the algorithm is improved by designing a new parallel structure and the communication performance of CR is enhanced.
Coarse-grained parallel genetic algorithm applied to a nuclear reactor core design optimization problem

International Nuclear Information System (INIS)

Pereira, Claudio M.N.A.; Lapa, Celso M.F.

2003-01-01

This work extends the research related to generic algorithms (GA) in core design optimization problems, which basic investigations were presented in previous work. Here we explore the use of the Island Genetic Algorithm (IGA), a coarse-grained parallel GA model, comparing its performance to that obtained by the application of a traditional non-parallel GA. The optimization problem consists on adjusting several reactor cell parameters, such as dimensions, enrichment and materials, in order to minimize the average peak-factor in a 3-enrichment zone reactor, considering restrictions on the average thermal flux, criticality and sub-moderation. Our IGA implementation runs as a distributed application on a conventional local area network (LAN), avoiding the use of expensive parallel computers or architectures. After exhaustive experiments, taking more than 1500 h in 550 MHz personal computers, we have observed that the IGA provided gains not only in terms of computational time, but also in the optimization outcome. Besides, we have also realized that, for such kind of problem, which fitness evaluation is itself time consuming, the time overhead in the IGA, due to the communication in LANs, is practically imperceptible, leading to the conclusion that the use of expensive parallel computers or architecture can be avoided
Problem solving with genetic algorithms and Splicer

Science.gov (United States)

Bayer, Steven E.; Wang, Lui

1991-01-01

Genetic algorithms are highly parallel, adaptive search procedures (i.e., problem-solving methods) loosely based on the processes of population genetics and Darwinian survival of the fittest. Genetic algorithms have proven useful in domains where other optimization techniques perform poorly. The main purpose of the paper is to discuss a NASA-sponsored software development project to develop a general-purpose tool for using genetic algorithms. The tool, called Splicer, can be used to solve a wide variety of optimization problems and is currently available from NASA and COSMIC. This discussion is preceded by an introduction to basic genetic algorithm concepts and a discussion of genetic algorithm applications.
Crystal structure prediction of flexible molecules using parallel genetic algorithms with a standard force field.

Science.gov (United States)

Kim, Seonah; Orendt, Anita M; Ferraro, Marta B; Facelli, Julio C

2009-10-01

This article describes the application of our distributed computing framework for crystal structure prediction (CSP) the modified genetic algorithms for crystal and cluster prediction (MGAC), to predict the crystal structure of flexible molecules using the general Amber force field (GAFF) and the CHARMM program. The MGAC distributed computing framework includes a series of tightly integrated computer programs for generating the molecule's force field, sampling crystal structures using a distributed parallel genetic algorithm and local energy minimization of the structures followed by the classifying, sorting, and archiving of the most relevant structures. Our results indicate that the method can consistently find the experimentally known crystal structures of flexible molecules, but the number of missing structures and poor ranking observed in some crystals show the need for further improvement of the potential. Copyright 2009 Wiley Periodicals, Inc.
A Hybrid Genetic Algorithm to Minimize Total Tardiness for Unrelated Parallel Machine Scheduling with Precedence Constraints

Directory of Open Access Journals (Sweden)

Chunfeng Liu

2013-01-01

Full Text Available The paper presents a novel hybrid genetic algorithm (HGA for a deterministic scheduling problem where multiple jobs with arbitrary precedence constraints are processed on multiple unrelated parallel machines. The objective is to minimize total tardiness, since delays of the jobs may lead to punishment cost or cancellation of orders by the clients in many situations. A priority rule-based heuristic algorithm, which schedules a prior job on a prior machine according to the priority rule at each iteration, is suggested and embedded to the HGA for initial feasible schedules that can be improved in further stages. Computational experiments are conducted to show that the proposed HGA performs well with respect to accuracy and efficiency of solution for small-sized problems and gets better results than the conventional genetic algorithm within the same runtime for large-sized problems.
A parallel adaptive quantum genetic algorithm for the controllability of arbitrary networks

Science.gov (United States)

Li, Yuhong

2018-01-01

In this paper, we propose a novel algorithm—parallel adaptive quantum genetic algorithm—which can rapidly determine the minimum control nodes of arbitrary networks with both control nodes and state nodes. The corresponding network can be fully controlled with the obtained control scheme. We transformed the network controllability issue into a combinational optimization problem based on the Popov-Belevitch-Hautus rank condition. A set of canonical networks and a list of real-world networks were experimented. Comparison results demonstrated that the algorithm was more ideal to optimize the controllability of networks, especially those larger-size networks. We demonstrated subsequently that there were links between the optimal control nodes and some network statistical characteristics. The proposed algorithm provides an effective approach to improve the controllability optimization of large networks or even extra-large networks with hundreds of thousands nodes. PMID:29554140
Totally parallel multilevel algorithms

Science.gov (United States)

Frederickson, Paul O.

1988-01-01

Four totally parallel algorithms for the solution of a sparse linear system have common characteristics which become quite apparent when they are implemented on a highly parallel hypercube such as the CM2. These four algorithms are Parallel Superconvergent Multigrid (PSMG) of Frederickson and McBryan, Robust Multigrid (RMG) of Hackbusch, the FFT based Spectral Algorithm, and Parallel Cyclic Reduction. In fact, all four can be formulated as particular cases of the same totally parallel multilevel algorithm, which are referred to as TPMA. In certain cases the spectral radius of TPMA is zero, and it is recognized to be a direct algorithm. In many other cases the spectral radius, although not zero, is small enough that a single iteration per timestep keeps the local error within the required tolerance.
Parallel Multi-Objective Genetic Algorithm for Short-Term Economic Environmental Hydrothermal Scheduling

Directory of Open Access Journals (Sweden)

Zhong-Kai Feng

2017-01-01

Full Text Available With the increasingly serious energy crisis and environmental pollution, the short-term economic environmental hydrothermal scheduling (SEEHTS problem is becoming more and more important in modern electrical power systems. In order to handle the SEEHTS problem efficiently, the parallel multi-objective genetic algorithm (PMOGA is proposed in the paper. Based on the Fork/Join parallel framework, PMOGA divides the whole population of individuals into several subpopulations which will evolve in different cores simultaneously. In this way, PMOGA can avoid the wastage of computational resources and increase the population diversity. Moreover, the constraint handling technique is used to handle the complex constraints in SEEHTS, and a selection strategy based on constraint violation is also employed to ensure the convergence speed and solution feasibility. The results from a hydrothermal system in different cases indicate that PMOGA can make the utmost of system resources to significantly improve the computing efficiency and solution quality. Moreover, PMOGA has competitive performance in SEEHTS when compared with several other methods reported in the previous literature, providing a new approach for the operation of hydrothermal systems.
A hybrid, massively parallel implementation of a genetic algorithm for optimization of the impact performance of a metal/polymer composite plate

KAUST Repository

Narayanan, Kiran; Mora Cordova, Angel; Allsopp, Nicholas; El Sayed, Tamer S.

2012-01-01

A hybrid parallelization method composed of a coarse-grained genetic algorithm (GA) and fine-grained objective function evaluations is implemented on a heterogeneous computational resource consisting of 16 IBM Blue Gene/P racks, a single x86 cluster
Genetic algorithm with small population size for search feasible control parameters for parallel hybrid electric vehicles

Directory of Open Access Journals (Sweden)

Yu-Huei Cheng

2017-11-01

Full Text Available The control strategy is a major unit in hybrid electric vehicles (HEVs. In order to provide suitable control parameters for reducing fuel consumptions and engine emissions while maintaining vehicle performance requirements, the genetic algorithm (GA with small population size is applied to search for feasible control parameters in parallel HEVs. The electric assist control strategy (EACS is used as the fundamental control strategy of parallel HEVs. The dynamic performance requirements stipulated in the Partnership for a New Generation of Vehicles (PNGV is considered to maintain the vehicle performance. The known ADvanced VehIcle SimulatOR (ADVISOR is used to simulate a specific parallel HEV with urban dynamometer driving schedule (UDDS. Five population sets with size 5, 10, 15, 20, and 25 are used in the GA. The experimental results show that the GA with population size of 25 is the best for selecting feasible control parameters in parallel HEVs.
Predicting the severity of nuclear power plant transients using nearest neighbors modeling optimized by genetic algorithms on a parallel computer

International Nuclear Information System (INIS)

Lin, J.; Bartal, Y.; Uhrig, R.E.

1995-01-01

The importance of automatic diagnostic systems for nuclear power plants (NPPs) has been discussed in numerous studies, and various such systems have been proposed. None of those systems were designed to predict the severity of the diagnosed scenario. A classification and severity prediction system for NPP transients is developed. The system is based on nearest neighbors modeling, which is optimized using genetic algorithms. The optimization process is used to determine the most important variables for each of the transient types analyzed. An enhanced version of the genetic algorithms is used in which a local downhill search is performed to further increase the accuracy achieved. The genetic algorithms search was implemented on a massively parallel supercomputer, the KSR1-64, to perform the analysis in a reasonable time. The data for this study were supplied by the high-fidelity simulator of the San Onofre unit 1 pressurized water reactor
Identification of Arbitrary Zonation in Groundwater Parameters using the Level Set Method and a Parallel Genetic Algorithm

Science.gov (United States)

Lei, H.; Lu, Z.; Vesselinov, V. V.; Ye, M.

2017-12-01

Simultaneous identification of both the zonation structure of aquifer heterogeneity and the hydrogeological parameters associated with these zones is challenging, especially for complex subsurface heterogeneity fields. In this study, a new approach, based on the combination of the level set method and a parallel genetic algorithm is proposed. Starting with an initial guess for the zonation field (including both zonation structure and the hydraulic properties of each zone), the level set method ensures that material interfaces are evolved through the inverse process such that the total residual between the simulated and observed state variables (hydraulic head) always decreases, which means that the inversion result depends on the initial guess field and the minimization process might fail if it encounters a local minimum. To find the global minimum, the genetic algorithm (GA) is utilized to explore the parameters that define initial guess fields, and the minimal total residual corresponding to each initial guess field is considered as the fitness function value in the GA. Due to the expensive evaluation of the fitness function, a parallel GA is adapted in combination with a simulated annealing algorithm. The new approach has been applied to several synthetic cases in both steady-state and transient flow fields, including a case with real flow conditions at the chromium contaminant site at the Los Alamos National Laboratory. The results show that this approach is capable of identifying the arbitrary zonation structures of aquifer heterogeneity and the hydrogeological parameters associated with these zones effectively.
Application of the distributed genetic algorithm for in-core fuel optimization problems under parallel computational environment

International Nuclear Information System (INIS)

Yamamoto, Akio; Hashimoto, Hiroshi

2002-01-01

The distributed genetic algorithm (DGA) is applied for loading pattern optimization problems of the pressurized water reactors. A basic concept of DGA follows that of the conventional genetic algorithm (GA). However, DGA equally distributes candidates of solutions (i.e. loading patterns) to several independent ''islands'' and evolves them in each island. Communications between islands, i.e. migrations of some candidates between islands are performed with a certain period. Since candidates of solutions independently evolve in each island while accepting different genes of migrants, premature convergence in the conventional GA can be prevented. Because many candidate loading patterns should be evaluated in GA or DGA, the parallelization is efficient to reduce turn around time. Parallel efficiency of DGA was measured using our optimization code and good efficiency was attained even in a heterogeneous cluster environment due to dynamic distribution of the calculation load. The optimization code is based on the client/server architecture with the TCP/IP native socket and a client (optimization) module and calculation server modules communicate the objects of loading patterns each other. Throughout the sensitivity study on optimization parameters of DGA, a suitable set of the parameters for a test problem was identified. Finally, optimization capability of DGA and the conventional GA was compared in the test problem and DGA provided better optimization results than the conventional GA. (author)

A Parallel Approach To Optimum Actuator Selection With a Genetic Algorithm

Science.gov (United States)

Rogers, James L.

2000-01-01

Recent discoveries in smart technologies have created a variety of aerodynamic actuators which have great potential to enable entirely new approaches to aerospace vehicle flight control. For a revolutionary concept such as a seamless aircraft with no moving control surfaces, there is a large set of candidate locations for placing actuators, resulting in a substantially larger number of combinations to examine in order to find an optimum placement satisfying the mission requirements. The placement of actuators on a wing determines the control effectiveness of the airplane. One approach to placement Maximizes the moments about the pitch, roll, and yaw axes, while minimizing the coupling. Genetic algorithms have been instrumental in achieving good solutions to discrete optimization problems, such as the actuator placement problem. As a proof of concept, a genetic has been developed to find the minimum number of actuators required to provide uncoupled pitch, roll, and yaw control for a simplified, untapered, unswept wing model. To find the optimum placement by searching all possible combinations would require 1,100 hours. Formulating the problem and as a multi-objective problem and modifying it to take advantage of the parallel processing capabilities of a multi-processor computer, reduces the optimization time to 22 hours.
Algorithms for parallel computers

International Nuclear Information System (INIS)

Churchhouse, R.F.

1985-01-01

Until relatively recently almost all the algorithms for use on computers had been designed on the (usually unstated) assumption that they were to be run on single processor, serial machines. With the introduction of vector processors, array processors and interconnected systems of mainframes, minis and micros, however, various forms of parallelism have become available. The advantage of parallelism is that it offers increased overall processing speed but it also raises some fundamental questions, including: (i) which, if any, of the existing 'serial' algorithms can be adapted for use in the parallel mode. (ii) How close to optimal can such adapted algorithms be and, where relevant, what are the convergence criteria. (iii) How can we design new algorithms specifically for parallel systems. (iv) For multi-processor systems how can we handle the software aspects of the interprocessor communications. Aspects of these questions illustrated by examples are considered in these lectures. (orig.)
Parallel Algorithms and Patterns

Energy Technology Data Exchange (ETDEWEB)

Robey, Robert W. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

2016-06-16

This is a powerpoint presentation on parallel algorithms and patterns. A parallel algorithm is a well-defined, step-by-step computational procedure that emphasizes concurrency to solve a problem. Examples of problems include: Sorting, searching, optimization, matrix operations. A parallel pattern is a computational step in a sequence of independent, potentially concurrent operations that occurs in diverse scenarios with some frequency. Examples are: Reductions, prefix scans, ghost cell updates. We only touch on parallel patterns in this presentation. It really deserves its own detailed discussion which Gabe Rockefeller would like to develop.
High-Speed General Purpose Genetic Algorithm Processor.

Science.gov (United States)

Hoseini Alinodehi, Seyed Pourya; Moshfe, Sajjad; Saber Zaeimian, Masoumeh; Khoei, Abdollah; Hadidi, Khairollah

2016-07-01

In this paper, an ultrafast steady-state genetic algorithm processor (GAP) is presented. Due to the heavy computational load of genetic algorithms (GAs), they usually take a long time to find optimum solutions. Hardware implementation is a significant approach to overcome the problem by speeding up the GAs procedure. Hence, we designed a digital CMOS implementation of GA in [Formula: see text] process. The proposed processor is not bounded to a specific application. Indeed, it is a general-purpose processor, which is capable of performing optimization in any possible application. Utilizing speed-boosting techniques, such as pipeline scheme, parallel coarse-grained processing, parallel fitness computation, parallel selection of parents, dual-population scheme, and support for pipelined fitness computation, the proposed processor significantly reduces the processing time. Furthermore, by relying on a built-in discard operator the proposed hardware may be used in constrained problems that are very common in control applications. In the proposed design, a large search space is achievable through the bit string length extension of individuals in the genetic population by connecting the 32-bit GAPs. In addition, the proposed processor supports parallel processing, in which the GAs procedure can be run on several connected processors simultaneously.
A survey of parallel multigrid algorithms

Science.gov (United States)

Chan, Tony F.; Tuminaro, Ray S.

1987-01-01

A typical multigrid algorithm applied to well-behaved linear-elliptic partial-differential equations (PDEs) is described. Criteria for designing and evaluating parallel algorithms are presented. Before evaluating the performance of some parallel multigrid algorithms, consideration is given to some theoretical complexity results for solving PDEs in parallel and for executing the multigrid algorithm. The effect of mapping and load imbalance on the partial efficiency of the algorithm is studied.
Empirical valence bond models for reactive potential energy surfaces: a parallel multilevel genetic program approach.

Science.gov (United States)

Bellucci, Michael A; Coker, David F

2011-07-28

We describe a new method for constructing empirical valence bond potential energy surfaces using a parallel multilevel genetic program (PMLGP). Genetic programs can be used to perform an efficient search through function space and parameter space to find the best functions and sets of parameters that fit energies obtained by ab initio electronic structure calculations. Building on the traditional genetic program approach, the PMLGP utilizes a hierarchy of genetic programming on two different levels. The lower level genetic programs are used to optimize coevolving populations in parallel while the higher level genetic program (HLGP) is used to optimize the genetic operator probabilities of the lower level genetic programs. The HLGP allows the algorithm to dynamically learn the mutation or combination of mutations that most effectively increase the fitness of the populations, causing a significant increase in the algorithm's accuracy and efficiency. The algorithm's accuracy and efficiency is tested against a standard parallel genetic program with a variety of one-dimensional test cases. Subsequently, the PMLGP is utilized to obtain an accurate empirical valence bond model for proton transfer in 3-hydroxy-gamma-pyrone in gas phase and protic solvent. © 2011 American Institute of Physics
Parallel algorithms

CERN Document Server

Casanova, Henri; Robert, Yves

2008-01-01

""…The authors of the present book, who have extensive credentials in both research and instruction in the area of parallelism, present a sound, principled treatment of parallel algorithms. … This book is very well written and extremely well designed from an instructional point of view. … The authors have created an instructive and fascinating text. The book will serve researchers as well as instructors who need a solid, readable text for a course on parallelism in computing. Indeed, for anyone who wants an understandable text from which to acquire a current, rigorous, and broad vi
An Improved Hierarchical Genetic Algorithm for Sheet Cutting Scheduling with Process Constraints

OpenAIRE

Yunqing Rao; Dezhong Qi; Jinling Li

2013-01-01

For the first time, an improved hierarchical genetic algorithm for sheet cutting problem which involves n cutting patterns for m non-identical parallel machines with process constraints has been proposed in the integrated cutting stock model. The objective of the cutting scheduling problem is minimizing the weighted completed time. A mathematical model for this problem is presented, an improved hierarchical genetic algorithm (ant colony—hierarchical genetic algorithm) is developed for better ...
Parallel algorithms for continuum dynamics

International Nuclear Information System (INIS)

Hicks, D.L.; Liebrock, L.M.

1987-01-01

Simply porting existing parallel programs to a new parallel processor may not achieve the full speedup possible; to achieve the maximum efficiency may require redesigning the parallel algorithms for the specific architecture. The authors discuss here parallel algorithms that were developed first for the HEP processor and then ported to the CRAY X-MP/4, the ELXSI/10, and the Intel iPSC/32. Focus is mainly on the most recent parallel processing results produced, i.e., those on the Intel Hypercube. The applications are simulations of continuum dynamics in which the momentum and stress gradients are important. Examples of these are inertial confinement fusion experiments, severe breaks in the coolant system of a reactor, weapons physics, shock-wave physics. Speedup efficiencies on the Intel iPSC Hypercube are very sensitive to the ratio of communication to computation. Great care must be taken in designing algorithms for this machine to avoid global communication. This is much more critical on the iPSC than it was on the three previous parallel processors
Parallel Computing Strategies for Irregular Algorithms

Science.gov (United States)

Biswas, Rupak; Oliker, Leonid; Shan, Hongzhang; Biegel, Bryan (Technical Monitor)

2002-01-01

Parallel computing promises several orders of magnitude increase in our ability to solve realistic computationally-intensive problems, but relies on their efficient mapping and execution on large-scale multiprocessor architectures. Unfortunately, many important applications are irregular and dynamic in nature, making their effective parallel implementation a daunting task. Moreover, with the proliferation of parallel architectures and programming paradigms, the typical scientist is faced with a plethora of questions that must be answered in order to obtain an acceptable parallel implementation of the solution algorithm. In this paper, we consider three representative irregular applications: unstructured remeshing, sparse matrix computations, and N-body problems, and parallelize them using various popular programming paradigms on a wide spectrum of computer platforms ranging from state-of-the-art supercomputers to PC clusters. We present the underlying problems, the solution algorithms, and the parallel implementation strategies. Smart load-balancing, partitioning, and ordering techniques are used to enhance parallel performance. Overall results demonstrate the complexity of efficiently parallelizing irregular algorithms.
Algorithmically specialized parallel computers

CERN Document Server

Snyder, Lawrence; Gannon, Dennis B

1985-01-01

Algorithmically Specialized Parallel Computers focuses on the concept and characteristics of an algorithmically specialized computer.This book discusses the algorithmically specialized computers, algorithmic specialization using VLSI, and innovative architectures. The architectures and algorithms for digital signal, speech, and image processing and specialized architectures for numerical computations are also elaborated. Other topics include the model for analyzing generalized inter-processor, pipelined architecture for search tree maintenance, and specialized computer organization for raster
Experiments with parallel algorithms for combinatorial problems

NARCIS (Netherlands)

G.A.P. Kindervater (Gerard); H.W.J.M. Trienekens

1985-01-01

textabstractIn the last decade many models for parallel computation have been proposed and many parallel algorithms have been developed. However, few of these models have been realized and most of these algorithms are supposed to run on idealized, unrealistic parallel machines. The parallel machines
Cloud Computing Task Scheduling Based on Cultural Genetic Algorithm

Directory of Open Access Journals (Sweden)

Li Jian-Wen

2016-01-01

Full Text Available The task scheduling strategy based on cultural genetic algorithm(CGA is proposed in order to improve the efficiency of task scheduling in the cloud computing platform, which targets at minimizing the total time and cost of task scheduling. The improved genetic algorithm is used to construct the main population space and knowledge space under cultural framework which get independent parallel evolution, forming a mechanism of mutual promotion to dispatch the cloud task. Simultaneously, in order to prevent the defects of the genetic algorithm which is easy to fall into local optimum, the non-uniform mutation operator is introduced to improve the search performance of the algorithm. The experimental results show that CGA reduces the total time and lowers the cost of the scheduling, which is an effective algorithm for the cloud task scheduling.
Model-driven product line engineering for mapping parallel algorithms to parallel computing platforms

NARCIS (Netherlands)

Arkin, Ethem; Tekinerdogan, Bedir

2016-01-01

Mapping parallel algorithms to parallel computing platforms requires several activities such as the analysis of the parallel algorithm, the definition of the logical configuration of the platform, the mapping of the algorithm to the logical configuration platform and the implementation of the
Ensemble of hybrid genetic algorithm for two-dimensional phase unwrapping

Science.gov (United States)

Balakrishnan, D.; Quan, C.; Tay, C. J.

2013-06-01

The phase unwrapping is the final and trickiest step in any phase retrieval technique. Phase unwrapping by artificial intelligence methods (optimization algorithms) such as hybrid genetic algorithm, reverse simulated annealing, particle swarm optimization, minimum cost matching showed better results than conventional phase unwrapping methods. In this paper, Ensemble of hybrid genetic algorithm with parallel populations is proposed to solve the branch-cut phase unwrapping problem. In a single populated hybrid genetic algorithm, the selection, cross-over and mutation operators are applied to obtain new population in every generation. The parameters and choice of operators will affect the performance of the hybrid genetic algorithm. The ensemble of hybrid genetic algorithm will facilitate to have different parameters set and different choice of operators simultaneously. Each population will use different set of parameters and the offspring of each population will compete against the offspring of all other populations, which use different set of parameters. The effectiveness of proposed algorithm is demonstrated by phase unwrapping examples and advantages of the proposed method are discussed.
Parallel algorithms for mapping pipelined and parallel computations

Science.gov (United States)

Nicol, David M.

1988-01-01

Many computational problems in image processing, signal processing, and scientific computing are naturally structured for either pipelined or parallel computation. When mapping such problems onto a parallel architecture it is often necessary to aggregate an obvious problem decomposition. Even in this context the general mapping problem is known to be computationally intractable, but recent advances have been made in identifying classes of problems and architectures for which optimal solutions can be found in polynomial time. Among these, the mapping of pipelined or parallel computations onto linear array, shared memory, and host-satellite systems figures prominently. This paper extends that work first by showing how to improve existing serial mapping algorithms. These improvements have significantly lower time and space complexities: in one case a published O(nm sup 3) time algorithm for mapping m modules onto n processors is reduced to an O(nm log m) time complexity, and its space requirements reduced from O(nm sup 2) to O(m). Run time complexity is further reduced with parallel mapping algorithms based on these improvements, which run on the architecture for which they create the mappings.
Parallel Algorithms for Groebner-Basis Reduction

Science.gov (United States)

1987-09-25

22209 ELEMENT NO. NO. NO. ACCESSION NO. 11. TITLE (Include Security Classification) * PARALLEL ALGORITHMS FOR GROEBNER -BASIS REDUCTION 12. PERSONAL...All other editions are obsolete. Productivity Engineering in the UNIXt Environment p Parallel Algorithms for Groebner -Basis Reduction Technical Report
A hybrid niched-island genetic algorithm applied to a nuclear core optimization problem

International Nuclear Information System (INIS)

Pereira, Claudio M.N.A.

2005-01-01

Diversity maintenance is a key-feature in most genetic-based optimization processes. The quest for such characteristic, has been motivating improvements in the original genetic algorithm (GA). The use of multiple populations (called islands) has demonstrating to increase diversity, delaying the genetic drift. Island Genetic Algorithms (IGA) lead to better results, however, the drift is only delayed, but not avoided. An important advantage of this approach is the simplicity and efficiency for parallel processing. Diversity can also be improved by the use of niching techniques. Niched Genetic Algorithms (NGA) are able to avoid the genetic drift, by containing evolution in niches of a single-population GA, however computational cost is increased. In this work it is investigated the use of a hybrid Niched-Island Genetic Algorithm (NIGA) in a nuclear core optimization problem found in literature. Computational experiments demonstrate that it is possible to take advantage of both, performance enhancement due to the parallelism and drift avoidance due to the use of niches. Comparative results shown that the proposed NIGA demonstrated to be more efficient and robust than an IGA and a NGA for solving the proposed optimization problem. (author)
From Genetics to Genetic Algorithms

Indian Academy of Sciences (India)

Genetic algorithms (GAs) are computational optimisation schemes with an ... The algorithms solve optimisation problems ..... Genetic Algorithms in Search, Optimisation and Machine. Learning, Addison-Wesley Publishing Company, Inc. 1989.
Mapping robust parallel multigrid algorithms to scalable memory architectures

Science.gov (United States)

Overman, Andrea; Vanrosendale, John

1993-01-01

The convergence rate of standard multigrid algorithms degenerates on problems with stretched grids or anisotropic operators. The usual cure for this is the use of line or plane relaxation. However, multigrid algorithms based on line and plane relaxation have limited and awkward parallelism and are quite difficult to map effectively to highly parallel architectures. Newer multigrid algorithms that overcome anisotropy through the use of multiple coarse grids rather than relaxation are better suited to massively parallel architectures because they require only simple point-relaxation smoothers. In this paper, we look at the parallel implementation of a V-cycle multiple semicoarsened grid (MSG) algorithm on distributed-memory architectures such as the Intel iPSC/860 and Paragon computers. The MSG algorithms provide two levels of parallelism: parallelism within the relaxation or interpolation on each grid and across the grids on each multigrid level. Both levels of parallelism must be exploited to map these algorithms effectively to parallel architectures. This paper describes a mapping of an MSG algorithm to distributed-memory architectures that demonstrates how both levels of parallelism can be exploited. The result is a robust and effective multigrid algorithm for distributed-memory machines.

DWFS: A Wrapper Feature Selection Tool Based on a Parallel Genetic Algorithm

KAUST Repository

Soufan, Othman

2015-02-26

Many scientific problems can be formulated as classification tasks. Data that harbor relevant information are usually described by a large number of features. Frequently, many of these features are irrelevant for the class prediction. The efficient implementation of classification models requires identification of suitable combinations of features. The smaller number of features reduces the problem\\'s dimensionality and may result in higher classification performance. We developed DWFS, a web-based tool that allows for efficient selection of features for a variety of problems. DWFS follows the wrapper paradigm and applies a search strategy based on Genetic Algorithms (GAs). A parallel GA implementation examines and evaluates simultaneously large number of candidate collections of features. DWFS also integrates various filteringmethods thatmay be applied as a pre-processing step in the feature selection process. Furthermore, weights and parameters in the fitness function of GA can be adjusted according to the application requirements. Experiments using heterogeneous datasets from different biomedical applications demonstrate that DWFS is fast and leads to a significant reduction of the number of features without sacrificing performance as compared to several widely used existing methods. DWFS can be accessed online at www.cbrc.kaust.edu.sa/dwfs.
DWFS: A Wrapper Feature Selection Tool Based on a Parallel Genetic Algorithm

KAUST Repository

Soufan, Othman; Kleftogiannis, Dimitrios A.; Kalnis, Panos; Bajic, Vladimir B.

2015-01-01

Many scientific problems can be formulated as classification tasks. Data that harbor relevant information are usually described by a large number of features. Frequently, many of these features are irrelevant for the class prediction. The efficient implementation of classification models requires identification of suitable combinations of features. The smaller number of features reduces the problem's dimensionality and may result in higher classification performance. We developed DWFS, a web-based tool that allows for efficient selection of features for a variety of problems. DWFS follows the wrapper paradigm and applies a search strategy based on Genetic Algorithms (GAs). A parallel GA implementation examines and evaluates simultaneously large number of candidate collections of features. DWFS also integrates various filteringmethods thatmay be applied as a pre-processing step in the feature selection process. Furthermore, weights and parameters in the fitness function of GA can be adjusted according to the application requirements. Experiments using heterogeneous datasets from different biomedical applications demonstrate that DWFS is fast and leads to a significant reduction of the number of features without sacrificing performance as compared to several widely used existing methods. DWFS can be accessed online at www.cbrc.kaust.edu.sa/dwfs.
Parallel Architectures and Parallel Algorithms for Integrated Vision Systems. Ph.D. Thesis

Science.gov (United States)

Choudhary, Alok Nidhi

1989-01-01

Computer vision is regarded as one of the most complex and computationally intensive problems. An integrated vision system (IVS) is a system that uses vision algorithms from all levels of processing to perform for a high level application (e.g., object recognition). An IVS normally involves algorithms from low level, intermediate level, and high level vision. Designing parallel architectures for vision systems is of tremendous interest to researchers. Several issues are addressed in parallel architectures and parallel algorithms for integrated vision systems.
Reliability optimization of series-parallel systems with a choice of redundancy strategies using a genetic algorithm

Energy Technology Data Exchange (ETDEWEB)

Tavakkoli-Moghaddam, R. [Department of Industrial Engineering, Faculty of Engineering, University of Tehran, P.O. Box 11365/4563, Tehran (Iran, Islamic Republic of); Department of Mechanical Engineering, The University of British Columbia, Vancouver (Canada)], E-mail: tavakoli@ut.ac.ir; Safari, J. [Department of Industrial Engineering, Science and Research Branch, Islamic Azad University, Tehran (Iran, Islamic Republic of)], E-mail: jalalsafari@pideco.com; Sassani, F. [Department of Mechanical Engineering, The University of British Columbia, Vancouver (Canada)], E-mail: sassani@mech.ubc.ca

2008-04-15

This paper proposes a genetic algorithm (GA) for a redundancy allocation problem for the series-parallel system when the redundancy strategy can be chosen for individual subsystems. Majority of the solution methods for the general redundancy allocation problems assume that the redundancy strategy for each subsystem is predetermined and fixed. In general, active redundancy has received more attention in the past. However, in practice both active and cold-standby redundancies may be used within a particular system design and the choice of the redundancy strategy becomes an additional decision variable. Thus, the problem is to select the best redundancy strategy, component, and redundancy level for each subsystem in order to maximize the system reliability under system-level constraints. This belongs to the NP-hard class of problems. Due to its complexity, it is so difficult to optimally solve such a problem by using traditional optimization tools. It is demonstrated in this paper that GA is an efficient method for solving this type of problems. Finally, computational results for a typical scenario are presented and the robustness of the proposed algorithm is discussed.
Reliability optimization of series-parallel systems with a choice of redundancy strategies using a genetic algorithm

International Nuclear Information System (INIS)

Tavakkoli-Moghaddam, R.; Safari, J.; Sassani, F.

2008-01-01

This paper proposes a genetic algorithm (GA) for a redundancy allocation problem for the series-parallel system when the redundancy strategy can be chosen for individual subsystems. Majority of the solution methods for the general redundancy allocation problems assume that the redundancy strategy for each subsystem is predetermined and fixed. In general, active redundancy has received more attention in the past. However, in practice both active and cold-standby redundancies may be used within a particular system design and the choice of the redundancy strategy becomes an additional decision variable. Thus, the problem is to select the best redundancy strategy, component, and redundancy level for each subsystem in order to maximize the system reliability under system-level constraints. This belongs to the NP-hard class of problems. Due to its complexity, it is so difficult to optimally solve such a problem by using traditional optimization tools. It is demonstrated in this paper that GA is an efficient method for solving this type of problems. Finally, computational results for a typical scenario are presented and the robustness of the proposed algorithm is discussed
Parallelization of TMVA Machine Learning Algorithms

CERN Document Server

Hajili, Mammad

2017-01-01

This report reflects my work on Parallelization of TMVA Machine Learning Algorithms integrated to ROOT Data Analysis Framework during summer internship at CERN. The report consists of 4 impor- tant part - data set used in training and validation, algorithms that multiprocessing applied on them, parallelization techniques and re- sults of execution time changes due to number of workers.
Parallel External Memory Graph Algorithms

DEFF Research Database (Denmark)

Arge, Lars Allan; Goodrich, Michael T.; Sitchinava, Nodari

2010-01-01

In this paper, we study parallel I/O efficient graph algorithms in the Parallel External Memory (PEM) model, one o f the private-cache chip multiprocessor (CMP) models. We study the fundamental problem of list ranking which leads to efficient solutions to problems on trees, such as computing lowest...... an optimal speedup of Â¿(P) in parallel I/O complexity and parallel computation time, compared to the single-processor external memory counterparts....
Parallel algorithms for numerical linear algebra

CERN Document Server

van der Vorst, H

1990-01-01

This is the first in a new series of books presenting research results and developments concerning the theory and applications of parallel computers, including vector, pipeline, array, fifth/future generation computers, and neural computers.All aspects of high-speed computing fall within the scope of the series, e.g. algorithm design, applications, software engineering, networking, taxonomy, models and architectural trends, performance, peripheral devices.Papers in Volume One cover the main streams of parallel linear algebra: systolic array algorithms, message-passing systems, algorithms for p
Improvement of remote monitoring on water quality in a subtropical reservoir by incorporating grammatical evolution with parallel genetic algorithms into satellite imagery.

Science.gov (United States)

Chen, Li; Tan, Chih-Hung; Kao, Shuh-Ji; Wang, Tai-Sheng

2008-01-01

Parallel GEGA was constructed by incorporating grammatical evolution (GE) into the parallel genetic algorithm (GA) to improve reservoir water quality monitoring based on remote sensing images. A cruise was conducted to ground-truth chlorophyll-a (Chl-a) concentration longitudinally along the Feitsui Reservoir, the primary water supply for Taipei City in Taiwan. Empirical functions with multiple spectral parameters from the Landsat 7 Enhanced Thematic Mapper (ETM+) data were constructed. The GE, an evolutionary automatic programming type system, automatically discovers complex nonlinear mathematical relationships among observed Chl-a concentrations and remote-sensed imageries. A GA was used afterward with GE to optimize the appropriate function type. Various parallel subpopulations were processed to enhance search efficiency during the optimization procedure with GA. Compared with a traditional linear multiple regression (LMR), the performance of parallel GEGA was found to be better than that of the traditional LMR model with lower estimating errors.
Improvement of Parallel Algorithm for MATRA Code

International Nuclear Information System (INIS)

Kim, Seong-Jin; Seo, Kyong-Won; Kwon, Hyouk; Hwang, Dae-Hyun

2014-01-01

The feasibility study to parallelize the MATRA code was conducted in KAERI early this year. As a result, a parallel algorithm for the MATRA code has been developed to decrease a considerably required computing time to solve a bigsize problem such as a whole core pin-by-pin problem of a general PWR reactor and to improve an overall performance of the multi-physics coupling calculations. It was shown that the performance of the MATRA code was greatly improved by implementing the parallel algorithm using MPI communication. For problems of a 1/8 core and whole core for SMART reactor, a speedup was evaluated as about 10 when the numbers of used processor were 25. However, it was also shown that the performance deteriorated as the axial node number increased. In this paper, the procedure of a communication between processors is optimized to improve the previous parallel algorithm.. To improve the performance deterioration of the parallelized MATRA code, the communication algorithm between processors was newly presented. It was shown that the speedup was improved and stable regardless of the axial node number
Parallel Algorithms for the Exascale Era

Energy Technology Data Exchange (ETDEWEB)

Robey, Robert W. [Los Alamos National Laboratory

2016-10-19

New parallel algorithms are needed to reach the Exascale level of parallelism with millions of cores. We look at some of the research developed by students in projects at LANL. The research blends ideas from the early days of computing while weaving in the fresh approach brought by students new to the field of high performance computing. We look at reproducibility of global sums and why it is important to parallel computing. Next we look at how the concept of hashing has led to the development of more scalable algorithms suitable for next-generation parallel computers. Nearly all of this work has been done by undergraduates and published in leading scientific journals.
Parallelization of a spherical Sn transport theory algorithm

International Nuclear Information System (INIS)

Haghighat, A.

1989-01-01

The work described in this paper derives a parallel algorithm for an R-dependent spherical S N transport theory algorithm and studies its performance by testing different sample problems. The S N transport method is one of the most accurate techniques used to solve the linear Boltzmann equation. Several studies have been done on the vectorization of the S N algorithms; however, very few studies have been performed on the parallelization of this algorithm. Weinke and Hommoto have looked at the parallel processing of the different energy groups, and Azmy recently studied the parallel processing of the inner iterations of an X-Y S N nodal transport theory method. Both studies have reported very encouraging results, which have prompted us to look at the parallel processing of an R-dependent S N spherical geometry algorithm. This geometry was chosen because, in spite of its simplicity, it contains the complications of the curvilinear geometries (i.e., redistribution of neutrons over the discretized angular bins)
Parallel Algorithms for Switching Edges in Heterogeneous Graphs.

Science.gov (United States)

Bhuiyan, Hasanuzzaman; Khan, Maleq; Chen, Jiangzhuo; Marathe, Madhav

2017-06-01

An edge switch is an operation on a graph (or network) where two edges are selected randomly and one of their end vertices are swapped with each other. Edge switch operations have important applications in graph theory and network analysis, such as in generating random networks with a given degree sequence, modeling and analyzing dynamic networks, and in studying various dynamic phenomena over a network. The recent growth of real-world networks motivates the need for efficient parallel algorithms. The dependencies among successive edge switch operations and the requirement to keep the graph simple (i.e., no self-loops or parallel edges) as the edges are switched lead to significant challenges in designing a parallel algorithm. Addressing these challenges requires complex synchronization and communication among the processors leading to difficulties in achieving a good speedup by parallelization. In this paper, we present distributed memory parallel algorithms for switching edges in massive networks. These algorithms provide good speedup and scale well to a large number of processors. A harmonic mean speedup of 73.25 is achieved on eight different networks with 1024 processors. One of the steps in our edge switch algorithms requires the computation of multinomial random variables in parallel. This paper presents the first non-trivial parallel algorithm for the problem, achieving a speedup of 925 using 1024 processors.
A Novel Parallel Algorithm for Edit Distance Computation

Directory of Open Access Journals (Sweden)

Muhammad Murtaza Yousaf

2018-01-01

Full Text Available The edit distance between two sequences is the minimum number of weighted transformation-operations that are required to transform one string into the other. The weighted transformation-operations are insert, remove, and substitute. Dynamic programming solution to find edit distance exists but it becomes computationally intensive when the lengths of strings become very large. This work presents a novel parallel algorithm to solve edit distance problem of string matching. The algorithm is based on resolving dependencies in the dynamic programming solution of the problem and it is able to compute each row of edit distance table in parallel. In this way, it becomes possible to compute the complete table in min(m,n iterations for strings of size m and n whereas state-of-the-art parallel algorithm solves the problem in max(m,n iterations. The proposed algorithm also increases the amount of parallelism in each of its iteration. The algorithm is also capable of exploiting spatial locality while its implementation. Additionally, the algorithm works in a load balanced way that further improves its performance. The algorithm is implemented for multicore systems having shared memory. Implementation of the algorithm in OpenMP shows linear speedup and better execution time as compared to state-of-the-art parallel approach. Efficiency of the algorithm is also proven better in comparison to its competitor.
Parallel paving: An algorithm for generating distributed, adaptive, all-quadrilateral meshes on parallel computers

Energy Technology Data Exchange (ETDEWEB)

Lober, R.R.; Tautges, T.J.; Vaughan, C.T.

1997-03-01

Paving is an automated mesh generation algorithm which produces all-quadrilateral elements. It can additionally generate these elements in varying sizes such that the resulting mesh adapts to a function distribution, such as an error function. While powerful, conventional paving is a very serial algorithm in its operation. Parallel paving is the extension of serial paving into parallel environments to perform the same meshing functions as conventional paving only on distributed, discretized models. This extension allows large, adaptive, parallel finite element simulations to take advantage of paving`s meshing capabilities for h-remap remeshing. A significantly modified version of the CUBIT mesh generation code has been developed to host the parallel paving algorithm and demonstrate its capabilities on both two dimensional and three dimensional surface geometries and compare the resulting parallel produced meshes to conventionally paved meshes for mesh quality and algorithm performance. Sandia`s {open_quotes}tiling{close_quotes} dynamic load balancing code has also been extended to work with the paving algorithm to retain parallel efficiency as subdomains undergo iterative mesh refinement.
Cellular Genetic Algorithm with Communicating Grids for Assembly Line Balancing Problems

Directory of Open Access Journals (Sweden)

BRUDARU, O.

2010-05-01

Full Text Available This paper presents a new approach with cellular multigrid genetic algorithms for the "I"-shaped and "U"-shaped assembly line balancing problems, including parallel workstations and compatibility constraints. First, a cellular hybrid genetic algorithm that uses a single grid is described. Appropriate operators for mutation, hypermutation, and crossover and two devoration techniques are proposed for creating and maintaining groups based on similarity. This monogrid algorithm is extended for handling many populations placed on different grids. In the multigrid version, the population of each grid is organized in clusters using the positional information of the chromosomes. A similarity preserving communication protocol between the clusters placed on different grids is introduced. The experimental evaluation shows that the multigrid cellular genetic algorithm with communicating grids is better than the hybrid genetic algorithm used for building it, whereas it dominates the monogrid version in all cases. Absolute performance is evaluated using classical benchmarks. The role of certain components of the cellular algorithm is explained and the effect of some parameters is evaluated.
An efficient parallel algorithm for matrix-vector multiplication

Energy Technology Data Exchange (ETDEWEB)

Hendrickson, B.; Leland, R.; Plimpton, S.

1993-03-01

The multiplication of a vector by a matrix is the kernel computation of many algorithms in scientific computation. A fast parallel algorithm for this calculation is therefore necessary if one is to make full use of the new generation of parallel supercomputers. This paper presents a high performance, parallel matrix-vector multiplication algorithm that is particularly well suited to hypercube multiprocessors. For an n x n matrix on p processors, the communication cost of this algorithm is O(n/[radical]p + log(p)), independent of the matrix sparsity pattern. The performance of the algorithm is demonstrated by employing it as the kernel in the well-known NAS conjugate gradient benchmark, where a run time of 6.09 seconds was observed. This is the best published performance on this benchmark achieved to date using a massively parallel supercomputer.
Iterative algorithms for large sparse linear systems on parallel computers

Science.gov (United States)

Adams, L. M.

1982-01-01

Algorithms for assembling in parallel the sparse system of linear equations that result from finite difference or finite element discretizations of elliptic partial differential equations, such as those that arise in structural engineering are developed. Parallel linear stationary iterative algorithms and parallel preconditioned conjugate gradient algorithms are developed for solving these systems. In addition, a model for comparing parallel algorithms on array architectures is developed and results of this model for the algorithms are given.
A Parallel Butterfly Algorithm

KAUST Repository

Poulson, Jack; Demanet, Laurent; Maxwell, Nicholas; Ying, Lexing

2014-01-01

The butterfly algorithm is a fast algorithm which approximately evaluates a discrete analogue of the integral transform (Equation Presented.) at large numbers of target points when the kernel, K(x, y), is approximately low-rank when restricted to subdomains satisfying a certain simple geometric condition. In d dimensions with O(Nd) quasi-uniformly distributed source and target points, when each appropriate submatrix of K is approximately rank-r, the running time of the algorithm is at most O(r2Nd logN). A parallelization of the butterfly algorithm is introduced which, assuming a message latency of α and per-process inverse bandwidth of β, executes in at most (Equation Presented.) time using p processes. This parallel algorithm was then instantiated in the form of the open-source DistButterfly library for the special case where K(x, y) = exp(iΦ(x, y)), where Φ(x, y) is a black-box, sufficiently smooth, real-valued phase function. Experiments on Blue Gene/Q demonstrate impressive strong-scaling results for important classes of phase functions. Using quasi-uniform sources, hyperbolic Radon transforms, and an analogue of a three-dimensional generalized Radon transform were, respectively, observed to strong-scale from 1-node/16-cores up to 1024-nodes/16,384-cores with greater than 90% and 82% efficiency, respectively. © 2014 Society for Industrial and Applied Mathematics.
A Parallel Butterfly Algorithm

KAUST Repository

Poulson, Jack

2014-02-04

The butterfly algorithm is a fast algorithm which approximately evaluates a discrete analogue of the integral transform (Equation Presented.) at large numbers of target points when the kernel, K(x, y), is approximately low-rank when restricted to subdomains satisfying a certain simple geometric condition. In d dimensions with O(Nd) quasi-uniformly distributed source and target points, when each appropriate submatrix of K is approximately rank-r, the running time of the algorithm is at most O(r2Nd logN). A parallelization of the butterfly algorithm is introduced which, assuming a message latency of α and per-process inverse bandwidth of β, executes in at most (Equation Presented.) time using p processes. This parallel algorithm was then instantiated in the form of the open-source DistButterfly library for the special case where K(x, y) = exp(iΦ(x, y)), where Φ(x, y) is a black-box, sufficiently smooth, real-valued phase function. Experiments on Blue Gene/Q demonstrate impressive strong-scaling results for important classes of phase functions. Using quasi-uniform sources, hyperbolic Radon transforms, and an analogue of a three-dimensional generalized Radon transform were, respectively, observed to strong-scale from 1-node/16-cores up to 1024-nodes/16,384-cores with greater than 90% and 82% efficiency, respectively. © 2014 Society for Industrial and Applied Mathematics.

Genetic particle swarm parallel algorithm analysis of optimization arrangement on mistuned blades

Science.gov (United States)

Zhao, Tianyu; Yuan, Huiqun; Yang, Wenjun; Sun, Huagang

2017-12-01

This article introduces a method of mistuned parameter identification which consists of static frequency testing of blades, dichotomy and finite element analysis. A lumped parameter model of an engine bladed-disc system is then set up. A bladed arrangement optimization method, namely the genetic particle swarm optimization algorithm, is presented. It consists of a discrete particle swarm optimization and a genetic algorithm. From this, the local and global search ability is introduced. CUDA-based co-evolution particle swarm optimization, using a graphics processing unit, is presented and its performance is analysed. The results show that using optimization results can reduce the amplitude and localization of the forced vibration response of a bladed-disc system, while optimization based on the CUDA framework can improve the computing speed. This method could provide support for engineering applications in terms of effectiveness and efficiency.
High-Performance Psychometrics: The Parallel-E Parallel-M Algorithm for Generalized Latent Variable Models. Research Report. ETS RR-16-34

Science.gov (United States)

von Davier, Matthias

2016-01-01

This report presents results on a parallel implementation of the expectation-maximization (EM) algorithm for multidimensional latent variable models. The developments presented here are based on code that parallelizes both the E step and the M step of the parallel-E parallel-M algorithm. Examples presented in this report include item response…
High Performance Parallel Multigrid Algorithms for Unstructured Grids

Science.gov (United States)

Frederickson, Paul O.

1996-01-01

We describe a high performance parallel multigrid algorithm for a rather general class of unstructured grid problems in two and three dimensions. The algorithm PUMG, for parallel unstructured multigrid, is related in structure to the parallel multigrid algorithm PSMG introduced by McBryan and Frederickson, for they both obtain a higher convergence rate through the use of multiple coarse grids. Another reason for the high convergence rate of PUMG is its smoother, an approximate inverse developed by Baumgardner and Frederickson.
Parallel image encryption algorithm based on discretized chaotic map

International Nuclear Information System (INIS)

Zhou Qing; Wong Kwokwo; Liao Xiaofeng; Xiang Tao; Hu Yue

2008-01-01

Recently, a variety of chaos-based algorithms were proposed for image encryption. Nevertheless, none of them works efficiently in parallel computing environment. In this paper, we propose a framework for parallel image encryption. Based on this framework, a new algorithm is designed using the discretized Kolmogorov flow map. It fulfills all the requirements for a parallel image encryption algorithm. Moreover, it is secure and fast. These properties make it a good choice for image encryption on parallel computing platforms
Research on parallel algorithm for sequential pattern mining

Science.gov (United States)

Zhou, Lijuan; Qin, Bai; Wang, Yu; Hao, Zhongxiao

2008-03-01

Sequential pattern mining is the mining of frequent sequences related to time or other orders from the sequence database. Its initial motivation is to discover the laws of customer purchasing in a time section by finding the frequent sequences. In recent years, sequential pattern mining has become an important direction of data mining, and its application field has not been confined to the business database and has extended to new data sources such as Web and advanced science fields such as DNA analysis. The data of sequential pattern mining has characteristics as follows: mass data amount and distributed storage. Most existing sequential pattern mining algorithms haven't considered the above-mentioned characteristics synthetically. According to the traits mentioned above and combining the parallel theory, this paper puts forward a new distributed parallel algorithm SPP(Sequential Pattern Parallel). The algorithm abides by the principal of pattern reduction and utilizes the divide-and-conquer strategy for parallelization. The first parallel task is to construct frequent item sets applying frequent concept and search space partition theory and the second task is to structure frequent sequences using the depth-first search method at each processor. The algorithm only needs to access the database twice and doesn't generate the candidated sequences, which abates the access time and improves the mining efficiency. Based on the random data generation procedure and different information structure designed, this paper simulated the SPP algorithm in a concrete parallel environment and implemented the AprioriAll algorithm. The experiments demonstrate that compared with AprioriAll, the SPP algorithm had excellent speedup factor and efficiency.
Explorations of the implementation of a parallel IDW interpolation algorithm in a Linux cluster-based parallel GIS

Science.gov (United States)

Huang, Fang; Liu, Dingsheng; Tan, Xicheng; Wang, Jian; Chen, Yunping; He, Binbin

2011-04-01

To design and implement an open-source parallel GIS (OP-GIS) based on a Linux cluster, the parallel inverse distance weighting (IDW) interpolation algorithm has been chosen as an example to explore the working model and the principle of algorithm parallel pattern (APP), one of the parallelization patterns for OP-GIS. Based on an analysis of the serial IDW interpolation algorithm of GRASS GIS, this paper has proposed and designed a specific parallel IDW interpolation algorithm, incorporating both single process, multiple data (SPMD) and master/slave (M/S) programming modes. The main steps of the parallel IDW interpolation algorithm are: (1) the master node packages the related information, and then broadcasts it to the slave nodes; (2) each node calculates its assigned data extent along one row using the serial algorithm; (3) the master node gathers the data from all nodes; and (4) iterations continue until all rows have been processed, after which the results are outputted. According to the experiments performed in the course of this work, the parallel IDW interpolation algorithm can attain an efficiency greater than 0.93 compared with similar algorithms, which indicates that the parallel algorithm can greatly reduce processing time and maximize speed and performance.
Parallel algorithms and architecture for computation of manipulator forward dynamics

Science.gov (United States)

Fijany, Amir; Bejczy, Antal K.

1989-01-01

Parallel computation of manipulator forward dynamics is investigated. Considering three classes of algorithms for the solution of the problem, that is, the O(n), the O(n exp 2), and the O(n exp 3) algorithms, parallelism in the problem is analyzed. It is shown that the problem belongs to the class of NC and that the time and processors bounds are of O(log2/2n) and O(n exp 4), respectively. However, the fastest stable parallel algorithms achieve the computation time of O(n) and can be derived by parallelization of the O(n exp 3) serial algorithms. Parallel computation of the O(n exp 3) algorithms requires the development of parallel algorithms for a set of fundamentally different problems, that is, the Newton-Euler formulation, the computation of the inertia matrix, decomposition of the symmetric, positive definite matrix, and the solution of triangular systems. Parallel algorithms for this set of problems are developed which can be efficiently implemented on a unique architecture, a triangular array of n(n+2)/2 processors with a simple nearest-neighbor interconnection. This architecture is particularly suitable for VLSI and WSI implementations. The developed parallel algorithm, compared to the best serial O(n) algorithm, achieves an asymptotic speedup of more than two orders-of-magnitude in the computation the forward dynamics.
The island model for parallel implementation of evolutionary algorithm of Population-Based Incremental Learning (PBIL) optimization

International Nuclear Information System (INIS)

Lima, Alan M.M. de; Schirru, Roberto

2000-01-01

Genetic algorithms are biologically motivated adaptive systems which have been used, with good results, for function optimization. The purpose of this work is to introduce a new parallelization method to be applied to the Population-Based Incremental Learning (PBIL) algorithm. PBIL combines standard genetic algorithm mechanisms with simple competitive learning and has ben successfully used in combinatorial optimization problems. The development of this algorithm aims its application to the reload optimization of PWR nuclear reactors. Tests have been performed with combinatorial optimization problems similar to the reload problem. Results are compared to the serial PBIL ones, showing the new method's superiority and its viability as a tool for the nuclear core reload problem solution. (author)
Study on the Method of Association Rules Mining Based on Genetic Algorithm and Application in Analysis of Seawater Samples

Directory of Open Access Journals (Sweden)

Qiuhong Sun

2014-04-01

Full Text Available Based on the data mining research, the data mining based on genetic algorithm method, the genetic algorithm is briefly introduced, while the genetic algorithm based on two important theories and theoretical templates principle implicit parallelism is also discussed. Focuses on the application of genetic algorithms for association rule mining method based on association rule mining, this paper proposes a genetic algorithm fitness function structure, data encoding, such as the title of the improvement program, in particular through the early issues study, proposed the improved adaptive Pc, Pm algorithm is applied to the genetic algorithm, thereby improving efficiency of the algorithm. Finally, a genetic algorithm based association rule mining algorithm, and be applied in sea water samples database in data mining and prove its effective.
Introduction to parallel algorithms and architectures arrays, trees, hypercubes

CERN Document Server

Leighton, F Thomson

1991-01-01

Introduction to Parallel Algorithms and Architectures: Arrays Trees Hypercubes provides an introduction to the expanding field of parallel algorithms and architectures. This book focuses on parallel computation involving the most popular network architectures, namely, arrays, trees, hypercubes, and some closely related networks.Organized into three chapters, this book begins with an overview of the simplest architectures of arrays and trees. This text then presents the structures and relationships between the dominant network architectures, as well as the most efficient parallel algorithms for
An Improved Hierarchical Genetic Algorithm for Sheet Cutting Scheduling with Process Constraints

Directory of Open Access Journals (Sweden)

Yunqing Rao

2013-01-01

Full Text Available For the first time, an improved hierarchical genetic algorithm for sheet cutting problem which involves n cutting patterns for m non-identical parallel machines with process constraints has been proposed in the integrated cutting stock model. The objective of the cutting scheduling problem is minimizing the weighted completed time. A mathematical model for this problem is presented, an improved hierarchical genetic algorithm (ant colony—hierarchical genetic algorithm is developed for better solution, and a hierarchical coding method is used based on the characteristics of the problem. Furthermore, to speed up convergence rates and resolve local convergence issues, a kind of adaptive crossover probability and mutation probability is used in this algorithm. The computational result and comparison prove that the presented approach is quite effective for the considered problem.
An improved hierarchical genetic algorithm for sheet cutting scheduling with process constraints.

Science.gov (United States)

Rao, Yunqing; Qi, Dezhong; Li, Jinling

2013-01-01

For the first time, an improved hierarchical genetic algorithm for sheet cutting problem which involves n cutting patterns for m non-identical parallel machines with process constraints has been proposed in the integrated cutting stock model. The objective of the cutting scheduling problem is minimizing the weighted completed time. A mathematical model for this problem is presented, an improved hierarchical genetic algorithm (ant colony--hierarchical genetic algorithm) is developed for better solution, and a hierarchical coding method is used based on the characteristics of the problem. Furthermore, to speed up convergence rates and resolve local convergence issues, a kind of adaptive crossover probability and mutation probability is used in this algorithm. The computational result and comparison prove that the presented approach is quite effective for the considered problem.
Parallelization of a blind deconvolution algorithm

Science.gov (United States)

Matson, Charles L.; Borelli, Kathy J.

2006-09-01

Often it is of interest to deblur imagery in order to obtain higher-resolution images. Deblurring requires knowledge of the blurring function - information that is often not available separately from the blurred imagery. Blind deconvolution algorithms overcome this problem by jointly estimating both the high-resolution image and the blurring function from the blurred imagery. Because blind deconvolution algorithms are iterative in nature, they can take minutes to days to deblur an image depending how many frames of data are used for the deblurring and the platforms on which the algorithms are executed. Here we present our progress in parallelizing a blind deconvolution algorithm to increase its execution speed. This progress includes sub-frame parallelization and a code structure that is not specialized to a specific computer hardware architecture.
MIP Models and Hybrid Algorithms for Simultaneous Job Splitting and Scheduling on Unrelated Parallel Machines

Science.gov (United States)

Ozmutlu, H. Cenk

2014-01-01

We developed mixed integer programming (MIP) models and hybrid genetic-local search algorithms for the scheduling problem of unrelated parallel machines with job sequence and machine-dependent setup times and with job splitting property. The first contribution of this paper is to introduce novel algorithms which make splitting and scheduling simultaneously with variable number of subjobs. We proposed simple chromosome structure which is constituted by random key numbers in hybrid genetic-local search algorithm (GAspLA). Random key numbers are used frequently in genetic algorithms, but it creates additional difficulty when hybrid factors in local search are implemented. We developed algorithms that satisfy the adaptation of results of local search into the genetic algorithms with minimum relocation operation of genes' random key numbers. This is the second contribution of the paper. The third contribution of this paper is three developed new MIP models which are making splitting and scheduling simultaneously. The fourth contribution of this paper is implementation of the GAspLAMIP. This implementation let us verify the optimality of GAspLA for the studied combinations. The proposed methods are tested on a set of problems taken from the literature and the results validate the effectiveness of the proposed algorithms. PMID:24977204
MIP models and hybrid algorithms for simultaneous job splitting and scheduling on unrelated parallel machines.

Science.gov (United States)

Eroglu, Duygu Yilmaz; Ozmutlu, H Cenk

2014-01-01

We developed mixed integer programming (MIP) models and hybrid genetic-local search algorithms for the scheduling problem of unrelated parallel machines with job sequence and machine-dependent setup times and with job splitting property. The first contribution of this paper is to introduce novel algorithms which make splitting and scheduling simultaneously with variable number of subjobs. We proposed simple chromosome structure which is constituted by random key numbers in hybrid genetic-local search algorithm (GAspLA). Random key numbers are used frequently in genetic algorithms, but it creates additional difficulty when hybrid factors in local search are implemented. We developed algorithms that satisfy the adaptation of results of local search into the genetic algorithms with minimum relocation operation of genes' random key numbers. This is the second contribution of the paper. The third contribution of this paper is three developed new MIP models which are making splitting and scheduling simultaneously. The fourth contribution of this paper is implementation of the GAspLAMIP. This implementation let us verify the optimality of GAspLA for the studied combinations. The proposed methods are tested on a set of problems taken from the literature and the results validate the effectiveness of the proposed algorithms.
Graph Transformation and Designing Parallel Sparse Matrix Algorithms beyond Data Dependence Analysis

Directory of Open Access Journals (Sweden)

H.X. Lin

2004-01-01

Full Text Available Algorithms are often parallelized based on data dependence analysis manually or by means of parallel compilers. Some vector/matrix computations such as the matrix-vector products with simple data dependence structures (data parallelism can be easily parallelized. For problems with more complicated data dependence structures, parallelization is less straightforward. The data dependence graph is a powerful means for designing and analyzing parallel algorithms. However, for sparse matrix computations, parallelization based on solely exploiting the existing parallelism in an algorithm does not always give satisfactory results. For example, the conventional Gaussian elimination algorithm for the solution of a tri-diagonal system is inherently sequential, so algorithms specially for parallel computation has to be designed. After briefly reviewing different parallelization approaches, a powerful graph formalism for designing parallel algorithms is introduced. This formalism will be discussed using a tri-diagonal system as an example. Its application to general matrix computations is also discussed. Its power in designing parallel algorithms beyond the ability of data dependence analysis is shown by means of a new algorithm called ACER (Alternating Cyclic Elimination and Reduction algorithm.
Fundamental Parallel Algorithms for Private-Cache Chip Multiprocessors

DEFF Research Database (Denmark)

Arge, Lars Allan; Goodrich, Michael T.; Nelson, Michael

2008-01-01

about the way cores are interconnected, for we assume that all inter-processor communication occurs through the memory hierarchy. We study several fundamental problems, including prefix sums, selection, and sorting, which often form the building blocks of other parallel algorithms. Indeed, we present...... two sorting algorithms, a distribution sort and a mergesort. Our algorithms are asymptotically optimal in terms of parallel cache accesses and space complexity under reasonable assumptions about the relationships between the number of processors, the size of memory, and the size of cache blocks....... In addition, we study sorting lower bounds in a computational model, which we call the parallel external-memory (PEM) model, that formalizes the essential properties of our algorithms for private-cache CMPs....
Academic training: From Evolution Theory to Parallel and Distributed Genetic Programming

CERN Multimedia

2007-01-01

2006-2007 ACADEMIC TRAINING PROGRAMME LECTURE SERIES 15, 16 March From 11:00 to 12:00 - Main Auditorium, bldg. 500 From Evolution Theory to Parallel and Distributed Genetic Programming F. FERNANDEZ DE VEGA / Univ. of Extremadura, SP Lecture No. 1: From Evolution Theory to Evolutionary Computation Evolutionary computation is a subfield of artificial intelligence (more particularly computational intelligence) involving combinatorial optimization problems, which are based to some degree on the evolution of biological life in the natural world. In this tutorial we will review the source of inspiration for this metaheuristic and its capability for solving problems. We will show the main flavours within the field, and different problems that have been successfully solved employing this kind of techniques. Lecture No. 2: Parallel and Distributed Genetic Programming The successful application of Genetic Programming (GP, one of the available Evolutionary Algorithms) to optimization problems has encouraged an ...
Parallel data encryption with RSA algorithm

OpenAIRE

Неретин, А. А.

2016-01-01

In this paper a parallel RSA algorithm with preliminary shuffling of source text was presented.Dependence of an encryption speed on the number of encryption nodes has been analysed, The proposed algorithm was implemented on C# language.
Discrete Hadamard transformation algorithm's parallelism analysis and achievement

Science.gov (United States)

Hu, Hui

2009-07-01

With respect to Discrete Hadamard Transformation (DHT) wide application in real-time signal processing while limitation in operation speed of DSP. The article makes DHT parallel research and its parallel performance analysis. Based on multiprocessor platform-TMS320C80 programming structure, the research is carried out to achieve two kinds of parallel DHT algorithms. Several experiments demonstrated the effectiveness of the proposed algorithms.

Genomic multiple sequence alignments: refinement using a genetic algorithm

Directory of Open Access Journals (Sweden)

Lefkowitz Elliot J

2005-08-01

Full Text Available Abstract Background Genomic sequence data cannot be fully appreciated in isolation. Comparative genomics – the practice of comparing genomic sequences from different species – plays an increasingly important role in understanding the genotypic differences between species that result in phenotypic differences as well as in revealing patterns of evolutionary relationships. One of the major challenges in comparative genomics is producing a high-quality alignment between two or more related genomic sequences. In recent years, a number of tools have been developed for aligning large genomic sequences. Most utilize heuristic strategies to identify a series of strong sequence similarities, which are then used as anchors to align the regions between the anchor points. The resulting alignment is globally correct, but in many cases is suboptimal locally. We describe a new program, GenAlignRefine, which improves the overall quality of global multiple alignments by using a genetic algorithm to improve local regions of alignment. Regions of low quality are identified, realigned using the program T-Coffee, and then refined using a genetic algorithm. Because a better COFFEE (Consistency based Objective Function For alignmEnt Evaluation score generally reflects greater alignment quality, the algorithm searches for an alignment that yields a better COFFEE score. To improve the intrinsic slowness of the genetic algorithm, GenAlignRefine was implemented as a parallel, cluster-based program. Results We tested the GenAlignRefine algorithm by running it on a Linux cluster to refine sequences from a simulation, as well as refine a multiple alignment of 15 Orthopoxvirus genomic sequences approximately 260,000 nucleotides in length that initially had been aligned by Multi-LAGAN. It took approximately 150 minutes for a 40-processor Linux cluster to optimize some 200 fuzzy (poorly aligned regions of the orthopoxvirus alignment. Overall sequence identity increased only
New Parallel Algorithms for Landscape Evolution Model

Science.gov (United States)

Jin, Y.; Zhang, H.; Shi, Y.

2017-12-01

Most landscape evolution models (LEM) developed in the last two decades solve the diffusion equation to simulate the transportation of surface sediments. This numerical approach is difficult to parallelize due to the computation of drainage area for each node, which needs huge amount of communication if run in parallel. In order to overcome this difficulty, we developed two parallel algorithms for LEM with a stream net. One algorithm handles the partition of grid with traditional methods and applies an efficient global reduction algorithm to do the computation of drainage areas and transport rates for the stream net; the other algorithm is based on a new partition algorithm, which partitions the nodes in catchments between processes first, and then partitions the cells according to the partition of nodes. Both methods focus on decreasing communication between processes and take the advantage of massive computing techniques, and numerical experiments show that they are both adequate to handle large scale problems with millions of cells. We implemented the two algorithms in our program based on the widely used finite element library deal.II, so that it can be easily coupled with ASPECT.
Where are the parallel algorithms?

Science.gov (United States)

Voigt, R. G.

1985-01-01

Four paradigms that can be useful in developing parallel algorithms are discussed. These include computational complexity analysis, changing the order of computation, asynchronous computation, and divide and conquer. Each is illustrated with an example from scientific computation, and it is shown that computational complexity must be used with great care or an inefficient algorithm may be selected.
Parallel conjugate gradient algorithms for manipulator dynamic simulation

Science.gov (United States)

Fijany, Amir; Scheld, Robert E.

1989-01-01

Parallel conjugate gradient algorithms for the computation of multibody dynamics are developed for the specialized case of a robot manipulator. For an n-dimensional positive-definite linear system, the Classical Conjugate Gradient (CCG) algorithms are guaranteed to converge in n iterations, each with a computation cost of O(n); this leads to a total computational cost of O(n sq) on a serial processor. A conjugate gradient algorithms is presented that provide greater efficiency using a preconditioner, which reduces the number of iterations required, and by exploiting parallelism, which reduces the cost of each iteration. Two Preconditioned Conjugate Gradient (PCG) algorithms are proposed which respectively use a diagonal and a tridiagonal matrix, composed of the diagonal and tridiagonal elements of the mass matrix, as preconditioners. Parallel algorithms are developed to compute the preconditioners and their inversions in O(log sub 2 n) steps using n processors. A parallel algorithm is also presented which, on the same architecture, achieves the computational time of O(log sub 2 n) for each iteration. Simulation results for a seven degree-of-freedom manipulator are presented. Variants of the proposed algorithms are also developed which can be efficiently implemented on the Robot Mathematics Processor (RMP).
Empirical study of parallel LRU simulation algorithms

Science.gov (United States)

Carr, Eric; Nicol, David M.

1994-01-01

This paper reports on the performance of five parallel algorithms for simulating a fully associative cache operating under the LRU (Least-Recently-Used) replacement policy. Three of the algorithms are SIMD, and are implemented on the MasPar MP-2 architecture. Two other algorithms are parallelizations of an efficient serial algorithm on the Intel Paragon. One SIMD algorithm is quite simple, but its cost is linear in the cache size. The two other SIMD algorithm are more complex, but have costs that are independent on the cache size. Both the second and third SIMD algorithms compute all stack distances; the second SIMD algorithm is completely general, whereas the third SIMD algorithm presumes and takes advantage of bounds on the range of reference tags. Both MIMD algorithm implemented on the Paragon are general and compute all stack distances; they differ in one step that may affect their respective scalability. We assess the strengths and weaknesses of these algorithms as a function of problem size and characteristics, and compare their performance on traces derived from execution of three SPEC benchmark programs.
Systematic approach for deriving feasible mappings of parallel algorithms to parallel computing platforms

NARCIS (Netherlands)

Arkin, Ethem; Tekinerdogan, Bedir; Imre, Kayhan M.

2017-01-01

The need for high-performance computing together with the increasing trend from single processor to parallel computer architectures has leveraged the adoption of parallel computing. To benefit from parallel computing power, usually parallel algorithms are defined that can be mapped and executed
A hybrid, massively parallel implementation of a genetic algorithm for optimization of the impact performance of a metal/polymer composite plate

KAUST Repository

Narayanan, Kiran

2012-07-17

A hybrid parallelization method composed of a coarse-grained genetic algorithm (GA) and fine-grained objective function evaluations is implemented on a heterogeneous computational resource consisting of 16 IBM Blue Gene/P racks, a single x86 cluster node and a high-performance file system. The GA iterator is coupled with a finite-element (FE) analysis code developed in house to facilitate computational steering in order to calculate the optimal impact velocities of a projectile colliding with a polyurea/structural steel composite plate. The FE code is capable of capturing adiabatic shear bands and strain localization, which are typically observed in high-velocity impact applications, and it includes several constitutive models of plasticity, viscoelasticity and viscoplasticity for metals and soft materials, which allow simulation of ductile fracture by void growth. A strong scaling study of the FE code was conducted to determine the optimum number of processes run in parallel. The relative efficiency of the hybrid, multi-level parallelization method is studied in order to determine the parameters for the parallelization. Optimal impact velocities of the projectile calculated using the proposed approach, are reported. © The Author(s) 2012.
Comparison of multihardware parallel implementations for a phase unwrapping algorithm

Science.gov (United States)

Hernandez-Lopez, Francisco Javier; Rivera, Mariano; Salazar-Garibay, Adan; Legarda-Sáenz, Ricardo

2018-04-01

Phase unwrapping is an important problem in the areas of optical metrology, synthetic aperture radar (SAR) image analysis, and magnetic resonance imaging (MRI) analysis. These images are becoming larger in size and, particularly, the availability and need for processing of SAR and MRI data have increased significantly with the acquisition of remote sensing data and the popularization of magnetic resonators in clinical diagnosis. Therefore, it is important to develop faster and accurate phase unwrapping algorithms. We propose a parallel multigrid algorithm of a phase unwrapping method named accumulation of residual maps, which builds on a serial algorithm that consists of the minimization of a cost function; minimization achieved by means of a serial Gauss-Seidel kind algorithm. Our algorithm also optimizes the original cost function, but unlike the original work, our algorithm is a parallel Jacobi class with alternated minimizations. This strategy is known as the chessboard type, where red pixels can be updated in parallel at same iteration since they are independent. Similarly, black pixels can be updated in parallel in an alternating iteration. We present parallel implementations of our algorithm for different parallel multicore architecture such as CPU-multicore, Xeon Phi coprocessor, and Nvidia graphics processing unit. In all the cases, we obtain a superior performance of our parallel algorithm when compared with the original serial version. In addition, we present a detailed comparative performance of the developed parallel versions.
Parallel/vector algorithms for the spherical SN transport theory method

International Nuclear Information System (INIS)

Haghighat, A.; Mattis, R.E.

1990-01-01

This paper discusses vector and parallel processing of a 1-D curvilinear (i.e. spherical) S N transport theory algorithm on the Cornell National SuperComputer Facility (CNSF) IBM 3090/600E. Two different vector algorithms were developed and parallelized based on angular decomposition. It is shown that significant speedups are attainable. For example, for problems with large granularity, using 4 processors, the parallel/vector algorithm achieves speedups (for wall-clock time) of more than 4.5 relative to the old serial/scalar algorithm. Furthermore, this work has demonstrated the existing potential for the development of faster processing vector and parallel algorithms for multidimensional curvilinear geometries. (author)
Online Algorithms for Parallel Job Scheduling and Strip Packing

NARCIS (Netherlands)

Hurink, Johann L.; Paulus, J.J.

We consider the online scheduling problem of parallel jobs on parallel machines, $P|online{−}list,m_j |C_{max}$. For this problem we present a 6.6623-competitive algorithm. This improves the best known 7-competitive algorithm for this problem. The presented algorithm also applies to the problem
A Parallel Particle Swarm Optimization Algorithm Accelerated by Asynchronous Evaluations

Science.gov (United States)

Venter, Gerhard; Sobieszczanski-Sobieski, Jaroslaw

2005-01-01

A parallel Particle Swarm Optimization (PSO) algorithm is presented. Particle swarm optimization is a fairly recent addition to the family of non-gradient based, probabilistic search algorithms that is based on a simplified social model and is closely tied to swarming theory. Although PSO algorithms present several attractive properties to the designer, they are plagued by high computational cost as measured by elapsed time. One approach to reduce the elapsed time is to make use of coarse-grained parallelization to evaluate the design points. Previous parallel PSO algorithms were mostly implemented in a synchronous manner, where all design points within a design iteration are evaluated before the next iteration is started. This approach leads to poor parallel speedup in cases where a heterogeneous parallel environment is used and/or where the analysis time depends on the design point being analyzed. This paper introduces an asynchronous parallel PSO algorithm that greatly improves the parallel e ciency. The asynchronous algorithm is benchmarked on a cluster assembled of Apple Macintosh G5 desktop computers, using the multi-disciplinary optimization of a typical transport aircraft wing as an example.
Algorithms for computational fluid dynamics n parallel processors

International Nuclear Information System (INIS)

Van de Velde, E.F.

1986-01-01

A study of parallel algorithms for the numerical solution of partial differential equations arising in computational fluid dynamics is presented. The actual implementation on parallel processors of shared and nonshared memory design is discussed. The performance of these algorithms is analyzed in terms of machine efficiency, communication time, bottlenecks and software development costs. For elliptic equations, a parallel preconditioned conjugate gradient method is described, which has been used to solve pressure equations discretized with high order finite elements on irregular grids. A parallel full multigrid method and a parallel fast Poisson solver are also presented. Hyperbolic conservation laws were discretized with parallel versions of finite difference methods like the Lax-Wendroff scheme and with the Random Choice method. Techniques are developed for comparing the behavior of an algorithm on different architectures as a function of problem size and local computational effort. Effective use of these advanced architecture machines requires the use of machine dependent programming. It is shown that the portability problems can be minimized by introducing high level operations on vectors and matrices structured into program libraries
Design of PID Controller Simulator based on Genetic Algorithm

Directory of Open Access Journals (Sweden)

Fahri VATANSEVER

2013-08-01

Full Text Available PID (Proportional Integral and Derivative controllers take an important place in the field of system controlling. Various methods such as Ziegler-Nichols, Cohen-Coon, Chien Hrones Reswick (CHR and Wang-Juang-Chan are available for the design of such controllers benefiting from the system time and frequency domain data. These controllers are in compliance with system properties under certain criteria suitable to the system. Genetic algorithms have become widely used in control system applications in parallel to the advances in the field of computer and artificial intelligence. In this study, PID controller designs have been carried out by means of classical methods and genetic algorithms and comparative results have been analyzed. For this purpose, a graphical user interface program which can be used for educational purpose has been developed. For the definite (entered transfer functions, the suitable P, PI and PID controller coefficients have calculated by both classical methods and genetic algorithms and many parameters and responses of the systems have been compared and presented numerically and graphically
An optimization method of relativistic backward wave oscillator using particle simulation and genetic algorithms

Energy Technology Data Exchange (ETDEWEB)

Chen, Zaigao; Wang, Jianguo [Key Laboratory for Physical Electronics and Devices of the Ministry of Education, Xi' an Jiaotong University, Xi' an, Shaanxi 710049 (China); Northwest Institute of Nuclear Technology, P.O. Box 69-12, Xi' an, Shaanxi 710024 (China); Wang, Yue; Qiao, Hailiang; Zhang, Dianhui [Northwest Institute of Nuclear Technology, P.O. Box 69-12, Xi' an, Shaanxi 710024 (China); Guo, Weijie [Key Laboratory for Physical Electronics and Devices of the Ministry of Education, Xi' an Jiaotong University, Xi' an, Shaanxi 710049 (China)

2013-11-15

Optimal design method of high-power microwave source using particle simulation and parallel genetic algorithms is presented in this paper. The output power, simulated by the fully electromagnetic particle simulation code UNIPIC, of the high-power microwave device is given as the fitness function, and the float-encoding genetic algorithms are used to optimize the high-power microwave devices. Using this method, we encode the heights of non-uniform slow wave structure in the relativistic backward wave oscillators (RBWO), and optimize the parameters on massively parallel processors. Simulation results demonstrate that we can obtain the optimal parameters of non-uniform slow wave structure in the RBWO, and the output microwave power enhances 52.6% after the device is optimized.
Analysis of a parallel multigrid algorithm

Science.gov (United States)

Chan, Tony F.; Tuminaro, Ray S.

1989-01-01

The parallel multigrid algorithm of Frederickson and McBryan (1987) is considered. This algorithm uses multiple coarse-grid problems (instead of one problem) in the hope of accelerating convergence and is found to have a close relationship to traditional multigrid methods. Specifically, the parallel coarse-grid correction operator is identical to a traditional multigrid coarse-grid correction operator, except that the mixing of high and low frequencies caused by aliasing error is removed. Appropriate relaxation operators can be chosen to take advantage of this property. Comparisons between the standard multigrid and the new method are made.
A Parallel Prefix Algorithm for Almost Toeplitz Tridiagonal Systems

Science.gov (United States)

Sun, Xian-He; Joslin, Ronald D.

1995-01-01

A compact scheme is a discretization scheme that is advantageous in obtaining highly accurate solutions. However, the resulting systems from compact schemes are tridiagonal systems that are difficult to solve efficiently on parallel computers. Considering the almost symmetric Toeplitz structure, a parallel algorithm, simple parallel prefix (SPP), is proposed. The SPP algorithm requires less memory than the conventional LU decomposition and is efficient on parallel machines. It consists of a prefix communication pattern and AXPY operations. Both the computation and the communication can be truncated without degrading the accuracy when the system is diagonally dominant. A formal accuracy study has been conducted to provide a simple truncation formula. Experimental results have been measured on a MasPar MP-1 SIMD machine and on a Cray 2 vector machine. Experimental results show that the simple parallel prefix algorithm is a good algorithm for symmetric, almost symmetric Toeplitz tridiagonal systems and for the compact scheme on high-performance computers.
Graphics Processing Unit–Enhanced Genetic Algorithms for Solving the Temporal Dynamics of Gene Regulatory Networks

Science.gov (United States)

García-Calvo, Raúl; Guisado, JL; Diaz-del-Rio, Fernando; Córdoba, Antonio; Jiménez-Morales, Francisco

2018-01-01

Understanding the regulation of gene expression is one of the key problems in current biology. A promising method for that purpose is the determination of the temporal dynamics between known initial and ending network states, by using simple acting rules. The huge amount of rule combinations and the nonlinear inherent nature of the problem make genetic algorithms an excellent candidate for finding optimal solutions. As this is a computationally intensive problem that needs long runtimes in conventional architectures for realistic network sizes, it is fundamental to accelerate this task. In this article, we study how to develop efficient parallel implementations of this method for the fine-grained parallel architecture of graphics processing units (GPUs) using the compute unified device architecture (CUDA) platform. An exhaustive and methodical study of various parallel genetic algorithm schemes—master-slave, island, cellular, and hybrid models, and various individual selection methods (roulette, elitist)—is carried out for this problem. Several procedures that optimize the use of the GPU’s resources are presented. We conclude that the implementation that produces better results (both from the performance and the genetic algorithm fitness perspectives) is simulating a few thousands of individuals grouped in a few islands using elitist selection. This model comprises 2 mighty factors for discovering the best solutions: finding good individuals in a short number of generations, and introducing genetic diversity via a relatively frequent and numerous migration. As a result, we have even found the optimal solution for the analyzed gene regulatory network (GRN). In addition, a comparative study of the performance obtained by the different parallel implementations on GPU versus a sequential application on CPU is carried out. In our tests, a multifold speedup was obtained for our optimized parallel implementation of the method on medium class GPU over an equivalent
Graphics Processing Unit-Enhanced Genetic Algorithms for Solving the Temporal Dynamics of Gene Regulatory Networks.

Science.gov (United States)

García-Calvo, Raúl; Guisado, J L; Diaz-Del-Rio, Fernando; Córdoba, Antonio; Jiménez-Morales, Francisco

2018-01-01

Understanding the regulation of gene expression is one of the key problems in current biology. A promising method for that purpose is the determination of the temporal dynamics between known initial and ending network states, by using simple acting rules. The huge amount of rule combinations and the nonlinear inherent nature of the problem make genetic algorithms an excellent candidate for finding optimal solutions. As this is a computationally intensive problem that needs long runtimes in conventional architectures for realistic network sizes, it is fundamental to accelerate this task. In this article, we study how to develop efficient parallel implementations of this method for the fine-grained parallel architecture of graphics processing units (GPUs) using the compute unified device architecture (CUDA) platform. An exhaustive and methodical study of various parallel genetic algorithm schemes-master-slave, island, cellular, and hybrid models, and various individual selection methods (roulette, elitist)-is carried out for this problem. Several procedures that optimize the use of the GPU's resources are presented. We conclude that the implementation that produces better results (both from the performance and the genetic algorithm fitness perspectives) is simulating a few thousands of individuals grouped in a few islands using elitist selection. This model comprises 2 mighty factors for discovering the best solutions: finding good individuals in a short number of generations, and introducing genetic diversity via a relatively frequent and numerous migration. As a result, we have even found the optimal solution for the analyzed gene regulatory network (GRN). In addition, a comparative study of the performance obtained by the different parallel implementations on GPU versus a sequential application on CPU is carried out. In our tests, a multifold speedup was obtained for our optimized parallel implementation of the method on medium class GPU over an equivalent
A parallel 2-opt algorithm for the traveling salesman problem

NARCIS (Netherlands)

Verhoeven, M.G.A.; Aarts, E.H.L.; Swinkels, P.C.J.

1995-01-01

We present a scalable parallel local search algorithm based on data parallelism. The concept of distributed neighborhood structures is introduced, and applied to the Traveling Salesman Problem (TSP). Our parallel local search algorithm finds the same quality solutions as the classical 2-opt
Parallel algorithms for placement and routing in VLSI design. Ph.D. Thesis

Science.gov (United States)

Brouwer, Randall Jay

1991-01-01

The computational requirements for high quality synthesis, analysis, and verification of very large scale integration (VLSI) designs have rapidly increased with the fast growing complexity of these designs. Research in the past has focused on the development of heuristic algorithms, special purpose hardware accelerators, or parallel algorithms for the numerous design tasks to decrease the time required for solution. Two new parallel algorithms are proposed for two VLSI synthesis tasks, standard cell placement and global routing. The first algorithm, a parallel algorithm for global routing, uses hierarchical techniques to decompose the routing problem into independent routing subproblems that are solved in parallel. Results are then presented which compare the routing quality to the results of other published global routers and which evaluate the speedups attained. The second algorithm, a parallel algorithm for cell placement and global routing, hierarchically integrates a quadrisection placement algorithm, a bisection placement algorithm, and the previous global routing algorithm. Unique partitioning techniques are used to decompose the various stages of the algorithm into independent tasks which can be evaluated in parallel. Finally, results are presented which evaluate the various algorithm alternatives and compare the algorithm performance to other placement programs. Measurements are presented on the parallel speedups available.

A GPU-Based Genetic Algorithm for the P-Median Problem

OpenAIRE

AlBdaiwi, Bader F.; AboElFotoh, Hosam M. F.

2016-01-01

The p-median problem is a well-known NP-hard problem. Many heuristics have been proposed in the literature for this problem. In this paper, we exploit a GPGPU parallel computing platform to present a new genetic algorithm implemented in Cuda and based on a Pseudo Boolean formulation of the p-median problem. We have tested the effectiveness of our algorithm using a Tesla K40 (2880 Cuda cores) on 290 different benchmark instances obtained from OR-Library, discrete location problems benchmark li...
Exact parallel maximum clique algorithm for general and protein graphs.

Science.gov (United States)

Depolli, Matjaž; Konc, Janez; Rozman, Kati; Trobec, Roman; Janežič, Dušanka

2013-09-23

A new exact parallel maximum clique algorithm MaxCliquePara, which finds the maximum clique (the fully connected subgraph) in undirected general and protein graphs, is presented. First, a new branch and bound algorithm for finding a maximum clique on a single computer core, which builds on ideas presented in two published state of the art sequential algorithms is implemented. The new sequential MaxCliqueSeq algorithm is faster than the reference algorithms on both DIMACS benchmark graphs as well as on protein-derived product graphs used for protein structural comparisons. Next, the MaxCliqueSeq algorithm is parallelized by splitting the branch-and-bound search tree to multiple cores, resulting in MaxCliquePara algorithm. The ability to exploit all cores efficiently makes the new parallel MaxCliquePara algorithm markedly superior to other tested algorithms. On a 12-core computer, the parallelization provides up to 2 orders of magnitude faster execution on the large DIMACS benchmark graphs and up to an order of magnitude faster execution on protein product graphs. The algorithms are freely accessible on http://commsys.ijs.si/~matjaz/maxclique.
Efficient sequential and parallel algorithms for record linkage.

Science.gov (United States)

Mamun, Abdullah-Al; Mi, Tian; Aseltine, Robert; Rajasekaran, Sanguthevar

2014-01-01

Integrating data from multiple sources is a crucial and challenging problem. Even though there exist numerous algorithms for record linkage or deduplication, they suffer from either large time needs or restrictions on the number of datasets that they can integrate. In this paper we report efficient sequential and parallel algorithms for record linkage which handle any number of datasets and outperform previous algorithms. Our algorithms employ hierarchical clustering algorithms as the basis. A key idea that we use is radix sorting on certain attributes to eliminate identical records before any further processing. Another novel idea is to form a graph that links similar records and find the connected components. Our sequential and parallel algorithms have been tested on a real dataset of 1,083,878 records and synthetic datasets ranging in size from 50,000 to 9,000,000 records. Our sequential algorithm runs at least two times faster, for any dataset, than the previous best-known algorithm, the two-phase algorithm using faster computation of the edit distance (TPA (FCED)). The speedups obtained by our parallel algorithm are almost linear. For example, we get a speedup of 7.5 with 8 cores (residing in a single node), 14.1 with 16 cores (residing in two nodes), and 26.4 with 32 cores (residing in four nodes). We have compared the performance of our sequential algorithm with TPA (FCED) and found that our algorithm outperforms the previous one. The accuracy is the same as that of this previous best-known algorithm.
Search of molecular ground state via genetic algorithm: Implementation on a hybrid SIMD-MIMD platform

International Nuclear Information System (INIS)

Pucello, N.; D'Agostino, G.; Pisacane, F.

1997-01-01

A genetic algorithm for the optimization of the ground-state structure of a metallic cluster has been developed and ported on a SIMD-MIMD parallel platform. The SIMD part of the parallel platform is represented by a Quadrics/APE100 consisting of 512 floating point units, while the MIMD part is formed by a cluster of workstations. The proposed algorithm is composed by a part where the genetic operators are applied to the elements of the population and a part which performs a further local relaxation and the fitness calculation via Molecular Dynamics. These parts have been implemented on the MIMD and on the SIMD part, respectively. Results have been compared to those generated by using Simulated Annealing
Real Time Optima Tracking Using Harvesting Models of the Genetic Algorithm

Science.gov (United States)

Baskaran, Subbiah; Noever, D.

1999-01-01

Tracking optima in real time propulsion control, particularly for non-stationary optimization problems is a challenging task. Several approaches have been put forward for such a study including the numerical method called the genetic algorithm. In brief, this approach is built upon Darwinian-style competition between numerical alternatives displayed in the form of binary strings, or by analogy to 'pseudogenes'. Breeding of improved solution is an often cited parallel to natural selection in.evolutionary or soft computing. In this report we present our results of applying a novel model of a genetic algorithm for tracking optima in propulsion engineering and in real time control. We specialize the algorithm to mission profiling and planning optimizations, both to select reduced propulsion needs through trajectory planning and to explore time or fuel conservation strategies.
A class of parallel algorithms for computation of the manipulator inertia matrix

Science.gov (United States)

Fijany, Amir; Bejczy, Antal K.

1989-01-01

Parallel and parallel/pipeline algorithms for computation of the manipulator inertia matrix are presented. An algorithm based on composite rigid-body spatial inertia method, which provides better features for parallelization, is used for the computation of the inertia matrix. Two parallel algorithms are developed which achieve the time lower bound in computation. Also described is the mapping of these algorithms with topological variation on a two-dimensional processor array, with nearest-neighbor connection, and with cardinality variation on a linear processor array. An efficient parallel/pipeline algorithm for the linear array was also developed, but at significantly higher efficiency.
New algorithms for parallel MRI

International Nuclear Information System (INIS)

Anzengruber, S; Ramlau, R; Bauer, F; Leitao, A

2008-01-01

Magnetic Resonance Imaging with parallel data acquisition requires algorithms for reconstructing the patient's image from a small number of measured lines of the Fourier domain (k-space). In contrast to well-known algorithms like SENSE and GRAPPA and its flavors we consider the problem as a non-linear inverse problem. However, in order to avoid cost intensive derivatives we will use Landweber-Kaczmarz iteration and in order to improve the overall results some additional sparsity constraints.
Contact-impact algorithms on parallel computers

International Nuclear Information System (INIS)

Zhong Zhihua; Nilsson, Larsgunnar

1994-01-01

Contact-impact algorithms on parallel computers are discussed within the context of explicit finite element analysis. The algorithms concerned include a contact searching algorithm and an algorithm for contact force calculations. The contact searching algorithm is based on the territory concept of the general HITA algorithm. However, no distinction is made between different contact bodies, or between different contact surfaces. All contact segments from contact boundaries are taken as a single set. Hierarchy territories and contact territories are expanded. A three-dimensional bucket sort algorithm is used to sort contact nodes. The defence node algorithm is used in the calculation of contact forces. Both the contact searching algorithm and the defence node algorithm are implemented on the connection machine CM-200. The performance of the algorithms is examined under different circumstances, and numerical results are presented. ((orig.))
Parallel grid generation algorithm for distributed memory computers

Science.gov (United States)

Moitra, Stuti; Moitra, Anutosh

1994-01-01

A parallel grid-generation algorithm and its implementation on the Intel iPSC/860 computer are described. The grid-generation scheme is based on an algebraic formulation of homotopic relations. Methods for utilizing the inherent parallelism of the grid-generation scheme are described, and implementation of multiple levELs of parallelism on multiple instruction multiple data machines are indicated. The algorithm is capable of providing near orthogonality and spacing control at solid boundaries while requiring minimal interprocessor communications. Results obtained on the Intel hypercube for a blended wing-body configuration are used to demonstrate the effectiveness of the algorithm. Fortran implementations bAsed on the native programming model of the iPSC/860 computer and the Express system of software tools are reported. Computational gains in execution time speed-up ratios are given.
Parallel algorithms for boundary value problems

Science.gov (United States)

Lin, Avi

1991-01-01

A general approach to solve boundary value problems numerically in a parallel environment is discussed. The basic algorithm consists of two steps: the local step where all the P available processors work in parallel, and the global step where one processor solves a tridiagonal linear system of the order P. The main advantages of this approach are twofold. First, this suggested approach is very flexible, especially in the local step and thus the algorithm can be used with any number of processors and with any of the SIMD or MIMD machines. Secondly, the communication complexity is very small and thus can be used as easily with shared memory machines. Several examples for using this strategy are discussed.
Improved Parallel Three-List Algorithm for the Knapsack Problem without Memory Conflicts

Institute of Scientific and Technical Information of China (English)

Pan Jun; Li Kenli; Li Qinghua

2006-01-01

Based on the two-list algorithm and the parallel three-list algorithm, an improved parallel three-list algorithm for knapsack problem is proposed, in which the method of divide and conquer, and parallel merging without memory conflicts are adopted. To find a solution for the n-element knapsack problem, the proposed algorithm needs O(23n/8) time when O(23n/8) shared memory units and O(2n/4) processors are available. The comparisons between the proposed algorithm and 10 existing algorithms show that the improved parallel three-list algorithm is the first exclusive-read exclusive-write (EREW) parallel algorithm that can solve the knapsack instances in less than O(2n/2) time when the available hardware resource is smaller than O(2n/2), and hence is an improved result over the past researches.
Parallelizing flow-accumulation calculations on graphics processing units—From iterative DEM preprocessing algorithm to recursive multiple-flow-direction algorithm

Science.gov (United States)

Qin, Cheng-Zhi; Zhan, Lijun

2012-06-01

As one of the important tasks in digital terrain analysis, the calculation of flow accumulations from gridded digital elevation models (DEMs) usually involves two steps in a real application: (1) using an iterative DEM preprocessing algorithm to remove the depressions and flat areas commonly contained in real DEMs, and (2) using a recursive flow-direction algorithm to calculate the flow accumulation for every cell in the DEM. Because both algorithms are computationally intensive, quick calculation of the flow accumulations from a DEM (especially for a large area) presents a practical challenge to personal computer (PC) users. In recent years, rapid increases in hardware capacity of the graphics processing units (GPUs) provided in modern PCs have made it possible to meet this challenge in a PC environment. Parallel computing on GPUs using a compute-unified-device-architecture (CUDA) programming model has been explored to speed up the execution of the single-flow-direction algorithm (SFD). However, the parallel implementation on a GPU of the multiple-flow-direction (MFD) algorithm, which generally performs better than the SFD algorithm, has not been reported. Moreover, GPU-based parallelization of the DEM preprocessing step in the flow-accumulation calculations has not been addressed. This paper proposes a parallel approach to calculate flow accumulations (including both iterative DEM preprocessing and a recursive MFD algorithm) on a CUDA-compatible GPU. For the parallelization of an MFD algorithm (MFD-md), two different parallelization strategies using a GPU are explored. The first parallelization strategy, which has been used in the existing parallel SFD algorithm on GPU, has the problem of computing redundancy. Therefore, we designed a parallelization strategy based on graph theory. The application results show that the proposed parallel approach to calculate flow accumulations on a GPU performs much faster than either sequential algorithms or other parallel GPU
An Algorithm for Parallel Sn Sweeps on Unstructured Meshes

International Nuclear Information System (INIS)

Pautz, Shawn D.

2002-01-01

A new algorithm for performing parallel S n sweeps on unstructured meshes is developed. The algorithm uses a low-complexity list ordering heuristic to determine a sweep ordering on any partitioned mesh. For typical problems and with 'normal' mesh partitionings, nearly linear speedups on up to 126 processors are observed. This is an important and desirable result, since although analyses of structured meshes indicate that parallel sweeps will not scale with normal partitioning approaches, no severe asymptotic degradation in the parallel efficiency is observed with modest (≤100) levels of parallelism. This result is a fundamental step in the development of efficient parallel S n methods
A Parallel Encryption Algorithm Based on Piecewise Linear Chaotic Map

Directory of Open Access Journals (Sweden)

Xizhong Wang

2013-01-01

Full Text Available We introduce a parallel chaos-based encryption algorithm for taking advantage of multicore processors. The chaotic cryptosystem is generated by the piecewise linear chaotic map (PWLCM. The parallel algorithm is designed with a master/slave communication model with the Message Passing Interface (MPI. The algorithm is suitable not only for multicore processors but also for the single-processor architecture. The experimental results show that the chaos-based cryptosystem possesses good statistical properties. The parallel algorithm provides much better performance than the serial ones and would be useful to apply in encryption/decryption file with large size or multimedia.
Parallel GPU implementation of iterative PCA algorithms.

Science.gov (United States)

Andrecut, M

2009-11-01

Principal component analysis (PCA) is a key statistical technique for multivariate data analysis. For large data sets, the common approach to PCA computation is based on the standard NIPALS-PCA algorithm, which unfortunately suffers from loss of orthogonality, and therefore its applicability is usually limited to the estimation of the first few components. Here we present an algorithm based on Gram-Schmidt orthogonalization (called GS-PCA), which eliminates this shortcoming of NIPALS-PCA. Also, we discuss the GPU (Graphics Processing Unit) parallel implementation of both NIPALS-PCA and GS-PCA algorithms. The numerical results show that the GPU parallel optimized versions, based on CUBLAS (NVIDIA), are substantially faster (up to 12 times) than the CPU optimized versions based on CBLAS (GNU Scientific Library).
A Parallel Saturation Algorithm on Shared Memory Architectures

Science.gov (United States)

Ezekiel, Jonathan; Siminiceanu

2007-01-01

Symbolic state-space generators are notoriously hard to parallelize. However, the Saturation algorithm implemented in the SMART verification tool differs from other sequential symbolic state-space generators in that it exploits the locality of ring events in asynchronous system models. This paper explores whether event locality can be utilized to efficiently parallelize Saturation on shared-memory architectures. Conceptually, we propose to parallelize the ring of events within a decision diagram node, which is technically realized via a thread pool. We discuss the challenges involved in our parallel design and conduct experimental studies on its prototypical implementation. On a dual-processor dual core PC, our studies show speed-ups for several example models, e.g., of up to 50% for a Kanban model, when compared to running our algorithm only on a single core.
From evolution theory to parallel and distributed genetic

CERN Multimedia

CERN. Geneva

2007-01-01

Lecture #1: From Evolution Theory to Evolutionary Computation. Evolutionary computation is a subfield of artificial intelligence (more particularly computational intelligence) involving combinatorial optimization problems, which are based to some degree on the evolution of biological life in the natural world. In this tutorial we will review the source of inspiration for this metaheuristic and its capability for solving problems. We will show the main flavours within the field, and different problems that have been successfully solved employing this kind of techniques. Lecture #2: Parallel and Distributed Genetic Programming. The successful application of Genetic Programming (GP, one of the available Evolutionary Algorithms) to optimization problems has encouraged an increasing number of researchers to apply these techniques to a large set of problems. Given the difficulty of some problems, much effort has been applied to improving the efficiency of GP during the last few years. Among the available proposals,...
Parallel Directionally Split Solver Based on Reformulation of Pipelined Thomas Algorithm

Science.gov (United States)

Povitsky, A.

1998-01-01

In this research an efficient parallel algorithm for 3-D directionally split problems is developed. The proposed algorithm is based on a reformulated version of the pipelined Thomas algorithm that starts the backward step computations immediately after the completion of the forward step computations for the first portion of lines This algorithm has data available for other computational tasks while processors are idle from the Thomas algorithm. The proposed 3-D directionally split solver is based on the static scheduling of processors where local and non-local, data-dependent and data-independent computations are scheduled while processors are idle. A theoretical model of parallelization efficiency is used to define optimal parameters of the algorithm, to show an asymptotic parallelization penalty and to obtain an optimal cover of a global domain with subdomains. It is shown by computational experiments and by the theoretical model that the proposed algorithm reduces the parallelization penalty about two times over the basic algorithm for the range of the number of processors (subdomains) considered and the number of grid nodes per subdomain.
Concatenating algorithms for parallel numerical simulations coupling radiation hydrodynamics with neutron transport

International Nuclear Information System (INIS)

Mo Zeyao

2004-11-01

Multiphysics parallel numerical simulations are usually essential to simplify researches on complex physical phenomena in which several physics are tightly coupled. It is very important on how to concatenate those coupled physics for fully scalable parallel simulation. Meanwhile, three objectives should be balanced, the first is efficient data transfer among simulations, the second and the third are efficient parallel executions and simultaneously developments of those simulation codes. Two concatenating algorithms for multiphysics parallel numerical simulations coupling radiation hydrodynamics with neutron transport on unstructured grid are presented. The first algorithm, Fully Loosely Concatenation (FLC), focuses on the independence of code development and the independence running with optimal performance of code. The second algorithm. Two Level Tightly Concatenation (TLTC), focuses on the optimal tradeoffs among above three objectives. Theoretical analyses for communicational complexity and parallel numerical experiments on hundreds of processors on two parallel machines have showed that these two algorithms are efficient and can be generalized to other multiphysics parallel numerical simulations. In especial, algorithm TLTC is linearly scalable and has achieved the optimal parallel performance. (authors)
Distributed parallel cooperative coevolutionary multi-objective large-scale immune algorithm for deployment of wireless sensor networks

DEFF Research Database (Denmark)

Cao, Bin; Zhao, Jianwei; Yang, Po

2018-01-01

-objective evolutionary algorithms the Cooperative Coevolutionary Generalized Differential Evolution 3, the Cooperative Multi-objective Differential Evolution and the Nondominated Sorting Genetic Algorithm III, the proposed algorithm addresses the deployment optimization problem efficiently and effectively.......Using immune algorithms is generally a time-intensive process especially for problems with a large number of variables. In this paper, we propose a distributed parallel cooperative coevolutionary multi-objective large-scale immune algorithm that is implemented using the message passing interface...... (MPI). The proposed algorithm is composed of three layers: objective, group and individual layers. First, for each objective in the multi-objective problem to be addressed, a subpopulation is used for optimization, and an archive population is used to optimize all the objectives. Second, the large...

Massively Parallel Algorithms for Solution of Schrodinger Equation

Science.gov (United States)

Fijany, Amir; Barhen, Jacob; Toomerian, Nikzad

1994-01-01

In this paper massively parallel algorithms for solution of Schrodinger equation are developed. Our results clearly indicate that the Crank-Nicolson method, in addition to its excellent numerical properties, is also highly suitable for massively parallel computation.
Optimization in optical systems revisited: Beyond genetic algorithms

Science.gov (United States)

Gagnon, Denis; Dumont, Joey; Dubé, Louis

2013-05-01

Designing integrated photonic devices such as waveguides, beam-splitters and beam-shapers often requires optimization of a cost function over a large solution space. Metaheuristics - algorithms based on empirical rules for exploring the solution space - are specifically tailored to those problems. One of the most widely used metaheuristics is the standard genetic algorithm (SGA), based on the evolution of a population of candidate solutions. However, the stochastic nature of the SGA sometimes prevents access to the optimal solution. Our goal is to show that a parallel tabu search (PTS) algorithm is more suited to optimization problems in general, and to photonics in particular. PTS is based on several search processes using a pool of diversified initial solutions. To assess the performance of both algorithms (SGA and PTS), we consider an integrated photonics design problem, the generation of arbitrary beam profiles using a two-dimensional waveguide-based dielectric structure. The authors acknowledge financial support from the Natural Sciences and Engineering Research Council of Canada (NSERC).
Combined spatial/angular domain decomposition SN algorithms for shared memory parallel machines

International Nuclear Information System (INIS)

Hunter, M.A.; Haghighat, A.

1993-01-01

Several parallel processing algorithms on the basis of spatial and angular domain decomposition methods are developed and incorporated into a two-dimensional discrete ordinates transport theory code. These algorithms divide the spatial and angular domains into independent subdomains so that the flux calculations within each subdomain can be processed simultaneously. Two spatial parallel algorithms (Block-Jacobi, red-black), one angular parallel algorithm (η-level), and their combinations are implemented on an eight processor CRAY Y-MP. Parallel performances of the algorithms are measured using a series of fixed source RZ geometry problems. Some of the results are also compared with those executed on an IBM 3090/600J machine. (orig.)
Genetic algorithms and fuzzy multiobjective optimization

CERN Document Server

Sakawa, Masatoshi

2002-01-01

Since the introduction of genetic algorithms in the 1970s, an enormous number of articles together with several significant monographs and books have been published on this methodology. As a result, genetic algorithms have made a major contribution to optimization, adaptation, and learning in a wide variety of unexpected fields. Over the years, many excellent books in genetic algorithm optimization have been published; however, they focus mainly on single-objective discrete or other hard optimization problems under certainty. There appears to be no book that is designed to present genetic algorithms for solving not only single-objective but also fuzzy and multiobjective optimization problems in a unified way. Genetic Algorithms And Fuzzy Multiobjective Optimization introduces the latest advances in the field of genetic algorithm optimization for 0-1 programming, integer programming, nonconvex programming, and job-shop scheduling problems under multiobjectiveness and fuzziness. In addition, the book treats a w...
A new parallel molecular dynamics algorithm for organic systems

International Nuclear Information System (INIS)

Plimpton, S.; Hendrickson, B.; Heffelfinger, G.

1993-01-01

A new parallel algorithm for simulating bonded molecular systems such as polymers and proteins by molecular dynamics (MD) is presented. In contrast to methods that extract parallelism by breaking the spatial domain into sub-pieces, the new method does not require regular geometries or uniform particle densities to achieve high parallel efficiency. For very large, regular systems spatial methods are often the best choice, but in practice the new method is faster for systems with tens-of-thousands of atoms simulated on large numbers of processors. It is also several times faster than the techniques commonly used for parallelizing bonded MD that assign a subset of atoms to each processor and require all-to-all communication. Implementation of the algorithm in a CHARMm-like MD model with many body forces and constraint dynamics is discussed and timings on the Intel Delta and Paragon machines are given. Example calculations using the algorithm in simulations of polymers and liquid-crystal molecules will also be briefly discussed
A Pseudo-Parallel Genetic Algorithm Integrating Simulated Annealing for Stochastic Location-Inventory-Routing Problem with Consideration of Returns in E-Commerce

Directory of Open Access Journals (Sweden)

Bailing Liu

2015-01-01

Full Text Available Facility location, inventory control, and vehicle routes scheduling are three key issues to be settled in the design of logistics system for e-commerce. Due to the online shopping features of e-commerce, customer returns are becoming much more than traditional commerce. This paper studies a three-phase supply chain distribution system consisting of one supplier, a set of retailers, and a single type of product with continuous review (Q, r inventory policy. We formulate a stochastic location-inventory-routing problem (LIRP model with no quality defects returns. To solve the NP-hand problem, a pseudo-parallel genetic algorithm integrating simulated annealing (PPGASA is proposed. The computational results show that PPGASA outperforms GA on optimal solution, computing time, and computing stability.
On а Recursive-Parallel Algorithm for Solving the Knapsack Problem

Directory of Open Access Journals (Sweden)

Vladimir V. Vasilchikov

2018-01-01

Full Text Available In this paper, we offer an efficient parallel algorithm for solving the NP-complete Knapsack Problem in its basic, so-called 0-1 variant. To find its exact solution, algorithms belonging to the category ”branch and bound methods” have long been used. To speed up the solving with varying degrees of efficiency, various options for parallelizing computations are also used. We propose here an algorithm for solving the problem, based on the paradigm of recursive-parallel computations. We consider it suited well for problems of this kind, when it is difficult to immediately break up the computations into a sufficient number of subtasks that are comparable in complexity, since they appear dynamically at run time. We used the RPM ParLib library, developed by the author, as the main tool to program the algorithm. This library allows us to develop effective applications for parallel computing on a local network in the .NET Framework. Such applications have the ability to generate parallel branches of computation directly during program execution and dynamically redistribute work between computing modules. Any language with support for the .NET Framework can be used as a programming language in conjunction with this library. For our experiments, we developed some C# applications using this library. The main purpose of these experiments was to study the acceleration achieved by recursive-parallel computing. A detailed description of the algorithm and its testing, as well as the results obtained, are also given in the paper.
Parallel algorithms on the ASTRA SIMD machine

International Nuclear Information System (INIS)

Odor, G.; Rohrbach, F.; Vesztergombi, G.; Varga, G.; Tatrai, F.

1996-01-01

In view of the tremendous computing power jump of modern RISC processors the interest in parallel computing seems to be thinning out. Why use a complicated system of parallel processors, if the problem can be solved by a single powerful micro-chip. It is a general law, however, that exponential growth will always end by some kind of a saturation, and then parallelism will again become a hot topic. We try to prepare ourselves for this eventuality. The MPPC project started in 1990 in the keydeys of parallelism and produced four ASTRA machines (presented at CHEP's 92) with 4k processors (which are expandable to 16k) based on yesterday's chip-technology (chip presented at CHEP'91). These machines now provide excellent test-beds for algorithmic developments in a complete, real environment. We are developing for example fast-pattern recognition algorithms which could be used in high-energy physics experiments at the LHC (planned to be operational after 2004 at CERN) for triggering and data reduction. The basic feature of our ASP (Associate String Processor) approach is to use extremely simple (thus very cheap) processor elements but in huge quantities (up to millions of processors) connected together by a very simple string-like communication chain. In this paper we present powerful algorithms based on this architecture indicating the performance perspectives if the hardware quality reaches present or even future technology levels. (author)
Application of genetic algorithms to in-core nuclear fuel management optimization

International Nuclear Information System (INIS)

Poon, P.W.; Parks, G.T.

1993-01-01

The search for an optimal arrangement of fresh and burnt fuel and control material within the core of a PWR represents a formidable optimization problem. The approach of combining the robust optimization capabilities of the Simulated Annealing (SA) algorithm with the computational speed of a Generalized Perturbation Theory (GPT) based evaluation methodology in the code FORMOSA has proved to be very effective. In this paper, we show that the incorporation of another stochastic search technique, a Genetic Algorithm, results in comparable optimization performance on serial computers and offers substantially superior performance on parallel machines. (orig.)
Parallel algorithms for nuclear reactor analysis via domain decomposition method

International Nuclear Information System (INIS)

Kim, Yong Hee

1995-02-01

In this thesis, the neutron diffusion equation in reactor physics is discretized by the finite difference method and is solved on a parallel computer network which is composed of T-800 transputers. T-800 transputer is a message-passing type MIMD (multiple instruction streams and multiple data streams) architecture. A parallel variant of Schwarz alternating procedure for overlapping subdomains is developed with domain decomposition. The thesis provides convergence analysis and improvement of the convergence of the algorithm. The convergence of the parallel Schwarz algorithms with DN(or ND), DD, NN, and mixed pseudo-boundary conditions(a weighted combination of Dirichlet and Neumann conditions) is analyzed for both continuous and discrete models in two-subdomain case and various underlying features are explored. The analysis shows that the convergence rate of the algorithm highly depends on the pseudo-boundary conditions and the theoretically best one is the mixed boundary conditions(MM conditions). Also it is shown that there may exist a significant discrepancy between continuous model analysis and discrete model analysis. In order to accelerate the convergence of the parallel Schwarz algorithm, relaxation in pseudo-boundary conditions is introduced and the convergence analysis of the algorithm for two-subdomain case is carried out. The analysis shows that under-relaxation of the pseudo-boundary conditions accelerates the convergence of the parallel Schwarz algorithm if the convergence rate without relaxation is negative, and any relaxation(under or over) decelerates convergence if the convergence rate without relaxation is positive. Numerical implementation of the parallel Schwarz algorithm on an MIMD system requires multi-level iterations: two levels for fixed source problems, three levels for eigenvalue problems. Performance of the algorithm turns out to be very sensitive to the iteration strategy. In general, multi-level iterations provide good performance when
Parallel algorithms and cluster computing

CERN Document Server

Hoffmann, Karl Heinz

2007-01-01

This book presents major advances in high performance computing as well as major advances due to high performance computing. It contains a collection of papers in which results achieved in the collaboration of scientists from computer science, mathematics, physics, and mechanical engineering are presented. From the science problems to the mathematical algorithms and on to the effective implementation of these algorithms on massively parallel and cluster computers we present state-of-the-art methods and technology as well as exemplary results in these fields. This book shows that problems which seem superficially distinct become intimately connected on a computational level.
Adapting algorithms to massively parallel hardware

CERN Document Server

Sioulas, Panagiotis

2016-01-01

In the recent years, the trend in computing has shifted from delivering processors with faster clock speeds to increasing the number of cores per processor. This marks a paradigm shift towards parallel programming in which applications are programmed to exploit the power provided by multi-cores. Usually there is gain in terms of the time-to-solution and the memory footprint. Specifically, this trend has sparked an interest towards massively parallel systems that can provide a large number of processors, and possibly computing nodes, as in the GPUs and MPPAs (Massively Parallel Processor Arrays). In this project, the focus was on two distinct computing problems: k-d tree searches and track seeding cellular automata. The goal was to adapt the algorithms to parallel systems and evaluate their performance in different cases.
APPLICATION OF GENETIC ALGORITHMS FOR ROBUST PARAMETER OPTIMIZATION

Directory of Open Access Journals (Sweden)

N. Belavendram

2010-12-01

Full Text Available Parameter optimization can be achieved by many methods such as Monte-Carlo, full, and fractional factorial designs. Genetic algorithms (GA are fairly recent in this respect but afford a novel method of parameter optimization. In GA, there is an initial pool of individuals each with its own specific phenotypic trait expressed as a ‘genetic chromosome’. Different genes enable individuals with different fitness levels to reproduce according to natural reproductive gene theory. This reproduction is established in terms of selection, crossover and mutation of reproducing genes. The resulting child generation of individuals has a better fitness level akin to natural selection, namely evolution. Populations evolve towards the fittest individuals. Such a mechanism has a parallel application in parameter optimization. Factors in a parameter design can be expressed as a genetic analogue in a pool of sub-optimal random solutions. Allowing this pool of sub-optimal solutions to evolve over several generations produces fitter generations converging to a pre-defined engineering optimum. In this paper, a genetic algorithm is used to study a seven factor non-linear equation for a Wheatstone bridge as the equation to be optimized. A comparison of the full factorial design against a GA method shows that the GA method is about 1200 times faster in finding a comparable solution.
A scalable method for parallelizing sampling-based motion planning algorithms

KAUST Repository

Jacobs, Sam Ade; Manavi, Kasra; Burgos, Juan; Denny, Jory; Thomas, Shawna; Amato, Nancy M.

2012-01-01

This paper describes a scalable method for parallelizing sampling-based motion planning algorithms. It subdivides configuration space (C-space) into (possibly overlapping) regions and independently, in parallel, uses standard (sequential) sampling-based planners to construct roadmaps in each region. Next, in parallel, regional roadmaps in adjacent regions are connected to form a global roadmap. By subdividing the space and restricting the locality of connection attempts, we reduce the work and inter-processor communication associated with nearest neighbor calculation, a critical bottleneck for scalability in existing parallel motion planning methods. We show that our method is general enough to handle a variety of planning schemes, including the widely used Probabilistic Roadmap (PRM) and Rapidly-exploring Random Trees (RRT) algorithms. We compare our approach to two other existing parallel algorithms and demonstrate that our approach achieves better and more scalable performance. Our approach achieves almost linear scalability on a 2400 core LINUX cluster and on a 153,216 core Cray XE6 petascale machine. © 2012 IEEE.
A scalable method for parallelizing sampling-based motion planning algorithms

KAUST Repository

Jacobs, Sam Ade

2012-05-01

This paper describes a scalable method for parallelizing sampling-based motion planning algorithms. It subdivides configuration space (C-space) into (possibly overlapping) regions and independently, in parallel, uses standard (sequential) sampling-based planners to construct roadmaps in each region. Next, in parallel, regional roadmaps in adjacent regions are connected to form a global roadmap. By subdividing the space and restricting the locality of connection attempts, we reduce the work and inter-processor communication associated with nearest neighbor calculation, a critical bottleneck for scalability in existing parallel motion planning methods. We show that our method is general enough to handle a variety of planning schemes, including the widely used Probabilistic Roadmap (PRM) and Rapidly-exploring Random Trees (RRT) algorithms. We compare our approach to two other existing parallel algorithms and demonstrate that our approach achieves better and more scalable performance. Our approach achieves almost linear scalability on a 2400 core LINUX cluster and on a 153,216 core Cray XE6 petascale machine. © 2012 IEEE.
Parallel algorithms for computation of the manipulator inertia matrix

Science.gov (United States)

Amin-Javaheri, Masoud; Orin, David E.

1989-01-01

The development of an O(log2N) parallel algorithm for the manipulator inertia matrix is presented. It is based on the most efficient serial algorithm which uses the composite rigid body method. Recursive doubling is used to reformulate the linear recurrence equations which are required to compute the diagonal elements of the matrix. It results in O(log2N) levels of computation. Computation of the off-diagonal elements involves N linear recurrences of varying-size and a new method, which avoids redundant computation of position and orientation transforms for the manipulator, is developed. The O(log2N) algorithm is presented in both equation and graphic forms which clearly show the parallelism inherent in the algorithm.
A simple and efficient parallel FFT algorithm using the BSP model

NARCIS (Netherlands)

Bisseling, R.H.; Inda, M.A.

2000-01-01

In this paper we present a new parallel radix FFT algorithm based on the BSP model Our parallel algorithm uses the groupcyclic distribution family which makes it simple to understand and easy to implement We show how to reduce the com munication cost of the algorithm by a factor of three in the case
Generation of Compliant Mechanisms using Hybrid Genetic Algorithm

Science.gov (United States)

Sharma, D.; Deb, K.

2014-10-01

Compliant mechanism is a single piece elastic structure which can deform to perform the assigned task. In this work, compliant mechanisms are evolved using a constraint based bi-objective optimization formulation which requires one user defined parameter ( η). This user defined parameter limits a gap between a desired path and an actual path traced by the compliant mechanism. The non-linear and discrete optimization problems are solved using the hybrid Genetic Algorithm (GA) wherein domain specific initialization, two-dimensional crossover operator and repairing techniques are adopted. A bit-wise local search method is used with elitist non-dominated sorting genetic algorithm to further refine the compliant mechanisms. Parallel computations are performed on the master-slave architecture to reduce the computation time. A parametric study is carried out for η value which suggests a range to evolve topologically different compliant mechanisms. The applied and boundary conditions to the compliant mechanisms are considered the variables that are evolved by the hybrid GA. The post-analysis of results unveils that the complaint mechanisms are always supported at unique location that can evolve the non-dominated solutions.
A parallel algorithm for the non-symmetric eigenvalue problem

International Nuclear Information System (INIS)

Sidani, M.M.

1991-01-01

An algorithm is presented for the solution of the non-symmetric eigenvalue problem. The algorithm is based on a divide-and-conquer procedure that provides initial approximations to the eigenpairs, which are then refined using Newton iterations. Since the smaller subproblems can be solved independently, and since Newton iterations with different initial guesses can be started simultaneously, the algorithm - unlike the standard QR method - is ideal for parallel computers. The author also reports on his investigation of deflation methods designed to obtain further eigenpairs if needed. Numerical results from implementations on a host of parallel machines (distributed and shared-memory) are presented
Parallel clustering algorithm for large-scale biological data sets.

Science.gov (United States)

Wang, Minchao; Zhang, Wu; Ding, Wang; Dai, Dongbo; Zhang, Huiran; Xie, Hao; Chen, Luonan; Guo, Yike; Xie, Jiang

2014-01-01

Recent explosion of biological data brings a great challenge for the traditional clustering algorithms. With increasing scale of data sets, much larger memory and longer runtime are required for the cluster identification problems. The affinity propagation algorithm outperforms many other classical clustering algorithms and is widely applied into the biological researches. However, the time and space complexity become a great bottleneck when handling the large-scale data sets. Moreover, the similarity matrix, whose constructing procedure takes long runtime, is required before running the affinity propagation algorithm, since the algorithm clusters data sets based on the similarities between data pairs. Two types of parallel architectures are proposed in this paper to accelerate the similarity matrix constructing procedure and the affinity propagation algorithm. The memory-shared architecture is used to construct the similarity matrix, and the distributed system is taken for the affinity propagation algorithm, because of its large memory size and great computing capacity. An appropriate way of data partition and reduction is designed in our method, in order to minimize the global communication cost among processes. A speedup of 100 is gained with 128 cores. The runtime is reduced from serval hours to a few seconds, which indicates that parallel algorithm is capable of handling large-scale data sets effectively. The parallel affinity propagation also achieves a good performance when clustering large-scale gene data (microarray) and detecting families in large protein superfamilies.

Efficient parallel implementation of active appearance model fitting algorithm on GPU.

Science.gov (United States)

Wang, Jinwei; Ma, Xirong; Zhu, Yuanping; Sun, Jizhou

2014-01-01

The active appearance model (AAM) is one of the most powerful model-based object detecting and tracking methods which has been widely used in various situations. However, the high-dimensional texture representation causes very time-consuming computations, which makes the AAM difficult to apply to real-time systems. The emergence of modern graphics processing units (GPUs) that feature a many-core, fine-grained parallel architecture provides new and promising solutions to overcome the computational challenge. In this paper, we propose an efficient parallel implementation of the AAM fitting algorithm on GPUs. Our design idea is fine grain parallelism in which we distribute the texture data of the AAM, in pixels, to thousands of parallel GPU threads for processing, which makes the algorithm fit better into the GPU architecture. We implement our algorithm using the compute unified device architecture (CUDA) on the Nvidia's GTX 650 GPU, which has the latest Kepler architecture. To compare the performance of our algorithm with different data sizes, we built sixteen face AAM models of different dimensional textures. The experiment results show that our parallel AAM fitting algorithm can achieve real-time performance for videos even on very high-dimensional textures.
Efficient Out of Core Sorting Algorithms for the Parallel Disks Model.

Science.gov (United States)

Kundeti, Vamsi; Rajasekaran, Sanguthevar

2011-11-01

In this paper we present efficient algorithms for sorting on the Parallel Disks Model (PDM). Numerous asymptotically optimal algorithms have been proposed in the literature. However many of these merge based algorithms have large underlying constants in the time bounds, because they suffer from the lack of read parallelism on PDM. The irregular consumption of the runs during the merge affects the read parallelism and contributes to the increased sorting time. In this paper we first introduce a novel idea called the dirty sequence accumulation that improves the read parallelism. Secondly, we show analytically that this idea can reduce the number of parallel I/O's required to sort the input close to the lower bound of [Formula: see text]. We experimentally verify our dirty sequence idea with the standard R-Way merge and show that our idea can reduce the number of parallel I/Os to sort on PDM significantly.
Viscometric characterization of cobalt nanoparticle-based magnetorheological fluids using genetic algorithms

International Nuclear Information System (INIS)

Chaudhuri, Anirban; Wereley, Norman M.; Kotha, Sanjay; Radhakrishnan, Ramachandran; Sudarshan, Tirumalai S.

2005-01-01

The rheological flow curves (shear stress vs. shear rate) of a nanoparticle cobalt-based magnetorheological fluid can be modeled using Bingham-plastic and Herschel-Bulkley constitutive models. Steady-state rheological flow curves were measured using a parallel disk rheometer for constant shear rates as a function of applied magnetic field. Genetic algorithms were used to identify constitutive model parameters from the flow curve data
An investigation of genetic algorithms

International Nuclear Information System (INIS)

Douglas, S.R.

1995-04-01

Genetic algorithms mimic biological evolution by natural selection in their search for better individuals within a changing population. they can be used as efficient optimizers. This report discusses the developing field of genetic algorithms. It gives a simple example of the search process and introduces the concept of schema. It also discusses modifications to the basic genetic algorithm that result in species and niche formation, in machine learning and artificial evolution of computer programs, and in the streamlining of human-computer interaction. (author). 3 refs., 1 tab., 2 figs
Efficient Parallel Algorithm For Direct Numerical Simulation of Turbulent Flows

Science.gov (United States)

Moitra, Stuti; Gatski, Thomas B.

1997-01-01

A distributed algorithm for a high-order-accurate finite-difference approach to the direct numerical simulation (DNS) of transition and turbulence in compressible flows is described. This work has two major objectives. The first objective is to demonstrate that parallel and distributed-memory machines can be successfully and efficiently used to solve computationally intensive and input/output intensive algorithms of the DNS class. The second objective is to show that the computational complexity involved in solving the tridiagonal systems inherent in the DNS algorithm can be reduced by algorithm innovations that obviate the need to use a parallelized tridiagonal solver.
Clustering and Genetic Algorithm Based Hybrid Flowshop Scheduling with Multiple Operations

Directory of Open Access Journals (Sweden)

Yingfeng Zhang

2014-01-01

Full Text Available This research is motivated by a flowshop scheduling problem of our collaborative manufacturing company for aeronautic products. The heat-treatment stage (HTS and precision forging stage (PFS of the case are selected as a two-stage hybrid flowshop system. In HTS, there are four parallel machines and each machine can process a batch of jobs simultaneously. In PFS, there are two machines. Each machine can install any module of the four modules for processing the workpeices with different sizes. The problem is characterized by many constraints, such as batching operation, blocking environment, and setup time and working time limitations of modules, and so forth. In order to deal with the above special characteristics, the clustering and genetic algorithm is used to calculate the good solution for the two-stage hybrid flowshop problem. The clustering is used to group the jobs according to the processing ranges of the different modules of PFS. The genetic algorithm is used to schedule the optimal sequence of the grouped jobs for the HTS and PFS. Finally, a case study is used to demonstrate the efficiency and effectiveness of the designed genetic algorithm.
Parallel simulated annealing algorithms for cell placement on hypercube multiprocessors

Science.gov (United States)

Banerjee, Prithviraj; Jones, Mark Howard; Sargent, Jeff S.

1990-01-01

Two parallel algorithms for standard cell placement using simulated annealing are developed to run on distributed-memory message-passing hypercube multiprocessors. The cells can be mapped in a two-dimensional area of a chip onto processors in an n-dimensional hypercube in two ways, such that both small and large cell exchange and displacement moves can be applied. The computation of the cost function in parallel among all the processors in the hypercube is described, along with a distributed data structure that needs to be stored in the hypercube to support the parallel cost evaluation. A novel tree broadcasting strategy is used extensively for updating cell locations in the parallel environment. A dynamic parallel annealing schedule estimates the errors due to interacting parallel moves and adapts the rate of synchronization automatically. Two novel approaches in controlling error in parallel algorithms are described: heuristic cell coloring and adaptive sequence control.
Computation of watersheds based on parallel graph algorithms

NARCIS (Netherlands)

Meijster, A.; Roerdink, J.B.T.M.; Maragos, P; Schafer, RW; Butt, MA

1996-01-01

In this paper the implementation of a parallel watershed algorithm is described. The algorithm has been implemented on a Cray J932, which is a shared memory architecture with 32 processors. The watershed transform has generally been considered to be inherently sequential, but recently a few research
Optimization approaches to mpi and area merging-based parallel buffer algorithm

Directory of Open Access Journals (Sweden)

Junfu Fan

Full Text Available On buffer zone construction, the rasterization-based dilation method inevitably introduces errors, and the double-sided parallel line method involves a series of complex operations. In this paper, we proposed a parallel buffer algorithm based on area merging and MPI (Message Passing Interface to improve the performances of buffer analyses on processing large datasets. Experimental results reveal that there are three major performance bottlenecks which significantly impact the serial and parallel buffer construction efficiencies, including the area merging strategy, the task load balance method and the MPI inter-process results merging strategy. Corresponding optimization approaches involving tree-like area merging strategy, the vertex number oriented parallel task partition method and the inter-process results merging strategy were suggested to overcome these bottlenecks. Experiments were carried out to examine the performance efficiency of the optimized parallel algorithm. The estimation results suggested that the optimization approaches could provide high performance and processing ability for buffer construction in a cluster parallel environment. Our method could provide insights into the parallelization of spatial analysis algorithm.
An Alternative Algorithm for Computing Watersheds on Shared Memory Parallel Computers

NARCIS (Netherlands)

Meijster, A.; Roerdink, J.B.T.M.

1995-01-01

In this paper a parallel implementation of a watershed algorithm is proposed. The algorithm can easily be implemented on shared memory parallel computers. The watershed transform is generally considered to be inherently sequential since the discrete watershed of an image is defined using recursion.
Comparative efficiencies of three parallel algorithms for nonlinear ...

Indian Academy of Sciences (India)

R. Narasimhan (Krishtel eMaging) 1461 1996 Oct 15 13:05:22

This algorithm is better suited for large size problems on coarse ... and reliable time integration algorithms for solving the second-order dynamic equilibrium equations that arise due ... Programming models required to take advantage of the parallel and distributed ..... In addition, MPI added the concept of a 'virtual topology'.
Efficient Parallel Implementation of Active Appearance Model Fitting Algorithm on GPU

Directory of Open Access Journals (Sweden)

Jinwei Wang

2014-01-01

Full Text Available The active appearance model (AAM is one of the most powerful model-based object detecting and tracking methods which has been widely used in various situations. However, the high-dimensional texture representation causes very time-consuming computations, which makes the AAM difficult to apply to real-time systems. The emergence of modern graphics processing units (GPUs that feature a many-core, fine-grained parallel architecture provides new and promising solutions to overcome the computational challenge. In this paper, we propose an efficient parallel implementation of the AAM fitting algorithm on GPUs. Our design idea is fine grain parallelism in which we distribute the texture data of the AAM, in pixels, to thousands of parallel GPU threads for processing, which makes the algorithm fit better into the GPU architecture. We implement our algorithm using the compute unified device architecture (CUDA on the Nvidia’s GTX 650 GPU, which has the latest Kepler architecture. To compare the performance of our algorithm with different data sizes, we built sixteen face AAM models of different dimensional textures. The experiment results show that our parallel AAM fitting algorithm can achieve real-time performance for videos even on very high-dimensional textures.
Parallel asynchronous systems and image processing algorithms

Science.gov (United States)

Coon, D. D.; Perera, A. G. U.

1989-01-01

A new hardware approach to implementation of image processing algorithms is described. The approach is based on silicon devices which would permit an independent analog processing channel to be dedicated to evey pixel. A laminar architecture consisting of a stack of planar arrays of the device would form a two-dimensional array processor with a 2-D array of inputs located directly behind a focal plane detector array. A 2-D image data stream would propagate in neuronlike asynchronous pulse coded form through the laminar processor. Such systems would integrate image acquisition and image processing. Acquisition and processing would be performed concurrently as in natural vision systems. The research is aimed at implementation of algorithms, such as the intensity dependent summation algorithm and pyramid processing structures, which are motivated by the operation of natural vision systems. Implementation of natural vision algorithms would benefit from the use of neuronlike information coding and the laminar, 2-D parallel, vision system type architecture. Besides providing a neural network framework for implementation of natural vision algorithms, a 2-D parallel approach could eliminate the serial bottleneck of conventional processing systems. Conversion to serial format would occur only after raw intensity data has been substantially processed. An interesting challenge arises from the fact that the mathematical formulation of natural vision algorithms does not specify the means of implementation, so that hardware implementation poses intriguing questions involving vision science.
Parallel preconditioned conjugate gradient algorithm applied to neutron diffusion problem

International Nuclear Information System (INIS)

Majumdar, A.; Martin, W.R.

1992-01-01

Numerical solution of the neutron diffusion problem requires solving a linear system of equations such as Ax = b, where A is an n x n symmetric positive definite (SPD) matrix; x and b are vectors with n components. The preconditioned conjugate gradient (PCG) algorithm is an efficient iterative method for solving such a linear system of equations. In this paper, the authors describe the implementation of a parallel PCG algorithm on a shared memory machine (BBN TC2000) and on a distributed workstation (IBM RS6000) environment created by the parallel virtual machine parallelization software
Optimal Golomb Ruler Sequences Generation for Optical WDM Systems: A Novel Parallel Hybrid Multi-objective Bat Algorithm

Science.gov (United States)

Bansal, Shonak; Singh, Arun Kumar; Gupta, Neena

2017-02-01

In real-life, multi-objective engineering design problems are very tough and time consuming optimization problems due to their high degree of nonlinearities, complexities and inhomogeneity. Nature-inspired based multi-objective optimization algorithms are now becoming popular for solving multi-objective engineering design problems. This paper proposes original multi-objective Bat algorithm (MOBA) and its extended form, namely, novel parallel hybrid multi-objective Bat algorithm (PHMOBA) to generate shortest length Golomb ruler called optimal Golomb ruler (OGR) sequences at a reasonable computation time. The OGRs found their application in optical wavelength division multiplexing (WDM) systems as channel-allocation algorithm to reduce the four-wave mixing (FWM) crosstalk. The performances of both the proposed algorithms to generate OGRs as optical WDM channel-allocation is compared with other existing classical computing and nature-inspired algorithms, including extended quadratic congruence (EQC), search algorithm (SA), genetic algorithms (GAs), biogeography based optimization (BBO) and big bang-big crunch (BB-BC) optimization algorithms. Simulations conclude that the proposed parallel hybrid multi-objective Bat algorithm works efficiently as compared to original multi-objective Bat algorithm and other existing algorithms to generate OGRs for optical WDM systems. The algorithm PHMOBA to generate OGRs, has higher convergence and success rate than original MOBA. The efficiency improvement of proposed PHMOBA to generate OGRs up to 20-marks, in terms of ruler length and total optical channel bandwidth (TBW) is 100 %, whereas for original MOBA is 85 %. Finally the implications for further research are also discussed.
A Parallel Compact Multi-Dimensional Numerical Algorithm with Aeroacoustics Applications

Science.gov (United States)

Povitsky, Alex; Morris, Philip J.

1999-01-01

In this study we propose a novel method to parallelize high-order compact numerical algorithms for the solution of three-dimensional PDEs (Partial Differential Equations) in a space-time domain. For this numerical integration most of the computer time is spent in computation of spatial derivatives at each stage of the Runge-Kutta temporal update. The most efficient direct method to compute spatial derivatives on a serial computer is a version of Gaussian elimination for narrow linear banded systems known as the Thomas algorithm. In a straightforward pipelined implementation of the Thomas algorithm processors are idle due to the forward and backward recurrences of the Thomas algorithm. To utilize processors during this time, we propose to use them for either non-local data independent computations, solving lines in the next spatial direction, or local data-dependent computations by the Runge-Kutta method. To achieve this goal, control of processor communication and computations by a static schedule is adopted. Thus, our parallel code is driven by a communication and computation schedule instead of the usual "creative, programming" approach. The obtained parallelization speed-up of the novel algorithm is about twice as much as that for the standard pipelined algorithm and close to that for the explicit DRP algorithm.
Interactive animation of fault-tolerant parallel algorithms

Energy Technology Data Exchange (ETDEWEB)

Apgar, S.W.

1992-02-01

Animation of algorithms makes understanding them intuitively easier. This paper describes the software tool Raft (Robust Animator of Fault Tolerant Algorithms). The Raft system allows the user to animate a number of parallel algorithms which achieve fault tolerant execution. In particular, we use it to illustrate the key Write-All problem. It has an extensive user-interface which allows a choice of the number of processors, the number of elements in the Write-All array, and the adversary to control the processor failures. The novelty of the system is that the interface allows the user to create new on-line adversaries as the algorithm executes.
Acoustic simulation in architecture with parallel algorithm

Science.gov (United States)

Li, Xiaohong; Zhang, Xinrong; Li, Dan

2004-03-01

In allusion to complexity of architecture environment and Real-time simulation of architecture acoustics, a parallel radiosity algorithm was developed. The distribution of sound energy in scene is solved with this method. And then the impulse response between sources and receivers at frequency segment, which are calculated with multi-process, are combined into whole frequency response. The numerical experiment shows that parallel arithmetic can improve the acoustic simulating efficiency of complex scene.
Comparison Of Hybrid Sorting Algorithms Implemented On Different Parallel Hardware Platforms

Directory of Open Access Journals (Sweden)

Dominik Zurek

2013-01-01

Full Text Available Sorting is a common problem in computer science. There are lot of well-known sorting algorithms created for sequential execution on a single processor. Recently, hardware platforms enable to create wide parallel algorithms. We have standard processors consist of multiple cores and hardware accelerators like GPU. The graphic cards with their parallel architecture give new possibility to speed up many algorithms. In this paper we describe results of implementation of a few different sorting algorithms on GPU cards and multicore processors. Then hybrid algorithm will be presented which consists of parts executed on both platforms, standard CPU and GPU.
Parallel optimization of IDW interpolation algorithm on multicore platform

Science.gov (United States)

Guan, Xuefeng; Wu, Huayi

2009-10-01

Due to increasing power consumption, heat dissipation, and other physical issues, the architecture of central processing unit (CPU) has been turning to multicore rapidly in recent years. Multicore processor is packaged with multiple processor cores in the same chip, which not only offers increased performance, but also presents significant challenges to application developers. As a matter of fact, in GIS field most of current GIS algorithms were implemented serially and could not best exploit the parallelism potential on such multicore platforms. In this paper, we choose Inverse Distance Weighted spatial interpolation algorithm (IDW) as an example to study how to optimize current serial GIS algorithms on multicore platform in order to maximize performance speedup. With the help of OpenMP, threading methodology is introduced to split and share the whole interpolation work among processor cores. After parallel optimization, execution time of interpolation algorithm is greatly reduced and good performance speedup is achieved. For example, performance speedup on Intel Xeon 5310 is 1.943 with 2 execution threads and 3.695 with 4 execution threads respectively. An additional output comparison between pre-optimization and post-optimization is carried out and shows that parallel optimization does to affect final interpolation result.

Massively parallel red-black algorithms for x-y-z response matrix equations

International Nuclear Information System (INIS)

Hanebutte, U.R.; Laurin-Kovitz, K.; Lewis, E.E.

1992-01-01

Recently, both discrete ordinates and spherical harmonic (S n and P n ) methods have been cast in the form of response matrices. In x-y geometry, massively parallel algorithms have been developed to solve the resulting response matrix equations on the Connection Machine family of parallel computers, the CM-2, CM-200, and CM-5. These algorithms utilize two-cycle iteration on a red-black checkerboard. In this work we examine the use of massively parallel red-black algorithms to solve response matric equations in three dimensions. This longer term objective is to utilize massively parallel algorithms to solve S n and/or P n response matrix problems. In this exploratory examination, however, we consider the simple 6 x 6 response matrices that are derivable from fine-mesh diffusion approximations in three dimensions
Optimal parallel algorithms for problems modeled by a family of intervals

Science.gov (United States)

Olariu, Stephan; Schwing, James L.; Zhang, Jingyuan

1992-01-01

A family of intervals on the real line provides a natural model for a vast number of scheduling and VLSI problems. Recently, a number of parallel algorithms to solve a variety of practical problems on such a family of intervals have been proposed in the literature. Computational tools are developed, and it is shown how they can be used for the purpose of devising cost-optimal parallel algorithms for a number of interval-related problems including finding a largest subset of pairwise nonoverlapping intervals, a minimum dominating subset of intervals, along with algorithms to compute the shortest path between a pair of intervals and, based on the shortest path, a parallel algorithm to find the center of the family of intervals. More precisely, with an arbitrary family of n intervals as input, all algorithms run in O(log n) time using O(n) processors in the EREW-PRAM model of computation.
GPGPU Implementation of a Genetic Algorithm for Stereo Refinement

Directory of Open Access Journals (Sweden)

Álvaro Arranz

2015-03-01

Full Text Available During the last decade, the general-purpose computing on graphics processing units Graphics (GPGPU has turned out to be a useful tool for speeding up many scientific calculations. Computer vision is known to be one of the fields with more penetration of these new techniques. This paper explores the advantages of using GPGPU implementation to speedup a genetic algorithm used for stereo refinement. The main contribution of this paper is analyzing which genetic operators take advantage of a parallel approach and the description of an efficient state- of-the-art implementation for each one. As a result, speed-ups close to x80 can be achieved, demonstrating to be the only way of achieving close to real-time performance.
A parallel simulated annealing algorithm for standard cell placement on a hypercube computer

Science.gov (United States)

Jones, Mark Howard

1987-01-01

A parallel version of a simulated annealing algorithm is presented which is targeted to run on a hypercube computer. A strategy for mapping the cells in a two dimensional area of a chip onto processors in an n-dimensional hypercube is proposed such that both small and large distance moves can be applied. Two types of moves are allowed: cell exchanges and cell displacements. The computation of the cost function in parallel among all the processors in the hypercube is described along with a distributed data structure that needs to be stored in the hypercube to support parallel cost evaluation. A novel tree broadcasting strategy is used extensively in the algorithm for updating cell locations in the parallel environment. Studies on the performance of the algorithm on example industrial circuits show that it is faster and gives better final placement results than the uniprocessor simulated annealing algorithms. An improved uniprocessor algorithm is proposed which is based on the improved results obtained from parallelization of the simulated annealing algorithm.
Parallel Algorithm for Wireless Data Compression and Encryption

Directory of Open Access Journals (Sweden)

Qin Jiancheng

2017-01-01

Full Text Available As the wireless network has limited bandwidth and insecure shared media, the data compression and encryption are very useful for the broadcasting transportation of big data in IoT (Internet of Things. However, the traditional techniques of compression and encryption are neither competent nor efficient. In order to solve this problem, this paper presents a combined parallel algorithm named “CZ algorithm” which can compress and encrypt the big data efficiently. CZ algorithm uses a parallel pipeline, mixes the coding of compression and encryption, and supports the data window up to 1 TB (or larger. Moreover, CZ algorithm can encrypt the big data as a chaotic cryptosystem which will not decrease the compression speed. Meanwhile, a shareware named “ComZip” is developed based on CZ algorithm. The experiment results show that ComZip in 64 b system can get better compression ratio than WinRAR and 7-zip, and it can be faster than 7-zip in the big data compression. In addition, ComZip encrypts the big data without extra consumption of computing resources.
The Primordial Soup Algorithm : a systematic approach to the specification of parallel parsers

NARCIS (Netherlands)

Janssen, Wil; Janssen, W.P.M.; Poel, Mannes; Sikkel, Nicolaas; Zwiers, Jakob

1992-01-01

A general framework for parallel parsing is presented, which allows for a unitied, systematic approach to parallel parsing. The Primordial Soup Algorithm creates trees by allowing partial parse trees to combine arbitrarily. By adding constraints to the general algorithm, a large, class of parallel
Parallel Evolutionary Optimization Algorithms for Peptide-Protein Docking

Science.gov (United States)

Poluyan, Sergey; Ershov, Nikolay

2018-02-01

In this study we examine the possibility of using evolutionary optimization algorithms in protein-peptide docking. We present the main assumptions that reduce the docking problem to a continuous global optimization problem and provide a way of using evolutionary optimization algorithms. The Rosetta all-atom force field was used for structural representation and energy scoring. We describe the parallelization scheme and MPI/OpenMP realization of the considered algorithms. We demonstrate the efficiency and the performance for some algorithms which were applied to a set of benchmark tests.
Parallelized event chain algorithm for dense hard sphere and polymer systems

International Nuclear Information System (INIS)

Kampmann, Tobias A.; Boltz, Horst-Holger; Kierfeld, Jan

2015-01-01

We combine parallelization and cluster Monte Carlo for hard sphere systems and present a parallelized event chain algorithm for the hard disk system in two dimensions. For parallelization we use a spatial partitioning approach into simulation cells. We find that it is crucial for correctness to ensure detailed balance on the level of Monte Carlo sweeps by drawing the starting sphere of event chains within each simulation cell with replacement. We analyze the performance gains for the parallelized event chain and find a criterion for an optimal degree of parallelization. Because of the cluster nature of event chain moves massive parallelization will not be optimal. Finally, we discuss first applications of the event chain algorithm to dense polymer systems, i.e., bundle-forming solutions of attractive semiflexible polymers
Kinetic-Monte-Carlo-Based Parallel Evolution Simulation Algorithm of Dust Particles

Directory of Open Access Journals (Sweden)

Xiaomei Hu

2014-01-01

Full Text Available The evolution simulation of dust particles provides an important way to analyze the impact of dust on the environment. KMC-based parallel algorithm is proposed to simulate the evolution of dust particles. In the parallel evolution simulation algorithm of dust particles, data distribution way and communication optimizing strategy are raised to balance the load of every process and reduce the communication expense among processes. The experimental results show that the simulation of diffusion, sediment, and resuspension of dust particles in virtual campus is realized and the simulation time is shortened by parallel algorithm, which makes up for the shortage of serial computing and makes the simulation of large-scale virtual environment possible.
A Two-Pass Exact Algorithm for Selection on Parallel Disk Systems.

Science.gov (United States)

Mi, Tian; Rajasekaran, Sanguthevar

2013-07-01

Numerous OLAP queries process selection operations of "top N", median, "top 5%", in data warehousing applications. Selection is a well-studied problem that has numerous applications in the management of data and databases since, typically, any complex data query can be reduced to a series of basic operations such as sorting and selection. The parallel selection has also become an important fundamental operation, especially after parallel databases were introduced. In this paper, we present a deterministic algorithm Recursive Sampling Selection (RSS) to solve the exact out-of-core selection problem, which we show needs no more than (2 + ε ) passes ( ε being a very small fraction). We have compared our RSS algorithm with two other algorithms in the literature, namely, the Deterministic Sampling Selection and QuickSelect on the Parallel Disks Systems. Our analysis shows that DSS is a (2 + ε )-pass algorithm when the total number of input elements N is a polynomial in the memory size M (i.e., N = M c for some constant c ). While, our proposed algorithm RSS runs in (2 + ε ) passes without any assumptions. Experimental results indicate that both RSS and DSS outperform QuickSelect on the Parallel Disks Systems. Especially, the proposed algorithm RSS is more scalable and robust to handle big data when the input size is far greater than the core memory size, including the case of N ≫ M c .
Parallelized Genetic Identification of the Thermal-Electrochemical Model for Lithium-Ion Battery

Directory of Open Access Journals (Sweden)

Liqiang Zhang

2013-01-01

Full Text Available The parameters of a well predicted model can be used as health characteristics for Lithium-ion battery. This article reports a parallelized parameter identification of the thermal-electrochemical model, which significantly reduces the time consumption of parameter identification. Since the P2D model has the most predictability, it is chosen for further research and expanded to the thermal-electrochemical model by coupling thermal effect and temperature-dependent parameters. Then Genetic Algorithm is used for parameter identification, but it takes too much time because of the long time simulation of model. For this reason, a computer cluster is built by surplus computing resource in our laboratory based on Parallel Computing Toolbox and Distributed Computing Server in MATLAB. The performance of two parallelized methods, namely Single Program Multiple Data (SPMD and parallel FOR loop (PARFOR, is investigated and then the parallelized GA identification is proposed. With this method, model simulations running parallelly and the parameter identification could be speeded up more than a dozen times, and the identification result is batter than that from serial GA. This conclusion is validated by model parameter identification of a real LiFePO4 battery.
A parallel version of a multigrid algorithm for isotropic transport equations

International Nuclear Information System (INIS)

Manteuffel, T.; McCormick, S.; Yang, G.; Morel, J.; Oliveira, S.

1994-01-01

The focus of this paper is on a parallel algorithm for solving the transport equations in a slab geometry using multigrid. The spatial discretization scheme used is a finite element method called the modified linear discontinuous (MLD) scheme. The MLD scheme represents a lumped version of the standard linear discontinuous (LD) scheme. The parallel algorithm was implemented on the Connection Machine 2 (CM2). Convergence rates and timings for this algorithm on the CM2 and Cray-YMP are shown
A Robust Parallel Algorithm for Combinatorial Compressed Sensing

Science.gov (United States)

Mendoza-Smith, Rodrigo; Tanner, Jared W.; Wechsung, Florian

2018-04-01

In previous work two of the authors have shown that a vector $x \\in \\mathbb{R}^n$ with at most $k Parallel-$\\ell_0$ decoding algorithm, where $\\mathrm{nnz}(A)$ denotes the number of nonzero entries in $A \\in \\mathbb{R}^{m \\times n}$. In this paper we present the Robust-$\\ell_0$ decoding algorithm, which robustifies Parallel-$\\ell_0$ when the sketch $Ax$ is corrupted by additive noise. This robustness is achieved by approximating the asymptotic posterior distribution of values in the sketch given its corrupted measurements. We provide analytic expressions that approximate these posteriors under the assumptions that the nonzero entries in the signal and the noise are drawn from continuous distributions. Numerical experiments presented show that Robust-$\\ell_0$ is superior to existing greedy and combinatorial compressed sensing algorithms in the presence of small to moderate signal-to-noise ratios in the setting of Gaussian signals and Gaussian additive noise.
Parallel computation of nondeterministic algorithms in VLSI

Energy Technology Data Exchange (ETDEWEB)

Hortensius, P D

1987-01-01

This work examines parallel VLSI implementations of nondeterministic algorithms. It is demonstrated that conventional pseudorandom number generators are unsuitable for highly parallel applications. Efficient parallel pseudorandom sequence generation can be accomplished using certain classes of elementary one-dimensional cellular automata. The pseudorandom numbers appear in parallel on each clock cycle. Extensive study of the properties of these new pseudorandom number generators is made using standard empirical random number tests, cycle length tests, and implementation considerations. Furthermore, it is shown these particular cellular automata can form the basis of efficient VLSI architectures for computations involved in the Monte Carlo simulation of both the percolation and Ising models from statistical mechanics. Finally, a variation on a Built-In Self-Test technique based upon cellular automata is presented. These Cellular Automata-Logic-Block-Observation (CALBO) circuits improve upon conventional design for testability circuitry.
When do evolutionary algorithms optimize separable functions in parallel?

DEFF Research Database (Denmark)

Doerr, Benjamin; Sudholt, Dirk; Witt, Carsten

2013-01-01

is that evolutionary algorithms make progress on all subfunctions in parallel, so that optimizing a separable function does not take not much longer than optimizing the hardest subfunction-subfunctions are optimized "in parallel." We show that this is only partially true, already for the simple (1+1) evolutionary...... algorithm ((1+1) EA). For separable functions composed of k Boolean functions indeed the optimization time is the maximum optimization time of these functions times a small O(log k) overhead. More generally, for sums of weighted subfunctions that each attain non-negative integer values less than r = o(log1...
Genetic algorithms applied to nuclear reactor design optimization

International Nuclear Information System (INIS)

Pereira, C.M.N.A.; Schirru, R.; Martinez, A.S.

2000-01-01

A genetic algorithm is a powerful search technique that simulates natural evolution in order to fit a population of computational structures to the solution of an optimization problem. This technique presents several advantages over classical ones such as linear programming based techniques, often used in nuclear engineering optimization problems. However, genetic algorithms demand some extra computational cost. Nowadays, due to the fast computers available, the use of genetic algorithms has increased and its practical application has become a reality. In nuclear engineering there are many difficult optimization problems related to nuclear reactor design. Genetic algorithm is a suitable technique to face such kind of problems. This chapter presents applications of genetic algorithms for nuclear reactor core design optimization. A genetic algorithm has been designed to optimize the nuclear reactor cell parameters, such as array pitch, isotopic enrichment, dimensions and cells materials. Some advantages of this genetic algorithm implementation over a classical method based on linear programming are revealed through the application of both techniques to a simple optimization problem. In order to emphasize the suitability of genetic algorithms for design optimization, the technique was successfully applied to a more complex problem, where the classical method is not suitable. Results and comments about the applications are also presented. (orig.)
A backtracking algorithm for the stream AND-parallel execution of logic programs

Energy Technology Data Exchange (ETDEWEB)

Somogyi, Z.; Ramamohanarao, K.; Vaghani, J. (Univ. of Melbourne, Parkville (Australia))

1988-06-01

The authors present the first backtracking algorithm for stream AND-parallel logic programs. It relies on compile-time knowledge of the data flow graph of each clause to let it figure out efficiently which goals to kill or restart when a goal fails. This crucial information, which they derive from mode declarations, was not available at compile-time in any previous stream AND-parallel system. They show that modes can increase the precision of the backtracking algorithm, though their algorithm allows this precision to be traded off against overhead on a procedure-by-procedure and call-by-call basis. The modes also allow their algorithm to handle efficiently programs that manipulate partially instantiated data structures and an important class of programs with circular dependency graphs. On code that does not need backtracking, the efficiency of their algorithm approaches that of the committed-choice languages; on code that does need backtracking its overhead is comparable to that of the independent AND-parallel backtracking algorithms.
Optimal hydrogenerator governor tuning with a genetic algorithm

International Nuclear Information System (INIS)

Lansberry, J.E.; Wozniak, L.; Goldberg, D.E.

1992-01-01

Many techniques exist for developing optimal controllers. This paper investigates genetic algorithms as a means of finding optimal solutions over a parameter space. In particular, the genetic algorithm is applied to optimal tuning of a governor for a hydrogenerator plant. Analog and digital simulation methods are compared for use in conjunction with the genetic algorithm optimization process. It is shown that analog plant simulation provides advantages in speed over digital plant simulation. This speed advantage makes application of the genetic algorithm in an actual plant environment feasible. Furthermore, the genetic algorithm is shown to possess the ability to reject plant noise and other system anomalies in its search for optimizing solutions
Fast parallel algorithms for the x-ray transform and its adjoint.

Science.gov (United States)

Gao, Hao

2012-11-01

Iterative reconstruction methods often offer better imaging quality and allow for reconstructions with lower imaging dose than classical methods in computed tomography. However, the computational speed is a major concern for these iterative methods, for which the x-ray transform and its adjoint are two most time-consuming components. The speed issue becomes even notable for the 3D imaging such as cone beam scans or helical scans, since the x-ray transform and its adjoint are frequently computed as there is usually not enough computer memory to save the corresponding system matrix. The purpose of this paper is to optimize the algorithm for computing the x-ray transform and its adjoint, and their parallel computation. The fast and highly parallelizable algorithms for the x-ray transform and its adjoint are proposed for the infinitely narrow beam in both 2D and 3D. The extension of these fast algorithms to the finite-size beam is proposed in 2D and discussed in 3D. The CPU and GPU codes are available at https://sites.google.com/site/fastxraytransform. The proposed algorithm is faster than Siddon's algorithm for computing the x-ray transform. In particular, the improvement for the parallel computation can be an order of magnitude. The authors have proposed fast and highly parallelizable algorithms for the x-ray transform and its adjoint, which are extendable for the finite-size beam. The proposed algorithms are suitable for parallel computing in the sense that the computational cost per parallel thread is O(1).
Implementation and analysis of a Navier-Stokes algorithm on parallel computers

Science.gov (United States)

Fatoohi, Raad A.; Grosch, Chester E.

1988-01-01

The results of the implementation of a Navier-Stokes algorithm on three parallel/vector computers are presented. The object of this research is to determine how well, or poorly, a single numerical algorithm would map onto three different architectures. The algorithm is a compact difference scheme for the solution of the incompressible, two-dimensional, time-dependent Navier-Stokes equations. The computers were chosen so as to encompass a variety of architectures. They are the following: the MPP, an SIMD machine with 16K bit serial processors; Flex/32, an MIMD machine with 20 processors; and Cray/2. The implementation of the algorithm is discussed in relation to these architectures and measures of the performance on each machine are given. The basic comparison is among SIMD instruction parallelism on the MPP, MIMD process parallelism on the Flex/32, and vectorization of a serial code on the Cray/2. Simple performance models are used to describe the performance. These models highlight the bottlenecks and limiting factors for this algorithm on these architectures. Finally, conclusions are presented.

A parallel algorithm for the two-dimensional time fractional diffusion equation with implicit difference method.

Science.gov (United States)

Gong, Chunye; Bao, Weimin; Tang, Guojian; Jiang, Yuewen; Liu, Jie

2014-01-01

It is very time consuming to solve fractional differential equations. The computational complexity of two-dimensional fractional differential equation (2D-TFDE) with iterative implicit finite difference method is O(M(x)M(y)N(2)). In this paper, we present a parallel algorithm for 2D-TFDE and give an in-depth discussion about this algorithm. A task distribution model and data layout with virtual boundary are designed for this parallel algorithm. The experimental results show that the parallel algorithm compares well with the exact solution. The parallel algorithm on single Intel Xeon X5540 CPU runs 3.16-4.17 times faster than the serial algorithm on single CPU core. The parallel efficiency of 81 processes is up to 88.24% compared with 9 processes on a distributed memory cluster system. We do think that the parallel computing technology will become a very basic method for the computational intensive fractional applications in the near future.
A GPU-paralleled implementation of an enhanced face recognition algorithm

Science.gov (United States)

Chen, Hao; Liu, Xiyang; Shao, Shuai; Zan, Jiguo

2013-03-01

Face recognition algorithm based on compressed sensing and sparse representation is hotly argued in these years. The scheme of this algorithm increases recognition rate as well as anti-noise capability. However, the computational cost is expensive and has become a main restricting factor for real world applications. In this paper, we introduce a GPU-accelerated hybrid variant of face recognition algorithm named parallel face recognition algorithm (pFRA). We describe here how to carry out parallel optimization design to take full advantage of many-core structure of a GPU. The pFRA is tested and compared with several other implementations under different data sample size. Finally, Our pFRA, implemented with NVIDIA GPU and Computer Unified Device Architecture (CUDA) programming model, achieves a significant speedup over the traditional CPU implementations.
A new scheduling algorithm for parallel sparse LU factorization with static pivoting

Energy Technology Data Exchange (ETDEWEB)

Grigori, Laura; Li, Xiaoye S.

2002-08-20

In this paper we present a static scheduling algorithm for parallel sparse LU factorization with static pivoting. The algorithm is divided into mapping and scheduling phases, using the symmetric pruned graphs of L' and U to represent dependencies. The scheduling algorithm is designed for driving the parallel execution of the factorization on a distributed-memory architecture. Experimental results and comparisons with SuperLU{_}DIST are reported after applying this algorithm on real world application matrices on an IBM SP RS/6000 distributed memory machine.
GPU-based parallel algorithm for blind image restoration using midfrequency-based methods

Science.gov (United States)

Xie, Lang; Luo, Yi-han; Bao, Qi-liang

2013-08-01

GPU-based general-purpose computing is a new branch of modern parallel computing, so the study of parallel algorithms specially designed for GPU hardware architecture is of great significance. In order to solve the problem of high computational complexity and poor real-time performance in blind image restoration, the midfrequency-based algorithm for blind image restoration was analyzed and improved in this paper. Furthermore, a midfrequency-based filtering method is also used to restore the image hardly with any recursion or iteration. Combining the algorithm with data intensiveness, data parallel computing and GPU execution model of single instruction and multiple threads, a new parallel midfrequency-based algorithm for blind image restoration is proposed in this paper, which is suitable for stream computing of GPU. In this algorithm, the GPU is utilized to accelerate the estimation of class-G point spread functions and midfrequency-based filtering. Aiming at better management of the GPU threads, the threads in a grid are scheduled according to the decomposition of the filtering data in frequency domain after the optimization of data access and the communication between the host and the device. The kernel parallelism structure is determined by the decomposition of the filtering data to ensure the transmission rate to get around the memory bandwidth limitation. The results show that, with the new algorithm, the operational speed is significantly increased and the real-time performance of image restoration is effectively improved, especially for high-resolution images.
Characterization of robotics parallel algorithms and mapping onto a reconfigurable SIMD machine

Science.gov (United States)

Lee, C. S. G.; Lin, C. T.

1989-01-01

The kinematics, dynamics, Jacobian, and their corresponding inverse computations are six essential problems in the control of robot manipulators. Efficient parallel algorithms for these computations are discussed and analyzed. Their characteristics are identified and a scheme on the mapping of these algorithms to a reconfigurable parallel architecture is presented. Based on the characteristics including type of parallelism, degree of parallelism, uniformity of the operations, fundamental operations, data dependencies, and communication requirement, it is shown that most of the algorithms for robotic computations possess highly regular properties and some common structures, especially the linear recursive structure. Moreover, they are well-suited to be implemented on a single-instruction-stream multiple-data-stream (SIMD) computer with reconfigurable interconnection network. The model of a reconfigurable dual network SIMD machine with internal direct feedback is introduced. A systematic procedure internal direct feedback is introduced. A systematic procedure to map these computations to the proposed machine is presented. A new scheduling problem for SIMD machines is investigated and a heuristic algorithm, called neighborhood scheduling, that reorders the processing sequence of subtasks to reduce the communication time is described. Mapping results of a benchmark algorithm are illustrated and discussed.
Boolean Queries Optimization by Genetic Algorithms

Czech Academy of Sciences Publication Activity Database

Húsek, Dušan; Owais, S.S.J.; Krömer, P.; Snášel, Václav

2005-01-01

Roč. 15, - (2005), s. 395-409 ISSN 1210-0552 R&D Projects: GA AV ČR 1ET100300414 Institutional research plan: CEZ:AV0Z10300504 Keywords : evolutionary algorithms * genetic algorithms * genetic programming * information retrieval * Boolean query Subject RIV: BB - Applied Statistics, Operational Research
A parallel algorithm for switch-level timing simulation on a hypercube multiprocessor

Science.gov (United States)

Rao, Hariprasad Nannapaneni

1989-01-01

The parallel approach to speeding up simulation is studied, specifically the simulation of digital LSI MOS circuitry on the Intel iPSC/2 hypercube. The simulation algorithm is based on RSIM, an event driven switch-level simulator that incorporates a linear transistor model for simulating digital MOS circuits. Parallel processing techniques based on the concepts of Virtual Time and rollback are utilized so that portions of the circuit may be simulated on separate processors, in parallel for as large an increase in speed as possible. A partitioning algorithm is also developed in order to subdivide the circuit for parallel processing.
Efficiency Analysis of the Parallel Implementation of the SIMPLE Algorithm on Multiprocessor Computers

Science.gov (United States)

Lashkin, S. V.; Kozelkov, A. S.; Yalozo, A. V.; Gerasimov, V. Yu.; Zelensky, D. K.

2017-12-01

This paper describes the details of the parallel implementation of the SIMPLE algorithm for numerical solution of the Navier-Stokes system of equations on arbitrary unstructured grids. The iteration schemes for the serial and parallel versions of the SIMPLE algorithm are implemented. In the description of the parallel implementation, special attention is paid to computational data exchange among processors under the condition of the grid model decomposition using fictitious cells. We discuss the specific features for the storage of distributed matrices and implementation of vector-matrix operations in parallel mode. It is shown that the proposed way of matrix storage reduces the number of interprocessor exchanges. A series of numerical experiments illustrates the effect of the multigrid SLAE solver tuning on the general efficiency of the algorithm; the tuning involves the types of the cycles used (V, W, and F), the number of iterations of a smoothing operator, and the number of cells for coarsening. Two ways (direct and indirect) of efficiency evaluation for parallelization of the numerical algorithm are demonstrated. The paper presents the results of solving some internal and external flow problems with the evaluation of parallelization efficiency by two algorithms. It is shown that the proposed parallel implementation enables efficient computations for the problems on a thousand processors. Based on the results obtained, some general recommendations are made for the optimal tuning of the multigrid solver, as well as for selecting the optimal number of cells per processor.
General upper bounds on the runtime of parallel evolutionary algorithms.

Science.gov (United States)

Lässig, Jörg; Sudholt, Dirk

2014-01-01

We present a general method for analyzing the runtime of parallel evolutionary algorithms with spatially structured populations. Based on the fitness-level method, it yields upper bounds on the expected parallel runtime. This allows for a rigorous estimate of the speedup gained by parallelization. Tailored results are given for common migration topologies: ring graphs, torus graphs, hypercubes, and the complete graph. Example applications for pseudo-Boolean optimization show that our method is easy to apply and that it gives powerful results. In our examples the performance guarantees improve with the density of the topology. Surprisingly, even sparse topologies such as ring graphs lead to a significant speedup for many functions while not increasing the total number of function evaluations by more than a constant factor. We also identify which number of processors lead to the best guaranteed speedups, thus giving hints on how to parameterize parallel evolutionary algorithms.
Innovative Software Algorithms and Tools parallel sessions summary

International Nuclear Information System (INIS)

Gaines, Irwin

2001-01-01

A variety of results were presented in the poster and 5 parallel sessions of the Innovative Software, Algorithms and Tools (ISAT) sessions. I will briefly summarize these presentations and attempt to identify some unifying trends
Learning Intelligent Genetic Algorithms Using Japanese Nonograms

Science.gov (United States)

Tsai, Jinn-Tsong; Chou, Ping-Yi; Fang, Jia-Cen

2012-01-01

An intelligent genetic algorithm (IGA) is proposed to solve Japanese nonograms and is used as a method in a university course to learn evolutionary algorithms. The IGA combines the global exploration capabilities of a canonical genetic algorithm (CGA) with effective condensed encoding, improved fitness function, and modified crossover and…
A New Approach of Parallelism and Load Balance for the Apriori Algorithm

Directory of Open Access Journals (Sweden)

BOLINA, A. C.

2013-06-01

Full Text Available The main goal of data mining is to discover relevant information on digital content. The Apriori algorithm is widely used to this objective, but its sequential version has a low performance when execu- ted over large volumes of data. Among the solutions for this problem is the parallel implementation of the algorithm, and among the parallel implementations presented in the literature that based on Apriori, it highlights the DPA (Distributed Parallel Apriori [10]. This paper presents the DMTA (Distributed Multithread Apriori algorithm, which is based on DPA and exploits the parallelism level of threads in order to increase the performance. Besides, DMTA can be executed over heterogeneous hardware platform, using different number of cores. The results showed that DMTA outperforms DPA, presents load balance among processes and threads, and it is effective in current multicore architectures.
Cognitive radio resource allocation based on coupled chaotic genetic algorithm

International Nuclear Information System (INIS)

Zu Yun-Xiao; Zhou Jie; Zeng Chang-Chang

2010-01-01

A coupled chaotic genetic algorithm for cognitive radio resource allocation which is based on genetic algorithm and coupled Logistic map is proposed. A fitness function for cognitive radio resource allocation is provided. Simulations are conducted for cognitive radio resource allocation by using the coupled chaotic genetic algorithm, simple genetic algorithm and dynamic allocation algorithm respectively. The simulation results show that, compared with simple genetic and dynamic allocation algorithm, coupled chaotic genetic algorithm reduces the total transmission power and bit error rate in cognitive radio system, and has faster convergence speed
Designing a parallel evolutionary algorithm for inferring gene networks on the cloud computing environment.

Science.gov (United States)

Lee, Wei-Po; Hsiao, Yu-Ting; Hwang, Wei-Che

2014-01-16

To improve the tedious task of reconstructing gene networks through testing experimentally the possible interactions between genes, it becomes a trend to adopt the automated reverse engineering procedure instead. Some evolutionary algorithms have been suggested for deriving network parameters. However, to infer large networks by the evolutionary algorithm, it is necessary to address two important issues: premature convergence and high computational cost. To tackle the former problem and to enhance the performance of traditional evolutionary algorithms, it is advisable to use parallel model evolutionary algorithms. To overcome the latter and to speed up the computation, it is advocated to adopt the mechanism of cloud computing as a promising solution: most popular is the method of MapReduce programming model, a fault-tolerant framework to implement parallel algorithms for inferring large gene networks. This work presents a practical framework to infer large gene networks, by developing and parallelizing a hybrid GA-PSO optimization method. Our parallel method is extended to work with the Hadoop MapReduce programming model and is executed in different cloud computing environments. To evaluate the proposed approach, we use a well-known open-source software GeneNetWeaver to create several yeast S. cerevisiae sub-networks and use them to produce gene profiles. Experiments have been conducted and the results have been analyzed. They show that our parallel approach can be successfully used to infer networks with desired behaviors and the computation time can be largely reduced. Parallel population-based algorithms can effectively determine network parameters and they perform better than the widely-used sequential algorithms in gene network inference. These parallel algorithms can be distributed to the cloud computing environment to speed up the computation. By coupling the parallel model population-based optimization method and the parallel computational framework, high
Genetic algorithms and supernovae type Ia analysis

International Nuclear Information System (INIS)

Bogdanos, Charalampos; Nesseris, Savvas

2009-01-01

We introduce genetic algorithms as a means to analyze supernovae type Ia data and extract model-independent constraints on the evolution of the Dark Energy equation of state w(z) ≡ P DE /ρ DE . Specifically, we will give a brief introduction to the genetic algorithms along with some simple examples to illustrate their advantages and finally we will apply them to the supernovae type Ia data. We find that genetic algorithms can lead to results in line with already established parametric and non-parametric reconstruction methods and could be used as a complementary way of treating SNIa data. As a non-parametric method, genetic algorithms provide a model-independent way to analyze data and can minimize bias due to premature choice of a dark energy model
An Optimal Parallel Algorithm for the Knapsack Problem Based on EREW

Institute of Scientific and Technical Information of China (English)

李肯立; 蒋盛益; 王卉; 李庆华

2003-01-01

A new parallel algorithm is proposed for the knapsack problem where the method of divide and conquer is adopted. Based on an EREW-SIMD machine with shared memory, the proposed algorithm utilizes O(2n/4)1-ε processors, 0≤ε≤1, and O(2n/2) memory to find a solution for the n-element knapsack problem in time O(2n/4(2n/4)ε). The cost of the proposed parallel algorithm is O(2n/2), which is an optimal method for solving the knapsack problem without memory conflicts and an improved result over the past researches.
On the Optimization and Parallelizing Little Algorithm for Solving the Traveling Salesman Problem

Directory of Open Access Journals (Sweden)

V. V. Vasilchikov

2016-01-01

Full Text Available The paper describes some ways to accelerate solving the NP-complete Traveling Salesman Problem. The classic Little algorithm belonging to the category of ”branch and bound methods” can solve it both for directed and undirected graphs. However, for undirected graphs its operation can be accelerated by eliminating the consideration of branches examined earlier. The paper proposes changes to be made in the key operations of the algorithm to speed up its execution. It also describes the results of an experiment that demonstrated a significant acceleration of solving the problem by using an advanced algorithm. Another way to speed up the work is to parallelize the algorithm. For problems of this kind it is difficult to break the task into a sufficient number of subtasks having comparable complexity. Their parallelism arises dynamically during the execution. For such problems, it seems reasonable to use parallel-recursive algorithms. In our case the use of the library RPM ParLib developed by the author was a good choice. It allows us to develop effective applications for parallel computing on a local network using any .NET-compatible programming language. We used C# to develop the programs. Parallel applications were developed as for basic and modified algorithms, the comparing of their speed was made. Experiments were performed for the graphs with the number of vertexes up to 45 and with the number of network computers up to 16. We also investigated the acceleration that can be achieved by parallelizing the basic Little algorithm for directed graphs. The results of these experiments are also presented in the paper.
Constraint treatment techniques and parallel algorithms for multibody dynamic analysis. Ph.D. Thesis

Science.gov (United States)

Chiou, Jin-Chern

1990-01-01

Computational procedures for kinematic and dynamic analysis of three-dimensional multibody dynamic (MBD) systems are developed from the differential-algebraic equations (DAE's) viewpoint. Constraint violations during the time integration process are minimized and penalty constraint stabilization techniques and partitioning schemes are developed. The governing equations of motion, a two-stage staggered explicit-implicit numerical algorithm, are treated which takes advantage of a partitioned solution procedure. A robust and parallelizable integration algorithm is developed. This algorithm uses a two-stage staggered central difference algorithm to integrate the translational coordinates and the angular velocities. The angular orientations of bodies in MBD systems are then obtained by using an implicit algorithm via the kinematic relationship between Euler parameters and angular velocities. It is shown that the combination of the present solution procedures yields a computationally more accurate solution. To speed up the computational procedures, parallel implementation of the present constraint treatment techniques, the two-stage staggered explicit-implicit numerical algorithm was efficiently carried out. The DAE's and the constraint treatment techniques were transformed into arrowhead matrices to which Schur complement form was derived. By fully exploiting the sparse matrix structural analysis techniques, a parallel preconditioned conjugate gradient numerical algorithm is used to solve the systems equations written in Schur complement form. A software testbed was designed and implemented in both sequential and parallel computers. This testbed was used to demonstrate the robustness and efficiency of the constraint treatment techniques, the accuracy of the two-stage staggered explicit-implicit numerical algorithm, and the speed up of the Schur-complement-based parallel preconditioned conjugate gradient algorithm on a parallel computer.
Parallel computing of physical maps--a comparative study in SIMD and MIMD parallelism.

Science.gov (United States)

Bhandarkar, S M; Chirravuri, S; Arnold, J

1996-01-01

Ordering clones from a genomic library into physical maps of whole chromosomes presents a central computational problem in genetics. Chromosome reconstruction via clone ordering is usually isomorphic to the NP-complete Optimal Linear Arrangement problem. Parallel SIMD and MIMD algorithms for simulated annealing based on Markov chain distribution are proposed and applied to the problem of chromosome reconstruction via clone ordering. Perturbation methods and problem-specific annealing heuristics are proposed and described. The SIMD algorithms are implemented on a 2048 processor MasPar MP-2 system which is an SIMD 2-D toroidal mesh architecture whereas the MIMD algorithms are implemented on an 8 processor Intel iPSC/860 which is an MIMD hypercube architecture. A comparative analysis of the various SIMD and MIMD algorithms is presented in which the convergence, speedup, and scalability characteristics of the various algorithms are analyzed and discussed. On a fine-grained, massively parallel SIMD architecture with a low synchronization overhead such as the MasPar MP-2, a parallel simulated annealing algorithm based on multiple periodically interacting searches performs the best. For a coarse-grained MIMD architecture with high synchronization overhead such as the Intel iPSC/860, a parallel simulated annealing algorithm based on multiple independent searches yields the best results. In either case, distribution of clonal data across multiple processors is shown to exacerbate the tendency of the parallel simulated annealing algorithm to get trapped in a local optimum.
A scalable parallel algorithm for multiple objective linear programs

Science.gov (United States)

Wiecek, Malgorzata M.; Zhang, Hong

1994-01-01

This paper presents an ADBASE-based parallel algorithm for solving multiple objective linear programs (MOLP's). Job balance, speedup and scalability are of primary interest in evaluating efficiency of the new algorithm. Implementation results on Intel iPSC/2 and Paragon multiprocessors show that the algorithm significantly speeds up the process of solving MOLP's, which is understood as generating all or some efficient extreme points and unbounded efficient edges. The algorithm gives specially good results for large and very large problems. Motivation and justification for solving such large MOLP's are also included.

Results of Evolution Supervised by Genetic Algorithms

Directory of Open Access Journals (Sweden)

Lorentz JÄNTSCHI

2010-09-01

Full Text Available The efficiency of a genetic algorithm is frequently assessed using a series of operators of evolution like crossover operators, mutation operators or other dynamic parameters. The present paper aimed to review the main results of evolution supervised by genetic algorithms used to identify solutions to agricultural and horticultural hard problems and to discuss the results of using a genetic algorithms on structure-activity relationships in terms of behavior of evolution supervised by genetic algorithms. A genetic algorithm had been developed and implemented in order to identify the optimal solution in term of estimation power of a multiple linear regression approach for structure-activity relationships. Three survival and three selection strategies (proportional, deterministic and tournament were investigated in order to identify the best survival-selection strategy able to lead to the model with higher estimation power. The Molecular Descriptors Family for structure characterization of a sample of 206 polychlorinated biphenyls with measured octanol-water partition coefficients was used as case study. Evolution using different selection and survival strategies proved to create populations of genotypes living in the evolution space with different diversity and variability. Under a series of criteria of comparisons these populations proved to be grouped and the groups were showed to be statistically different one to each other. The conclusions about genetic algorithm evolution according to a number of criteria were also highlighted.
A parallel algorithm for 3D particle tracking and Lagrangian trajectory reconstruction

International Nuclear Information System (INIS)

Barker, Douglas; Zhang, Yuanhui; Lifflander, Jonathan; Arya, Anshu

2012-01-01

Particle-tracking methods are widely used in fluid mechanics and multi-target tracking research because of their unique ability to reconstruct long trajectories with high spatial and temporal resolution. Researchers have recently demonstrated 3D tracking of several objects in real time, but as the number of objects is increased, real-time tracking becomes impossible due to data transfer and processing bottlenecks. This problem may be solved by using parallel processing. In this paper, a parallel-processing framework has been developed based on frame decomposition and is programmed using the asynchronous object-oriented Charm++ paradigm. This framework can be a key step in achieving a scalable Lagrangian measurement system for particle-tracking velocimetry and may lead to real-time measurement capabilities. The parallel tracking algorithm was evaluated with three data sets including the particle image velocimetry standard 3D images data set #352, a uniform data set for optimal parallel performance and a computational-fluid-dynamics-generated non-uniform data set to test trajectory reconstruction accuracy, consistency with the sequential version and scalability to more than 500 processors. The algorithm showed strong scaling up to 512 processors and no inherent limits of scalability were seen. Ultimately, up to a 200-fold speedup is observed compared to the serial algorithm when 256 processors were used. The parallel algorithm is adaptable and could be easily modified to use any sequential tracking algorithm, which inputs frames of 3D particle location data and outputs particle trajectories
Efficient Parallel Algorithms for Unsteady Incompressible Flows

KAUST Repository

Guermond, Jean-Luc; Minev, Peter D.

2013-01-01

The objective of this paper is to give an overview of recent developments on splitting schemes for solving the time-dependent incompressible Navier–Stokes equations and to discuss possible extensions to the variable density/viscosity case. A particular attention is given to algorithms that can be implemented efficiently on large parallel clusters.
Comparison of genetic algorithms with conjugate gradient methods

Science.gov (United States)

Bosworth, J. L.; Foo, N. Y.; Zeigler, B. P.

1972-01-01

Genetic algorithms for mathematical function optimization are modeled on search strategies employed in natural adaptation. Comparisons of genetic algorithms with conjugate gradient methods, which were made on an IBM 1800 digital computer, show that genetic algorithms display superior performance over gradient methods for functions which are poorly behaved mathematically, for multimodal functions, and for functions obscured by additive random noise. Genetic methods offer performance comparable to gradient methods for many of the standard functions.
Parallel Algorithms for Graph Optimization using Tree Decompositions

Energy Technology Data Exchange (ETDEWEB)

Sullivan, Blair D [ORNL; Weerapurage, Dinesh P [ORNL; Groer, Christopher S [ORNL

2012-06-01

Although many $\\cal{NP}$-hard graph optimization problems can be solved in polynomial time on graphs of bounded tree-width, the adoption of these techniques into mainstream scientific computation has been limited due to the high memory requirements of the necessary dynamic programming tables and excessive runtimes of sequential implementations. This work addresses both challenges by proposing a set of new parallel algorithms for all steps of a tree decomposition-based approach to solve the maximum weighted independent set problem. A hybrid OpenMP/MPI implementation includes a highly scalable parallel dynamic programming algorithm leveraging the MADNESS task-based runtime, and computational results demonstrate scaling. This work enables a significant expansion of the scale of graphs on which exact solutions to maximum weighted independent set can be obtained, and forms a framework for solving additional graph optimization problems with similar techniques.
Parallel-vector algorithms for particle simulations on shared-memory multiprocessors

International Nuclear Information System (INIS)

Nishiura, Daisuke; Sakaguchi, Hide

2011-01-01

Over the last few decades, the computational demands of massive particle-based simulations for both scientific and industrial purposes have been continuously increasing. Hence, considerable efforts are being made to develop parallel computing techniques on various platforms. In such simulations, particles freely move within a given space, and so on a distributed-memory system, load balancing, i.e., assigning an equal number of particles to each processor, is not guaranteed. However, shared-memory systems achieve better load balancing for particle models, but suffer from the intrinsic drawback of memory access competition, particularly during (1) paring of contact candidates from among neighboring particles and (2) force summation for each particle. Here, novel algorithms are proposed to overcome these two problems. For the first problem, the key is a pre-conditioning process during which particle labels are sorted by a cell label in the domain to which the particles belong. Then, a list of contact candidates is constructed by pairing the sorted particle labels. For the latter problem, a table comprising the list indexes of the contact candidate pairs is created and used to sum the contact forces acting on each particle for all contacts according to Newton's third law. With just these methods, memory access competition is avoided without additional redundant procedures. The parallel efficiency and compatibility of these two algorithms were evaluated in discrete element method (DEM) simulations on four types of shared-memory parallel computers: a multicore multiprocessor computer, scalar supercomputer, vector supercomputer, and graphics processing unit. The computational efficiency of a DEM code was found to be drastically improved with our algorithms on all but the scalar supercomputer. Thus, the developed parallel algorithms are useful on shared-memory parallel computers with sufficient memory bandwidth.
Particle swarm genetic algorithm and its application

International Nuclear Information System (INIS)

Liu Chengxiang; Yan Changxiang; Wang Jianjun; Liu Zhenhai

2012-01-01

To solve the problems of slow convergence speed and tendency to fall into the local optimum of the standard particle swarm optimization while dealing with nonlinear constraint optimization problem, a particle swarm genetic algorithm is designed. The proposed algorithm adopts feasibility principle handles constraint conditions and avoids the difficulty of penalty function method in selecting punishment factor, generates initial feasible group randomly, which accelerates particle swarm convergence speed, and introduces genetic algorithm crossover and mutation strategy to avoid particle swarm falls into the local optimum Through the optimization calculation of the typical test functions, the results show that particle swarm genetic algorithm has better optimized performance. The algorithm is applied in nuclear power plant optimization, and the optimization results are significantly. (authors)
Parallel algorithms for testing finite state machines:Generating UIO sequences

OpenAIRE

Hierons, RM; Turker, UC

2016-01-01

This paper describes an efficient parallel algorithm that uses many-core GPUs for automatically deriving Unique Input Output sequences (UIOs) from Finite State Machines. The proposed algorithm uses the global scope of the GPU's global memory through coalesced memory access and minimises the transfer between CPU and GPU memory. The results of experiments indicate that the proposed method yields considerably better results compared to a single core UIO construction algorithm. Our algorithm is s...
Options for Parallelizing a Planning and Scheduling Algorithm

Science.gov (United States)

Clement, Bradley J.; Estlin, Tara A.; Bornstein, Benjamin D.

2011-01-01

Space missions have a growing interest in putting multi-core processors onboard spacecraft. For many missions processing power significantly slows operations. We investigate how continual planning and scheduling algorithms can exploit multi-core processing and outline different potential design decisions for a parallelized planning architecture. This organization of choices and challenges helps us with an initial design for parallelizing the CASPER planning system for a mesh multi-core processor. This work extends that presented at another workshop with some preliminary results.
A Parallel Algorithm for Connected Component Labelling of Gray-scale Images on Homogeneous Multicore Architectures

International Nuclear Information System (INIS)

Niknam, Mehdi; Thulasiraman, Parimala; Camorlinga, Sergio

2010-01-01

Connected component labelling is an essential step in image processing. We provide a parallel version of Suzuki's sequential connected component algorithm in order to speed up the labelling process. Also, we modify the algorithm to enable labelling gray-scale images. Due to the data dependencies in the algorithm we used a method similar to pipeline to exploit parallelism. The parallel algorithm method achieved a speedup of 2.5 for image size of 256 x 256 pixels using 4 processing threads.
The development of a scalable parallel 3-D CFD algorithm for turbomachinery. M.S. Thesis Final Report

Science.gov (United States)

Luke, Edward Allen

1993-01-01

Two algorithms capable of computing a transonic 3-D inviscid flow field about rotating machines are considered for parallel implementation. During the study of these algorithms, a significant new method of measuring the performance of parallel algorithms is developed. The theory that supports this new method creates an empirical definition of scalable parallel algorithms that is used to produce quantifiable evidence that a scalable parallel application was developed. The implementation of the parallel application and an automated domain decomposition tool are also discussed.
The Algorithm for Algorithms: An Evolutionary Algorithm Based on Automatic Designing of Genetic Operators

Directory of Open Access Journals (Sweden)

Dazhi Jiang

2015-01-01

Full Text Available At present there is a wide range of evolutionary algorithms available to researchers and practitioners. Despite the great diversity of these algorithms, virtually all of the algorithms share one feature: they have been manually designed. A fundamental question is “are there any algorithms that can design evolutionary algorithms automatically?” A more complete definition of the question is “can computer construct an algorithm which will generate algorithms according to the requirement of a problem?” In this paper, a novel evolutionary algorithm based on automatic designing of genetic operators is presented to address these questions. The resulting algorithm not only explores solutions in the problem space like most traditional evolutionary algorithms do, but also automatically generates genetic operators in the operator space. In order to verify the performance of the proposed algorithm, comprehensive experiments on 23 well-known benchmark optimization problems are conducted. The results show that the proposed algorithm can outperform standard differential evolution algorithm in terms of convergence speed and solution accuracy which shows that the algorithm designed automatically by computers can compete with the algorithms designed by human beings.
Parallel Global Optimization with the Particle Swarm Algorithm (Preprint)

National Research Council Canada - National Science Library

Schutte, J. F; Reinbolt, J. A; Fregly, B. J; Haftka, R. T; George, A. D

2004-01-01

.... To obtain enhanced computational throughput and global search capability, we detail the coarse-grained parallelization of an increasingly popular global search method, the Particle Swarm Optimization (PSO) algorithm...
Detection of Defective Sensors in Phased Array Using Compressed Sensing and Hybrid Genetic Algorithm

Directory of Open Access Journals (Sweden)

Shafqat Ullah Khan

2016-01-01

Full Text Available A compressed sensing based array diagnosis technique has been presented. This technique starts from collecting the measurements of the far-field pattern. The system linking the difference between the field measured using the healthy reference array and the field radiated by the array under test is solved using a genetic algorithm (GA, parallel coordinate descent (PCD algorithm, and then a hybridized GA with PCD algorithm. These algorithms are applied for fully and partially defective antenna arrays. The simulation results indicate that the proposed hybrid algorithm outperforms in terms of localization of element failure with a small number of measurements. In the proposed algorithm, the slow and early convergence of GA has been avoided by combining it with PCD algorithm. It has been shown that the hybrid GA-PCD algorithm provides an accurate diagnosis of fully and partially defective sensors as compared to GA or PCD alone. Different simulations have been provided to validate the performance of the designed algorithms in diversified scenarios.
Efficient sequential and parallel algorithms for planted motif search.

Science.gov (United States)

Nicolae, Marius; Rajasekaran, Sanguthevar

2014-01-31

Motif searching is an important step in the detection of rare events occurring in a set of DNA or protein sequences. One formulation of the problem is known as (l,d)-motif search or Planted Motif Search (PMS). In PMS we are given two integers l and d and n biological sequences. We want to find all sequences of length l that appear in each of the input sequences with at most d mismatches. The PMS problem is NP-complete. PMS algorithms are typically evaluated on certain instances considered challenging. Despite ample research in the area, a considerable performance gap exists because many state of the art algorithms have large runtimes even for moderately challenging instances. This paper presents a fast exact parallel PMS algorithm called PMS8. PMS8 is the first algorithm to solve the challenging (l,d) instances (25,10) and (26,11). PMS8 is also efficient on instances with larger l and d such as (50,21). We include a comparison of PMS8 with several state of the art algorithms on multiple problem instances. This paper also presents necessary and sufficient conditions for 3 l-mers to have a common d-neighbor. The program is freely available at http://engr.uconn.edu/~man09004/PMS8/. We present PMS8, an efficient exact algorithm for Planted Motif Search. PMS8 introduces novel ideas for generating common neighborhoods. We have also implemented a parallel version for this algorithm. PMS8 can solve instances not solved by any previous algorithms.
Multirate-based fast parallel algorithms for 2-D DHT-based real-valued discrete Gabor transform.

Science.gov (United States)

Tao, Liang; Kwan, Hon Keung

2012-07-01

Novel algorithms for the multirate and fast parallel implementation of the 2-D discrete Hartley transform (DHT)-based real-valued discrete Gabor transform (RDGT) and its inverse transform are presented in this paper. A 2-D multirate-based analysis convolver bank is designed for the 2-D RDGT, and a 2-D multirate-based synthesis convolver bank is designed for the 2-D inverse RDGT. The parallel channels in each of the two convolver banks have a unified structure and can apply the 2-D fast DHT algorithm to speed up their computations. The computational complexity of each parallel channel is low and is independent of the Gabor oversampling rate. All the 2-D RDGT coefficients of an image are computed in parallel during the analysis process and can be reconstructed in parallel during the synthesis process. The computational complexity and time of the proposed parallel algorithms are analyzed and compared with those of the existing fastest algorithms for 2-D discrete Gabor transforms. The results indicate that the proposed algorithms are the fastest, which make them attractive for real-time image processing.
An efficient parallel algorithm: Poststack and prestack Kirchhoff 3D depth migration using flexi-depth iterations

Science.gov (United States)

Rastogi, Richa; Srivastava, Abhishek; Khonde, Kiran; Sirasala, Kirannmayi M.; Londhe, Ashutosh; Chavhan, Hitesh

2015-07-01

This paper presents an efficient parallel 3D Kirchhoff depth migration algorithm suitable for current class of multicore architecture. The fundamental Kirchhoff depth migration algorithm exhibits inherent parallelism however, when it comes to 3D data migration, as the data size increases the resource requirement of the algorithm also increases. This challenges its practical implementation even on current generation high performance computing systems. Therefore a smart parallelization approach is essential to handle 3D data for migration. The most compute intensive part of Kirchhoff depth migration algorithm is the calculation of traveltime tables due to its resource requirements such as memory/storage and I/O. In the current research work, we target this area and develop a competent parallel algorithm for post and prestack 3D Kirchhoff depth migration, using hybrid MPI+OpenMP programming techniques. We introduce a concept of flexi-depth iterations while depth migrating data in parallel imaging space, using optimized traveltime table computations. This concept provides flexibility to the algorithm by migrating data in a number of depth iterations, which depends upon the available node memory and the size of data to be migrated during runtime. Furthermore, it minimizes the requirements of storage, I/O and inter-node communication, thus making it advantageous over the conventional parallelization approaches. The developed parallel algorithm is demonstrated and analysed on Yuva II, a PARAM series of supercomputers. Optimization, performance and scalability experiment results along with the migration outcome show the effectiveness of the parallel algorithm.
Implementation of PHENIX trigger algorithms on massively parallel computers

International Nuclear Information System (INIS)

Petridis, A.N.; Wohn, F.K.

1995-01-01

The event selection requirements of contemporary high energy and nuclear physics experiments are met by the introduction of on-line trigger algorithms which identify potentially interesting events and reduce the data acquisition rate to levels that are manageable by the electronics. Such algorithms being parallel in nature can be simulated off-line using massively parallel computers. The PHENIX experiment intends to investigate the possible existence of a new phase of matter called the quark gluon plasma which has been theorized to have existed in very early stages of the evolution of the universe by studying collisions of heavy nuclei at ultra-relativistic energies. Such interactions can also reveal important information regarding the structure of the nucleus and mandate a thorough investigation of the simpler proton-nucleus collisions at the same energies. The complexity of PHENIX events and the need to analyze and also simulate them at rates similar to the data collection ones imposes enormous computation demands. This work is a first effort to implement PHENIX trigger algorithms on parallel computers and to study the feasibility of using such machines to run the complex programs necessary for the simulation of the PHENIX detector response. Fine and coarse grain approaches have been studied and evaluated. Depending on the application the performance of a massively parallel computer can be much better or much worse than that of a serial workstation. A comparison between single instruction and multiple instruction computers is also made and possible applications of the single instruction machines to high energy and nuclear physics experiments are outlined. copyright 1995 American Institute of Physics
A parallel algorithm for filtering gravitational waves from coalescing binaries

International Nuclear Information System (INIS)

Sathyaprakash, B.S.; Dhurandhar, S.V.

1992-10-01

Coalescing binary stars are perhaps the most promising sources for the observation of gravitational waves with laser interferometric gravity wave detectors. The waveform from these sources can be predicted with sufficient accuracy for matched filtering techniques to be applied. In this paper we present a parallel algorithm for detecting signals from coalescing compact binaries by the method of matched filtering. We also report the details of its implementation on a 256-node connection machine consisting of a network of transputers. The results of our analysis indicate that parallel processing is a promising approach to on-line analysis of data from gravitational wave detectors to filter out coalescing binary signals. The algorithm described is quite general in that the kernel of the algorithm is applicable to any set of matched filters. (author). 15 refs, 4 figs
BitPAl: a bit-parallel, general integer-scoring sequence alignment algorithm.

Science.gov (United States)

Loving, Joshua; Hernandez, Yozen; Benson, Gary

2014-11-15

Mapping of high-throughput sequencing data and other bulk sequence comparison applications have motivated a search for high-efficiency sequence alignment algorithms. The bit-parallel approach represents individual cells in an alignment scoring matrix as bits in computer words and emulates the calculation of scores by a series of logic operations composed of AND, OR, XOR, complement, shift and addition. Bit-parallelism has been successfully applied to the longest common subsequence (LCS) and edit-distance problems, producing fast algorithms in practice. We have developed BitPAl, a bit-parallel algorithm for general, integer-scoring global alignment. Integer-scoring schemes assign integer weights for match, mismatch and insertion/deletion. The BitPAl method uses structural properties in the relationship between adjacent scores in the scoring matrix to construct classes of efficient algorithms, each designed for a particular set of weights. In timed tests, we show that BitPAl runs 7-25 times faster than a standard iterative algorithm. Source code is freely available for download at http://lobstah.bu.edu/BitPAl/BitPAl.html. BitPAl is implemented in C and runs on all major operating systems. jloving@bu.edu or yhernand@bu.edu or gbenson@bu.edu Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.

A hybrid algorithm for parallel molecular dynamics simulations

Science.gov (United States)

Mangiardi, Chris M.; Meyer, R.

2017-10-01

This article describes algorithms for the hybrid parallelization and SIMD vectorization of molecular dynamics simulations with short-range forces. The parallelization method combines domain decomposition with a thread-based parallelization approach. The goal of the work is to enable efficient simulations of very large (tens of millions of atoms) and inhomogeneous systems on many-core processors with hundreds or thousands of cores and SIMD units with large vector sizes. In order to test the efficiency of the method, simulations of a variety of configurations with up to 74 million atoms have been performed. Results are shown that were obtained on multi-core systems with Sandy Bridge and Haswell processors as well as systems with Xeon Phi many-core processors.
Performance modeling of parallel algorithms for solving neutron diffusion problems

International Nuclear Information System (INIS)

Azmy, Y.Y.; Kirk, B.L.

1995-01-01

Neutron diffusion calculations are the most common computational methods used in the design, analysis, and operation of nuclear reactors and related activities. Here, mathematical performance models are developed for the parallel algorithm used to solve the neutron diffusion equation on message passing and shared memory multiprocessors represented by the Intel iPSC/860 and the Sequent Balance 8000, respectively. The performance models are validated through several test problems, and these models are used to estimate the performance of each of the two considered architectures in situations typical of practical applications, such as fine meshes and a large number of participating processors. While message passing computers are capable of producing speedup, the parallel efficiency deteriorates rapidly as the number of processors increases. Furthermore, the speedup fails to improve appreciably for massively parallel computers so that only small- to medium-sized message passing multiprocessors offer a reasonable platform for this algorithm. In contrast, the performance model for the shared memory architecture predicts very high efficiency over a wide range of number of processors reasonable for this architecture. Furthermore, the model efficiency of the Sequent remains superior to that of the hypercube if its model parameters are adjusted to make its processors as fast as those of the iPSC/860. It is concluded that shared memory computers are better suited for this parallel algorithm than message passing computers
Fast parallel molecular algorithms for DNA-based computation: factoring integers.

Science.gov (United States)

Chang, Weng-Long; Guo, Minyi; Ho, Michael Shan-Hui

2005-06-01

The RSA public-key cryptosystem is an algorithm that converts input data to an unrecognizable encryption and converts the unrecognizable data back into its original decryption form. The security of the RSA public-key cryptosystem is based on the difficulty of factoring the product of two large prime numbers. This paper demonstrates to factor the product of two large prime numbers, and is a breakthrough in basic biological operations using a molecular computer. In order to achieve this, we propose three DNA-based algorithms for parallel subtractor, parallel comparator, and parallel modular arithmetic that formally verify our designed molecular solutions for factoring the product of two large prime numbers. Furthermore, this work indicates that the cryptosystems using public-key are perhaps insecure and also presents clear evidence of the ability of molecular computing to perform complicated mathematical operations.
A NEW HYBRID GENETIC ALGORITHM FOR VERTEX COVER PROBLEM

OpenAIRE

UĞURLU, Onur

2015-01-01

The minimum vertex cover problem belongs to the class of NP-compl ete graph theoretical problems. This paper presents a hybrid genetic algorithm to solve minimum ver tex cover problem. In this paper, it has been shown that when local optimization technique is added t o genetic algorithm to form hybrid genetic algorithm, it gives more quality solution than simple genet ic algorithm. Also, anew mutation operator has been developed especially for minimum verte...
Parallel algorithms for network routing problems and recurrences

International Nuclear Information System (INIS)

Wisniewski, J.A.; Sameh, A.H.

1982-01-01

In this paper, we consider the parallel solution of recurrences, and linear systems in the regular algebra of Carre. These problems are equivalent to solving the shortest path problem in graph theory, and they also arise in the analysis of Fortran programs. Our methods for solving linear systems in the regular algebra are analogues of well-known methods for solving systems of linear algebraic equations. A parallel version of Dijkstra's method, which has no linear algebraic analogue, is presented. Considerations for choosing an algorithm when the problem is large and sparse are also discussed
A Hybrid Shared-Memory Parallel Max-Tree Algorithm for Extreme Dynamic-Range Images.

Science.gov (United States)

Moschini, Ugo; Meijster, Arnold; Wilkinson, Michael H F

2018-03-01

Max-trees, or component trees, are graph structures that represent the connected components of an image in a hierarchical way. Nowadays, many application fields rely on images with high-dynamic range or floating point values. Efficient sequential algorithms exist to build trees and compute attributes for images of any bit depth. However, we show that the current parallel algorithms perform poorly already with integers at bit depths higher than 16 bits per pixel. We propose a parallel method combining the two worlds of flooding and merging max-tree algorithms. First, a pilot max-tree of a quantized version of the image is built in parallel using a flooding method. Later, this structure is used in a parallel leaf-to-root approach to compute efficiently the final max-tree and to drive the merging of the sub-trees computed by the threads. We present an analysis of the performance both on simulated and actual 2D images and 3D volumes. Execution times are about better than the fastest sequential algorithm and speed-up goes up to on 64 threads.
Using a genetic algorithm to solve fluid-flow problems

International Nuclear Information System (INIS)

Pryor, R.J.

1990-01-01

Genetic algorithms are based on the mechanics of the natural selection and natural genetics processes. These algorithms are finding increasing application to a wide variety of engineering optimization and machine learning problems. In this paper, the authors demonstrate the use of a genetic algorithm to solve fluid flow problems. Specifically, the authors use the algorithm to solve the one-dimensional flow equations for a pipe
Research in Parallel Algorithms and Software for Computational Aerosciences

Science.gov (United States)

Domel, Neal D.

1996-01-01

Phase 1 is complete for the development of a computational fluid dynamics CFD) parallel code with automatic grid generation and adaptation for the Euler analysis of flow over complex geometries. SPLITFLOW, an unstructured Cartesian grid code developed at Lockheed Martin Tactical Aircraft Systems, has been modified for a distributed memory/massively parallel computing environment. The parallel code is operational on an SGI network, Cray J90 and C90 vector machines, SGI Power Challenge, and Cray T3D and IBM SP2 massively parallel machines. Parallel Virtual Machine (PVM) is the message passing protocol for portability to various architectures. A domain decomposition technique was developed which enforces dynamic load balancing to improve solution speed and memory requirements. A host/node algorithm distributes the tasks. The solver parallelizes very well, and scales with the number of processors. Partially parallelized and non-parallelized tasks consume most of the wall clock time in a very fine grain environment. Timing comparisons on a Cray C90 demonstrate that Parallel SPLITFLOW runs 2.4 times faster on 8 processors than its non-parallel counterpart autotasked over 8 processors.
Medical Image Retrieval Based On the Parallelization of the Cluster Sampling Algorithm

OpenAIRE

Ali, Hesham Arafat; Attiya, Salah; El-henawy, Ibrahim

2017-01-01

In this paper we develop parallel cluster sampling algorithms and show that a multi-chain version is embarrassingly parallel and can be used efficiently for medical image retrieval among other applications.
A new asynchronous parallel algorithm for inferring large-scale gene regulatory networks.

Directory of Open Access Journals (Sweden)

Xiangyun Xiao

Full Text Available The reconstruction of gene regulatory networks (GRNs from high-throughput experimental data has been considered one of the most important issues in systems biology research. With the development of high-throughput technology and the complexity of biological problems, we need to reconstruct GRNs that contain thousands of genes. However, when many existing algorithms are used to handle these large-scale problems, they will encounter two important issues: low accuracy and high computational cost. To overcome these difficulties, the main goal of this study is to design an effective parallel algorithm to infer large-scale GRNs based on high-performance parallel computing environments. In this study, we proposed a novel asynchronous parallel framework to improve the accuracy and lower the time complexity of large-scale GRN inference by combining splitting technology and ordinary differential equation (ODE-based optimization. The presented algorithm uses the sparsity and modularity of GRNs to split whole large-scale GRNs into many small-scale modular subnetworks. Through the ODE-based optimization of all subnetworks in parallel and their asynchronous communications, we can easily obtain the parameters of the whole network. To test the performance of the proposed approach, we used well-known benchmark datasets from Dialogue for Reverse Engineering Assessments and Methods challenge (DREAM, experimentally determined GRN of Escherichia coli and one published dataset that contains more than 10 thousand genes to compare the proposed approach with several popular algorithms on the same high-performance computing environments in terms of both accuracy and time complexity. The numerical results demonstrate that our parallel algorithm exhibits obvious superiority in inferring large-scale GRNs.
A new asynchronous parallel algorithm for inferring large-scale gene regulatory networks.

Science.gov (United States)

Xiao, Xiangyun; Zhang, Wei; Zou, Xiufen

2015-01-01

The reconstruction of gene regulatory networks (GRNs) from high-throughput experimental data has been considered one of the most important issues in systems biology research. With the development of high-throughput technology and the complexity of biological problems, we need to reconstruct GRNs that contain thousands of genes. However, when many existing algorithms are used to handle these large-scale problems, they will encounter two important issues: low accuracy and high computational cost. To overcome these difficulties, the main goal of this study is to design an effective parallel algorithm to infer large-scale GRNs based on high-performance parallel computing environments. In this study, we proposed a novel asynchronous parallel framework to improve the accuracy and lower the time complexity of large-scale GRN inference by combining splitting technology and ordinary differential equation (ODE)-based optimization. The presented algorithm uses the sparsity and modularity of GRNs to split whole large-scale GRNs into many small-scale modular subnetworks. Through the ODE-based optimization of all subnetworks in parallel and their asynchronous communications, we can easily obtain the parameters of the whole network. To test the performance of the proposed approach, we used well-known benchmark datasets from Dialogue for Reverse Engineering Assessments and Methods challenge (DREAM), experimentally determined GRN of Escherichia coli and one published dataset that contains more than 10 thousand genes to compare the proposed approach with several popular algorithms on the same high-performance computing environments in terms of both accuracy and time complexity. The numerical results demonstrate that our parallel algorithm exhibits obvious superiority in inferring large-scale GRNs.
Iterative schemes for parallel Sn algorithms in a shared-memory computing environment

International Nuclear Information System (INIS)

Haghighat, A.; Hunter, M.A.; Mattis, R.E.

1995-01-01

Several two-dimensional spatial domain partitioning S n transport theory algorithms are developed on the basis of different iterative schemes. These algorithms are incorporated into TWOTRAN-II and tested on the shared-memory CRAY Y-MP C90 computer. For a series of fixed-source r-z geometry homogeneous problems, it is demonstrated that the concurrent red-black algorithms may result in large parallel efficiencies (>60%) on C90. It is also demonstrated that for a realistic shielding problem, the use of the negative flux fixup causes high load imbalance, which results in a significant loss of parallel efficiency
A Fast parallel tridiagonal algorithm for a class of CFD applications

Science.gov (United States)

Moitra, Stuti; Sun, Xian-He

1996-01-01

The parallel diagonal dominant (PDD) algorithm is an efficient tridiagonal solver. This paper presents for study a variation of the PDD algorithm, the reduced PDD algorithm. The new algorithm maintains the minimum communication provided by the PDD algorithm, but has a reduced operation count. The PDD algorithm also has a smaller operation count than the conventional sequential algorithm for many applications. Accuracy analysis is provided for the reduced PDD algorithm for symmetric Toeplitz tridiagonal (STT) systems. Implementation results on Langley's Intel Paragon and IBM SP2 show that both the PDD and reduced PDD algorithms are efficient and scalable.
Optimization Solution of Troesch’s and Bratu’s Problems of Ordinary Type Using Novel Continuous Genetic Algorithm

Directory of Open Access Journals (Sweden)

Zaer Abo-Hammour

2014-01-01

Full Text Available A new kind of optimization technique, namely, continuous genetic algorithm, is presented in this paper for numerically approximating the solutions of Troesch’s and Bratu’s problems. The underlying idea of the method is to convert the two differential problems into discrete versions by replacing each of the second derivatives by an appropriate difference quotient approximation. The new method has the following characteristics. First, it should not resort to more advanced mathematical tools; that is, the algorithm should be simple to understand and implement and should be thus easily accepted in the mathematical and physical application fields. Second, the algorithm is of global nature in terms of the solutions obtained as well as its ability to solve other mathematical and physical problems. Third, the proposed methodology has an implicit parallel nature which points to its implementation on parallel machines. The algorithm is tested on different versions of Troesch’s and Bratu’s problems. Experimental results show that the proposed algorithm is effective, straightforward, and simple.
A parallel row-based algorithm for standard cell placement with integrated error control

Science.gov (United States)

Sargent, Jeff S.; Banerjee, Prith

1989-01-01

A new row-based parallel algorithm for standard-cell placement targeted for execution on a hypercube multiprocessor is presented. Key features of this implementation include a dynamic simulated-annealing schedule, row-partitioning of the VLSI chip image, and two novel approaches to control error in parallel cell-placement algorithms: (1) Heuristic Cell-Coloring; (2) Adaptive Sequence Length Control.
A Hybrid Parallel Preconditioning Algorithm For CFD

Science.gov (United States)

Barth,Timothy J.; Tang, Wei-Pai; Kwak, Dochan (Technical Monitor)

1995-01-01

A new hybrid preconditioning algorithm will be presented which combines the favorable attributes of incomplete lower-upper (ILU) factorization with the favorable attributes of the approximate inverse method recently advocated by numerous researchers. The quality of the preconditioner is adjustable and can be increased at the cost of additional computation while at the same time the storage required is roughly constant and approximately equal to the storage required for the original matrix. In addition, the preconditioning algorithm suggests an efficient and natural parallel implementation with reduced communication. Sample calculations will be presented for the numerical solution of multi-dimensional advection-diffusion equations. The matrix solver has also been embedded into a Newton algorithm for solving the nonlinear Euler and Navier-Stokes equations governing compressible flow. The full paper will show numerous examples in CFD to demonstrate the efficiency and robustness of the method.
A parallel ILP algorithm that incorporates incremental batch learning

OpenAIRE

Nuno Fonseca; Rui Camacho; Fernado Silva

2003-01-01

In this paper we tackle the problems of eciency and scala-bility faced by Inductive Logic Programming (ILP) systems. We proposethe use of parallelism to improve eciency and the use of an incrementalbatch learning to address the scalability problem. We describe a novelparallel algorithm that incorporates into ILP the method of incremen-tal batch learning. The theoretical complexity of the algorithm indicatesthat a linear speedup can be achieved.
Using Genetic Algorithms for Building Metrics of Collaborative Systems

Directory of Open Access Journals (Sweden)

Cristian CIUREA

2011-01-01

Full Text Available he paper objective is to reveal the importance of genetic algorithms in building robust metrics of collaborative systems. The main types of collaborative systems in economy are presented and some characteristics of genetic algorithms are described. A genetic algorithm was implemented in order to determine the local maximum and minimum points of the relative complexity function associated to a collaborative banking system. The intelligent collaborative systems based on genetic algorithms, representing the new generation of collaborative systems, are analyzed and the implementation of auto-adaptive interfaces in a banking application is described.
A Globally Convergent Parallel SSLE Algorithm for Inequality Constrained Optimization

Directory of Open Access Journals (Sweden)

Zhijun Luo

2014-01-01

Full Text Available A new parallel variable distribution algorithm based on interior point SSLE algorithm is proposed for solving inequality constrained optimization problems under the condition that the constraints are block-separable by the technology of sequential system of linear equation. Each iteration of this algorithm only needs to solve three systems of linear equations with the same coefficient matrix to obtain the descent direction. Furthermore, under certain conditions, the global convergence is achieved.
Redundancy allocation of series-parallel systems using a variable neighborhood search algorithm

International Nuclear Information System (INIS)

Liang, Y.-C.; Chen, Y.-C.

2007-01-01

This paper presents a meta-heuristic algorithm, variable neighborhood search (VNS), to the redundancy allocation problem (RAP). The RAP, an NP-hard problem, has attracted the attention of much prior research, generally in a restricted form where each subsystem must consist of identical components. The newer meta-heuristic methods overcome this limitation and offer a practical way to solve large instances of the relaxed RAP where different components can be used in parallel. Authors' previously published work has shown promise for the variable neighborhood descent (VND) method, the simplest version among VNS variations, on RAP. The variable neighborhood search method itself has not been used in reliability design, yet it is a method that fits those combinatorial problems with potential neighborhood structures, as in the case of the RAP. Therefore, authors further extended their work to develop a VNS algorithm for the RAP and tested a set of well-known benchmark problems from the literature. Results on 33 test instances ranging from less to severely constrained conditions show that the variable neighborhood search method improves the performance of VND and provides a competitive solution quality at economically computational expense in comparison with the best-known heuristics including ant colony optimization, genetic algorithm, and tabu search

Redundancy allocation of series-parallel systems using a variable neighborhood search algorithm

Energy Technology Data Exchange (ETDEWEB)

Liang, Y.-C. [Department of Industrial Engineering and Management, Yuan Ze University, No 135 Yuan-Tung Road, Chung-Li, Taoyuan County, Taiwan 320 (China)]. E-mail: ycliang@saturn.yzu.edu.tw; Chen, Y.-C. [Department of Industrial Engineering and Management, Yuan Ze University, No 135 Yuan-Tung Road, Chung-Li, Taoyuan County, Taiwan 320 (China)]. E-mail: s927523@mail.yzu.edu.tw

2007-03-15

This paper presents a meta-heuristic algorithm, variable neighborhood search (VNS), to the redundancy allocation problem (RAP). The RAP, an NP-hard problem, has attracted the attention of much prior research, generally in a restricted form where each subsystem must consist of identical components. The newer meta-heuristic methods overcome this limitation and offer a practical way to solve large instances of the relaxed RAP where different components can be used in parallel. Authors' previously published work has shown promise for the variable neighborhood descent (VND) method, the simplest version among VNS variations, on RAP. The variable neighborhood search method itself has not been used in reliability design, yet it is a method that fits those combinatorial problems with potential neighborhood structures, as in the case of the RAP. Therefore, authors further extended their work to develop a VNS algorithm for the RAP and tested a set of well-known benchmark problems from the literature. Results on 33 test instances ranging from less to severely constrained conditions show that the variable neighborhood search method improves the performance of VND and provides a competitive solution quality at economically computational expense in comparison with the best-known heuristics including ant colony optimization, genetic algorithm, and tabu search.
Big Data: A Parallel Particle Swarm Optimization-Back-Propagation Neural Network Algorithm Based on MapReduce.

Science.gov (United States)

Cao, Jianfang; Cui, Hongyan; Shi, Hao; Jiao, Lijuan

2016-01-01

A back-propagation (BP) neural network can solve complicated random nonlinear mapping problems; therefore, it can be applied to a wide range of problems. However, as the sample size increases, the time required to train BP neural networks becomes lengthy. Moreover, the classification accuracy decreases as well. To improve the classification accuracy and runtime efficiency of the BP neural network algorithm, we proposed a parallel design and realization method for a particle swarm optimization (PSO)-optimized BP neural network based on MapReduce on the Hadoop platform using both the PSO algorithm and a parallel design. The PSO algorithm was used to optimize the BP neural network's initial weights and thresholds and improve the accuracy of the classification algorithm. The MapReduce parallel programming model was utilized to achieve parallel processing of the BP algorithm, thereby solving the problems of hardware and communication overhead when the BP neural network addresses big data. Datasets on 5 different scales were constructed using the scene image library from the SUN Database. The classification accuracy of the parallel PSO-BP neural network algorithm is approximately 92%, and the system efficiency is approximately 0.85, which presents obvious advantages when processing big data. The algorithm proposed in this study demonstrated both higher classification accuracy and improved time efficiency, which represents a significant improvement obtained from applying parallel processing to an intelligent algorithm on big data.
Applicability of genetic algorithms to parameter estimation of economic models

Directory of Open Access Journals (Sweden)

Marcel Ševela

2004-01-01

Full Text Available The paper concentrates on capability of genetic algorithms for parameter estimation of non-linear economic models. In the paper we test the ability of genetic algorithms to estimate of parameters of demand function for durable goods and simultaneously search for parameters of genetic algorithm that lead to maximum effectiveness of the computation algorithm. The genetic algorithms connect deterministic iterative computation methods with stochastic methods. In the genteic aůgorithm approach each possible solution is represented by one individual, those life and lifes of all generations of individuals run under a few parameter of genetic algorithm. Our simulations resulted in optimal mutation rate of 15% of all bits in chromosomes, optimal elitism rate 20%. We can not set the optimal extend of generation, because it proves positive correlation with effectiveness of genetic algorithm in all range under research, but its impact is degreasing. The used genetic algorithm was sensitive to mutation rate at most, than to extend of generation. The sensitivity to elitism rate is not so strong.
Using parallel computing in modeling and optimization of mineral ...

African Journals Online (AJOL)

Then to solve ultimate pit limit problem it is required to find such a sub graph in a graph whose sum of weights will be maximal. One of the possible solutions of this problem is using genetic algorithms. We use a ... Details of implementation parallel genetic algorithm for searching open pit limits are provided. Comparison with ...
Parallelization of the model-based iterative reconstruction algorithm DIRA

International Nuclear Information System (INIS)

Oertenberg, A.; Sandborg, M.; Alm Carlsson, G.; Malusek, A.; Magnusson, M.

2016-01-01

New paradigms for parallel programming have been devised to simplify software development on multi-core processors and many-core graphical processing units (GPU). Despite their obvious benefits, the parallelization of existing computer programs is not an easy task. In this work, the use of the Open Multiprocessing (OpenMP) and Open Computing Language (OpenCL) frameworks is considered for the parallelization of the model-based iterative reconstruction algorithm DIRA with the aim to significantly shorten the code's execution time. Selected routines were parallelized using OpenMP and OpenCL libraries; some routines were converted from MATLAB to C and optimised. Parallelization of the code with the OpenMP was easy and resulted in an overall speedup of 15 on a 16-core computer. Parallelization with OpenCL was more difficult owing to differences between the central processing unit and GPU architectures. The resulting speedup was substantially lower than the theoretical peak performance of the GPU; the cause was explained. (authors)
Genetic algorithms for protein threading.

Science.gov (United States)

Yadgari, J; Amir, A; Unger, R

1998-01-01

Despite many years of efforts, a direct prediction of protein structure from sequence is still not possible. As a result, in the last few years researchers have started to address the "inverse folding problem": Identifying and aligning a sequence to the fold with which it is most compatible, a process known as "threading". In two meetings in which protein folding predictions were objectively evaluated, it became clear that threading as a concept promises a real breakthrough, but that much improvement is still needed in the technique itself. Threading is a NP-hard problem, and thus no general polynomial solution can be expected. Still a practical approach with demonstrated ability to find optimal solutions in many cases, and acceptable solutions in other cases, is needed. We applied the technique of Genetic Algorithms in order to significantly improve the ability of threading algorithms to find the optimal alignment of a sequence to a structure, i.e. the alignment with the minimum free energy. A major progress reported here is the design of a representation of the threading alignment as a string of fixed length. With this representation validation of alignments and genetic operators are effectively implemented. Appropriate data structure and parameters have been selected. It is shown that Genetic Algorithm threading is effective and is able to find the optimal alignment in a few test cases. Furthermore, the described algorithm is shown to perform well even without pre-definition of core elements. Existing threading methods are dependent on such constraints to make their calculations feasible. But the concept of core elements is inherently arbitrary and should be avoided if possible. While a rigorous proof is hard to submit yet an, we present indications that indeed Genetic Algorithm threading is capable of finding consistently good solutions of full alignments in search spaces of size up to 10(70).
Meta-heuristic algorithms for parallel identical machines scheduling problem with weighted late work criterion and common due date.

Science.gov (United States)

Xu, Zhenzhen; Zou, Yongxing; Kong, Xiangjie

2015-01-01

To our knowledge, this paper investigates the first application of meta-heuristic algorithms to tackle the parallel machines scheduling problem with weighted late work criterion and common due date ([Formula: see text]). Late work criterion is one of the performance measures of scheduling problems which considers the length of late parts of particular jobs when evaluating the quality of scheduling. Since this problem is known to be NP-hard, three meta-heuristic algorithms, namely ant colony system, genetic algorithm, and simulated annealing are designed and implemented, respectively. We also propose a novel algorithm named LDF (largest density first) which is improved from LPT (longest processing time first). The computational experiments compared these meta-heuristic algorithms with LDF, LPT and LS (list scheduling), and the experimental results show that SA performs the best in most cases. However, LDF is better than SA in some conditions, moreover, the running time of LDF is much shorter than SA.
Bioinformatics algorithm based on a parallel implementation of a machine learning approach using transducers

International Nuclear Information System (INIS)

Roche-Lima, Abiel; Thulasiram, Ruppa K

2012-01-01

Finite automata, in which each transition is augmented with an output label in addition to the familiar input label, are considered finite-state transducers. Transducers have been used to analyze some fundamental issues in bioinformatics. Weighted finite-state transducers have been proposed to pairwise alignments of DNA and protein sequences; as well as to develop kernels for computational biology. Machine learning algorithms for conditional transducers have been implemented and used for DNA sequence analysis. Transducer learning algorithms are based on conditional probability computation. It is calculated by using techniques, such as pair-database creation, normalization (with Maximum-Likelihood normalization) and parameters optimization (with Expectation-Maximization - EM). These techniques are intrinsically costly for computation, even worse when are applied to bioinformatics, because the databases sizes are large. In this work, we describe a parallel implementation of an algorithm to learn conditional transducers using these techniques. The algorithm is oriented to bioinformatics applications, such as alignments, phylogenetic trees, and other genome evolution studies. Indeed, several experiences were developed using the parallel and sequential algorithm on Westgrid (specifically, on the Breeze cluster). As results, we obtain that our parallel algorithm is scalable, because execution times are reduced considerably when the data size parameter is increased. Another experience is developed by changing precision parameter. In this case, we obtain smaller execution times using the parallel algorithm. Finally, number of threads used to execute the parallel algorithm on the Breezy cluster is changed. In this last experience, we obtain as result that speedup is considerably increased when more threads are used; however there is a convergence for number of threads equal to or greater than 16.
Genetic Algorithms for Multiple-Choice Problems

Science.gov (United States)

Aickelin, Uwe

2010-04-01

This thesis investigates the use of problem-specific knowledge to enhance a genetic algorithm approach to multiple-choice optimisation problems.It shows that such information can significantly enhance performance, but that the choice of information and the way it is included are important factors for success.Two multiple-choice problems are considered.The first is constructing a feasible nurse roster that considers as many requests as possible.In the second problem, shops are allocated to locations in a mall subject to constraints and maximising the overall income.Genetic algorithms are chosen for their well-known robustness and ability to solve large and complex discrete optimisation problems.However, a survey of the literature reveals room for further research into generic ways to include constraints into a genetic algorithm framework.Hence, the main theme of this work is to balance feasibility and cost of solutions.In particular, co-operative co-evolution with hierarchical sub-populations, problem structure exploiting repair schemes and indirect genetic algorithms with self-adjusting decoder functions are identified as promising approaches.The research starts by applying standard genetic algorithms to the problems and explaining the failure of such approaches due to epistasis.To overcome this, problem-specific information is added in a variety of ways, some of which are designed to increase the number of feasible solutions found whilst others are intended to improve the quality of such solutions.As well as a theoretical discussion as to the underlying reasons for using each operator,extensive computational experiments are carried out on a variety of data.These show that the indirect approach relies less on problem structure and hence is easier to implement and superior in solution quality.
Algorithms for parallel flow solvers on message passing architectures

Science.gov (United States)

Vanderwijngaart, Rob F.

1995-01-01

The purpose of this project has been to identify and test suitable technologies for implementation of fluid flow solvers -- possibly coupled with structures and heat equation solvers -- on MIMD parallel computers. In the course of this investigation much attention has been paid to efficient domain decomposition strategies for ADI-type algorithms. Multi-partitioning derives its efficiency from the assignment of several blocks of grid points to each processor in the parallel computer. A coarse-grain parallelism is obtained, and a near-perfect load balance results. In uni-partitioning every processor receives responsibility for exactly one block of grid points instead of several. This necessitates fine-grain pipelined program execution in order to obtain a reasonable load balance. Although fine-grain parallelism is less desirable on many systems, especially high-latency networks of workstations, uni-partition methods are still in wide use in production codes for flow problems. Consequently, it remains important to achieve good efficiency with this technique that has essentially been superseded by multi-partitioning for parallel ADI-type algorithms. Another reason for the concentration on improving the performance of pipeline methods is their applicability in other types of flow solver kernels with stronger implied data dependence. Analytical expressions can be derived for the size of the dynamic load imbalance incurred in traditional pipelines. From these it can be determined what is the optimal first-processor retardation that leads to the shortest total completion time for the pipeline process. Theoretical predictions of pipeline performance with and without optimization match experimental observations on the iPSC/860 very well. Analysis of pipeline performance also highlights the effect of uncareful grid partitioning in flow solvers that employ pipeline algorithms. If grid blocks at boundaries are not at least as large in the wall-normal direction as those
Fast parallel DNA-based algorithms for molecular computation: the set-partition problem.

Science.gov (United States)

Chang, Weng-Long

2007-12-01

This paper demonstrates that basic biological operations can be used to solve the set-partition problem. In order to achieve this, we propose three DNA-based algorithms, a signed parallel adder, a signed parallel subtractor and a signed parallel comparator, that formally verify our designed molecular solutions for solving the set-partition problem.
The Applications of Genetic Algorithms in Medicine

Directory of Open Access Journals (Sweden)

Ali Ghaheri

2015-11-01

Full Text Available A great wealth of information is hidden amid medical research data that in some cases cannot be easily analyzed, if at all, using classical statistical methods. Inspired by nature, metaheuristic algorithms have been developed to offer optimal or near-optimal solutions to complex data analysis and decision-making tasks in a reasonable time. Due to their powerful features, metaheuristic algorithms have frequently been used in other fields of sciences. In medicine, however, the use of these algorithms are not known by physicians who may well benefit by applying them to solve complex medical problems. Therefore, in this paper, we introduce the genetic algorithm and its applications in medicine. The use of the genetic algorithm has promising implications in various medical specialties including radiology, radiotherapy, oncology, pediatrics, cardiology, endocrinology, surgery, obstetrics and gynecology, pulmonology, infectious diseases, orthopedics, rehabilitation medicine, neurology, pharmacotherapy, and health care management. This review introduces the applications of the genetic algorithm in disease screening, diagnosis, treatment planning, pharmacovigilance, prognosis, and health care management, and enables physicians to envision possible applications of this metaheuristic method in their medical career.
The Applications of Genetic Algorithms in Medicine.

Science.gov (United States)

Ghaheri, Ali; Shoar, Saeed; Naderan, Mohammad; Hoseini, Sayed Shahabuddin

2015-11-01

A great wealth of information is hidden amid medical research data that in some cases cannot be easily analyzed, if at all, using classical statistical methods. Inspired by nature, metaheuristic algorithms have been developed to offer optimal or near-optimal solutions to complex data analysis and decision-making tasks in a reasonable time. Due to their powerful features, metaheuristic algorithms have frequently been used in other fields of sciences. In medicine, however, the use of these algorithms are not known by physicians who may well benefit by applying them to solve complex medical problems. Therefore, in this paper, we introduce the genetic algorithm and its applications in medicine. The use of the genetic algorithm has promising implications in various medical specialties including radiology, radiotherapy, oncology, pediatrics, cardiology, endocrinology, surgery, obstetrics and gynecology, pulmonology, infectious diseases, orthopedics, rehabilitation medicine, neurology, pharmacotherapy, and health care management. This review introduces the applications of the genetic algorithm in disease screening, diagnosis, treatment planning, pharmacovigilance, prognosis, and health care management, and enables physicians to envision possible applications of this metaheuristic method in their medical career.].
Using Load Balancing to Scalably Parallelize Sampling-Based Motion Planning Algorithms

KAUST Repository

Fidel, Adam; Jacobs, Sam Ade; Sharma, Shishir; Amato, Nancy M.; Rauchwerger, Lawrence

2014-01-01

Motion planning, which is the problem of computing feasible paths in an environment for a movable object, has applications in many domains ranging from robotics, to intelligent CAD, to protein folding. The best methods for solving this PSPACE-hard problem are so-called sampling-based planners. Recent work introduced uniform spatial subdivision techniques for parallelizing sampling-based motion planning algorithms that scaled well. However, such methods are prone to load imbalance, as planning time depends on region characteristics and, for most problems, the heterogeneity of the sub problems increases as the number of processors increases. In this work, we introduce two techniques to address load imbalance in the parallelization of sampling-based motion planning algorithms: an adaptive work stealing approach and bulk-synchronous redistribution. We show that applying these techniques to representatives of the two major classes of parallel sampling-based motion planning algorithms, probabilistic roadmaps and rapidly-exploring random trees, results in a more scalable and load-balanced computation on more than 3,000 cores. © 2014 IEEE.
Using Load Balancing to Scalably Parallelize Sampling-Based Motion Planning Algorithms

KAUST Repository

Fidel, Adam

2014-05-01

Motion planning, which is the problem of computing feasible paths in an environment for a movable object, has applications in many domains ranging from robotics, to intelligent CAD, to protein folding. The best methods for solving this PSPACE-hard problem are so-called sampling-based planners. Recent work introduced uniform spatial subdivision techniques for parallelizing sampling-based motion planning algorithms that scaled well. However, such methods are prone to load imbalance, as planning time depends on region characteristics and, for most problems, the heterogeneity of the sub problems increases as the number of processors increases. In this work, we introduce two techniques to address load imbalance in the parallelization of sampling-based motion planning algorithms: an adaptive work stealing approach and bulk-synchronous redistribution. We show that applying these techniques to representatives of the two major classes of parallel sampling-based motion planning algorithms, probabilistic roadmaps and rapidly-exploring random trees, results in a more scalable and load-balanced computation on more than 3,000 cores. © 2014 IEEE.
Quantum Genetic Algorithms for Computer Scientists

OpenAIRE

Lahoz Beltrá, Rafael

2016-01-01

Genetic algorithms (GAs) are a class of evolutionary algorithms inspired by Darwinian natural selection. They are popular heuristic optimisation methods based on simulated genetic mechanisms, i.e., mutation, crossover, etc. and population dynamical processes such as reproduction, selection, etc. Over the last decade, the possibility to emulate a quantum computer (a computer using quantum-mechanical phenomena to perform operations on data) has led to a new class of GAs known as “Quantum Geneti...
Application of the distributed genetic algorithm for loading pattern optimization problems

International Nuclear Information System (INIS)

Hashimoto, Hiroshi; Yamamoto, Akio

2000-01-01

The distributed genetic algorithm (DGA) is applied for loading pattern optimization problems of the pressurized water reactors (PWR). Due to stiff nature of the loading pattern optimizations (e.g. multi-modality and non-linearity), stochastic methods like the simulated annealing or the genetic algorithm (GA) are widely applied for these problems. A basic concept of DGA is based on that of GA. However, DGA equally distributes candidates of solutions (i.e. loading patterns) to several independent 'islands' and evolves them in each island. Migrations of some candidates are performed among islands with a certain period. Since candidates of solutions independently evolve in each island with accepting different genes of migrants from other islands, premature convergence in the traditional GA can be prevented. Because many candidate loading patterns should be evaluated in one generation of GA or DGA, the parallelization in these calculations works efficiently. Parallel efficiency was measured using our optimization code and good load balance was attained even in a heterogeneous cluster environment due to dynamic distribution of the calculation load. The optimization code is based on the client/server architecture with the TCP/IP native socket and a client (optimization module) and calculation server modules communicate the objects of loading patterns each other. Throughout the sensitivity study on optimization parameters of DGA, a suitable set of the parameters for a test problem was identified. Finally, optimization capability of DGA and the traditional GA was compared in the test problem and DGA provided better optimization results than the traditional GA. (author)
Optimization of Pressurizer Based on Genetic-Simplex Algorithm

International Nuclear Information System (INIS)

Wang, Cheng; Yan, Chang Qi; Wang, Jian Jun

2014-01-01

Pressurizer is one of key components in nuclear power system. It's important to control the dimension in the design of pressurizer through optimization techniques. In this work, a mathematic model of a vertical electric heating pressurizer was established. A new Genetic-Simplex Algorithm (GSA) that combines genetic algorithm and simplex algorithm was developed to enhance the searching ability, and the comparison among modified and original algorithms is conducted by calculating the benchmark function. Furthermore, the optimization design of pressurizer, taking minimization of volume and net weight as objectives, was carried out considering thermal-hydraulic and geometric constraints through GSA. The results indicate that the mathematical model is agreeable for the pressurizer and the new algorithm is more effective than the traditional genetic algorithm. The optimization design shows obvious validity and can provide guidance for real engineering design
A Computational Fluid Dynamics Algorithm on a Massively Parallel Computer

Science.gov (United States)

Jespersen, Dennis C.; Levit, Creon

1989-01-01

The discipline of computational fluid dynamics is demanding ever-increasing computational power to deal with complex fluid flow problems. We investigate the performance of a finite-difference computational fluid dynamics algorithm on a massively parallel computer, the Connection Machine. Of special interest is an implicit time-stepping algorithm; to obtain maximum performance from the Connection Machine, it is necessary to use a nonstandard algorithm to solve the linear systems that arise in the implicit algorithm. We find that the Connection Machine ran achieve very high computation rates on both explicit and implicit algorithms. The performance of the Connection Machine puts it in the same class as today's most powerful conventional supercomputers.
On the impact of communication complexity in the design of parallel numerical algorithms

Science.gov (United States)

Gannon, D.; Vanrosendale, J.

1984-01-01

This paper describes two models of the cost of data movement in parallel numerical algorithms. One model is a generalization of an approach due to Hockney, and is suitable for shared memory multiprocessors where each processor has vector capabilities. The other model is applicable to highly parallel nonshared memory MIMD systems. In the second model, algorithm performance is characterized in terms of the communication network design. Techniques used in VLSI complexity theory are also brought in, and algorithm independent upper bounds on system performance are derived for several problems that are important to scientific computation.

Genetic algorithms in loading pattern optimization

International Nuclear Information System (INIS)

Yilmazbayhan, A.; Tombakoglu, M.; Bekar, K. B.; Erdemli, A. Oe

2001-01-01

Genetic Algorithm (GA) based systems are used for the loading pattern optimization. The use of Genetic Algorithm operators such as regional crossover, crossover and mutation, and selection of initial population size for PWRs are discussed. Antithetic variates are used to generate the initial population. The performance of GA with antithetic variates is compared to traditional GA. The results of multi-cycle optimization are discussed for objective function taking into account cycle burn-up and discharge burn-up
Adaptive sensor fusion using genetic algorithms

International Nuclear Information System (INIS)

Fitzgerald, D.S.; Adams, D.G.

1994-01-01

Past attempts at sensor fusion have used some form of Boolean logic to combine the sensor information. As an alteniative, an adaptive ''fuzzy'' sensor fusion technique is described in this paper. This technique exploits the robust capabilities of fuzzy logic in the decision process as well as the optimization features of the genetic algorithm. This paper presents a brief background on fuzzy logic and genetic algorithms and how they are used in an online implementation of adaptive sensor fusion
Availability allocation to repairable systems with genetic algorithms: a multi-objective formulation

International Nuclear Information System (INIS)

Elegbede, Charles; Adjallah, Kondo

2003-01-01

This paper describes a methodology based on genetic algorithms (GA) and experiments plan to optimize the availability and the cost of reparable parallel-series systems. It is a NP-hard problem of multi-objective combinatorial optimization, modeled with continuous and discrete variables. By using the weighting technique, the problem is transformed into a single-objective optimization problem whose constraints are then relaxed by the exterior penalty technique. We then propose a search of solution through GA, whose parameters are adjusted using experiments plan technique. A numerical example is used to assess the method
Efficient parallel and out of core algorithms for constructing large bi-directed de Bruijn graphs

Directory of Open Access Journals (Sweden)

Vaughn Matthew

2010-11-01

Full Text Available Abstract Background Assembling genomic sequences from a set of overlapping reads is one of the most fundamental problems in computational biology. Algorithms addressing the assembly problem fall into two broad categories - based on the data structures which they employ. The first class uses an overlap/string graph and the second type uses a de Bruijn graph. However with the recent advances in short read sequencing technology, de Bruijn graph based algorithms seem to play a vital role in practice. Efficient algorithms for building these massive de Bruijn graphs are very essential in large sequencing projects based on short reads. In an earlier work, an O(n/p time parallel algorithm has been given for this problem. Here n is the size of the input and p is the number of processors. This algorithm enumerates all possible bi-directed edges which can overlap with a node and ends up generating Θ(nΣ messages (Σ being the size of the alphabet. Results In this paper we present a Θ(n/p time parallel algorithm with a communication complexity that is equal to that of parallel sorting and is not sensitive to Σ. The generality of our algorithm makes it very easy to extend it even to the out-of-core model and in this case it has an optimal I/O complexity of Θ(nlog(n/BBlog(M/B (M being the main memory size and B being the size of the disk block. We demonstrate the scalability of our parallel algorithm on a SGI/Altix computer. A comparison of our algorithm with the previous approaches reveals that our algorithm is faster - both asymptotically and practically. We demonstrate the scalability of our sequential out-of-core algorithm by comparing it with the algorithm used by VELVET to build the bi-directed de Bruijn graph. Our experiments reveal that our algorithm can build the graph with a constant amount of memory, which clearly outperforms VELVET. We also provide efficient algorithms for the bi-directed chain compaction problem. Conclusions The bi
Efficient parallel and out of core algorithms for constructing large bi-directed de Bruijn graphs.

Science.gov (United States)

Kundeti, Vamsi K; Rajasekaran, Sanguthevar; Dinh, Hieu; Vaughn, Matthew; Thapar, Vishal

2010-11-15

Assembling genomic sequences from a set of overlapping reads is one of the most fundamental problems in computational biology. Algorithms addressing the assembly problem fall into two broad categories - based on the data structures which they employ. The first class uses an overlap/string graph and the second type uses a de Bruijn graph. However with the recent advances in short read sequencing technology, de Bruijn graph based algorithms seem to play a vital role in practice. Efficient algorithms for building these massive de Bruijn graphs are very essential in large sequencing projects based on short reads. In an earlier work, an O(n/p) time parallel algorithm has been given for this problem. Here n is the size of the input and p is the number of processors. This algorithm enumerates all possible bi-directed edges which can overlap with a node and ends up generating Θ(nΣ) messages (Σ being the size of the alphabet). In this paper we present a Θ(n/p) time parallel algorithm with a communication complexity that is equal to that of parallel sorting and is not sensitive to Σ. The generality of our algorithm makes it very easy to extend it even to the out-of-core model and in this case it has an optimal I/O complexity of Θ(nlog(n/B)Blog(M/B)) (M being the main memory size and B being the size of the disk block). We demonstrate the scalability of our parallel algorithm on a SGI/Altix computer. A comparison of our algorithm with the previous approaches reveals that our algorithm is faster--both asymptotically and practically. We demonstrate the scalability of our sequential out-of-core algorithm by comparing it with the algorithm used by VELVET to build the bi-directed de Bruijn graph. Our experiments reveal that our algorithm can build the graph with a constant amount of memory, which clearly outperforms VELVET. We also provide efficient algorithms for the bi-directed chain compaction problem. The bi-directed de Bruijn graph is a fundamental data structure for
Real-Coded Quantum-Inspired Genetic Algorithm-Based BP Neural Network Algorithm

Directory of Open Access Journals (Sweden)

Jianyong Liu

2015-01-01

Full Text Available The method that the real-coded quantum-inspired genetic algorithm (RQGA used to optimize the weights and threshold of BP neural network is proposed to overcome the defect that the gradient descent method makes the algorithm easily fall into local optimal value in the learning process. Quantum genetic algorithm (QGA is with good directional global optimization ability, but the conventional QGA is based on binary coding; the speed of calculation is reduced by the coding and decoding processes. So, RQGA is introduced to explore the search space, and the improved varied learning rate is adopted to train the BP neural network. Simulation test shows that the proposed algorithm is effective to rapidly converge to the solution conformed to constraint conditions.
Mission Planning for Unmanned Aircraft with Genetic Algorithms

DEFF Research Database (Denmark)

Hansen, Karl Damkjær

unmanned aircraft are used for aerial surveying of the crops. The farmer takes the role of the analyst above, who does not necessarily have any specific interest in remote controlled aircraft but needs the outcome of the survey. The recurring method in the study is the genetic algorithm; a flexible...... contributions are made in the area of the genetic algorithms. One is a method to decide on the right time to stop the computation of the plan, when the right balance is stricken between using the time planning and using the time flying. The other contribution is a characterization of the evolutionary operators...... used in the genetic algorithm. The result is a measure based on entropy to evaluate and control the diversity of the population of the genetic algorithm, which is an important factor its effectiveness....
Increasing the reach of forensic genetics with massively parallel sequencing.

Science.gov (United States)

Budowle, Bruce; Schmedes, Sarah E; Wendt, Frank R

2017-09-01

The field of forensic genetics has made great strides in the analysis of biological evidence related to criminal and civil matters. More so, the discipline has set a standard of performance and quality in the forensic sciences. The advent of massively parallel sequencing will allow the field to expand its capabilities substantially. This review describes the salient features of massively parallel sequencing and how it can impact forensic genetics. The features of this technology offer increased number and types of genetic markers that can be analyzed, higher throughput of samples, and the capability of targeting different organisms, all by one unifying methodology. While there are many applications, three are described where massively parallel sequencing will have immediate impact: molecular autopsy, microbial forensics and differentiation of monozygotic twins. The intent of this review is to expose the forensic science community to the potential enhancements that have or are soon to arrive and demonstrate the continued expansion the field of forensic genetics and its service in the investigation of legal matters.
Reactor controller design using genetic algorithms with simulated annealing

International Nuclear Information System (INIS)

Erkan, K.; Buetuen, E.

2000-01-01

This chapter presents a digital control system for ITU TRIGA Mark-II reactor using genetic algorithms with simulated annealing. The basic principles of genetic algorithms for problem solving are inspired by the mechanism of natural selection. Natural selection is a biological process in which stronger individuals are likely to be winners in a competing environment. Genetic algorithms use a direct analogy of natural evolution. Genetic algorithms are global search techniques for optimisation but they are poor at hill-climbing. Simulated annealing has the ability of probabilistic hill-climbing. Thus, the two techniques are combined here to get a fine-tuned algorithm that yields a faster convergence and a more accurate search by introducing a new mutation operator like simulated annealing or an adaptive cooling schedule. In control system design, there are currently no systematic approaches to choose the controller parameters to obtain the desired performance. The controller parameters are usually determined by test and error with simulation and experimental analysis. Genetic algorithm is used automatically and efficiently searching for a set of controller parameters for better performance. (orig.)
Parallel Newton-Krylov-Schwarz algorithms for the transonic full potential equation

Science.gov (United States)

Cai, Xiao-Chuan; Gropp, William D.; Keyes, David E.; Melvin, Robin G.; Young, David P.

1996-01-01

We study parallel two-level overlapping Schwarz algorithms for solving nonlinear finite element problems, in particular, for the full potential equation of aerodynamics discretized in two dimensions with bilinear elements. The overall algorithm, Newton-Krylov-Schwarz (NKS), employs an inexact finite-difference Newton method and a Krylov space iterative method, with a two-level overlapping Schwarz method as a preconditioner. We demonstrate that NKS, combined with a density upwinding continuation strategy for problems with weak shocks, is robust and, economical for this class of mixed elliptic-hyperbolic nonlinear partial differential equations, with proper specification of several parameters. We study upwinding parameters, inner convergence tolerance, coarse grid density, subdomain overlap, and the level of fill-in in the incomplete factorization, and report their effect on numerical convergence rate, overall execution time, and parallel efficiency on a distributed-memory parallel computer.
Genetic algorithm-based neural network for accidents diagnosis of research reactors on FPGA

International Nuclear Information System (INIS)

Ghuname, A.A.A.

2012-01-01

The Nuclear Research Reactors plants are expected to be operated with high levels of reliability, availability and safety. In order to achieve and maintain system stability and assure satisfactory and safe operation, there is increasing demand for automated systems to detect and diagnose such failures. Artificial Neural Networks (ANNs) are one of the most popular solutions because of their parallel structure, high speed, and their ability to give easy solution to complicated problems. The genetic algorithms (GAs) which are search algorithms (optimization techniques), in recent years, have been used to find the optimum construction of a neural network for definite application, as one of the advantages of its usage. Nowadays, Field Programmable Gate Arrays (FPGAs) are being an important implementation method of neural networks due to their high performance and they can easily be made parallel. The VHDL, which stands for VHSIC (Very High Speed Integrated Circuits) Hardware Description Language, have been used to describe the design behaviorally in addition to schematic and other description languages. The description of designs in synthesizable language such as VHDL make them reusable and be implemented in upgradeable systems like the Nuclear Research Reactors plants. In this thesis, the work was carried out through three main parts.In the first part, the Nuclear Research Reactors accident's pattern recognition is tackled within the artificial neural network approach. Such patterns are introduced initially without noise. And, to increase the reliability of such neural network, the noise ratio up to 50% was added for training in order to ensure the recognition of these patterns if it introduced with noise.The second part is concerned with the construction of Artificial Neural Networks (ANNs) using Genetic algorithms (GAs) for the nuclear accidents diagnosis. MATLAB ANNs toolbox and GAs toolbox are employed to optimize an ANN for this purpose. The results obtained show
Genetic algorithm for neural networks optimization

Science.gov (United States)

Setyawati, Bina R.; Creese, Robert C.; Sahirman, Sidharta

2004-11-01

This paper examines the forecasting performance of multi-layer feed forward neural networks in modeling a particular foreign exchange rates, i.e. Japanese Yen/US Dollar. The effects of two learning methods, Back Propagation and Genetic Algorithm, in which the neural network topology and other parameters fixed, were investigated. The early results indicate that the application of this hybrid system seems to be well suited for the forecasting of foreign exchange rates. The Neural Networks and Genetic Algorithm were programmed using MATLAB«.
Overcoming artificial spatial correlations in simulations of superstructure domain growth with parallel Monte Carlo algorithms

International Nuclear Information System (INIS)

Schleier, W.; Besold, G.; Heinz, K.

1992-01-01

The authors study the applicability of parallelized/vectorized Monte Carlo (MC) algorithms to the simulation of domain growth in two-dimensional lattice gas models undergoing an ordering process after a rapid quench below an order-disorder transition temperature. As examples they consider models with 2 x 1 and c(2 x 2) equilibrium superstructures on the square and rectangular lattices, respectively. They also study the case of phase separation ('1 x 1' islands) on the square lattice. A generalized parallel checkerboard algorithm for Kawasaki dynamics is shown to give rise to artificial spatial correlations in all three models. However, only if superstructure domains evolve do these correlations modify the kinetics by influencing the nucleation process and result in a reduced growth exponent compared to the value from the conventional heat bath algorithm with random single-site updates. In order to overcome these artificial modifications, two MC algorithms with a reduced degree of parallelism ('hybrid' and 'mask' algorithms, respectively) are presented and applied. As the results indicate, these algorithms are suitable for the simulation of superstructure domain growth on parallel/vector computers. 60 refs., 10 figs., 1 tab
Parallel Sn Sweeps on Unstructured Grids: Algorithms for Prioritization, Grid Partitioning, and Cycle Detection

International Nuclear Information System (INIS)

Plimpton, Steven J.; Hendrickson, Bruce; Burns, Shawn P.; McLendon, William III; Rauchwerger, Lawrence

2005-01-01

The method of discrete ordinates is commonly used to solve the Boltzmann transport equation. The solution in each ordinate direction is most efficiently computed by sweeping the radiation flux across the computational grid. For unstructured grids this poses many challenges, particularly when implemented on distributed-memory parallel machines where the grid geometry is spread across processors. We present several algorithms relevant to this approach: (a) an asynchronous message-passing algorithm that performs sweeps simultaneously in multiple ordinate directions, (b) a simple geometric heuristic to prioritize the computational tasks that a processor works on, (c) a partitioning algorithm that creates columnar-style decompositions for unstructured grids, and (d) an algorithm for detecting and eliminating cycles that sometimes exist in unstructured grids and can prevent sweeps from successfully completing. Algorithms (a) and (d) are fully parallel; algorithms (b) and (c) can be used in conjunction with (a) to achieve higher parallel efficiencies. We describe our message-passing implementations of these algorithms within a radiation transport package. Performance and scalability results are given for unstructured grids with up to 3 million elements (500 million unknowns) running on thousands of processors of Sandia National Laboratories' Intel Tflops machine and DEC-Alpha CPlant cluster
An intrinsic algorithm for parallel Poisson disk sampling on arbitrary surfaces.

Science.gov (United States)

Ying, Xiang; Xin, Shi-Qing; Sun, Qian; He, Ying

2013-09-01

Poisson disk sampling has excellent spatial and spectral properties, and plays an important role in a variety of visual computing. Although many promising algorithms have been proposed for multidimensional sampling in euclidean space, very few studies have been reported with regard to the problem of generating Poisson disks on surfaces due to the complicated nature of the surface. This paper presents an intrinsic algorithm for parallel Poisson disk sampling on arbitrary surfaces. In sharp contrast to the conventional parallel approaches, our method neither partitions the given surface into small patches nor uses any spatial data structure to maintain the voids in the sampling domain. Instead, our approach assigns each sample candidate a random and unique priority that is unbiased with regard to the distribution. Hence, multiple threads can process the candidates simultaneously and resolve conflicts by checking the given priority values. Our algorithm guarantees that the generated Poisson disks are uniformly and randomly distributed without bias. It is worth noting that our method is intrinsic and independent of the embedding space. This intrinsic feature allows us to generate Poisson disk patterns on arbitrary surfaces in IR(n). To our knowledge, this is the first intrinsic, parallel, and accurate algorithm for surface Poisson disk sampling. Furthermore, by manipulating the spatially varying density function, we can obtain adaptive sampling easily.
Feed-forward volume rendering algorithm for moderately parallel MIMD machines

Science.gov (United States)

Yagel, Roni

1993-01-01

Algorithms for direct volume rendering on parallel and vector processors are investigated. Volumes are transformed efficiently on parallel processors by dividing the data into slices and beams of voxels. Equal sized sets of slices along one axis are distributed to processors. Parallelism is achieved at two levels. Because each slice can be transformed independently of others, processors transform their assigned slices with no communication, thus providing maximum possible parallelism at the first level. Within each slice, consecutive beams are incrementally transformed using coherency in the transformation computation. Also, coherency across slices can be exploited to further enhance performance. This coherency yields the second level of parallelism through the use of the vector processing or pipelining. Other ongoing efforts include investigations into image reconstruction techniques, load balancing strategies, and improving performance.
Optimal support arrangement of piping systems using genetic algorithm

International Nuclear Information System (INIS)

Chiba, T.; Okado, S.; Fujii, I.; Itami, K.

1996-01-01

The support arrangement is one of the important factors in the design of piping systems. Much time is required to decide the arrangement of the supports. The authors applied a genetic algorithm to find the optimum support arrangement for piping systems. Examples are provided to illustrate the effectiveness of the genetic algorithm. Good results are obtained when applying the genetic algorithm to the actual designing of the piping system
Modeling of genetic algorithms with a finite population

NARCIS (Netherlands)

C.H.M. van Kemenade

1997-01-01

textabstractCross-competition between non-overlapping building blocks can strongly influence the performance of evolutionary algorithms. The choice of the selection scheme can have a strong influence on the performance of a genetic algorithm. This paper describes a number of different genetic
A Parallel Adaptive Particle Swarm Optimization Algorithm for Economic/Environmental Power Dispatch

Directory of Open Access Journals (Sweden)

Jinchao Li

2012-01-01

Full Text Available A parallel adaptive particle swarm optimization algorithm (PAPSO is proposed for economic/environmental power dispatch, which can overcome the premature characteristic, the slow-speed convergence in the late evolutionary phase, and lacking good direction in particles’ evolutionary process. A search population is randomly divided into several subpopulations. Then for each subpopulation, the optimal solution is searched synchronously using the proposed method, and thus parallel computing is realized. To avoid converging to a local optimum, a crossover operator is introduced to exchange the information among the subpopulations and the diversity of population is sustained simultaneously. Simulation results show that the proposed algorithm can effectively solve the economic/environmental operation problem of hydropower generating units. Performance comparisons show that the solution from the proposed method is better than those from the conventional particle swarm algorithm and other optimization algorithms.
Application of the DMRG in two dimensions: a parallel tempering algorithm

Science.gov (United States)

Hu, Shijie; Zhao, Jize; Zhang, Xuefeng; Eggert, Sebastian

The Density Matrix Renormalization Group (DMRG) is known to be a powerful algorithm for treating one-dimensional systems. When the DMRG is applied in two dimensions, however, the convergence becomes much less reliable and typically ''metastable states'' may appear, which are unfortunately quite robust even when keeping a very high number of DMRG states. To overcome this problem we have now successfully developed a parallel tempering DMRG algorithm. Similar to parallel tempering in quantum Monte Carlo, this algorithm allows the systematic switching of DMRG states between different model parameters, which is very efficient for solving convergence problems. Using this method we have figured out the phase diagram of the xxz model on the anisotropic triangular lattice which can be realized by hardcore bosons in optical lattices. SFB Transregio 49 of the Deutsche Forschungsgemeinschaft (DFG) and the Allianz fur Hochleistungsrechnen Rheinland-Pfalz (AHRP).

Quantum Genetic Algorithms for Computer Scientists

Directory of Open Access Journals (Sweden)

Rafael Lahoz-Beltra

2016-10-01

Full Text Available Genetic algorithms (GAs are a class of evolutionary algorithms inspired by Darwinian natural selection. They are popular heuristic optimisation methods based on simulated genetic mechanisms, i.e., mutation, crossover, etc. and population dynamical processes such as reproduction, selection, etc. Over the last decade, the possibility to emulate a quantum computer (a computer using quantum-mechanical phenomena to perform operations on data has led to a new class of GAs known as “Quantum Genetic Algorithms” (QGAs. In this review, we present a discussion, future potential, pros and cons of this new class of GAs. The review will be oriented towards computer scientists interested in QGAs “avoiding” the possible difficulties of quantum-mechanical phenomena.
A parallel algorithm for 3D dislocation dynamics

International Nuclear Information System (INIS)

Wang Zhiqiang; Ghoniem, Nasr; Swaminarayan, Sriram; LeSar, Richard

2006-01-01

Dislocation dynamics (DD), a discrete dynamic simulation method in which dislocations are the fundamental entities, is a powerful tool for investigation of plasticity, deformation and fracture of materials at the micron length scale. However, severe computational difficulties arising from complex, long-range interactions between these curvilinear line defects limit the application of DD in the study of large-scale plastic deformation. We present here the development of a parallel algorithm for accelerated computer simulations of DD. By representing dislocations as a 3D set of dislocation particles, we show here that the problem of an interacting ensemble of dislocations can be converted to a problem of a particle ensemble, interacting with a long-range force field. A grid using binary space partitioning is constructed to keep track of node connectivity across domains. We demonstrate the computational efficiency of the parallel micro-plasticity code and discuss how O(N) methods map naturally onto the parallel data structure. Finally, we present results from applications of the parallel code to deformation in single crystal fcc metals
Big Data GPU-Driven Parallel Processing Spatial and Spatio-Temporal Clustering Algorithms

Science.gov (United States)

Konstantaras, Antonios; Skounakis, Emmanouil; Kilty, James-Alexander; Frantzeskakis, Theofanis; Maravelakis, Emmanuel

2016-04-01

Advances in graphics processing units' technology towards encompassing parallel architectures [1], comprised of thousands of cores and multiples of parallel threads, provide the foundation in terms of hardware for the rapid processing of various parallel applications regarding seismic big data analysis. Seismic data are normally stored as collections of vectors in massive matrices, growing rapidly in size as wider areas are covered, denser recording networks are being established and decades of data are being compiled together [2]. Yet, many processes regarding seismic data analysis are performed on each seismic event independently or as distinct tiles [3] of specific grouped seismic events within a much larger data set. Such processes, independent of one another can be performed in parallel narrowing down processing times drastically [1,3]. This research work presents the development and implementation of three parallel processing algorithms using Cuda C [4] for the investigation of potentially distinct seismic regions [5,6] present in the vicinity of the southern Hellenic seismic arc. The algorithms, programmed and executed in parallel comparatively, are the: fuzzy k-means clustering with expert knowledge [7] in assigning overall clusters' number; density-based clustering [8]; and a selves-developed spatio-temporal clustering algorithm encompassing expert [9] and empirical knowledge [10] for the specific area under investigation. Indexing terms: GPU parallel programming, Cuda C, heterogeneous processing, distinct seismic regions, parallel clustering algorithms, spatio-temporal clustering References [1] Kirk, D. and Hwu, W.: 'Programming massively parallel processors - A hands-on approach', 2nd Edition, Morgan Kaufman Publisher, 2013 [2] Konstantaras, A., Valianatos, F., Varley, M.R. and Makris, J.P.: 'Soft-Computing Modelling of Seismicity in the Southern Hellenic Arc', Geoscience and Remote Sensing Letters, vol. 5 (3), pp. 323-327, 2008 [3] Papadakis, S. and
Efficient sequential and parallel algorithms for finding edit distance based motifs.

Science.gov (United States)

Pal, Soumitra; Xiao, Peng; Rajasekaran, Sanguthevar

2016-08-18

Motif search is an important step in extracting meaningful patterns from biological data. The general problem of motif search is intractable and there is a pressing need to develop efficient, exact and approximation algorithms to solve this problem. In this paper, we present several novel, exact, sequential and parallel algorithms for solving the (l,d) Edit-distance-based Motif Search (EMS) problem: given two integers l,d and n biological strings, find all strings of length l that appear in each input string with atmost d errors of types substitution, insertion and deletion. One popular technique to solve the problem is to explore for each input string the set of all possible l-mers that belong to the d-neighborhood of any substring of the input string and output those which are common for all input strings. We introduce a novel and provably efficient neighborhood exploration technique. We show that it is enough to consider the candidates in neighborhood which are at a distance exactly d. We compactly represent these candidate motifs using wildcard characters and efficiently explore them with very few repetitions. Our sequential algorithm uses a trie based data structure to efficiently store and sort the candidate motifs. Our parallel algorithm in a multi-core shared memory setting uses arrays for storing and a novel modification of radix-sort for sorting the candidate motifs. The algorithms for EMS are customarily evaluated on several challenging instances such as (8,1), (12,2), (16,3), (20,4), and so on. The best previously known algorithm, EMS1, is sequential and in estimated 3 days solves up to instance (16,3). Our sequential algorithms are more than 20 times faster on (16,3). On other hard instances such as (9,2), (11,3), (13,4), our algorithms are much faster. Our parallel algorithm has more than 600 % scaling performance while using 16 threads. Our algorithms have pushed up the state-of-the-art of EMS solvers and we believe that the techniques introduced in
An intelligent allocation algorithm for parallel processing

Science.gov (United States)

Carroll, Chester C.; Homaifar, Abdollah; Ananthram, Kishan G.

1988-01-01

The problem of allocating nodes of a program graph to processors in a parallel processing architecture is considered. The algorithm is based on critical path analysis, some allocation heuristics, and the execution granularity of nodes in a program graph. These factors, and the structure of interprocessor communication network, influence the allocation. To achieve realistic estimations of the executive durations of allocations, the algorithm considers the fact that nodes in a program graph have to communicate through varying numbers of tokens. Coarse and fine granularities have been implemented, with interprocessor token-communication duration, varying from zero up to values comparable to the execution durations of individual nodes. The effect on allocation of communication network structures is demonstrated by performing allocations for crossbar (non-blocking) and star (blocking) networks. The algorithm assumes the availability of as many processors as it needs for the optimal allocation of any program graph. Hence, the focus of allocation has been on varying token-communication durations rather than varying the number of processors. The algorithm always utilizes as many processors as necessary for the optimal allocation of any program graph, depending upon granularity and characteristics of the interprocessor communication network.
Large-Scale Parallel Viscous Flow Computations using an Unstructured Multigrid Algorithm

Science.gov (United States)

Mavriplis, Dimitri J.

1999-01-01

The development and testing of a parallel unstructured agglomeration multigrid algorithm for steady-state aerodynamic flows is discussed. The agglomeration multigrid strategy uses a graph algorithm to construct the coarse multigrid levels from the given fine grid, similar to an algebraic multigrid approach, but operates directly on the non-linear system using the FAS (Full Approximation Scheme) approach. The scalability and convergence rate of the multigrid algorithm are examined on the SGI Origin 2000 and the Cray T3E. An argument is given which indicates that the asymptotic scalability of the multigrid algorithm should be similar to that of its underlying single grid smoothing scheme. For medium size problems involving several million grid points, near perfect scalability is obtained for the single grid algorithm, while only a slight drop-off in parallel efficiency is observed for the multigrid V- and W-cycles, using up to 128 processors on the SGI Origin 2000, and up to 512 processors on the Cray T3E. For a large problem using 25 million grid points, good scalability is observed for the multigrid algorithm using up to 1450 processors on a Cray T3E, even when the coarsest grid level contains fewer points than the total number of processors.
Exploration Of Deep Learning Algorithms Using Openacc Parallel Programming Model

KAUST Repository

Hamam, Alwaleed A.

2017-03-13

Deep learning is based on a set of algorithms that attempt to model high level abstractions in data. Specifically, RBM is a deep learning algorithm that used in the project to increase it\\'s time performance using some efficient parallel implementation by OpenACC tool with best possible optimizations on RBM to harness the massively parallel power of NVIDIA GPUs. GPUs development in the last few years has contributed to growing the concept of deep learning. OpenACC is a directive based ap-proach for computing where directives provide compiler hints to accelerate code. The traditional Restricted Boltzmann Ma-chine is a stochastic neural network that essentially perform a binary version of factor analysis. RBM is a useful neural net-work basis for larger modern deep learning model, such as Deep Belief Network. RBM parameters are estimated using an efficient training method that called Contrastive Divergence. Parallel implementation of RBM is available using different models such as OpenMP, and CUDA. But this project has been the first attempt to apply OpenACC model on RBM.
Exploration Of Deep Learning Algorithms Using Openacc Parallel Programming Model

KAUST Repository

Hamam, Alwaleed A.; Khan, Ayaz H.

2017-01-01

Deep learning is based on a set of algorithms that attempt to model high level abstractions in data. Specifically, RBM is a deep learning algorithm that used in the project to increase it's time performance using some efficient parallel implementation by OpenACC tool with best possible optimizations on RBM to harness the massively parallel power of NVIDIA GPUs. GPUs development in the last few years has contributed to growing the concept of deep learning. OpenACC is a directive based ap-proach for computing where directives provide compiler hints to accelerate code. The traditional Restricted Boltzmann Ma-chine is a stochastic neural network that essentially perform a binary version of factor analysis. RBM is a useful neural net-work basis for larger modern deep learning model, such as Deep Belief Network. RBM parameters are estimated using an efficient training method that called Contrastive Divergence. Parallel implementation of RBM is available using different models such as OpenMP, and CUDA. But this project has been the first attempt to apply OpenACC model on RBM.
Optimization Algorithms for Calculation of the Joint Design Point in Parallel Systems

DEFF Research Database (Denmark)

Enevoldsen, I.; Sørensen, John Dalsgaard

1992-01-01

In large structures it is often necessary to estimate the reliability of the system by use of parallel systems. Optimality criteria-based algorithms for calculation of the joint design point in a parallel system are described and efficient active set strategies are developed. Three possible...
Eigenvalues calculation algorithms for {lambda}-modes determination. Parallelization approach

Energy Technology Data Exchange (ETDEWEB)

Vidal, V. [Universidad Politecnica de Valencia (Spain). Departamento de Sistemas Informaticos y Computacion; Verdu, G.; Munoz-Cobo, J.L. [Universidad Politecnica de Valencia (Spain). Departamento de Ingenieria Quimica y Nuclear; Ginestart, D. [Universidad Politecnica de Valencia (Spain). Departamento de Matematica Aplicada

1997-03-01

In this paper, we review two methods to obtain the {lambda}-modes of a nuclear reactor, Subspace Iteration method and Arnoldi`s method, which are popular methods to solve the partial eigenvalue problem for a given matrix. In the developed application for the neutron diffusion equation we include improved acceleration techniques for both methods. Also, we propose two parallelization approaches for these methods, a coarse grain parallelization and a fine grain one. We have tested the developed algorithms with two realistic problems, focusing on the efficiency of the methods according to the CPU times. (author).
Analysis of Shrinkage on Thick Plate Part using Genetic Algorithm

Directory of Open Access Journals (Sweden)

Najihah S.N.

2016-01-01

Full Text Available Injection moulding is the most widely used processes in manufacturing plastic products. Since the quality of injection improves plastic parts are mostly influenced by process conditions, the method to determine the optimum process conditions becomes the key to improving the part quality. This paper presents a systematic methodology to analyse the shrinkage of the thick plate part during the injection moulding process. Genetic Algorithm (GA method was proposed to optimise the process parameters that would result in optimal solutions of optimisation goals. Using the GA, the shrinkage of the thick plate part was improved by 39.1% in parallel direction and 17.21% in the normal direction of melt flow.
An Adaptive Filtering Algorithm Based on Genetic Algorithm-Backpropagation Network

Directory of Open Access Journals (Sweden)

Kai Hu

2013-01-01

Full Text Available A new image filtering algorithm is proposed. GA-BPN algorithm uses genetic algorithm (GA to decide weights in a back propagation neural network (BPN. It has better global optimal characteristics than traditional optimal algorithm. In this paper, we used GA-BPN to do image noise filter researching work. Firstly, this paper uses training samples to train GA-BPN as the noise detector. Then, we utilize the well-trained GA-BPN to recognize noise pixels in target image. And at last, an adaptive weighted average algorithm is used to recover noise pixels recognized by GA-BPN. Experiment data shows that this algorithm has better performance than other filters.
Implementation of a parallel algorithm for spherical SN calculations on the IBM 3090

International Nuclear Information System (INIS)

Haghighat, A.; Lawrence, R.D.

1989-01-01

Parallel S N algorithms based on domain decomposition in angle are straightforward to develop in Cartesian geometry because the computation of the angular fluxes for a specific discrete ordinate can be performed independently of all other angles. This is not the case for curvilinear geometries, where the angular redistribution component of the discretized streaming operator results in coupling between angular fluxes along adjacent discrete ordinates. Previously, the authors developed a parallel algorithm for S N calculations in spherical geometry and examined its iterative convergence for criticality and detector problems with differing scattering/absorption ratios. In this paper, the authors describe the implementation of the algorithm on an IBM 3090 Model 400 (four processors) and present computational results illustrating the efficiency of the algorithm relative to serial execution
Using Genetic Algorithms in Secured Business Intelligence Mobile Applications

Directory of Open Access Journals (Sweden)

Silvia TRIF

2011-01-01

Full Text Available The paper aims to assess the use of genetic algorithms for training neural networks used in secured Business Intelligence Mobile Applications. A comparison is made between classic back-propagation method and a genetic algorithm based training. The design of these algorithms is presented. A comparative study is realized for determining the better way of training neural networks, from the point of view of time and memory usage. The results show that genetic algorithms based training offer better performance and memory usage than back-propagation and they are fit to be implemented on mobile devices.
A structured representation for parallel algorithm design on multicomputers

International Nuclear Information System (INIS)

Sun, Xian-He; Ni, L.M.

1991-01-01

Traditionally, parallel algorithms have been designed by brute force methods and fine-tuned on each architecture to achieve high performance. Rather than studying the design case by case, a systematic approach is proposed. A notation is first developed. Using this notation, most of the frequently used scientific and engineering applications can be presented by simple formulas. The formulas constitute the structured representation of the corresponding applications. The structured representation is simple, adequate and easy to understand. They also contain sufficient information about uneven allocation and communication latency degradations. With the structured representation, applications can be compared, classified and partitioned. Some of the basic building blocks, called computation models, of frequently used applications are identified and studied. Most applications are combinations of some computation models. The structured representation relates general applications to computation models. Studying computation models leads to a guideline for efficient parallel algorithm design for general applications. 6 refs., 7 figs
Algorithm comparison and benchmarking using a parallel spectra transform shallow water model

Energy Technology Data Exchange (ETDEWEB)

Worley, P.H. [Oak Ridge National Lab., TN (United States); Foster, I.T.; Toonen, B. [Argonne National Lab., IL (United States)

1995-04-01

In recent years, a number of computer vendors have produced supercomputers based on a massively parallel processing (MPP) architecture. These computers have been shown to be competitive in performance with conventional vector supercomputers for some applications. As spectral weather and climate models are heavy users of vector supercomputers, it is interesting to determine how these models perform on MPPS, and which MPPs are best suited to the execution of spectral models. The benchmarking of MPPs is complicated by the fact that different algorithms may be more efficient on different architectures. Hence, a comprehensive benchmarking effort must answer two related questions: which algorithm is most efficient on each computer and how do the most efficient algorithms compare on different computers. In general, these are difficult questions to answer because of the high cost associated with implementing and evaluating a range of different parallel algorithms on each MPP platform.
Parallel algorithm for determining motion vectors in ice floe images by matching edge features

Science.gov (United States)

Manohar, M.; Ramapriyan, H. K.; Strong, J. P.

1988-01-01

A parallel algorithm is described to determine motion vectors of ice floes using time sequences of images of the Arctic ocean obtained from the Synthetic Aperture Radar (SAR) instrument flown on-board the SEASAT spacecraft. Researchers describe a parallel algorithm which is implemented on the MPP for locating corresponding objects based on their translationally and rotationally invariant features. The algorithm first approximates the edges in the images by polygons or sets of connected straight-line segments. Each such edge structure is then reduced to a seed point. Associated with each seed point are the descriptions (lengths, orientations and sequence numbers) of the lines constituting the corresponding edge structure. A parallel matching algorithm is used to match packed arrays of such descriptions to identify corresponding seed points in the two images. The matching algorithm is designed such that fragmentation and merging of ice floes are taken into account by accepting partial matches. The technique has been demonstrated to work on synthetic test patterns and real image pairs from SEASAT in times ranging from .5 to 0.7 seconds for 128 x 128 images.
Parallel algorithms for 2-D cylindrical transport equations of Eigenvalue problem

International Nuclear Information System (INIS)

Wei, J.; Yang, S.

2013-01-01

In this paper, aimed at the neutron transport equations of eigenvalue problem under 2-D cylindrical geometry on unstructured grid, the discrete scheme of Sn discrete ordinate and discontinuous finite is built, and the parallel computation for the scheme is realized on MPI systems. Numerical experiments indicate that the designed parallel algorithm can reach perfect speedup, it has good practicality and scalability. (authors)
Design of multiple sequence alignment algorithms on parallel, distributed memory supercomputers.

Science.gov (United States)

Church, Philip C; Goscinski, Andrzej; Holt, Kathryn; Inouye, Michael; Ghoting, Amol; Makarychev, Konstantin; Reumann, Matthias

2011-01-01

The challenge of comparing two or more genomes that have undergone recombination and substantial amounts of segmental loss and gain has recently been addressed for small numbers of genomes. However, datasets of hundreds of genomes are now common and their sizes will only increase in the future. Multiple sequence alignment of hundreds of genomes remains an intractable problem due to quadratic increases in compute time and memory footprint. To date, most alignment algorithms are designed for commodity clusters without parallelism. Hence, we propose the design of a multiple sequence alignment algorithm on massively parallel, distributed memory supercomputers to enable research into comparative genomics on large data sets. Following the methodology of the sequential progressiveMauve algorithm, we design data structures including sequences and sorted k-mer lists on the IBM Blue Gene/P supercomputer (BG/P). Preliminary results show that we can reduce the memory footprint so that we can potentially align over 250 bacterial genomes on a single BG/P compute node. We verify our results on a dataset of E.coli, Shigella and S.pneumoniae genomes. Our implementation returns results matching those of the original algorithm but in 1/2 the time and with 1/4 the memory footprint for scaffold building. In this study, we have laid the basis for multiple sequence alignment of large-scale datasets on a massively parallel, distributed memory supercomputer, thus enabling comparison of hundreds instead of a few genome sequences within reasonable time.
A parallel stereo reconstruction algorithm with applications in entomology (APSRA)

Science.gov (United States)

Bhasin, Rajesh; Jang, Won Jun; Hart, John C.

2012-03-01

We propose a fast parallel algorithm for the reconstruction of 3-Dimensional point clouds of insects from binocular stereo image pairs using a hierarchical approach for disparity estimation. Entomologists study various features of insects to classify them, build their distribution maps, and discover genetic links between specimens among various other essential tasks. This information is important to the pesticide and the pharmaceutical industries among others. When considering the large collections of insects entomologists analyze, it becomes difficult to physically handle the entire collection and share the data with researchers across the world. With the method presented in our work, Entomologists can create an image database for their collections and use the 3D models for studying the shape and structure of the insects thus making it easier to maintain and share. Initial feedback shows that the reconstructed 3D models preserve the shape and size of the specimen. We further optimize our results to incorporate multiview stereo which produces better overall structure of the insects. Our main contribution is applying stereoscopic vision techniques to entomology to solve the problems faced by entomologists.

A multithreaded parallel implementation of a dynamic programming algorithm for sequence comparison.

Science.gov (United States)

Martins, W S; Del Cuvillo, J B; Useche, F J; Theobald, K B; Gao, G R

2001-01-01

This paper discusses the issues involved in implementing a dynamic programming algorithm for biological sequence comparison on a general-purpose parallel computing platform based on a fine-grain event-driven multithreaded program execution model. Fine-grain multithreading permits efficient parallelism exploitation in this application both by taking advantage of asynchronous point-to-point synchronizations and communication with low overheads and by effectively tolerating latency through the overlapping of computation and communication. We have implemented our scheme on EARTH, a fine-grain event-driven multithreaded execution and architecture model which has been ported to a number of parallel machines with off-the-shelf processors. Our experimental results show that the dynamic programming algorithm can be efficiently implemented on EARTH systems with high performance (e.g., speedup of 90 on 120 nodes), good programmability and reasonable cost.
An efficient parallel algorithm for the calculation of canonical MP2 energies.

Science.gov (United States)

Baker, Jon; Pulay, Peter

2002-09-01

We present the parallel version of a previous serial algorithm for the efficient calculation of canonical MP2 energies (Pulay, P.; Saebo, S.; Wolinski, K. Chem Phys Lett 2001, 344, 543). It is based on the Saebo-Almlöf direct-integral transformation, coupled with an efficient prescreening of the AO integrals. The parallel algorithm avoids synchronization delays by spawning a second set of slaves during the bin-sort prior to the second half-transformation. Results are presented for systems with up to 2000 basis functions. MP2 energies for molecules with 400-500 basis functions can be routinely calculated to microhartree accuracy on a small number of processors (6-8) in a matter of minutes with modern PC-based parallel computers. Copyright 2002 Wiley Periodicals, Inc. J Comput Chem 23: 1150-1156, 2002
An Optimization Algorithm for Multipath Parallel Allocation for Service Resource in the Simulation Task Workflow

Directory of Open Access Journals (Sweden)

Zhiteng Wang

2014-01-01

Full Text Available Service oriented modeling and simulation are hot issues in the field of modeling and simulation, and there is need to call service resources when simulation task workflow is running. How to optimize the service resource allocation to ensure that the task is complete effectively is an important issue in this area. In military modeling and simulation field, it is important to improve the probability of success and timeliness in simulation task workflow. Therefore, this paper proposes an optimization algorithm for multipath service resource parallel allocation, in which multipath service resource parallel allocation model is built and multiple chains coding scheme quantum optimization algorithm is used for optimization and solution. The multiple chains coding scheme quantum optimization algorithm is to extend parallel search space to improve search efficiency. Through the simulation experiment, this paper investigates the effect for the probability of success in simulation task workflow from different optimization algorithm, service allocation strategy, and path number, and the simulation result shows that the optimization algorithm for multipath service resource parallel allocation is an effective method to improve the probability of success and timeliness in simulation task workflow.
Hybrid Modeling KMeans – Genetic Algorithms in the Health Care Data

Directory of Open Access Journals (Sweden)

Tessy Badriyah

2013-06-01

Full Text Available K-Means is one of the major algorithms widely used in clustering due to its good computational performance. However, K-Means is very sensitive to the initially selected points which randomly selected, and therefore it does not always generate optimum solutions. Genetic algorithm approach can be applied to solve this problem. In this research we examine the potential of applying hybrid GA- KMeans with focus on the area of health care data. We proposed a new technique using hybrid method combining KMeans Clustering and Genetic Algorithms, called the “Hybrid K-Means Genetic Algorithms” (HKGA. HKGA combines the power of Genetic Algorithms and the efficiency of K-Means Clustering. We compare our results with other conventional algorithms and also with other published research as well. Our results demonstrate that the HKGA achieves very good results and in some cases superior to other methods. Keywords: Machine Learning, K-Means, Genetic Algorithms, Hybrid KMeans Genetic Algorithm (HGKA.
An efficient parallel algorithm for the solution of a tridiagonal linear system of equations

Science.gov (United States)

Stone, H. S.

1971-01-01

Tridiagonal linear systems of equations are solved on conventional serial machines in a time proportional to N, where N is the number of equations. The conventional algorithms do not lend themselves directly to parallel computations on computers of the ILLIAC IV class, in the sense that they appear to be inherently serial. An efficient parallel algorithm is presented in which computation time grows as log sub 2 N. The algorithm is based on recursive doubling solutions of linear recurrence relations, and can be used to solve recurrence relations of all orders.
Parallel Algorithm for Incremental Betweenness Centrality on Large Graphs

KAUST Repository

Jamour, Fuad Tarek

2017-10-17

Betweenness centrality quantifies the importance of nodes in a graph in many applications, including network analysis, community detection and identification of influential users. Typically, graphs in such applications evolve over time. Thus, the computation of betweenness centrality should be performed incrementally. This is challenging because updating even a single edge may trigger the computation of all-pairs shortest paths in the entire graph. Existing approaches cannot scale to large graphs: they either require excessive memory (i.e., quadratic to the size of the input graph) or perform unnecessary computations rendering them prohibitively slow. We propose iCentral; a novel incremental algorithm for computing betweenness centrality in evolving graphs. We decompose the graph into biconnected components and prove that processing can be localized within the affected components. iCentral is the first algorithm to support incremental betweeness centrality computation within a graph component. This is done efficiently, in linear space; consequently, iCentral scales to large graphs. We demonstrate with real datasets that the serial implementation of iCentral is up to 3.7 times faster than existing serial methods. Our parallel implementation that scales to large graphs, is an order of magnitude faster than the state-of-the-art parallel algorithm, while using an order of magnitude less computational resources.
Genetic Algorithm Based Economic Dispatch with Valve Point Effect

Energy Technology Data Exchange (ETDEWEB)

Park, Jong Nam; Park, Kyung Won; Kim, Ji Hong; Kim, Jin O [Hanyang University (Korea, Republic of)

1999-03-01

This paper presents a new approach on genetic algorithm to economic dispatch problem for valve point discontinuities. Proposed approach in this paper on genetic algorithms improves the performance to solve economic dispatch problem for valve point discontinuities through improved death penalty method, generation-apart elitism, atavism and sexual selection with sexual distinction. Numerical results on a test system consisting of 13 thermal units show that the proposed approach is faster, more robust and powerful than conventional genetic algorithms. (author). 8 refs., 10 figs.
NETRA: A parallel architecture for integrated vision systems 2: Algorithms and performance evaluation

Science.gov (United States)

Choudhary, Alok N.; Patel, Janak H.; Ahuja, Narendra

1989-01-01

In part 1 architecture of NETRA is presented. A performance evaluation of NETRA using several common vision algorithms is also presented. Performance of algorithms when they are mapped on one cluster is described. It is shown that SIMD, MIMD, and systolic algorithms can be easily mapped onto processor clusters, and almost linear speedups are possible. For some algorithms, analytical performance results are compared with implementation performance results. It is observed that the analysis is very accurate. Performance analysis of parallel algorithms when mapped across clusters is presented. Mappings across clusters illustrate the importance and use of shared as well as distributed memory in achieving high performance. The parameters for evaluation are derived from the characteristics of the parallel algorithms, and these parameters are used to evaluate the alternative communication strategies in NETRA. Furthermore, the effect of communication interference from other processors in the system on the execution of an algorithm is studied. Using the analysis, performance of many algorithms with different characteristics is presented. It is observed that if communication speeds are matched with the computation speeds, good speedups are possible when algorithms are mapped across clusters.
Speeding Up the String Comparison of the IDS Snort using Parallel Programming: A Systematic Literature Review on the Parallelized Aho-Corasick Algorithm

Directory of Open Access Journals (Sweden)

SILVA JUNIOR,J. B.

2016-12-01

Full Text Available The Intrusion Detection System (IDS needs to compare the contents of all packets arriving at the network interface with a set of signatures for indicating possible attacks, a task that consumes much CPU processing time. In order to alleviate this problem, some researchers have tried to parallelize the IDS's comparison engine, transferring execution from the CPU to GPU. This paper identifies and maps the parallelization features of the Aho-Corasick algorithm, which is used in Snort to compare patterns, in order to show this algorithm's implementation and execution issues, as well as optimization techniques for the Aho-Corasick machine. We have found 147 papers from important computer science publications databases, and have mapped them. We selected 22 and analyzed them in order to find our results. Our analysis of the papers showed, among other results, that parallelization of the AC algorithm is a new task and the authors have focused on the State Transition Table as the most common way to implement the algorithm on the GPU. Furthermore, we found that some techniques speed up the algorithm and reduce the required machine storage space are highly used, such as the algorithm running on the fastest memories and mechanisms for reducing the number of nodes and bit maping.
Parallel Quasi Newton Algorithms for Large Scale Non Linear Unconstrained Optimization

International Nuclear Information System (INIS)

Rahman, M. A.; Basarudin, T.

1997-01-01

This paper discusses about Quasi Newton (QN) method to solve non-linear unconstrained minimization problems. One of many important of QN method is choice of matrix Hk. to be positive definite and satisfies to QN method. Our interest here is the parallel QN methods which will suite for the solution of large-scale optimization problems. The QN methods became less attractive in large-scale problems because of the storage and computational requirements. How ever, it is often the case that the Hessian is space matrix. In this paper we include the mechanism of how to reduce the Hessian update and hold the Hessian properties.One major reason of our research is that the QN method may be good in solving certain type of minimization problems, but it is efficiency degenerate when is it applied to solve other category of problems. For this reason, we use an algorithm containing several direction strategies which are processed in parallel. We shall attempt to parallelized algorithm by exploring different search directions which are generated by various QN update during the minimization process. The different line search strategies will be employed simultaneously in the process of locating the minimum along each direction.The code of algorithm will be written in Occam language 2 which is run on the transputer machine
Parallelizing Gene Expression Programming Algorithm in Enabling Large-Scale Classification

Directory of Open Access Journals (Sweden)

Lixiong Xu

2017-01-01

Full Text Available As one of the most effective function mining algorithms, Gene Expression Programming (GEP algorithm has been widely used in classification, pattern recognition, prediction, and other research fields. Based on the self-evolution, GEP is able to mine an optimal function for dealing with further complicated tasks. However, in big data researches, GEP encounters low efficiency issue due to its long time mining processes. To improve the efficiency of GEP in big data researches especially for processing large-scale classification tasks, this paper presents a parallelized GEP algorithm using MapReduce computing model. The experimental results show that the presented algorithm is scalable and efficient for processing large-scale classification tasks.
Parallel-Vector Algorithm For Rapid Structural Anlysis

Science.gov (United States)

Agarwal, Tarun R.; Nguyen, Duc T.; Storaasli, Olaf O.

1993-01-01

New algorithm developed to overcome deficiency of skyline storage scheme by use of variable-band storage scheme. Exploits both parallel and vector capabilities of modern high-performance computers. Gives engineers and designers opportunity to include more design variables and constraints during optimization of structures. Enables use of more refined finite-element meshes to obtain improved understanding of complex behaviors of aerospace structures leading to better, safer designs. Not only attractive for current supercomputers but also for next generation of shared-memory supercomputers.
Implementation of a Monte Carlo algorithm for neutron transport on a massively parallel SIMD machine

International Nuclear Information System (INIS)

Baker, R.S.

1992-01-01

We present some results from the recent adaptation of a vectorized Monte Carlo algorithm to a massively parallel architecture. The performance of the algorithm on a single processor Cray Y-MP and a Thinking Machine Corporations CM-2 and CM-200 is compared for several test problems. The results show that significant speedups are obtainable for vectorized Monte Carlo algorithms on massively parallel machines, even when the algorithms are applied to realistic problems which require extensive variance reduction. However, the architecture of the Connection Machine does place some limitations on the regime in which the Monte Carlo algorithm may be expected to perform well
Implementation of a Monte Carlo algorithm for neutron transport on a massively parallel SIMD machine

International Nuclear Information System (INIS)

Baker, R.S.

1993-01-01

We present some results from the recent adaptation of a vectorized Monte Carlo algorithm to a massively parallel architecture. The performance of the algorithm on a single processor Cray Y-MP and a Thinking Machine Corporations CM-2 and CM-200 is compared for several test problems. The results show that significant speedups are obtainable for vectorized Monte Carlo algorithms on massively parallel machines, even when the algorithms are applied to realistic problems which require extensive variance reduction. However, the architecture of the Connection Machine does place some limitations on the regime in which the Monte Carlo algorithm may be expected to perform well. (orig.)
Development of a Framework for Genetic Algorithms

OpenAIRE

Wååg, Håkan

2009-01-01

Genetic algorithms is a method of optimization that can be used tosolve many different kinds of problems. This thesis focuses ondeveloping a framework for genetic algorithms that is capable ofsolving at least the two problems explored in the work. Otherproblems are supported by allowing user-made extensions.The purpose of this thesis is to explore the possibilities of geneticalgorithms for optimization problems and artificial intelligenceapplications.To test the framework two applications are...
Evolving temporal association rules with genetic algorithms

OpenAIRE

Matthews, Stephen G.; Gongora, Mario A.; Hopgood, Adrian A.

2010-01-01

A novel framework for mining temporal association rules by discovering itemsets with a genetic algorithm is introduced. Metaheuristics have been applied to association rule mining, we show the efficacy of extending this to another variant - temporal association rule mining. Our framework is an enhancement to existing temporal association rule mining methods as it employs a genetic algorithm to simultaneously search the rule space and temporal space. A methodology for validating the ability of...
Genetic Algorithms in Wind Turbine Airfoil Design

Energy Technology Data Exchange (ETDEWEB)

Grasso, F. [ECN Wind Energy, Petten (Netherlands); Bizzarrini, N.; Coiro, D.P. [Department of Aerospace Engineering, University of Napoli ' Federico II' , Napoli (Italy)

2011-03-15

One key element in the aerodynamic design of wind turbines is the use of specially tailored airfoils to increase the ratio of energy capture to the loading and thereby to reduce cost of energy. This work is focused on the design of a wind turbine airfoil by using numerical optimization. Firstly, the optimization approach is presented; a genetic algorithm is used, coupled with RFOIL solver and a composite Bezier geometrical parameterization. A particularly sensitive point is the choice and implementation of constraints; in order to formalize in the most complete and effective way the design requirements, the effects of activating specific constraints are discussed. A numerical example regarding the design of a high efficiency airfoil for the outer part of a blade by using genetic algorithms is illustrated and the results are compared with existing wind turbine airfoils. Finally a new hybrid design strategy is illustrated and discussed, in which the genetic algorithms are used at the beginning of the design process to explore a wide domain. Then, the gradient based algorithms are used in order to improve the first stage optimum.
High-speed parallel implementation of a modified PBR algorithm on DSP-based EH topology

Science.gov (United States)

Rajan, K.; Patnaik, L. M.; Ramakrishna, J.

1997-08-01

Algebraic Reconstruction Technique (ART) is an age-old method used for solving the problem of three-dimensional (3-D) reconstruction from projections in electron microscopy and radiology. In medical applications, direct 3-D reconstruction is at the forefront of investigation. The simultaneous iterative reconstruction technique (SIRT) is an ART-type algorithm with the potential of generating in a few iterations tomographic images of a quality comparable to that of convolution backprojection (CBP) methods. Pixel-based reconstruction (PBR) is similar to SIRT reconstruction, and it has been shown that PBR algorithms give better quality pictures compared to those produced by SIRT algorithms. In this work, we propose a few modifications to the PBR algorithms. The modified algorithms are shown to give better quality pictures compared to PBR algorithms. The PBR algorithm and the modified PBR algorithms are highly compute intensive, Not many attempts have been made to reconstruct objects in the true 3-D sense because of the high computational overhead. In this study, we have developed parallel two-dimensional (2-D) and 3-D reconstruction algorithms based on modified PBR. We attempt to solve the two problems encountered by the PBR and modified PBR algorithms, i.e., the long computational time and the large memory requirements, by parallelizing the algorithm on a multiprocessor system. We investigate the possible task and data partitioning schemes by exploiting the potential parallelism in the PBR algorithm subject to minimizing the memory requirement. We have implemented an extended hypercube (EH) architecture for the high-speed execution of the 3-D reconstruction algorithm using the commercially available fast floating point digital signal processor (DSP) chips as the processing elements (PEs) and dual-port random access memories (DPR) as channels between the PEs. We discuss and compare the performances of the PBR algorithm on an IBM 6000 RISC workstation, on a Silicon
Solving the Flood Propagation Problem with Newton Algorithm on Parallel Systems

Directory of Open Access Journals (Sweden)

Chefi Triki

2012-04-01

Full Text Available In this paper we propose a parallel implementation for the flood propagation method Flo2DH. The model is built on a finite element spatial approximation combined with a Newton algorithm that uses a direct LU linear solver. The parallel implementation has been developed by using the standard MPI protocol and has been tested on a set of real world problems.
Optimization of Multiple Traveling Salesman Problem Based on Simulated Annealing Genetic Algorithm

Directory of Open Access Journals (Sweden)

Xu Mingji

2017-01-01

Full Text Available It is very effective to solve the multi variable optimization problem by using hierarchical genetic algorithm. This thesis analyzes both advantages and disadvantages of hierarchical genetic algorithm and puts forward an improved simulated annealing genetic algorithm. The new algorithm is applied to solve the multiple traveling salesman problem, which can improve the performance of the solution. First, it improves the design of chromosomes hierarchical structure in terms of redundant hierarchical algorithm, and it suggests a suffix design of chromosomes; Second, concerning to some premature problems of genetic algorithm, it proposes a self-identify crossover operator and mutation; Third, when it comes to the problem of weak ability of local search of genetic algorithm, it stretches the fitness by mixing genetic algorithm with simulated annealing algorithm. Forth, it emulates the problems of N traveling salesmen and M cities so as to verify its feasibility. The simulation and calculation shows that this improved algorithm can be quickly converged to a best global solution, which means the algorithm is encouraging in practical uses.

Introduction to genetic algorithms as a modeling tool

International Nuclear Information System (INIS)

Wildberger, A.M.; Hickok, K.A.

1990-01-01

Genetic algorithms are search and classification techniques modeled on natural adaptive systems. This is an introduction to their use as a modeling tool with emphasis on prospects for their application in the power industry. It is intended to provide enough background information for its audience to begin to follow technical developments in genetic algorithms and to recognize those which might impact on electric power engineering. Beginning with a discussion of genetic algorithms and their origin as a model of biological adaptation, their advantages and disadvantages are described in comparison with other modeling tools such as simulation and neural networks in order to provide guidance in selecting appropriate applications. In particular, their use is described for improving expert systems from actual data and they are suggested as an aid in building mathematical models. Using the Thermal Performance Advisor as an example, it is suggested how genetic algorithms might be used to make a conventional expert system and mathematical model of a power plant adapt automatically to changes in the plant's characteristics
An efficient implementation of a backpropagation learning algorithm on quadrics parallel supercomputer

International Nuclear Information System (INIS)

Taraglio, S.; Massaioli, F.

1995-08-01

A parallel implementation of a library to build and train Multi Layer Perceptrons via the Back Propagation algorithm is presented. The target machine is the SIMD massively parallel supercomputer Quadrics. Performance measures are provided on three different machines with different number of processors, for two network examples. A sample source code is given
An Improved Parallel DNA Algorithm of 3-SAT

Directory of Open Access Journals (Sweden)

Wei Liu

2007-09-01

Full Text Available There are many large-size and difficult computational problems in mathematics and computer science. For many of these problems, traditional computers cannot handle the mass of data in acceptable timeframes, which we call an NP problem. DNA computing is a means of solving a class of intractable computational problems in which the computing time grows exponentially with problem size. This paper proposes a parallel algorithm model for the universal 3-SAT problem based on the Adleman-Lipton model and applies biological operations to handling the mass of data in solution space. In this manner, we can control the run time of the algorithm to be finite and approximately constant.
Robust reactor power control system design by genetic algorithm

Energy Technology Data Exchange (ETDEWEB)

Lee, Yoon Joon; Cho, Kyung Ho; Kim, Sin [Cheju National University, Cheju (Korea, Republic of)

1998-12-31

The H{sub {infinity}} robust controller for the reactor power control system is designed by use of the mixed weight sensitivity. The system is configured into the typical two-port model with which the weight functions are augmented. Since the solution depends on the weighting functions and the problem is of nonconvex, the genetic algorithm is used to determine the weighting functions. The cost function applied in the genetic algorithm permits the direct control of the power tracking performances. In addition, the actual operating constraints such as rod velocity and acceleration can be treated as design parameters. Compared with the conventional approach, the controller designed by the genetic algorithm results in the better performances with the realistic constraints. Also, it is found that the genetic algorithm could be used as an effective tool in the robust design. 4 refs., 6 figs. (Author)
Robust reactor power control system design by genetic algorithm

Energy Technology Data Exchange (ETDEWEB)

Lee, Yoon Joon; Cho, Kyung Ho; Kim, Sin [Cheju National University, Cheju (Korea, Republic of)

1997-12-31

The H{sub {infinity}} robust controller for the reactor power control system is designed by use of the mixed weight sensitivity. The system is configured into the typical two-port model with which the weight functions are augmented. Since the solution depends on the weighting functions and the problem is of nonconvex, the genetic algorithm is used to determine the weighting functions. The cost function applied in the genetic algorithm permits the direct control of the power tracking performances. In addition, the actual operating constraints such as rod velocity and acceleration can be treated as design parameters. Compared with the conventional approach, the controller designed by the genetic algorithm results in the better performances with the realistic constraints. Also, it is found that the genetic algorithm could be used as an effective tool in the robust design. 4 refs., 6 figs. (Author)
Research and Applications of Shop Scheduling Based on Genetic Algorithms

Directory of Open Access Journals (Sweden)

Hang ZHAO

Full Text Available ABSTRACT Shop Scheduling is an important factor affecting the efficiency of production, efficient scheduling method and a research and application for optimization technology play an important role for manufacturing enterprises to improve production efficiency, reduce production costs and many other aspects. Existing studies have shown that improved genetic algorithm has solved the limitations that existed in the genetic algorithm, the objective function is able to meet customers' needs for shop scheduling, and the future research should focus on the combination of genetic algorithm with other optimized algorithms. In this paper, in order to overcome the shortcomings of early convergence of genetic algorithm and resolve local minimization problem in search process,aiming at mixed flow shop scheduling problem, an improved cyclic search genetic algorithm is put forward, and chromosome coding method and corresponding operation are given.The operation has the nature of inheriting the optimal individual ofthe previous generation and is able to avoid the emergence of local minimum, and cyclic and crossover operation and mutation operation can enhance the diversity of the population and then quickly get the optimal individual, and the effectiveness of the algorithm is validated. Experimental results show that the improved algorithm can well avoid the emergency of local minimum and is rapid in convergence.
PARALLEL ADAPTIVE MULTILEVEL SAMPLING ALGORITHMS FOR THE BAYESIAN ANALYSIS OF MATHEMATICAL MODELS

KAUST Repository

Prudencio, Ernesto; Cheung, Sai Hung

2012-01-01

In recent years, Bayesian model updating techniques based on measured data have been applied to many engineering and applied science problems. At the same time, parallel computational platforms are becoming increasingly more powerful and are being used more frequently by the engineering and scientific communities. Bayesian techniques usually require the evaluation of multi-dimensional integrals related to the posterior probability density function (PDF) of uncertain model parameters. The fact that such integrals cannot be computed analytically motivates the research of stochastic simulation methods for sampling posterior PDFs. One such algorithm is the adaptive multilevel stochastic simulation algorithm (AMSSA). In this paper we discuss the parallelization of AMSSA, formulating the necessary load balancing step as a binary integer programming problem. We present a variety of results showing the effectiveness of load balancing on the overall performance of AMSSA in a parallel computational environment.
Genetic algorithm essentials

CERN Document Server

Kramer, Oliver

2017-01-01

This book introduces readers to genetic algorithms (GAs) with an emphasis on making the concepts, algorithms, and applications discussed as easy to understand as possible. Further, it avoids a great deal of formalisms and thus opens the subject to a broader audience in comparison to manuscripts overloaded by notations and equations. The book is divided into three parts, the first of which provides an introduction to GAs, starting with basic concepts like evolutionary operators and continuing with an overview of strategies for tuning and controlling parameters. In turn, the second part focuses on solution space variants like multimodal, constrained, and multi-objective solution spaces. Lastly, the third part briefly introduces theoretical tools for GAs, the intersections and hybridizations with machine learning, and highlights selected promising applications.
A Hybrid Genetic Algorithm Approach for Optimal Power Flow

Directory of Open Access Journals (Sweden)

Sydulu Maheswarapu

2011-08-01

Full Text Available This paper puts forward a reformed hybrid genetic algorithm (GA based approach to the optimal power flow. In the approach followed here, continuous variables are designed using real-coded GA and discrete variables are processed as binary strings. The outcomes are compared with many other methods like simple genetic algorithm (GA, adaptive genetic algorithm (AGA, differential evolution (DE, particle swarm optimization (PSO and music based harmony search (MBHS on a IEEE30 bus test bed, with a total load of 283.4 MW. Its found that the proposed algorithm is found to offer lowest fuel cost. The proposed method is found to be computationally faster, robust, superior and promising form its convergence characteristics.
Genetic Algorithm Applied to the Eigenvalue Equalization Filtered-x LMS Algorithm (EE-FXLMS

Directory of Open Access Journals (Sweden)

Stephan P. Lovstedt

2008-01-01

Full Text Available The FXLMS algorithm, used extensively in active noise control (ANC, exhibits frequency-dependent convergence behavior. This leads to degraded performance for time-varying tonal noise and noise with multiple stationary tones. Previous work by the authors proposed the eigenvalue equalization filtered-x least mean squares (EE-FXLMS algorithm. For that algorithm, magnitude coefficients of the secondary path transfer function are modified to decrease variation in the eigenvalues of the filtered-x autocorrelation matrix, while preserving the phase, giving faster convergence and increasing overall attenuation. This paper revisits the EE-FXLMS algorithm, using a genetic algorithm to find magnitude coefficients that give the least variation in eigenvalues. This method overcomes some of the problems with implementing the EE-FXLMS algorithm arising from finite resolution of sampled systems. Experimental control results using the original secondary path model, and a modified secondary path model for both the previous implementation of EE-FXLMS and the genetic algorithm implementation are compared.
Field Programmable Gate Array Based Parallel Strapdown Algorithm Design for Strapdown Inertial Navigation Systems

Directory of Open Access Journals (Sweden)

Long-Hua Ma

2011-08-01

Full Text Available A new generalized optimum strapdown algorithm with coning and sculling compensation is presented, in which the position, velocity and attitude updating operations are carried out based on the single-speed structure in which all computations are executed at a single updating rate that is sufficiently high to accurately account for high frequency angular rate and acceleration rectification effects. Different from existing algorithms, the updating rates of the coning and sculling compensations are unrelated with the number of the gyro incremental angle samples and the number of the accelerometer incremental velocity samples. When the output sampling rate of inertial sensors remains constant, this algorithm allows increasing the updating rate of the coning and sculling compensation, yet with more numbers of gyro incremental angle and accelerometer incremental velocity in order to improve the accuracy of system. Then, in order to implement the new strapdown algorithm in a single FPGA chip, the parallelization of the algorithm is designed and its computational complexity is analyzed. The performance of the proposed parallel strapdown algorithm is tested on the Xilinx ISE 12.3 software platform and the FPGA device XC6VLX550T hardware platform on the basis of some fighter data. It is shown that this parallel strapdown algorithm on the FPGA platform can greatly decrease the execution time of algorithm to meet the real-time and high precision requirements of system on the high dynamic environment, relative to the existing implemented on the DSP platform.
A parallel algorithm for transient solid dynamics simulations with contact detection

International Nuclear Information System (INIS)

Attaway, S.; Hendrickson, B.; Plimpton, S.; Gardner, D.; Vaughan, C.; Heinstein, M.; Peery, J.

1996-01-01

Solid dynamics simulations with Lagrangian finite elements are used to model a wide variety of problems, such as the calculation of impact damage to shipping containers for nuclear waste and the analysis of vehicular crashes. Using parallel computers for these simulations has been hindered by the difficulty of searching efficiently for material surface contacts in parallel. A new parallel algorithm for calculation of arbitrary material contacts in finite element simulations has been developed and implemented in the PRONTO3D transient solid dynamics code. This paper will explore some of the issues involved in developing efficient, portable, parallel finite element models for nonlinear transient solid dynamics simulations. The contact-detection problem poses interesting challenges for efficient implementation of a solid dynamics simulation on a parallel computer. The finite element mesh is typically partitioned so that each processor owns a localized region of the finite element mesh. This mesh partitioning is optimal for the finite element portion of the calculation since each processor must communicate only with the few connected neighboring processors that share boundaries with the decomposed mesh. However, contacts can occur between surfaces that may be owned by any two arbitrary processors. Hence, a global search across all processors is required at every time step to search for these contacts. Load-imbalance can become a problem since the finite element decomposition divides the volumetric mesh evenly across processors but typically leaves the surface elements unevenly distributed. In practice, these complications have been limiting factors in the performance and scalability of transient solid dynamics on massively parallel computers. In this paper the authors present a new parallel algorithm for contact detection that overcomes many of these limitations
Portfolio selection using genetic algorithms | Yahaya | International ...

African Journals Online (AJOL)

In this paper, one of the nature-inspired evolutionary algorithms – a Genetic Algorithms (GA) was used in solving the portfolio selection problem (PSP). Based on a real dataset from a popular stock market, the performance of the algorithm in relation to those obtained from one of the popular quadratic programming (QP) ...
Parallelization of MCNP4 code by using simple FORTRAN algorithms

International Nuclear Information System (INIS)

Yazid, P.I.; Takano, Makoto; Masukawa, Fumihiro; Naito, Yoshitaka.

1993-12-01

Simple FORTRAN algorithms, that rely only on open, close, read and write statements, together with disk files and some UNIX commands have been applied to parallelization of MCNP4. The code, named MCNPNFS, maintains almost all capabilities of MCNP4 in solving shielding problems. It is able to perform parallel computing on a set of any UNIX workstations connected by a network, regardless of the heterogeneity in hardware system, provided that all processors produce a binary file in the same format. Further, it is confirmed that MCNPNFS can be executed also on Monte-4 vector-parallel computer. MCNPNFS has been tested intensively by executing 5 photon-neutron benchmark problems, a spent fuel cask problem and 17 sample problems included in the original code package of MCNP4. Three different workstations, connected by a network, have been used to execute MCNPNFS in parallel. By measuring CPU time, the parallel efficiency is determined to be 58% to 99% and 86% in average. On Monte-4, MCNPNFS has been executed using 4 processors concurrently and has achieved the parallel efficiency of 79% in average. (author)
Parallel O(log n) algorithms for open- and closed-chain rigid multibody systems based on a new mass matrix factorization technique

Science.gov (United States)

Fijany, Amir

1993-01-01

In this paper, parallel O(log n) algorithms for computation of rigid multibody dynamics are developed. These parallel algorithms are derived by parallelization of new O(n) algorithms for the problem. The underlying feature of these O(n) algorithms is a drastically different strategy for decomposition of interbody force which leads to a new factorization of the mass matrix (M). Specifically, it is shown that a factorization of the inverse of the mass matrix in the form of the Schur Complement is derived as M(exp -1) = C - B(exp *)A(exp -1)B, wherein matrices C, A, and B are block tridiagonal matrices. The new O(n) algorithm is then derived as a recursive implementation of this factorization of M(exp -1). For the closed-chain systems, similar factorizations and O(n) algorithms for computation of Operational Space Mass Matrix lambda and its inverse lambda(exp -1) are also derived. It is shown that these O(n) algorithms are strictly parallel, that is, they are less efficient than other algorithms for serial computation of the problem. But, to our knowledge, they are the only known algorithms that can be parallelized and that lead to both time- and processor-optimal parallel algorithms for the problem, i.e., parallel O(log n) algorithms with O(n) processors. The developed parallel algorithms, in addition to their theoretical significance, are also practical from an implementation point of view due to their simple architectural requirements.
Parallel algorithm for dominant points correspondences in robot binocular stereo vision

Science.gov (United States)

Al-Tammami, A.; Singh, B.

1993-01-01

This paper presents an algorithm to find the correspondences of points representing dominant feature in robot stereo vision. The algorithm consists of two main steps: dominant point extraction and dominant point matching. In the feature extraction phase, the algorithm utilizes the widely used Moravec Interest Operator and two other operators: the Prewitt Operator and a new operator called Gradient Angle Variance Operator. The Interest Operator in the Moravec algorithm was used to exclude featureless areas and simple edges which are oriented in the vertical, horizontal, and two diagonals. It was incorrectly detecting points on edges which are not on the four main directions (vertical, horizontal, and two diagonals). The new algorithm uses the Prewitt operator to exclude featureless areas, so that the Interest Operator is applied only on the edges to exclude simple edges and to leave interesting points. This modification speeds-up the extraction process by approximately 5 times. The Gradient Angle Variance (GAV), an operator which calculates the variance of the gradient angle in a window around the point under concern, is then applied on the interesting points to exclude the redundant ones and leave the actual dominant ones. The matching phase is performed after the extraction of the dominant points in both stereo images. The matching starts with dominant points in the left image and does a local search, looking for corresponding dominant points in the right image. The search is geometrically constrained the epipolar line of the parallel-axes stereo geometry and the maximum disparity of the application environment. If one dominant point in the right image lies in the search areas, then it is the corresponding point of the reference dominant point in the left image. A parameter provided by the GAV is thresholded and used as a rough similarity measure to select the corresponding dominant point if there is more than one point the search area. The correlation is used as
Parallel Algorithm for Solving TOV Equations for Sequence of Cold and Dense Nuclear Matter Models

Science.gov (United States)

Ayriyan, Alexander; Buša, Ján; Grigorian, Hovik; Poghosyan, Gevorg

2018-04-01

We have introduced parallel algorithm simulation of neutron star configurations for set of equation of state models. The performance of the parallel algorithm has been investigated for testing set of EoS models on two computational systems. It scales when using with MPI on modern CPUs and this investigation allowed us also to compare two different types of computational nodes.
Algorithms for a parallel implementation of Hidden Markov Models with a small state space

DEFF Research Database (Denmark)

Nielsen, Jesper; Sand, Andreas

2011-01-01

Two of the most important algorithms for Hidden Markov Models are the forward and the Viterbi algorithms. We show how formulating these using linear algebra naturally lends itself to parallelization. Although the obtained algorithms are slow for Hidden Markov Models with large state spaces...
Tag SNP selection via a genetic algorithm.

Science.gov (United States)

Mahdevar, Ghasem; Zahiri, Javad; Sadeghi, Mehdi; Nowzari-Dalini, Abbas; Ahrabian, Hayedeh

2010-10-01

Single Nucleotide Polymorphisms (SNPs) provide valuable information on human evolutionary history and may lead us to identify genetic variants responsible for human complex diseases. Unfortunately, molecular haplotyping methods are costly, laborious, and time consuming; therefore, algorithms for constructing full haplotype patterns from small available data through computational methods, Tag SNP selection problem, are convenient and attractive. This problem is proved to be an NP-hard problem, so heuristic methods may be useful. In this paper we present a heuristic method based on genetic algorithm to find reasonable solution within acceptable time. The algorithm was tested on a variety of simulated and experimental data. In comparison with the exact algorithm, based on brute force approach, results show that our method can obtain optimal solutions in almost all cases and runs much faster than exact algorithm when the number of SNP sites is large. Our software is available upon request to the corresponding author.
Towards a HPC-oriented parallel implementation of a learning algorithm for bioinformatics applications.

Science.gov (United States)

D'Angelo, Gianni; Rampone, Salvatore

2014-01-01

The huge quantity of data produced in Biomedical research needs sophisticated algorithmic methodologies for its storage, analysis, and processing. High Performance Computing (HPC) appears as a magic bullet in this challenge. However, several hard to solve parallelization and load balancing problems arise in this context. Here we discuss the HPC-oriented implementation of a general purpose learning algorithm, originally conceived for DNA analysis and recently extended to treat uncertainty on data (U-BRAIN). The U-BRAIN algorithm is a learning algorithm that finds a Boolean formula in disjunctive normal form (DNF), of approximately minimum complexity, that is consistent with a set of data (instances) which may have missing bits. The conjunctive terms of the formula are computed in an iterative way by identifying, from the given data, a family of sets of conditions that must be satisfied by all the positive instances and violated by all the negative ones; such conditions allow the computation of a set of coefficients (relevances) for each attribute (literal), that form a probability distribution, allowing the selection of the term literals. The great versatility that characterizes it, makes U-BRAIN applicable in many of the fields in which there are data to be analyzed. However the memory and the execution time required by the running are of O(n(3)) and of O(n(5)) order, respectively, and so, the algorithm is unaffordable for huge data sets. We find mathematical and programming solutions able to lead us towards the implementation of the algorithm U-BRAIN on parallel computers. First we give a Dynamic Programming model of the U-BRAIN algorithm, then we minimize the representation of the relevances. When the data are of great size we are forced to use the mass memory, and depending on where the data are actually stored, the access times can be quite different. According to the evaluation of algorithmic efficiency based on the Disk Model, in order to reduce the costs of

Pose estimation for augmented reality applications using genetic algorithm.

Science.gov (United States)

Yu, Ying Kin; Wong, Kin Hong; Chang, Michael Ming Yuen

2005-12-01

This paper describes a genetic algorithm that tackles the pose-estimation problem in computer vision. Our genetic algorithm can find the rotation and translation of an object accurately when the three-dimensional structure of the object is given. In our implementation, each chromosome encodes both the pose and the indexes to the selected point features of the object. Instead of only searching for the pose as in the existing work, our algorithm, at the same time, searches for a set containing the most reliable feature points in the process. This mismatch filtering strategy successfully makes the algorithm more robust under the presence of point mismatches and outliers in the images. Our algorithm has been tested with both synthetic and real data with good results. The accuracy of the recovered pose is compared to the existing algorithms. Our approach outperformed the Lowe's method and the other two genetic algorithms under the presence of point mismatches and outliers. In addition, it has been used to estimate the pose of a real object. It is shown that the proposed method is applicable to augmented reality applications.
Solving the SAT problem using Genetic Algorithm

Directory of Open Access Journals (Sweden)

Arunava Bhattacharjee

2017-08-01

Full Text Available In this paper we propose our genetic algorithm for solving the SAT problem. We introduce various crossover and mutation techniques and then make a comparative analysis between them in order to find out which techniques are the best suited for solving a SAT instance. Before the genetic algorithm is applied to an instance it is better to seek for unit and pure literals in the given formula and then try to eradicate them. This can considerably reduce the search space, and to demonstrate this we tested our algorithm on some random SAT instances. However, to analyse the various crossover and mutation techniques and also to evaluate the optimality of our algorithm we performed extensive experiments on benchmark instances of the SAT problem. We also estimated the ideal crossover length that would maximise the chances to solve a given SAT instance.
A parallel implementation of a maximum entropy reconstruction algorithm for PET images in a visual language

International Nuclear Information System (INIS)

Bastiens, K.; Lemahieu, I.

1994-01-01

The application of a maximum entropy reconstruction algorithm to PET images requires a lot of computing resources. A parallel implementation could seriously reduce the execution time. However, programming a parallel application is still a non trivial task, needing specialized people. In this paper a programming environment based on a visual programming language is used for a parallel implementation of the reconstruction algorithm. This programming environment allows less experienced programmers to use the performance of multiprocessor systems. (authors)
Parallel Algorithm of Geometrical Hashing Based on NumPy Package and Processes Pool

Directory of Open Access Journals (Sweden)

Klyachin Vladimir Aleksandrovich

2015-10-01

Full Text Available The article considers the problem of multi-dimensional geometric hashing. The paper describes a mathematical model of geometric hashing and considers an example of its use in localization problems for the point. A method of constructing the corresponding hash matrix by parallel algorithm is considered. In this paper an algorithm of parallel geometric hashing using a development pattern «pool processes» is proposed. The implementation of the algorithm is executed using the Python programming language and NumPy package for manipulating multidimensional data. To implement the process pool it is proposed to use a class Process Pool Executor imported from module concurrent.futures, which is included in the distribution of the interpreter Python since version 3.2. All the solutions are presented in the paper by corresponding UML class diagrams. Designed GeomNash package includes classes Data, Result, GeomHash, Job. The results of the developed program presents the corresponding graphs. Also, the article presents the theoretical justification for the application process pool for the implementation of parallel algorithms. It is obtained condition t2 > (p/(p-1*t1 of the appropriateness of process pool. Here t1 - the time of transmission unit of data between processes, and t2 - the time of processing unit data by one processor.
Implementing a Parallel Image Edge Detection Algorithm Based on the Otsu-Canny Operator on the Hadoop Platform.

Science.gov (United States)

Cao, Jianfang; Chen, Lichao; Wang, Min; Tian, Yun

2018-01-01

The Canny operator is widely used to detect edges in images. However, as the size of the image dataset increases, the edge detection performance of the Canny operator decreases and its runtime becomes excessive. To improve the runtime and edge detection performance of the Canny operator, in this paper, we propose a parallel design and implementation for an Otsu-optimized Canny operator using a MapReduce parallel programming model that runs on the Hadoop platform. The Otsu algorithm is used to optimize the Canny operator's dual threshold and improve the edge detection performance, while the MapReduce parallel programming model facilitates parallel processing for the Canny operator to solve the processing speed and communication cost problems that occur when the Canny edge detection algorithm is applied to big data. For the experiments, we constructed datasets of different scales from the Pascal VOC2012 image database. The proposed parallel Otsu-Canny edge detection algorithm performs better than other traditional edge detection algorithms. The parallel approach reduced the running time by approximately 67.2% on a Hadoop cluster architecture consisting of 5 nodes with a dataset of 60,000 images. Overall, our approach system speeds up the system by approximately 3.4 times when processing large-scale datasets, which demonstrates the obvious superiority of our method. The proposed algorithm in this study demonstrates both better edge detection performance and improved time performance.
Dynamic route guidance algorithm based algorithm based on artificial immune system

Institute of Scientific and Technical Information of China (English)

无

2007-01-01

To improve the performance of the K-shortest paths search in intelligent traffic guidance systems,this paper proposes an optimal search algorithm based on the intelligent optimization search theory and the memphor mechanism of vertebrate immune systems.This algorithm,applied to the urban traffic network model established by the node-expanding method,can expediently realize K-shortest paths search in the urban traffic guidance systems.Because of the immune memory and global parallel search ability from artificial immune systems,K shortest paths can be found without any repeat,which indicates evidently the superiority of the algorithm to the conventional ones.Not only does it perform a better parallelism,the algorithm also prevents premature phenomenon that often occurs in genetic algorithms.Thus,it is especially suitable for real-time requirement of the traffic guidance system and other engineering optimal applications.A case study verifies the efficiency and the practicability of the algorithm aforementioned.
Parallel pipeline algorithm of real time star map preprocessing

Science.gov (United States)

Wang, Hai-yong; Qin, Tian-mu; Liu, Jia-qi; Li, Zhi-feng; Li, Jian-hua

2016-03-01

To improve the preprocessing speed of star map and reduce the resource consumption of embedded system of star tracker, a parallel pipeline real-time preprocessing algorithm is presented. The two characteristics, the mean and the noise standard deviation of the background gray of a star map, are firstly obtained dynamically by the means that the intervene of the star image itself to the background is removed in advance. The criterion on whether or not the following noise filtering is needed is established, then the extraction threshold value is assigned according to the level of background noise, so that the centroiding accuracy is guaranteed. In the processing algorithm, as low as two lines of pixel data are buffered, and only 100 shift registers are used to record the connected domain label, by which the problems of resources wasting and connected domain overflow are solved. The simulating results show that the necessary data of the selected bright stars could be immediately accessed in a delay time as short as 10us after the pipeline processing of a 496×496 star map in 50Mb/s is finished, and the needed memory and registers resource total less than 80kb. To verify the accuracy performance of the algorithm proposed, different levels of background noise are added to the processed ideal star map, and the statistic centroiding error is smaller than 1/23 pixel under the condition that the signal to noise ratio is greater than 1. The parallel pipeline algorithm of real time star map preprocessing helps to increase the data output speed and the anti-dynamic performance of star tracker.
Genetic algorithms with memory- and elitism-based immigrants in dynamic environments.

Science.gov (United States)

Yang, Shengxiang

2008-01-01

In recent years the genetic algorithm community has shown a growing interest in studying dynamic optimization problems. Several approaches have been devised. The random immigrants and memory schemes are two major ones. The random immigrants scheme addresses dynamic environments by maintaining the population diversity while the memory scheme aims to adapt genetic algorithms quickly to new environments by reusing historical information. This paper investigates a hybrid memory and random immigrants scheme, called memory-based immigrants, and a hybrid elitism and random immigrants scheme, called elitism-based immigrants, for genetic algorithms in dynamic environments. In these schemes, the best individual from memory or the elite from the previous generation is retrieved as the base to create immigrants into the population by mutation. This way, not only can diversity be maintained but it is done more efficiently to adapt genetic algorithms to the current environment. Based on a series of systematically constructed dynamic problems, experiments are carried out to compare genetic algorithms with the memory-based and elitism-based immigrants schemes against genetic algorithms with traditional memory and random immigrants schemes and a hybrid memory and multi-population scheme. The sensitivity analysis regarding some key parameters is also carried out. Experimental results show that the memory-based and elitism-based immigrants schemes efficiently improve the performance of genetic algorithms in dynamic environments.
A pipelined FPGA implementation of an encryption algorithm based on genetic algorithm

Science.gov (United States)

Thirer, Nonel

2013-05-01

With the evolution of digital data storage and exchange, it is essential to protect the confidential information from every unauthorized access. High performance encryption algorithms were developed and implemented by software and hardware. Also many methods to attack the cipher text were developed. In the last years, the genetic algorithm has gained much interest in cryptanalysis of cipher texts and also in encryption ciphers. This paper analyses the possibility to use the genetic algorithm as a multiple key sequence generator for an AES (Advanced Encryption Standard) cryptographic system, and also to use a three stages pipeline (with four main blocks: Input data, AES Core, Key generator, Output data) to provide a fast encryption and storage/transmission of a large amount of data.
Bayesian Network Constraint-Based Structure Learning Algorithms: Parallel and Optimized Implementations in the bnlearn R Package

Directory of Open Access Journals (Sweden)

Marco Scutari

2017-03-01

Full Text Available It is well known in the literature that the problem of learning the structure of Bayesian networks is very hard to tackle: Its computational complexity is super-exponential in the number of nodes in the worst case and polynomial in most real-world scenarios. Efficient implementations of score-based structure learning benefit from past and current research in optimization theory, which can be adapted to the task by using the network score as the objective function to maximize. This is not true for approaches based on conditional independence tests, called constraint-based learning algorithms. The only optimization in widespread use, backtracking, leverages the symmetries implied by the definitions of neighborhood and Markov blanket. In this paper we illustrate how backtracking is implemented in recent versions of the bnlearn R package, and how it degrades the stability of Bayesian network structure learning for little gain in terms of speed. As an alternative, we describe a software architecture and framework that can be used to parallelize constraint-based structure learning algorithms (also implemented in bnlearn and we demonstrate its performance using four reference networks and two real-world data sets from genetics and systems biology. We show that on modern multi-core or multiprocessor hardware parallel implementations are preferable over backtracking, which was developed when single-processor machines were the norm.
Evacuation route planning during nuclear emergency using genetic algorithm

International Nuclear Information System (INIS)

Suman, Vitisha; Sarkar, P.K.

2012-01-01

In nuclear industry the routing in case of any emergency is a cause of concern and of great importance. Even the smallest of time saved in the affected region saves a huge amount of otherwise received dose. Genetic algorithm an optimization technique has great ability to search for the optimal path from the affected region to a destination station in a spatially addressed problem. Usually heuristic algorithms are used to carry out these types of search strategy, but due to the lack of global sampling in the feasible solution space, these algorithms have considerable possibility of being trapped into local optima. Routing problems mainly are search problems for finding the shortest distance within a time limit to cover the required number of stations taking care of the traffics, road quality, population size etc. Lack of any formal mechanisms to help decision-makers explore the solution space of their problem and thereby challenges their assumptions about the number and range of options available. The Genetic Algorithm provides a way to optimize a multi-parameter constrained problem with an ease. Here use of Genetic Algorithm to generate a range of options available and to search a solution space and selectively focus on promising combinations of criteria makes them ideally suited to such complex spatial decision problems. The emergency response and routing can be made efficient, in accessing the closest facilities and determining the shortest route using genetic algorithm. The accuracy and care in creating database can be used to improve the result of the final output. The Genetic algorithm can be used to improve the accuracy of result on the basis of distance where other algorithm cannot be obtained. The search space can be utilized to its great extend
PARALLEL ALGORITHM FOR THREE-DIMENSIONAL STOKES FLOW SIMULATION USING BOUNDARY ELEMENT METHOD

Directory of Open Access Journals (Sweden)

D. G. Pribytok

2016-01-01

Full Text Available Parallel computing technique for modeling three-dimensional viscous flow (Stokes flow using direct boundary element method is presented. The problem is solved in three phases: sampling and construction of system of linear algebraic equations (SLAE, its decision and finding the velocity of liquid at predetermined points. For construction of the system and finding the velocity, the parallel algorithms using graphics CUDA cards programming technology have been developed and implemented. To solve the system of linear algebraic equations the implemented software libraries are used. A comparison of time consumption for three main algorithms on the example of calculation of viscous fluid motion in three-dimensional cavity is performed.
A parallel implementation of a maximum entropy reconstruction algorithm for PET images in a visual language

Energy Technology Data Exchange (ETDEWEB)

Bastiens, K; Lemahieu, I [University of Ghent - ELIS Department, St. Pietersnieuwstraat 41, B-9000 Ghent (Belgium)

1994-12-31

The application of a maximum entropy reconstruction algorithm to PET images requires a lot of computing resources. A parallel implementation could seriously reduce the execution time. However, programming a parallel application is still a non trivial task, needing specialized people. In this paper a programming environment based on a visual programming language is used for a parallel implementation of the reconstruction algorithm. This programming environment allows less experienced programmers to use the performance of multiprocessor systems. (authors). 8 refs, 3 figs, 1 tab.
Time-Delay System Identification Using Genetic Algorithm

DEFF Research Database (Denmark)

Yang, Zhenyu; Seested, Glen Thane

2013-01-01

Due to the unknown dead-time coefficient, the time-delay system identification turns to be a non-convex optimization problem. This paper investigates the identification of a simple time-delay system, named First-Order-Plus-Dead-Time (FOPDT), by using the Genetic Algorithm (GA) technique. The qual......Due to the unknown dead-time coefficient, the time-delay system identification turns to be a non-convex optimization problem. This paper investigates the identification of a simple time-delay system, named First-Order-Plus-Dead-Time (FOPDT), by using the Genetic Algorithm (GA) technique...
Medical image segmentation using genetic algorithms.

Science.gov (United States)

Maulik, Ujjwal

2009-03-01

Genetic algorithms (GAs) have been found to be effective in the domain of medical image segmentation, since the problem can often be mapped to one of search in a complex and multimodal landscape. The challenges in medical image segmentation arise due to poor image contrast and artifacts that result in missing or diffuse organ/tissue boundaries. The resulting search space is therefore often noisy with a multitude of local optima. Not only does the genetic algorithmic framework prove to be effective in coming out of local optima, it also brings considerable flexibility into the segmentation procedure. In this paper, an attempt has been made to review the major applications of GAs to the domain of medical image segmentation.
SequenceL: Automated Parallel Algorithms Derived from CSP-NT Computational Laws

Science.gov (United States)

Cooke, Daniel; Rushton, Nelson

2013-01-01

With the introduction of new parallel architectures like the cell and multicore chips from IBM, Intel, AMD, and ARM, as well as the petascale processing available for highend computing, a larger number of programmers will need to write parallel codes. Adding the parallel control structure to the sequence, selection, and iterative control constructs increases the complexity of code development, which often results in increased development costs and decreased reliability. SequenceL is a high-level programming language that is, a programming language that is closer to a human s way of thinking than to a machine s. Historically, high-level languages have resulted in decreased development costs and increased reliability, at the expense of performance. In recent applications at JSC and in industry, SequenceL has demonstrated the usual advantages of high-level programming in terms of low cost and high reliability. SequenceL programs, however, have run at speeds typically comparable with, and in many cases faster than, their counterparts written in C and C++ when run on single-core processors. Moreover, SequenceL is able to generate parallel executables automatically for multicore hardware, gaining parallel speedups without any extra effort from the programmer beyond what is required to write the sequen tial/singlecore code. A SequenceL-to-C++ translator has been developed that automatically renders readable multithreaded C++ from a combination of a SequenceL program and sample data input. The SequenceL language is based on two fundamental computational laws, Consume-Simplify- Produce (CSP) and Normalize-Trans - pose (NT), which enable it to automate the creation of parallel algorithms from high-level code that has no annotations of parallelism whatsoever. In our anecdotal experience, SequenceL development has been in every case less costly than development of the same algorithm in sequential (that is, single-core, single process) C or C++, and an order of magnitude less
Genetic Optimization Algorithm for Metabolic Engineering Revisited

Directory of Open Access Journals (Sweden)

Tobias B. Alter

2018-05-01

Full Text Available To date, several independent methods and algorithms exist for exploiting constraint-based stoichiometric models to find metabolic engineering strategies that optimize microbial production performance. Optimization procedures based on metaheuristics facilitate a straightforward adaption and expansion of engineering objectives, as well as fitness functions, while being particularly suited for solving problems of high complexity. With the increasing interest in multi-scale models and a need for solving advanced engineering problems, we strive to advance genetic algorithms, which stand out due to their intuitive optimization principles and the proven usefulness in this field of research. A drawback of genetic algorithms is that premature convergence to sub-optimal solutions easily occurs if the optimization parameters are not adapted to the specific problem. Here, we conducted comprehensive parameter sensitivity analyses to study their impact on finding optimal strain designs. We further demonstrate the capability of genetic algorithms to simultaneously handle (i multiple, non-linear engineering objectives; (ii the identification of gene target-sets according to logical gene-protein-reaction associations; (iii minimization of the number of network perturbations; and (iv the insertion of non-native reactions, while employing genome-scale metabolic models. This framework adds a level of sophistication in terms of strain design robustness, which is exemplarily tested on succinate overproduction in Escherichia coli.
A parallel row-based algorithm with error control for standard-cell replacement on a hypercube multiprocessor

Science.gov (United States)

Sargent, Jeff Scott

1988-01-01

A new row-based parallel algorithm for standard-cell placement targeted for execution on a hypercube multiprocessor is presented. Key features of this implementation include a dynamic simulated-annealing schedule, row-partitioning of the VLSI chip image, and two novel new approaches to controlling error in parallel cell-placement algorithms; Heuristic Cell-Coloring and Adaptive (Parallel Move) Sequence Control. Heuristic Cell-Coloring identifies sets of noninteracting cells that can be moved repeatedly, and in parallel, with no buildup of error in the placement cost. Adaptive Sequence Control allows multiple parallel cell moves to take place between global cell-position updates. This feedback mechanism is based on an error bound derived analytically from the traditional annealing move-acceptance profile. Placement results are presented for real industry circuits and the performance is summarized of an implementation on the Intel iPSC/2 Hypercube. The runtime of this algorithm is 5 to 16 times faster than a previous program developed for the Hypercube, while producing equivalent quality placement. An integrated place and route program for the Intel iPSC/2 Hypercube is currently being developed.
Optimization-Based Image Segmentation by Genetic Algorithms

Directory of Open Access Journals (Sweden)

Rosenberger C

2008-01-01

Full Text Available Abstract Many works in the literature focus on the definition of evaluation metrics and criteria that enable to quantify the performance of an image processing algorithm. These evaluation criteria can be used to define new image processing algorithms by optimizing them. In this paper, we propose a general scheme to segment images by a genetic algorithm. The developed method uses an evaluation criterion which quantifies the quality of an image segmentation result. The proposed segmentation method can integrate a local ground truth when it is available in order to set the desired level of precision of the final result. A genetic algorithm is then used in order to determine the best combination of information extracted by the selected criterion. Then, we show that this approach can either be applied for gray-levels or multicomponents images in a supervised context or in an unsupervised one. Last, we show the efficiency of the proposed method through some experimental results on several gray-levels and multicomponents images.
Optimization-Based Image Segmentation by Genetic Algorithms

Directory of Open Access Journals (Sweden)

H. Laurent

2008-05-01

Full Text Available Many works in the literature focus on the definition of evaluation metrics and criteria that enable to quantify the performance of an image processing algorithm. These evaluation criteria can be used to define new image processing algorithms by optimizing them. In this paper, we propose a general scheme to segment images by a genetic algorithm. The developed method uses an evaluation criterion which quantifies the quality of an image segmentation result. The proposed segmentation method can integrate a local ground truth when it is available in order to set the desired level of precision of the final result. A genetic algorithm is then used in order to determine the best combination of information extracted by the selected criterion. Then, we show that this approach can either be applied for gray-levels or multicomponents images in a supervised context or in an unsupervised one. Last, we show the efficiency of the proposed method through some experimental results on several gray-levels and multicomponents images.

First results of genetic algorithm application in ML image reconstruction in emission tomography

International Nuclear Information System (INIS)

Smolik, W.

1999-01-01

This paper concerns application of genetic algorithm in maximum likelihood image reconstruction in emission tomography. The example of genetic algorithm for image reconstruction is presented. The genetic algorithm was based on the typical genetic scheme modified due to the nature of solved problem. The convergence of algorithm was examined. The different adaption functions, selection and crossover methods were verified. The algorithm was tested on simulated SPECT data. The obtained results of image reconstruction are discussed. (author)
Variation in efficiency of parallel algorithms. [for study of stiffness matrices in planar trusses

Science.gov (United States)

Hayashi, A.; Melosh, R. J.; Utku, S.; Salama, M.

1985-01-01

The present study has the objective to investigate some iterative parallel-processor linear equation solving algorithms with respect to efficiency for analyses of typical linear engineering systems. Attention is given to a set of n linear equations, Ku = p, where K = an n x n positive definite, sparsely populated, symmetric matrix, u = an n x 1 vector of unknown responses, and p = an n x 1 vector of prescribed constants. This study is concerned with a hybrid method in which iteration is used to solve the problem, while a direct method is used on the local processor level. Variations in the efficiency of parallel algorithms are explored. Measures of the efficiency are based on computer experiments regarding the algorithms. For all the algorithms, the wall clock time is found to decrease as the number of processors increases.
Genetic algorithm approach to thin film optical parameters determination

International Nuclear Information System (INIS)

Jurecka, S.; Jureckova, M.; Muellerova, J.

2003-01-01

Optical parameters of thin film are important for several optical and optoelectronic applications. In this work the genetic algorithm proposed to solve optical parameters of thin film values. The experimental reflectance is modelled by the Forouhi - Bloomer dispersion relations. The refractive index, the extinction coefficient and the film thickness are the unknown parameters in this model. Genetic algorithm use probabilistic examination of promissing areas of the parameter space. It creates a population of solutions based on the reflectance model and then operates on the population to evolve the best solution by using selection, crossover and mutation operators on the population individuals. The implementation of genetic algorithm method and the experimental results are described too (Authors)
An Intrinsic Algorithm for Parallel Poisson Disk Sampling on Arbitrary Surfaces.

Science.gov (United States)

Ying, Xiang; Xin, Shi-Qing; Sun, Qian; He, Ying

2013-03-08

Poisson disk sampling plays an important role in a variety of visual computing, due to its useful statistical property in distribution and the absence of aliasing artifacts. While many effective techniques have been proposed to generate Poisson disk distribution in Euclidean space, relatively few work has been reported to the surface counterpart. This paper presents an intrinsic algorithm for parallel Poisson disk sampling on arbitrary surfaces. We propose a new technique for parallelizing the dart throwing. Rather than the conventional approaches that explicitly partition the spatial domain to generate the samples in parallel, our approach assigns each sample candidate a random and unique priority that is unbiased with regard to the distribution. Hence, multiple threads can process the candidates simultaneously and resolve conflicts by checking the given priority values. It is worth noting that our algorithm is accurate as the generated Poisson disks are uniformly and randomly distributed without bias. Our method is intrinsic in that all the computations are based on the intrinsic metric and are independent of the embedding space. This intrinsic feature allows us to generate Poisson disk distributions on arbitrary surfaces. Furthermore, by manipulating the spatially varying density function, we can obtain adaptive sampling easily.
An Improved Chaos Genetic Algorithm for T-Shaped MIMO Radar Antenna Array Optimization

Directory of Open Access Journals (Sweden)

Xin Fu

2014-01-01

Full Text Available In view of the fact that the traditional genetic algorithm easily falls into local optimum in the late iterations, an improved chaos genetic algorithm employed chaos theory and genetic algorithm is presented to optimize the low side-lobe for T-shaped MIMO radar antenna array. The novel two-dimension Cat chaotic map has been put forward to produce its initial population, improving the diversity of individuals. The improved Tent map is presented for groups of individuals of a generation with chaos disturbance. Improved chaotic genetic algorithm optimization model is established. The algorithm presented in this paper not only improved the search precision, but also avoids effectively the problem of local convergence and prematurity. For MIMO radar, the improved chaos genetic algorithm proposed in this paper obtains lower side-lobe level through optimizing the exciting current amplitude. Simulation results show that the algorithm is feasible and effective. Its performance is superior to the traditional genetic algorithm.
Multi-objective optimization algorithms for mixed model assembly line balancing problem with parallel workstations

Directory of Open Access Journals (Sweden)

Masoud Rabbani

2016-12-01

Full Text Available This paper deals with mixed model assembly line (MMAL balancing problem of type-I. In MMALs several products are made on an assembly line while the similarity of these products is so high. As a result, it is possible to assemble several types of products simultaneously without any additional setup times. The problem has some particular features such as parallel workstations and precedence constraints in dynamic periods in which each period also effects on its next period. The research intends to reduce the number of workstations and maximize the workload smoothness between workstations. Dynamic periods are used to determine all variables in different periods to achieve efficient solutions. A non-dominated sorting genetic algorithm (NSGA-II and multi-objective particle swarm optimization (MOPSO are used to solve the problem. The proposed model is validated with GAMS software for small size problem and the performance of the foregoing algorithms is compared with each other based on some comparison metrics. The NSGA-II outperforms MOPSO with respect to some comparison metrics used in this paper, but in other metrics MOPSO is better than NSGA-II. Finally, conclusion and future research is provided.
A Parallel, Finite-Volume Algorithm for Large-Eddy Simulation of Turbulent Flows

Science.gov (United States)

Bui, Trong T.

1999-01-01

A parallel, finite-volume algorithm has been developed for large-eddy simulation (LES) of compressible turbulent flows. This algorithm includes piecewise linear least-square reconstruction, trilinear finite-element interpolation, Roe flux-difference splitting, and second-order MacCormack time marching. Parallel implementation is done using the message-passing programming model. In this paper, the numerical algorithm is described. To validate the numerical method for turbulence simulation, LES of fully developed turbulent flow in a square duct is performed for a Reynolds number of 320 based on the average friction velocity and the hydraulic diameter of the duct. Direct numerical simulation (DNS) results are available for this test case, and the accuracy of this algorithm for turbulence simulations can be ascertained by comparing the LES solutions with the DNS results. The effects of grid resolution, upwind numerical dissipation, and subgrid-scale dissipation on the accuracy of the LES are examined. Comparison with DNS results shows that the standard Roe flux-difference splitting dissipation adversely affects the accuracy of the turbulence simulation. For accurate turbulence simulations, only 3-5 percent of the standard Roe flux-difference splitting dissipation is needed.
Genetic algorithms and artificial neural networks for loading pattern optimisation of advanced gas-cooled reactors

Energy Technology Data Exchange (ETDEWEB)

Ziver, A.K. E-mail: a.k.ziver@imperial.ac.uk; Pain, C.C; Carter, J.N.; Oliveira, C.R.E. de; Goddard, A.J.H.; Overton, R.S

2004-03-01

A non-generational genetic algorithm (GA) has been developed for fuel management optimisation of Advanced Gas-Cooled Reactors, which are operated by British Energy and produce around 20% of the UK's electricity requirements. An evolutionary search is coded using the genetic operators; namely selection by tournament, two-point crossover, mutation and random assessment of population for multi-cycle loading pattern (LP) optimisation. A detailed description of the chromosomes in the genetic algorithm coded is presented. Artificial Neural Networks (ANNs) have been constructed and trained to accelerate the GA-based search during the optimisation process. The whole package, called GAOPT, is linked to the reactor analysis code PANTHER, which performs fresh fuel loading, burn-up and power shaping calculations for each reactor cycle by imposing station-specific safety and operational constraints. GAOPT has been verified by performing a number of tests, which are applied to the Hinkley Point B and Hartlepool reactors. The test results giving loading pattern (LP) scenarios obtained from single and multi-cycle optimisation calculations applied to realistic reactor states of the Hartlepool and Hinkley Point B reactors are discussed. The results have shown that the GA/ANN algorithms developed can help the fuel engineer to optimise loading patterns in an efficient and more profitable way than currently available for multi-cycle refuelling of AGRs. Research leading to parallel GAs applied to LP optimisation are outlined, which can be adapted to present day LWR fuel management problems.
Massively parallel algorithms for trace-driven cache simulations

Science.gov (United States)

Nicol, David M.; Greenberg, Albert G.; Lubachevsky, Boris D.

1991-01-01

Trace driven cache simulation is central to computer design. A trace is a very long sequence of reference lines from main memory. At the t(exp th) instant, reference x sub t is hashed into a set of cache locations, the contents of which are then compared with x sub t. If at the t sup th instant x sub t is not present in the cache, then it is said to be a miss, and is loaded into the cache set, possibly forcing the replacement of some other memory line, and making x sub t present for the (t+1) sup st instant. The problem of parallel simulation of a subtrace of N references directed to a C line cache set is considered, with the aim of determining which references are misses and related statistics. A simulation method is presented for the Least Recently Used (LRU) policy, which regradless of the set size C runs in time O(log N) using N processors on the exclusive read, exclusive write (EREW) parallel model. A simpler LRU simulation algorithm is given that runs in O(C log N) time using N/log N processors. Timings are presented of the second algorithm's implementation on the MasPar MP-1, a machine with 16384 processors. A broad class of reference based line replacement policies are considered, which includes LRU as well as the Least Frequently Used and Random replacement policies. A simulation method is presented for any such policy that on any trace of length N directed to a C line set runs in the O(C log N) time with high probability using N processors on the EREW model. The algorithms are simple, have very little space overhead, and are well suited for SIMD implementation.
A highly efficient parallel algorithm for solving the neutron diffusion nodal equations on shared-memory computers

International Nuclear Information System (INIS)

Azmy, Y.Y.; Kirk, B.L.

1990-01-01

Modern parallel computer architectures offer an enormous potential for reducing CPU and wall-clock execution times of large-scale computations commonly performed in various applications in science and engineering. Recently, several authors have reported their efforts in developing and implementing parallel algorithms for solving the neutron diffusion equation on a variety of shared- and distributed-memory parallel computers. Testing of these algorithms for a variety of two- and three-dimensional meshes showed significant speedup of the computation. Even for very large problems (i.e., three-dimensional fine meshes) executed concurrently on a few nodes in serial (nonvector) mode, however, the measured computational efficiency is very low (40 to 86%). In this paper, the authors present a highly efficient (∼85 to 99.9%) algorithm for solving the two-dimensional nodal diffusion equations on the Sequent Balance 8000 parallel computer. Also presented is a model for the performance, represented by the efficiency, as a function of problem size and the number of participating processors. The model is validated through several tests and then extrapolated to larger problems and more processors to predict the performance of the algorithm in more computationally demanding situations
Portfolio optimization by using linear programing models based on genetic algorithm

Science.gov (United States)

Sukono; Hidayat, Y.; Lesmana, E.; Putra, A. S.; Napitupulu, H.; Supian, S.

2018-01-01

In this paper, we discussed the investment portfolio optimization using linear programming model based on genetic algorithms. It is assumed that the portfolio risk is measured by absolute standard deviation, and each investor has a risk tolerance on the investment portfolio. To complete the investment portfolio optimization problem, the issue is arranged into a linear programming model. Furthermore, determination of the optimum solution for linear programming is done by using a genetic algorithm. As a numerical illustration, we analyze some of the stocks traded on the capital market in Indonesia. Based on the analysis, it is shown that the portfolio optimization performed by genetic algorithm approach produces more optimal efficient portfolio, compared to the portfolio optimization performed by a linear programming algorithm approach. Therefore, genetic algorithms can be considered as an alternative on determining the investment portfolio optimization, particularly using linear programming models.
Parallel supercomputing: Advanced methods, algorithms, and software for large-scale linear and nonlinear problems

Energy Technology Data Exchange (ETDEWEB)

Carey, G.F.; Young, D.M.

1993-12-31

The program outlined here is directed to research on methods, algorithms, and software for distributed parallel supercomputers. Of particular interest are finite element methods and finite difference methods together with sparse iterative solution schemes for scientific and engineering computations of very large-scale systems. Both linear and nonlinear problems will be investigated. In the nonlinear case, applications with bifurcation to multiple solutions will be considered using continuation strategies. The parallelizable numerical methods of particular interest are a family of partitioning schemes embracing domain decomposition, element-by-element strategies, and multi-level techniques. The methods will be further developed incorporating parallel iterative solution algorithms with associated preconditioners in parallel computer software. The schemes will be implemented on distributed memory parallel architectures such as the CRAY MPP, Intel Paragon, the NCUBE3, and the Connection Machine. We will also consider other new architectures such as the Kendall-Square (KSQ) and proposed machines such as the TERA. The applications will focus on large-scale three-dimensional nonlinear flow and reservoir problems with strong convective transport contributions. These are legitimate grand challenge class computational fluid dynamics (CFD) problems of significant practical interest to DOE. The methods developed and algorithms will, however, be of wider interest.
First massively parallel algorithm to be implemented in Apollo-II code

International Nuclear Information System (INIS)

Stankovski, Z.

1994-01-01

The collision probability (CP) method in neutron transport, as applied to arbitrary 2D XY geometries, like the TDT module in APOLLO-II, is very time consuming. Consequently RZ or 3D extensions became prohibitive. Fortunately, this method is very suitable for parallelization. Massively parallel computer architectures, especially MIMD machines, bring a new breath to this method. In this paper we present a CM5 implementation of the CP method. Parallelization is applied to the energy groups, using the CMMD message passing library. In our case we use 32 processors for the standard 99-group APOLLIB-II library. The real advantage of this algorithm will appear in the calculation of the future fine multigroup library (about 8000 groups) of the SAPHYR project with a massively parallel computer (to the order of hundreds of processors). (author). 3 tabs., 4 figs., 4 refs
First massively parallel algorithm to be implemented in APOLLO-II code

International Nuclear Information System (INIS)

Stankovski, Z.

1994-01-01

The collision probability method in neutron transport, as applied to arbitrary 2-dimensional geometries, like the two dimensional transport module in APOLLO-II is very time consuming. Consequently 3-dimensional extension became prohibitive. Fortunately, this method is very suitable for parallelization. Massively parallel computer architectures, especially MIMD machines, bring a new breath to this method. In this paper we present a CM5 implementation of the collision probability method. Parallelization is applied to the energy groups, using the CMMD massage passing library. In our case we used 32 processors for the standard 99-group APOLLIB-II library. The real advantage of this algorithm will appear in the calculation of the future multigroup library (about 8000 groups) of the SAPHYR project with a massively parallel computer (to the order of hundreds of processors). (author). 4 refs., 4 figs., 3 tabs
Genetic algorithms and their use in Geophysical Problems

Energy Technology Data Exchange (ETDEWEB)

Parker, Paul B. [Univ. of California, Berkeley, CA (United States)

1999-04-01

Genetic algorithms (GAs), global optimization methods that mimic Darwinian evolution are well suited to the nonlinear inverse problems of geophysics. A standard genetic algorithm selects the best or ''fittest'' models from a ''population'' and then applies operators such as crossover and mutation in order to combine the most successful characteristics of each model and produce fitter models. More sophisticated operators have been developed, but the standard GA usually provides a robust and efficient search. Although the choice of parameter settings such as crossover and mutation rate may depend largely on the type of problem being solved, numerous results show that certain parameter settings produce optimal performance for a wide range of problems and difficulties. In particular, a low (about half of the inverse of the population size) mutation rate is crucial for optimal results, but the choice of crossover method and rate do not seem to affect performance appreciably. Optimal efficiency is usually achieved with smaller (< 50) populations. Lastly, tournament selection appears to be the best choice of selection methods due to its simplicity and its autoscaling properties. However, if a proportional selection method is used such as roulette wheel selection, fitness scaling is a necessity, and a high scaling factor (> 2.0) should be used for the best performance. Three case studies are presented in which genetic algorithms are used to invert for crustal parameters. The first is an inversion for basement depth at Yucca mountain using gravity data, the second an inversion for velocity structure in the crust of the south island of New Zealand using receiver functions derived from teleseismic events, and the third is a similar receiver function inversion for crustal velocities beneath the Mendocino Triple Junction region of Northern California. The inversions demonstrate that genetic algorithms are effective in solving problems
The genetic architecture of parallel armor plate reduction in threespine sticklebacks.

Directory of Open Access Journals (Sweden)

Pamela F Colosimo

2004-05-01

Full Text Available How many genetic changes control the evolution of new traits in natural populations? Are the same genetic changes seen in cases of parallel evolution? Despite long-standing interest in these questions, they have been difficult to address, particularly in vertebrates. We have analyzed the genetic basis of natural variation in three different aspects of the skeletal armor of threespine sticklebacks (Gasterosteus aculeatus: the pattern, number, and size of the bony lateral plates. A few chromosomal regions can account for variation in all three aspects of the lateral plates, with one major locus contributing to most of the variation in lateral plate pattern and number. Genetic mapping and allelic complementation experiments show that the same major locus is responsible for the parallel evolution of armor plate reduction in two widely separated populations. These results suggest that a small number of genetic changes can produce major skeletal alterations in natural populations and that the same major locus is used repeatedly when similar traits evolve in different locations.
A parallel algorithm for solving the integral form of the discrete ordinates equations

International Nuclear Information System (INIS)

Zerr, R. J.; Azmy, Y. Y.

2009-01-01

The integral form of the discrete ordinates equations involves a system of equations that has a large, dense coefficient matrix. The serial construction methodology is presented and properties that affect the execution times to construct and solve the system are evaluated. Two approaches for massively parallel implementation of the solution algorithm are proposed and the current results of one of these are presented. The system of equations May be solved using two parallel solvers-block Jacobi and conjugate gradient. Results indicate that both methods can reduce overall wall-clock time for execution. The conjugate gradient solver exhibits better performance to compete with the traditional source iteration technique in terms of execution time and scalability. The parallel conjugate gradient method is synchronous, hence it does not increase the number of iterations for convergence compared to serial execution, and the efficiency of the algorithm demonstrates an apparent asymptotic decline. (authors)
Application of genetic algorithms for parameter estimation in liquid chromatography

International Nuclear Information System (INIS)

Hernandez Torres, Reynier; Irizar Mesa, Mirtha; Tavares Camara, Leoncio Diogenes

2012-01-01

In chromatography, complex inverse problems related to the parameters estimation and process optimization are presented. Metaheuristics methods are known as general purpose approximated algorithms which seek and hopefully find good solutions at a reasonable computational cost. These methods are iterative process to perform a robust search of a solution space. Genetic algorithms are optimization techniques based on the principles of genetics and natural selection. They have demonstrated very good performance as global optimizers in many types of applications, including inverse problems. In this work, the effectiveness of genetic algorithms is investigated to estimate parameters in liquid chromatography
Application of mapping crossover genetic algorithm in nuclear power equipment optimization design

International Nuclear Information System (INIS)

Li Guijiang; Yan Changqi; Wang Jianjun; Liu Chengyang

2013-01-01

Genetic algorithm (GA) has been widely applied in nuclear engineering. An improved method, named the mapping crossover genetic algorithm (MCGA), was developed aiming at improving the shortcomings of traditional genetic algorithm (TGA). The optimal results of benchmark problems show that MCGA has better optimizing performance than TGA. MCGA was applied to the reactor coolant pump optimization design. (authors)
Bio-Inspired Genetic Algorithms with Formalized Crossover Operators for Robotic Applications.

Science.gov (United States)

Zhang, Jie; Kang, Man; Li, Xiaojuan; Liu, Geng-Yang

2017-01-01

Genetic algorithms are widely adopted to solve optimization problems in robotic applications. In such safety-critical systems, it is vitally important to formally prove the correctness when genetic algorithms are applied. This paper focuses on formal modeling of crossover operations that are one of most important operations in genetic algorithms. Specially, we for the first time formalize crossover operations with higher-order logic based on HOL4 that is easy to be deployed with its user-friendly programing environment. With correctness-guaranteed formalized crossover operations, we can safely apply them in robotic applications. We implement our technique to solve a path planning problem using a genetic algorithm with our formalized crossover operations, and the results show the effectiveness of our technique.

A Highly Parallel and Scalable Motion Estimation Algorithm with GPU for HEVC

Directory of Open Access Journals (Sweden)

Yun-gang Xue

2017-01-01

Full Text Available We propose a highly parallel and scalable motion estimation algorithm, named multilevel resolution motion estimation (MLRME for short, by combining the advantages of local full search and downsampling. By subsampling a video frame, a large amount of computation is saved. While using the local full-search method, it can exploit massive parallelism and make full use of the powerful modern many-core accelerators, such as GPU and Intel Xeon Phi. We implanted the proposed MLRME into HM12.0, and the experimental results showed that the encoding quality of the MLRME method is close to that of the fast motion estimation in HEVC, which declines by less than 1.5%. We also implemented the MLRME with CUDA, which obtained 30–60x speed-up compared to the serial algorithm on single CPU. Specifically, the parallel implementation of MLRME on a GTX 460 GPU can meet the real-time coding requirement with about 25 fps for the 2560×1600 video format, while, for 832×480, the performance is more than 100 fps.
Application of Hybrid Genetic Algorithm Routine in Optimizing Food and Bioengineering Processes

Directory of Open Access Journals (Sweden)

Jaya Shankar Tumuluru

2016-11-01

Full Text Available Optimization is a crucial step in the analysis of experimental results. Deterministic methods only converge on local optimums and require exponentially more time as dimensionality increases. Stochastic algorithms are capable of efficiently searching the domain space; however convergence is not guaranteed. This article demonstrates the novelty of the hybrid genetic algorithm (HGA, which combines both stochastic and deterministic routines for improved optimization results. The new hybrid genetic algorithm developed is applied to the Ackley benchmark function as well as case studies in food, biofuel, and biotechnology processes. For each case study, the hybrid genetic algorithm found a better optimum candidate than reported by the sources. In the case of food processing, the hybrid genetic algorithm improved the anthocyanin yield by 6.44%. Optimization of bio-oil production using HGA resulted in a 5.06% higher yield. In the enzyme production process, HGA predicted a 0.39% higher xylanase yield. Hybridization of the genetic algorithm with a deterministic algorithm resulted in an improved optimum compared to statistical methods.
Fast parallel tracking algorithm for the muon detector of the CBM experiment at FAIR

International Nuclear Information System (INIS)

Lebedev, A.; Hoehne, C.; Kisel', I.; Ososkov, G.

2010-01-01

Particle trajectory recognition is an important and challenging task in the Compressed Baryonic Matter (CBM) experiment at the future FAIR accelerator at Darmstadt. The tracking algorithms have to process terabytes of input data produced in particle collisions. Therefore, the speed of the tracking software is extremely important for data analysis. In this contribution, a fast parallel track reconstruction algorithm, which uses available features of modern processors is presented. These features comprise a SIMD instruction set (SSE) and multithreading. The first allows one to pack several data items into one register and to operate on all of them in parallel thus achieving more operations per cycle. The second feature enables the routines to exploit all available CPU cores and hardware threads. This parallel version of the tracking algorithm has been compared to the initial serial scalar version which uses a similar approach for tracking. A speed-upfactor of 487 was achieved (from 730 to 1.5 ms/event) for a computer with 2 x Intel Core 17 processors at 2.66 GHz
A proposal simulated annealing algorithm for proportional parallel flow shops with separated setup times

Directory of Open Access Journals (Sweden)

Helio Yochihiro Fuchigami

2014-08-01

Full Text Available This article addresses the problem of minimizing makespan on two parallel flow shops with proportional processing and setup times. The setup times are separated and sequence-independent. The parallel flow shop scheduling problem is a specific case of well-known hybrid flow shop, characterized by a multistage production system with more than one machine working in parallel at each stage. This situation is very common in various kinds of companies like chemical, electronics, automotive, pharmaceutical and food industries. This work aimed to propose six Simulated Annealing algorithms, their perturbation schemes and an algorithm for initial sequence generation. This study can be classified as “applied research” regarding the nature, “exploratory” about the objectives and “experimental” as to procedures, besides the “quantitative” approach. The proposed algorithms were effective regarding the solution and computationally efficient. Results of Analysis of Variance (ANOVA revealed no significant difference between the schemes in terms of makespan. It’s suggested the use of PS4 scheme, which moves a subsequence of jobs, for providing the best percentage of success. It was also found that there is a significant difference between the results of the algorithms for each value of the proportionality factor of the processing and setup times of flow shops.
From Massively Parallel Algorithms and Fluctuating Time Horizons to Nonequilibrium Surface Growth

International Nuclear Information System (INIS)

Korniss, G.; Toroczkai, Z.; Novotny, M. A.; Rikvold, P. A.

2000-01-01

We study the asymptotic scaling properties of a massively parallel algorithm for discrete-event simulations where the discrete events are Poisson arrivals. The evolution of the simulated time horizon is analogous to a nonequilibrium surface. Monte Carlo simulations and a coarse-grained approximation indicate that the macroscopic landscape in the steady state is governed by the Edwards-Wilkinson Hamiltonian. Since the efficiency of the algorithm corresponds to the density of local minima in the associated surface, our results imply that the algorithm is asymptotically scalable. (c) 2000 The American Physical Society
Amodified probabilistic genetic algorithm for the solution of complex constrained optimization problems

OpenAIRE

Vorozheikin, A.; Gonchar, T.; Panfilov, I.; Sopov, E.; Sopov, S.

2009-01-01

A new algorithm for the solution of complex constrained optimization problems based on the probabilistic genetic algorithm with optimal solution prediction is proposed. The efficiency investigation results in comparison with standard genetic algorithm are presented.
Genetic algorithm based on qubits and quantum gates

International Nuclear Information System (INIS)

Silva, Joao Batista Rosa; Ramos, Rubens Viana

2003-01-01

Full text: Genetic algorithm, a computational technique based on the evolution of the species, in which a possible solution of the problem is coded in a binary string, called chromosome, has been used successfully in several kinds of problems, where the search of a minimal or a maximal value is necessary, even when local minima are present. A natural generalization of a binary string is a qubit string. Hence, it is possible to use the structure of a genetic algorithm having a sequence of qubits as a chromosome and using quantum operations in the reproduction in order to find the best solution in some problems of quantum information. For example, given a unitary matrix U what is the pair of qubits that, when applied at the input, provides the output state with maximal entanglement? In order to solve this problem, a population of chromosomes of two qubits was created. The crossover was performed applying the quantum gates CNOT and SWAP at the pair of qubits, while the mutation was performed applying the quantum gates Hadamard, Z and Not in a single qubit. The result was compared with a classical genetic algorithm used to solve the same problem. A hundred simulations using the same U matrix was performed. Both algorithms, hereafter named by CGA (classical) and QGA (using qu bits), reached good results close to 1 however, the number of generations needed to find the best result was lower for the QGA. Another problem where the QGA can be useful is in the calculation of the relative entropy of entanglement. We have tested our algorithm using 100 pure states chosen randomly. The stop criterion used was the error lower than 0.01. The main advantages of QGA are its good precision, robustness and very easy implementation. The main disadvantage is its low velocity, as happen for all kind of genetic algorithms. (author)
Optimization of the Compensation of a Meshed MV Network by a Modified Genetic Algorithm

DEFF Research Database (Denmark)

Nielsen, Hans; Paar, M.; Toman, P.

2007-01-01

The article discusses the utilization of a modified genetic algorithm (GA) for the optimization of the shunt compensation in meshed and radial MV distribution networks. The algorithm looks for minimum costs of the network power losses and minimum capital and operating costs of applied capacitors......, all of this under limitations specified by a multicriteria penalization function. The parallel evolution branches in the GA are used for the purpose of the optimization accelaration. The application of this GA has been implemented in Matlab. The evaluation part of the GA implementation is based...... on the steady-state analysis using a linear one-line diagram model of a power network. The results of steady-state solutions are compared with the results from the DIgSILENT PowerFactory program. Its practical applicability is demonstrated on examples of 22 kV and meshed overhead distribution networks....
Spatial updating grand canonical Monte Carlo algorithms for fluid simulation: generalization to continuous potentials and parallel implementation.

Science.gov (United States)

O'Keeffe, C J; Ren, Ruichao; Orkoulas, G

2007-11-21

Spatial updating grand canonical Monte Carlo algorithms are generalizations of random and sequential updating algorithms for lattice systems to continuum fluid models. The elementary steps, insertions or removals, are constructed by generating points in space either at random (random updating) or in a prescribed order (sequential updating). These algorithms have previously been developed only for systems of impenetrable spheres for which no particle overlap occurs. In this work, spatial updating grand canonical algorithms are generalized to continuous, soft-core potentials to account for overlapping configurations. Results on two- and three-dimensional Lennard-Jones fluids indicate that spatial updating grand canonical algorithms, both random and sequential, converge faster than standard grand canonical algorithms. Spatial algorithms based on sequential updating not only exhibit the fastest convergence but also are ideal for parallel implementation due to the absence of strict detailed balance and the nature of the updating that minimizes interprocessor communication. Parallel simulation results for three-dimensional Lennard-Jones fluids show a substantial reduction of simulation time for systems of moderate and large size. The efficiency improvement by parallel processing through domain decomposition is always in addition to the efficiency improvement by sequential updating.
A "Hands on" Strategy for Teaching Genetic Algorithms to Undergraduates

Science.gov (United States)

Venables, Anne; Tan, Grace

2007-01-01

Genetic algorithms (GAs) are a problem solving strategy that uses stochastic search. Since their introduction (Holland, 1975), GAs have proven to be particularly useful for solving problems that are "intractable" using classical methods. The language of genetic algorithms (GAs) is heavily laced with biological metaphors from evolutionary…
Genetic algorithms applied to the nuclear power plant operation

International Nuclear Information System (INIS)

Schirru, R.; Martinez, A.S.; Pereira, C.M.N.A.

2000-01-01

Nuclear power plant operation often involves very important human decisions, such as actions to be taken after a nuclear accident/transient, or finding the best core reload pattern, a complex combinatorial optimization problem which requires expert knowledge. Due to the complexity involved in the decisions to be taken, computerized systems have been intensely explored in order to aid the operator. Following hardware advances, soft computing has been improved and, nowadays, intelligent technologies, such as genetic algorithms, neural networks and fuzzy systems, are being used to support operator decisions. In this chapter two main problems are explored: transient diagnosis and nuclear core refueling. Here, solutions to such kind of problems, based on genetic algorithms, are described. A genetic algorithm was designed to optimize the nuclear fuel reload of Angra-1 nuclear power plant. Results compared to those obtained by an expert reveal a gain in the burn-up cycle. Two other genetic algorithm approaches were used to optimize real time diagnosis systems. The first one learns partitions in the time series that represents the transients, generating a set of classification centroids. The other one involves the optimization of an adaptive vector quantization neural network. Results are shown and commented. (orig.)
Genetic Algorithm Optimizes Q-LAW Control Parameters

Science.gov (United States)

Lee, Seungwon; von Allmen, Paul; Petropoulos, Anastassios; Terrile, Richard

2008-01-01

A document discusses a multi-objective, genetic algorithm designed to optimize Lyapunov feedback control law (Q-law) parameters in order to efficiently find Pareto-optimal solutions for low-thrust trajectories for electronic propulsion systems. These would be propellant-optimal solutions for a given flight time, or flight time optimal solutions for a given propellant requirement. The approximate solutions are used as good initial solutions for high-fidelity optimization tools. When the good initial solutions are used, the high-fidelity optimization tools quickly converge to a locally optimal solution near the initial solution. Q-law control parameters are represented as real-valued genes in the genetic algorithm. The performances of the Q-law control parameters are evaluated in the multi-objective space (flight time vs. propellant mass) and sorted by the non-dominated sorting method that assigns a better fitness value to the solutions that are dominated by a fewer number of other solutions. With the ranking result, the genetic algorithm encourages the solutions with higher fitness values to participate in the reproduction process, improving the solutions in the evolution process. The population of solutions converges to the Pareto front that is permitted within the Q-law control parameter space.
Decentralized diagnostics based on a distributed micro-genetic algorithm for transducer networks monitoring large experimental systems.

Science.gov (United States)

Arpaia, P; Cimmino, P; Girone, M; La Commara, G; Maisto, D; Manna, C; Pezzetti, M

2014-09-01

Evolutionary approach to centralized multiple-faults diagnostics is extended to distributed transducer networks monitoring large experimental systems. Given a set of anomalies detected by the transducers, each instance of the multiple-fault problem is formulated as several parallel communicating sub-tasks running on different transducers, and thus solved one-by-one on spatially separated parallel processes. A micro-genetic algorithm merges evaluation time efficiency, arising from a small-size population distributed on parallel-synchronized processors, with the effectiveness of centralized evolutionary techniques due to optimal mix of exploitation and exploration. In this way, holistic view and effectiveness advantages of evolutionary global diagnostics are combined with reliability and efficiency benefits of distributed parallel architectures. The proposed approach was validated both (i) by simulation at CERN, on a case study of a cold box for enhancing the cryogeny diagnostics of the Large Hadron Collider, and (ii) by experiments, under the framework of the industrial research project MONDIEVOB (Building Remote Monitoring and Evolutionary Diagnostics), co-funded by EU and the company Del Bo srl, Napoli, Italy.
Finite element analysis and genetic algorithm optimization design for the actuator placement on a large adaptive structure

Science.gov (United States)

Sheng, Lizeng

, GA Version 1, 2 and 3, were developed to find the optimal locations of piezoelectric actuators from the order of 1021 ˜ 1056 candidate placements. Introducing a variable population approach, we improve the flexibility of selection operation in genetic algorithms. Incorporating mutation and hill climbing into micro-genetic algorithms, we are able to develop a more efficient genetic algorithm. Through extensive numerical experiments, we find that the design search space for the optimal placements of a large number of actuators is highly multi-modal and that the most distinct nature of genetic algorithms is their robustness. They give results that are random but with only a slight variability. The genetic algorithms can be used to get adequate solution using a limited number of evaluations. To get the highest quality solution, multiple runs including different random seed generators are necessary. The investigation time can be significantly reduced using a very coarse grain parallel computing. Overall, the methodology of using finite element analysis and genetic algorithm optimization provides a robust solution approach for the challenging problem of optimal placements of a large number of actuators in the design of next generation of adaptive structures.
Multiple-algorithm parallel fusion of infrared polarization and intensity images based on algorithmic complementarity and synergy

Science.gov (United States)

Zhang, Lei; Yang, Fengbao; Ji, Linna; Lv, Sheng

2018-01-01

Diverse image fusion methods perform differently. Each method has advantages and disadvantages compared with others. One notion is that the advantages of different image methods can be effectively combined. A multiple-algorithm parallel fusion method based on algorithmic complementarity and synergy is proposed. First, in view of the characteristics of the different algorithms and difference-features among images, an index vector-based feature-similarity is proposed to define the degree of complementarity and synergy. This proposed index vector is a reliable evidence indicator for algorithm selection. Second, the algorithms with a high degree of complementarity and synergy are selected. Then, the different degrees of various features and infrared intensity images are used as the initial weights for the nonnegative matrix factorization (NMF). This avoids randomness of the NMF initialization parameter. Finally, the fused images of different algorithms are integrated using the NMF because of its excellent data fusing performance on independent features. Experimental results demonstrate that the visual effect and objective evaluation index of the fused images obtained using the proposed method are better than those obtained using traditional methods. The proposed method retains all the advantages that individual fusion algorithms have.
Warehouse stocking optimization based on dynamic ant colony genetic algorithm

Science.gov (United States)

Xiao, Xiaoxu

2018-04-01

In view of the various orders of FAW (First Automotive Works) International Logistics Co., Ltd., the SLP method is used to optimize the layout of the warehousing units in the enterprise, thus the warehouse logistics is optimized and the external processing speed of the order is improved. In addition, the relevant intelligent algorithms for optimizing the stocking route problem are analyzed. The ant colony algorithm and genetic algorithm which have good applicability are emphatically studied. The parameters of ant colony algorithm are optimized by genetic algorithm, which improves the performance of ant colony algorithm. A typical path optimization problem model is taken as an example to prove the effectiveness of parameter optimization.
Parallel Algorithm for Adaptive Numerical Integration

International Nuclear Information System (INIS)

Sujatmiko, M.; Basarudin, T.

1997-01-01

This paper presents an automation algorithm for integration using adaptive trapezoidal method. The interval is adaptively divided where the width of sub interval are different and fit to the behavior of its function. For a function f, an integration on interval [a,b] can be obtained, with maximum tolerance ε, using estimation (f, a, b, ε). The estimated solution is valid if the error is still in a reasonable range, fulfil certain criteria. If the error is big, however, the problem is solved by dividing it into to similar and independent sub problem on to separate [a, (a+b)/2] and [(a+b)/2, b] interval, i. e. ( f, a, (a+b)/2, ε/2) and (f, (a+b)/2, b, ε/2) estimations. The problems are solved in two different kinds of processor, root processor and worker processor. Root processor function ti divide a main problem into sub problems and distribute them to worker processor. The division mechanism may go further until all of the sub problem are resolved. The solution of each sub problem is then submitted to the root processor such that the solution for the main problem can be obtained. The algorithm is implemented on C-programming-base distributed computer networking system under parallel virtual machine platform
Dynamic traffic assignment : genetic algorithms approach

Science.gov (United States)

1997-01-01

Real-time route guidance is a promising approach to alleviating congestion on the nations highways. A dynamic traffic assignment model is central to the development of guidance strategies. The artificial intelligence technique of genetic algorithm...
Massively parallel performance of neutron transport response matrix algorithms

International Nuclear Information System (INIS)

Hanebutte, U.R.; Lewis, E.E.

1993-01-01

Massively parallel red/black response matrix algorithms for the solution of within-group neutron transport problems are implemented on the Connection Machines-2, 200 and 5. The response matrices are dericed from the diamond-differences and linear-linear nodal discrete ordinate and variational nodal P 3 approximations. The unaccelerated performance of the iterative procedure is examined relative to the maximum rated performances of the machines. The effects of processor partitions size, of virtual processor ratio and of problems size are examined in detail. For the red/black algorithm, the ratio of inter-node communication to computing times is found to be quite small, normally of the order of ten percent or less. Performance increases with problems size and with virtual processor ratio, within the memeory per physical processor limitation. Algorithm adaptation to courser grain machines is straight-forward, with total computing time being virtually inversely proportional to the number of physical processors. (orig.)
A new parallelization algorithm of ocean model with explicit scheme

Science.gov (United States)

Fu, X. D.

2017-08-01

This paper will focus on the parallelization of ocean model with explicit scheme which is one of the most commonly used schemes in the discretization of governing equation of ocean model. The characteristic of explicit schema is that calculation is simple, and that the value of the given grid point of ocean model depends on the grid point at the previous time step, which means that one doesn’t need to solve sparse linear equations in the process of solving the governing equation of the ocean model. Aiming at characteristics of the explicit scheme, this paper designs a parallel algorithm named halo cells update with tiny modification of original ocean model and little change of space step and time step of the original ocean model, which can parallelize ocean model by designing transmission module between sub-domains. This paper takes the GRGO for an example to implement the parallelization of GRGO (Global Reduced Gravity Ocean model) with halo update. The result demonstrates that the higher speedup can be achieved at different problem size.

A novel highly parallel algorithm for linearly unmixing hyperspectral images

Science.gov (United States)

Guerra, Raúl; López, Sebastián.; Callico, Gustavo M.; López, Jose F.; Sarmiento, Roberto

2014-10-01

Endmember extraction and abundances calculation represent critical steps within the process of linearly unmixing a given hyperspectral image because of two main reasons. The first one is due to the need of computing a set of accurate endmembers in order to further obtain confident abundance maps. The second one refers to the huge amount of operations involved in these time-consuming processes. This work proposes an algorithm to estimate the endmembers of a hyperspectral image under analysis and its abundances at the same time. The main advantage of this algorithm is its high parallelization degree and the mathematical simplicity of the operations implemented. This algorithm estimates the endmembers as virtual pixels. In particular, the proposed algorithm performs the descent gradient method to iteratively refine the endmembers and the abundances, reducing the mean square error, according with the linear unmixing model. Some mathematical restrictions must be added so the method converges in a unique and realistic solution. According with the algorithm nature, these restrictions can be easily implemented. The results obtained with synthetic images demonstrate the well behavior of the algorithm proposed. Moreover, the results obtained with the well-known Cuprite dataset also corroborate the benefits of our proposal.
Development of imaging and reconstructions algorithms on parallel processing architectures for applications in non-destructive testing

International Nuclear Information System (INIS)

Pedron, Antoine

2013-01-01

This thesis work is placed between the scientific domain of ultrasound non-destructive testing and algorithm-architecture adequation. Ultrasound non-destructive testing includes a group of analysis techniques used in science and industry to evaluate the properties of a material, component, or system without causing damage. In order to characterise possible defects, determining their position, size and shape, imaging and reconstruction tools have been developed at CEA-LIST, within the CIVA software platform. Evolution of acquisition sensors implies a continuous growth of datasets and consequently more and more computing power is needed to maintain interactive reconstructions. General purpose processors (GPP) evolving towards parallelism and emerging architectures such as GPU allow large acceleration possibilities than can be applied to these algorithms. The main goal of the thesis is to evaluate the acceleration than can be obtained for two reconstruction algorithms on these architectures. These two algorithms differ in their parallelization scheme. The first one can be properly parallelized on GPP whereas on GPU, an intensive use of atomic instructions is required. Within the second algorithm, parallelism is easier to express, but loop ordering on GPP, as well as thread scheduling and a good use of shared memory on GPU are necessary in order to obtain efficient results. Different API or libraries, such as OpenMP, CUDA and OpenCL are evaluated through chosen benchmarks. An integration of both algorithms in the CIVA software platform is proposed and different issues related to code maintenance and durability are discussed. (author) [fr
Advanced optimization of permanent magnet wigglers using a genetic algorithm

Energy Technology Data Exchange (ETDEWEB)

Hajima, Ryoichi [Univ. of Tokyo (Japan)

1995-12-31

In permanent magnet wigglers, magnetic imperfection of each magnet piece causes field error. This field error can be reduced or compensated by sorting magnet pieces in proper order. We showed a genetic algorithm has good property for this sorting scheme. In this paper, this optimization scheme is applied to the case of permanent magnets which have errors in the direction of field. The result shows the genetic algorithm is superior to other algorithms.
Advanced optimization of permanent magnet wigglers using a genetic algorithm

International Nuclear Information System (INIS)

Hajima, Ryoichi

1995-01-01

In permanent magnet wigglers, magnetic imperfection of each magnet piece causes field error. This field error can be reduced or compensated by sorting magnet pieces in proper order. We showed a genetic algorithm has good property for this sorting scheme. In this paper, this optimization scheme is applied to the case of permanent magnets which have errors in the direction of field. The result shows the genetic algorithm is superior to other algorithms
Experimental Performance of a Genetic Algorithm for Airborne Strategic Conflict Resolution

Science.gov (United States)

Karr, David A.; Vivona, Robert A.; Roscoe, David A.; DePascale, Stephen M.; Consiglio, Maria

2009-01-01

The Autonomous Operations Planner, a research prototype flight-deck decision support tool to enable airborne self-separation, uses a pattern-based genetic algorithm to resolve predicted conflicts between the ownship and traffic aircraft. Conflicts are resolved by modifying the active route within the ownship's flight management system according to a predefined set of maneuver pattern templates. The performance of this pattern-based genetic algorithm was evaluated in the context of batch-mode Monte Carlo simulations running over 3600 flight hours of autonomous aircraft in en-route airspace under conditions ranging from typical current traffic densities to several times that level. Encountering over 8900 conflicts during two simulation experiments, the genetic algorithm was able to resolve all but three conflicts, while maintaining a required time of arrival constraint for most aircraft. Actual elapsed running time for the algorithm was consistent with conflict resolution in real time. The paper presents details of the genetic algorithm's design, along with mathematical models of the algorithm's performance and observations regarding the effectiveness of using complimentary maneuver patterns when multiple resolutions by the same aircraft were required.
Application of Genetic Algorithms in Seismic Tomography

Science.gov (United States)

Soupios, Pantelis; Akca, Irfan; Mpogiatzis, Petros; Basokur, Ahmet; Papazachos, Constantinos

2010-05-01

In the earth sciences several inverse problems that require data fitting and parameter estimation are nonlinear and can involve a large number of unknown parameters. Consequently, the application of analytical inversion or optimization techniques may be quite restrictive. In practice, most analytical methods are local in nature and rely on a linearized form of the problem in question, adopting an iterative procedure using partial derivatives to improve an initial model. This approach can lead to a dependence of the final model solution on the starting model and is prone to entrapment in local misfit minima. Moreover, the calculation of derivatives can be computationally inefficient and create instabilities when numerical approximations are used. In contrast to these local minimization methods, global techniques that do not rely on partial derivatives, are independent of the form of the data misfit criterion, and are computationally robust. Such methods often use random processes to sample a selected wider span of the model space. In this situation, randomly generated models are assessed in terms of their data-fitting quality and the process may be stopped after a certain number of acceptable models is identified or continued until a satisfactory data fit is achieved. A new class of methods known as genetic algorithms achieves the aforementioned approximation through novel model representation and manipulations. Genetic algorithms (GAs) were originally developed in the field of artificial intelligence by John Holland more than 20 years ago, but even in this field it is less than a decade that the methodology has been more generally applied and only recently did the methodology attract the attention of the earth sciences community. Applications have been generally concentrated in geophysics and in particular seismology. As awareness of genetic algorithms grows there surely will be many more and varied applications to earth science problems. In the present work, the
Parallel SN algorithms in shared- and distributed-memory environments

International Nuclear Information System (INIS)

Haghighat, Alireza; Hunter, Melissa A.; Mattis, Ronald E.

1995-01-01

Different 2-D spatial domain partitioning Sn transport theory algorithms have been developed on the basis of the Block-Jacobi iterative scheme. These algorithms have been incorporated into TWOTRAN-II, and tested on a shared-memory CRAY Y-MP C90 and a distributed-memory IBM SP1. For a series of fixed source r-z geometry homogeneous problems, parallel efficiencies in a range of 50-90% are achieved on the C90 with 6 processors, and lower values (20-60%) are obtained on the SP1. It is demonstrated that better performance is attainable if one addresses issues such as convergence rate, load-balancing, and granularity for both architectures, as well as message passing (network bandwidth and latency) for SP1. (author). 17 refs, 4 figs
Genetic algorithm for nuclear data evaluation

Energy Technology Data Exchange (ETDEWEB)

Arthur, Jennifer Ann [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

2018-02-02

These are slides on genetic algorithm for nuclear data evaluation. The following is covered: initial population, fitness (outer loop), calculate fitness, selection (first part of inner loop), reproduction (second part of inner loop), solution, and examples.
Parallel algorithms for interactive manipulation of digital terrain models

Science.gov (United States)

Davis, E. W.; Mcallister, D. F.; Nagaraj, V.

1988-01-01

Interactive three-dimensional graphics applications, such as terrain data representation and manipulation, require extensive arithmetic processing. Massively parallel machines are attractive for this application since they offer high computational rates, and grid connected architectures provide a natural mapping for grid based terrain models. Presented here are algorithms for data movement on the massive parallel processor (MPP) in support of pan and zoom functions over large data grids. It is an extension of earlier work that demonstrated real-time performance of graphics functions on grids that were equal in size to the physical dimensions of the MPP. When the dimensions of a data grid exceed the processing array size, data is packed in the array memory. Windows of the total data grid are interactively selected for processing. Movement of packed data is needed to distribute items across the array for efficient parallel processing. Execution time for data movement was found to exceed that for arithmetic aspects of graphics functions. Performance figures are given for routines written in MPP Pascal.
Parallel algorithms for finding cliques in a graph

International Nuclear Information System (INIS)

Szabo, S

2011-01-01

A clique is a subgraph in a graph that is complete in the sense that each two of its nodes are connected by an edge. Finding cliques in a given graph is an important procedure in discrete mathematical modeling. The paper will show how concepts such as splitting partitions, quasi coloring, node and edge dominance are related to clique search problems. In particular we will discuss the connection with parallel clique search algorithms. These concepts also suggest practical guide lines to inspect a given graph before starting a large scale search.
Optimal data replication: A new approach to optimizing parallel EM algorithms on a mesh-connected multiprocessor for 3D PET image reconstruction

International Nuclear Information System (INIS)

Chen, C.M.; Lee, S.Y.

1995-01-01

The EM algorithm promises an estimated image with the maximal likelihood for 3D PET image reconstruction. However, due to its long computation time, the EM algorithm has not been widely used in practice. While several parallel implementations of the EM algorithm have been developed to make the EM algorithm feasible, they do not guarantee an optimal parallelization efficiency. In this paper, the authors propose a new parallel EM algorithm which maximizes the performance by optimizing data replication on a mesh-connected message-passing multiprocessor. To optimize data replication, the authors have formally derived the optimal allocation of shared data, group sizes, integration and broadcasting of replicated data as well as the scheduling of shared data accesses. The proposed parallel EM algorithm has been implemented on an iPSC/860 with 16 PEs. The experimental and theoretical results, which are consistent with each other, have shown that the proposed parallel EM algorithm could improve performance substantially over those using unoptimized data replication
Parallel Implementation and Scaling of an Adaptive Mesh Discrete Ordinates Algorithm for Transport

International Nuclear Information System (INIS)

Howell, L H

2004-01-01

Block-structured adaptive mesh refinement (AMR) uses a mesh structure built up out of locally-uniform rectangular grids. In the BoxLib parallel framework used by the Raptor code, each processor operates on one or more of these grids at each refinement level. The decomposition of the mesh into grids and the distribution of these grids among processors may change every few timesteps as a calculation proceeds. Finer grids use smaller timesteps than coarser grids, requiring additional work to keep the system synchronized and ensure conservation between different refinement levels. In a paper for NECDC 2002 I presented preliminary results on implementation of parallel transport sweeps on the AMR mesh, conjugate gradient acceleration, accuracy of the AMR solution, and scalar speedup of the AMR algorithm compared to a uniform fully-refined mesh. This paper continues with a more in-depth examination of the parallel scaling properties of the scheme, both in single-level and multi-level calculations. Both sweeping and setup costs are considered. The algorithm scales with acceptable performance to several hundred processors. Trends suggest, however, that this is the limit for efficient calculations with traditional transport sweeps, and that modifications to the sweep algorithm will be increasingly needed as job sizes in the thousands of processors become common
Nonlinear inversion of potential-field data using a hybrid-encoding genetic algorithm

Science.gov (United States)

Chen, C.; Xia, J.; Liu, J.; Feng, G.

2006-01-01

Using a genetic algorithm to solve an inverse problem of complex nonlinear geophysical equations is advantageous because it does not require computer gradients of models or "good" initial models. The multi-point search of a genetic algorithm makes it easier to find the globally optimal solution while avoiding falling into a local extremum. As is the case in other optimization approaches, the search efficiency for a genetic algorithm is vital in finding desired solutions successfully in a multi-dimensional model space. A binary-encoding genetic algorithm is hardly ever used to resolve an optimization problem such as a simple geophysical inversion with only three unknowns. The encoding mechanism, genetic operators, and population size of the genetic algorithm greatly affect search processes in the evolution. It is clear that improved operators and proper population size promote the convergence. Nevertheless, not all genetic operations perform perfectly while searching under either a uniform binary or a decimal encoding system. With the binary encoding mechanism, the crossover scheme may produce more new individuals than with the decimal encoding. On the other hand, the mutation scheme in a decimal encoding system will create new genes larger in scope than those in the binary encoding. This paper discusses approaches of exploiting the search potential of genetic operations in the two encoding systems and presents an approach with a hybrid-encoding mechanism, multi-point crossover, and dynamic population size for geophysical inversion. We present a method that is based on the routine in which the mutation operation is conducted in the decimal code and multi-point crossover operation in the binary code. The mix-encoding algorithm is called the hybrid-encoding genetic algorithm (HEGA). HEGA provides better genes with a higher probability by a mutation operator and improves genetic algorithms in resolving complicated geophysical inverse problems. Another significant
Optimal groundwater remediation using artificial neural networks and the genetic algorithm

Energy Technology Data Exchange (ETDEWEB)

Rogers, Leah L. [Stanford Univ., CA (United States)

1992-08-01

An innovative computational approach for the optimization of groundwater remediation is presented which uses artificial neural networks (ANNs) and the genetic algorithm (GA). In this approach, the ANN is trained to predict an aspect of the outcome of a flow and transport simulation. Then the GA searches through realizations or patterns of pumping and uses the trained network to predict the outcome of the realizations. This approach has advantages of parallel processing of the groundwater simulations and the ability to ``recycle`` or reuse the base of knowledge formed by these simulations. These advantages offer reduction of computational burden of the groundwater simulations relative to a more conventional approach which uses nonlinear programming (NLP) with a quasi-newtonian search. Also the modular nature of this approach facilitates substitution of different groundwater simulation models.
Optimal groundwater remediation using artificial neural networks and the genetic algorithm

International Nuclear Information System (INIS)

Rogers, L.L.

1992-08-01

An innovative computational approach for the optimization of groundwater remediation is presented which uses artificial neural networks (ANNs) and the genetic algorithm (GA). In this approach, the ANN is trained to predict an aspect of the outcome of a flow and transport simulation. Then the GA searches through realizations or patterns of pumping and uses the trained network to predict the outcome of the realizations. This approach has advantages of parallel processing of the groundwater simulations and the ability to ''recycle'' or reuse the base of knowledge formed by these simulations. These advantages offer reduction of computational burden of the groundwater simulations relative to a more conventional approach which uses nonlinear programming (NLP) with a quasi-newtonian search. Also the modular nature of this approach facilitates substitution of different groundwater simulation models
Improved multilayer OLED architecture using evolutionary genetic algorithm

International Nuclear Information System (INIS)

Quirino, W.G.; Teixeira, K.C.; Legnani, C.; Calil, V.L.; Messer, B.; Neto, O.P. Vilela; Pacheco, M.A.C.; Cremona, M.

2009-01-01

Organic light-emitting diodes (OLEDs) constitute a new class of emissive devices, which present high efficiency and low voltage operation, among other advantages over current technology. Multilayer architecture (M-OLED) is generally used to optimize these devices, specially overcoming the suppression of light emission due to the exciton recombination near the metal layers. However, improvement in recombination, transport and charge injection can also be achieved by blending electron and hole transporting layers into the same one. Graded emissive region devices can provide promising results regarding quantum and power efficiency and brightness, as well. The massive number of possible model configurations, however, suggests that a search algorithm would be more suitable for this matter. In this work, multilayer OLEDs were simulated and fabricated using Genetic Algorithms (GAs) as evolutionary strategy to improve their efficiency. Genetic Algorithms are stochastic algorithms based on genetic inheritance and Darwinian strife to survival. In our simulations, it was assumed a 50 nm width graded region, divided into five equally sized layers. The relative concentrations of the materials within each layer were optimized to obtain the lower V/J 0.5 ratio, where V is the applied voltage and J the current density. The best M-OLED architecture obtained by genetic algorithm presented a V/J 0.5 ratio nearly 7% lower than the value reported in the literature. In order to check the experimental validity of the improved results obtained in the simulations, two M-OLEDs with different architectures were fabricated by thermal deposition in high vacuum environment. The results of the comparison between simulation and some experiments are presented and discussed.
Genetic algorithms at UC Davis/LLNL

Energy Technology Data Exchange (ETDEWEB)

Vemuri, V.R. [comp.

1993-12-31

A tutorial introduction to genetic algorithms is given. This brief tutorial should serve the purpose of introducing the subject to the novice. The tutorial is followed by a brief commentary on the term project reports that follow.
Examination of Speed Contribution of Parallelization for Several Fingerprint Pre-Processing Algorithms

Directory of Open Access Journals (Sweden)

GORGUNOGLU, S.

2014-05-01

Full Text Available In analysis of minutiae based fingerprint systems, fingerprints needs to be pre-processed. The pre-processing is carried out to enhance the quality of the fingerprint and to obtain more accurate minutiae points. Reducing the pre-processing time is important for identification and verification in real time systems and especially for databases holding large fingerprints information. Parallel processing and parallel CPU computing can be considered as distribution of processes over multi core processor. This is done by using parallel programming techniques. Reducing the execution time is the main objective in parallel processing. In this study, pre-processing of minutiae based fingerprint system is implemented by parallel processing on multi core computers using OpenMP and on graphics processor using CUDA to improve execution time. The execution times and speedup ratios are compared with the one that of single core processor. The results show that by using parallel processing, execution time is substantially improved. The improvement ratios obtained for different pre-processing algorithms allowed us to make suggestions on the more suitable approaches for parallelization.
Resizing Technique-Based Hybrid Genetic Algorithm for Optimal Drift Design of Multistory Steel Frame Buildings

Directory of Open Access Journals (Sweden)

Hyo Seon Park

2014-01-01

Full Text Available Since genetic algorithm-based optimization methods are computationally expensive for practical use in the field of structural optimization, a resizing technique-based hybrid genetic algorithm for the drift design of multistory steel frame buildings is proposed to increase the convergence speed of genetic algorithms. To reduce the number of structural analyses required for the convergence, a genetic algorithm is combined with a resizing technique that is an efficient optimal technique to control the drift of buildings without the repetitive structural analysis. The resizing technique-based hybrid genetic algorithm proposed in this paper is applied to the minimum weight design of three steel frame buildings. To evaluate the performance of the algorithm, optimum weights, computational times, and generation numbers from the proposed algorithm are compared with those from a genetic algorithm. Based on the comparisons, it is concluded that the hybrid genetic algorithm shows clear improvements in convergence properties.
Parallel implementation of DNA sequences matching algorithms using PWM on GPU architecture.

Science.gov (United States)

Sharma, Rahul; Gupta, Nitin; Narang, Vipin; Mittal, Ankush

2011-01-01

Positional Weight Matrices (PWMs) are widely used in representation and detection of Transcription Factor Of Binding Sites (TFBSs) on DNA. We implement online PWM search algorithm over parallel architecture. A large PWM data can be processed on Graphic Processing Unit (GPU) systems in parallel which can help in matching sequences at a faster rate. Our method employs extensive usage of highly multithreaded architecture and shared memory of multi-cored GPU. An efficient use of shared memory is required to optimise parallel reduction in CUDA. Our optimised method has a speedup of 230-280x over linear implementation on GPU named GeForce GTX 280.

Genetic algorithms for adaptive real-time control in space systems

Science.gov (United States)

Vanderzijp, J.; Choudry, A.

1988-01-01

Genetic Algorithms that are used for learning as one way to control the combinational explosion associated with the generation of new rules are discussed. The Genetic Algorithm approach tends to work best when it can be applied to a domain independent knowledge representation. Applications to real time control in space systems are discussed.
Indoor high precision three-dimensional positioning system based on visible light communication using modified genetic algorithm

Science.gov (United States)

Chen, Hao; Guan, Weipeng; Li, Simin; Wu, Yuxiang

2018-04-01

To improve the precision of indoor positioning and actualize three-dimensional positioning, a reversed indoor positioning system based on visible light communication (VLC) using genetic algorithm (GA) is proposed. In order to solve the problem of interference between signal sources, CDMA modulation is used. Each light-emitting diode (LED) in the system broadcasts a unique identity (ID) code using CDMA modulation. Receiver receives mixed signal from every LED reference point, by the orthogonality of spreading code in CDMA modulation, ID information and intensity attenuation information from every LED can be obtained. According to positioning principle of received signal strength (RSS), the coordinate of the receiver can be determined. Due to system noise and imperfection of device utilized in the system, distance between receiver and transmitters will deviate from the real value resulting in positioning error. By introducing error correction factors to global parallel search of genetic algorithm, coordinates of the receiver in three-dimensional space can be determined precisely. Both simulation results and experimental results show that in practical application scenarios, the proposed positioning system can realize high precision positioning service.
Applying genetic algorithms for programming manufactoring cell tasks

Directory of Open Access Journals (Sweden)

Efredy Delgado

2005-05-01

Full Text Available This work was aimed for developing computational intelligence for scheduling a manufacturing cell's tasks, based manily on genetic algorithms. The manufacturing cell was modelled as beign a production-line; the makespan was calculated by using heuristics adapted from several libraries for genetic algorithms computed in C++ builder. Several problems dealing with small, medium and large list of jobs and machinery were resolved. The results were compared with other heuristics. The approach developed here would seem to be promising for future research concerning scheduling manufacturing cell tasks involving mixed batches.
Fast parallel algorithms that compute transitive closure of a fuzzy relation

Science.gov (United States)

Kreinovich, Vladik YA.

1993-01-01

The notion of a transitive closure of a fuzzy relation is very useful for clustering in pattern recognition, for fuzzy databases, etc. The original algorithm proposed by L. Zadeh (1971) requires the computation time O(n(sup 4)), where n is the number of elements in the relation. In 1974, J. C. Dunn proposed a O(n(sup 2)) algorithm. Since we must compute n(n-1)/2 different values s(a, b) (a not equal to b) that represent the fuzzy relation, and we need at least one computational step to compute each of these values, we cannot compute all of them in less than O(n(sup 2)) steps. So, Dunn's algorithm is in this sense optimal. For small n, it is ok. However, for big n (e.g., for big databases), it is still a lot, so it would be desirable to decrease the computation time (this problem was formulated by J. Bezdek). Since this decrease cannot be done on a sequential computer, the only way to do it is to use a computer with several processors working in parallel. We show that on a parallel computer, transitive closure can be computed in time O((log(sub 2)(n))2).
Computational experience with a parallel algorithm for tetrangle inequality bound smoothing.

Science.gov (United States)

Rajan, K; Deo, N

1999-09-01

Determining molecular structure from interatomic distances is an important and challenging problem. Given a molecule with n atoms, lower and upper bounds on interatomic distances can usually be obtained only for a small subset of the 2(n(n-1)) atom pairs, using NMR. Given the bounds so obtained on the distances between some of the atom pairs, it is often useful to compute tighter bounds on all the 2(n(n-1)) pairwise distances. This process is referred to as bound smoothing. The initial lower and upper bounds for the pairwise distances not measured are usually assumed to be 0 and infinity. One method for bound smoothing is to use the limits imposed by the triangle inequality. The distance bounds so obtained can often be tightened further by applying the tetrangle inequality--the limits imposed on the six pairwise distances among a set of four atoms (instead of three for the triangle inequalities). The tetrangle inequality is expressed by the Cayley-Menger determinants. For every quadruple of atoms, each pass of the tetrangle inequality bound smoothing procedure finds upper and lower limits on each of the six distances in the quadruple. Applying the tetrangle inequalities to each of the (4n) quadruples requires O(n4) time. Here, we propose a parallel algorithm for bound smoothing employing the tetrangle inequality. Each pass of our algorithm requires O(n3 log n) time on a REW PRAM (Concurrent Read Exclusive Write Parallel Random Access Machine) with O(log(n)n) processors. An implementation of this parallel algorithm on the Intel Paragon XP/S and its performance are also discussed.
Multi-user cognitive radio network resource allocation based on the adaptive niche immune genetic algorithm

International Nuclear Information System (INIS)

Zu Yun-Xiao; Zhou Jie

2012-01-01

Multi-user cognitive radio network resource allocation based on the adaptive niche immune genetic algorithm is proposed, and a fitness function is provided. Simulations are conducted using the adaptive niche immune genetic algorithm, the simulated annealing algorithm, the quantum genetic algorithm and the simple genetic algorithm, respectively. The results show that the adaptive niche immune genetic algorithm performs better than the other three algorithms in terms of the multi-user cognitive radio network resource allocation, and has quick convergence speed and strong global searching capability, which effectively reduces the system power consumption and bit error rate. (geophysics, astronomy, and astrophysics)
Genetic Algorithms for a Parameter Estimation of a Fermentation Process Model: A Comparison

Directory of Open Access Journals (Sweden)

Olympia Roeva

2005-12-01

Full Text Available In this paper the problem of a parameter estimation using genetic algorithms is examined. A case study considering the estimation of 6 parameters of a nonlinear dynamic model of E. coli fermentation is presented as a test problem. The parameter estimation problem is stated as a nonlinear programming problem subject to nonlinear differential-algebraic constraints. This problem is known to be frequently ill-conditioned and multimodal. Thus, traditional (gradient-based local optimization methods fail to arrive satisfied solutions. To overcome their limitations, the use of different genetic algorithms as stochastic global optimization methods is explored. These algorithms are proved to be very suitable for the optimization of highly non-linear problems with many variables. Genetic algorithms can guarantee global optimality and robustness. These facts make them advantageous in use for parameter identification of fermentation models. A comparison between simple, modified and multi-population genetic algorithms is presented. The best result is obtained using the modified genetic algorithm. The considered algorithms converged very closely to the cost value but the modified algorithm is in times faster than other two.
Parallel algorithm of real-time infrared image restoration based on total variation theory

Science.gov (United States)

Zhu, Ran; Li, Miao; Long, Yunli; Zeng, Yaoyuan; An, Wei

2015-10-01

Image restoration is a necessary preprocessing step for infrared remote sensing applications. Traditional methods allow us to remove the noise but penalize too much the gradients corresponding to edges. Image restoration techniques based on variational approaches can solve this over-smoothing problem for the merits of their well-defined mathematical modeling of the restore procedure. The total variation (TV) of infrared image is introduced as a L1 regularization term added to the objective energy functional. It converts the restoration process to an optimization problem of functional involving a fidelity term to the image data plus a regularization term. Infrared image restoration technology with TV-L1 model exploits the remote sensing data obtained sufficiently and preserves information at edges caused by clouds. Numerical implementation algorithm is presented in detail. Analysis indicates that the structure of this algorithm can be easily implemented in parallelization. Therefore a parallel implementation of the TV-L1 filter based on multicore architecture with shared memory is proposed for infrared real-time remote sensing systems. Massive computation of image data is performed in parallel by cooperating threads running simultaneously on multiple cores. Several groups of synthetic infrared image data are used to validate the feasibility and effectiveness of the proposed parallel algorithm. Quantitative analysis of measuring the restored image quality compared to input image is presented. Experiment results show that the TV-L1 filter can restore the varying background image reasonably, and that its performance can achieve the requirement of real-time image processing.
Genetic Algorithm and its Application in Optimal Sensor Layout

Directory of Open Access Journals (Sweden)

Xiang-Yang Chen

2015-05-01

Full Text Available This paper aims at the problem of multi sensor station distribution, based on multi- sensor systems of different types as the research object, in the analysis of various types of sensors with different application background, different indicators of demand, based on the different constraints, for all kinds of multi sensor station is studied, the application of genetic algorithms as a tool for the objective function of the models optimization, then the optimal various types of multi sensor station distribution plan, improve the performance of the system, and achieved good military effect. In the field of application of sensor radar, track measuring instrument, the satellite, passive positioning equipment of various types, specific problem, use care indicators and station arrangement between the mathematical model of geometry, using genetic algorithm to get the optimization results station distribution, to solve a variety of practical problems provides useful help, but also reflects the improved genetic algorithm in electronic weapon system based on multi sensor station distribution on the applicability and effectiveness of the optimization; finally the genetic algorithm for integrated optimization of multi sensor station distribution using the good to the training exercise tasks based on actual in, and have achieved good military effect.
Microwave tomography global optimization, parallelization and performance evaluation

CERN Document Server

Noghanian, Sima; Desell, Travis; Ashtari, Ali

2014-01-01

This book provides a detailed overview on the use of global optimization and parallel computing in microwave tomography techniques. The book focuses on techniques that are based on global optimization and electromagnetic numerical methods. The authors provide parallelization techniques on homogeneous and heterogeneous computing architectures on high performance and general purpose futuristic computers. The book also discusses the multi-level optimization technique, hybrid genetic algorithm and its application in breast cancer imaging.
Global Optimization of a Periodic System using a Genetic Algorithm

Science.gov (United States)

Stucke, David; Crespi, Vincent

2001-03-01

We use a novel application of a genetic algorithm global optimizatin technique to find the lowest energy structures for periodic systems. We apply this technique to colloidal crystals for several different stoichiometries of binary and trinary colloidal crystals. This application of a genetic algorithm is decribed and results of likely candidate structures are presented.
Direct and iterative algorithms for the parallel solution of the one-dimensional macroscopic Navier-Stokes equations

International Nuclear Information System (INIS)

Doster, J.M.; Sills, E.D.

1986-01-01

Current efforts are under way to develop and evaluate numerical algorithms for the parallel solution of the large sparse matrix equations associated with the finite difference representation of the macroscopic Navier-Stokes equations. Previous work has shown that these equations can be cast into smaller coupled matrix equations suitable for solution utilizing multiple computer processors operating in parallel. The individual processors themselves may exhibit parallelism through the use of vector pipelines. This wor, has concentrated on the one-dimensional drift flux form of the Navier-Stokes equations. Direct and iterative algorithms that may be suitable for implementation on parallel computer architectures are evaluated in terms of accuracy and overall execution speed. This work has application to engineering and training simulations, on-line process control systems, and engineering workstations where increased computational speeds are required
A parallel graded-mesh FDTD algorithm for human-antenna interaction problems.

Science.gov (United States)

Catarinucci, Luca; Tarricone, Luciano

2009-01-01

The finite difference time domain method (FDTD) is frequently used for the numerical solution of a wide variety of electromagnetic (EM) problems and, among them, those concerning human exposure to EM fields. In many practical cases related to the assessment of occupational EM exposure, large simulation domains are modeled and high space resolution adopted, so that strong memory and central processing unit power requirements have to be satisfied. To better afford the computational effort, the use of parallel computing is a winning approach; alternatively, subgridding techniques are often implemented. However, the simultaneous use of subgridding schemes and parallel algorithms is very new. In this paper, an easy-to-implement and highly-efficient parallel graded-mesh (GM) FDTD scheme is proposed and applied to human-antenna interaction problems, demonstrating its appropriateness in dealing with complex occupational tasks and showing its capability to guarantee the advantages of a traditional subgridding technique without affecting the parallel FDTD performance.
Multiscale Architectures and Parallel Algorithms for Video Object Tracking

Science.gov (United States)

2011-10-01

larger number of cores using the IBM QS22 Blade for handling higher video processing workloads (but at higher cost per core), low power consumption and...Cell/B.E. Blade processors which have a lot more main memory but also higher power consumption . More detailed performance figures for HD and SD video...Parallelism in Algorithms and Architectures, pages 289–298, 2007. [3] S. Ali and M. Shah. COCOA - Tracking in aerial imagery. In Daniel J. Henry
An Enhanced Genetic Algorithm for the Generalized Traveling Salesman Problem

Directory of Open Access Journals (Sweden)

H. Jafarzadeh

2017-12-01

Full Text Available The generalized traveling salesman problem (GTSP deals with finding the minimum-cost tour in a clustered set of cities. In this problem, the traveler is interested in finding the best path that goes through all clusters. As this problem is NP-hard, implementing a metaheuristic algorithm to solve the large scale problems is inevitable. The performance of these algorithms can be intensively promoted by other heuristic algorithms. In this study, a search method is developed that improves the quality of the solutions and competition time considerably in comparison with Genetic Algorithm. In the proposed algorithm, the genetic algorithms with the Nearest Neighbor Search (NNS are combined and a heuristic mutation operator is applied. According to the experimental results on a set of standard test problems with symmetric distances, the proposed algorithm finds the best solutions in most cases with the least computational time. The proposed algorithm is highly competitive with the published until now algorithms in both solution quality and running time.
Design Optimization of Tilting-Pad Journal Bearing Using a Genetic Algorithm

Directory of Open Access Journals (Sweden)

Hamit Saruhan

2004-01-01

Full Text Available This article focuses on the use of genetic algorithms in developing an efficient optimum design method for tilting pad bearings. The approach optimizes based on minimum film thickness, power loss, maximum film temperature, and a global objective. Results for a five tilting-pad preloaded bearing are presented to provide a comparison with more traditional optimum design methods such as the gradient-based global criterion method, and also to provide insight into the potential of genetic algorithms in the design of rotor bearings. Genetic algorithms are efficient search techniques based on the idea of natural selection and genetics. These robust methods have gained recognition as general problem solving techniques in many applications.
Efficient Serial and Parallel Algorithms for Selection of Unique Oligos in EST Databases.

Science.gov (United States)

Mata-Montero, Manrique; Shalaby, Nabil; Sheppard, Bradley

2013-01-01

Obtaining unique oligos from an EST database is a problem of great importance in bioinformatics, particularly in the discovery of new genes and the mapping of the human genome. Many algorithms have been developed to find unique oligos, many of which are much less time consuming than the traditional brute force approach. An algorithm was presented by Zheng et al. (2004) which finds the solution of the unique oligos search problem efficiently. We implement this algorithm as well as several new algorithms based on some theorems included in this paper. We demonstrate how, with these new algorithms, we can obtain unique oligos much faster than with previous ones. We parallelize these new algorithms to further improve the time of finding unique oligos. All algorithms are run on ESTs obtained from a Barley EST database.
Development Modules for Specification of Requirements for a System of Verification of Parallel Algorithms

Directory of Open Access Journals (Sweden)

Vasiliy Yu. Meltsov

2012-05-01

Full Text Available This paper presents the results of the development of one of the modules of the system verification of parallel algorithms that are used to verify the inference engine. This module is designed to build the specification requirements, the feasibility of which on the algorithm is necessary to prove (test.
Genetic Algorithm for Traveling Salesman Problem with Modified Cycle Crossover Operator

Directory of Open Access Journals (Sweden)

Abid Hussain

2017-01-01

Full Text Available Genetic algorithms are evolutionary techniques used for optimization purposes according to survival of the fittest idea. These methods do not ensure optimal solutions; however, they give good approximation usually in time. The genetic algorithms are useful for NP-hard problems, especially the traveling salesman problem. The genetic algorithm depends on selection criteria, crossover, and mutation operators. To tackle the traveling salesman problem using genetic algorithms, there are various representations such as binary, path, adjacency, ordinal, and matrix representations. In this article, we propose a new crossover operator for traveling salesman problem to minimize the total distance. This approach has been linked with path representation, which is the most natural way to represent a legal tour. Computational results are also reported with some traditional path representation methods like partially mapped and order crossovers along with new cycle crossover operator for some benchmark TSPLIB instances and found improvements.
Hybridizing Differential Evolution with a Genetic Algorithm for Color Image Segmentation

Directory of Open Access Journals (Sweden)

R. V. V. Krishna

2016-10-01

Full Text Available This paper proposes a hybrid of differential evolution and genetic algorithms to solve the color image segmentation problem. Clustering based color image segmentation algorithms segment an image by clustering the features of color and texture, thereby obtaining accurate prototype cluster centers. In the proposed algorithm, the color features are obtained using the homogeneity model. A new texture feature named Power Law Descriptor (PLD which is a modification of Weber Local Descriptor (WLD is proposed and further used as a texture feature for clustering. Genetic algorithms are competent in handling binary variables, while differential evolution on the other hand is more efficient in handling real parameters. The obtained texture feature is binary in nature and the color feature is a real value, which suits very well the hybrid cluster center optimization problem in image segmentation. Thus in the proposed algorithm, the optimum texture feature centers are evolved using genetic algorithms, whereas the optimum color feature centers are evolved using differential evolution.

A new parallel algorithm and its simulation on hypercube simulator for low pass digital image filtering using systolic array

International Nuclear Information System (INIS)

Al-Hallaq, A.; Amin, S.

1998-01-01

This paper introduces a new parallel algorithm and its simulation on a hypercube simulator for the low pass digital image filtering using a systolic array. This new algorithm is faster than the old one (Amin, 1988). This is due to the the fact that the old algorithm carries out the addition operations in a sequential mode. But in our new design these addition operations are divided into tow groups, which can be performed in parallel. One group will be performed on one half of the systolic array and the other on the second half, that is, by folding. This parallelism reduces the time required for the whole process by almost quarter the time of the old algorithm.(authors). 18 refs., 3 figs
A novel progressively swarmed mixed integer genetic algorithm for ...

African Journals Online (AJOL)

MIGA) which inherits the advantages of binary and real coded Genetic Algorithm approach. The proposed algorithm is applied for the conventional generation cost minimization Optimal Power Flow (OPF) problem and for the Security ...
Evolving aerodynamic airfoils for wind turbines through a genetic algorithm

Science.gov (United States)

Hernández, J. J.; Gómez, E.; Grageda, J. I.; Couder, C.; Solís, A.; Hanotel, C. L.; Ledesma, JI

2017-01-01

Nowadays, genetic algorithms stand out for airfoil optimisation, due to the virtues of mutation and crossing-over techniques. In this work we propose a genetic algorithm with arithmetic crossover rules. The optimisation criteria are taken to be the maximisation of both aerodynamic efficiency and lift coefficient, while minimising drag coefficient. Such algorithm shows greatly improvements in computational costs, as well as a high performance by obtaining optimised airfoils for Mexico City's specific wind conditions from generic wind turbines designed for higher Reynolds numbers, in few iterations.
An iterative algorithm for solving the multidimensional neutron diffusion nodal method equations on parallel computers

International Nuclear Information System (INIS)

Kirk, B.L.; Azmy, Y.Y.

1992-01-01

In this paper the one-group, steady-state neutron diffusion equation in two-dimensional Cartesian geometry is solved using the nodal integral method. The discrete variable equations comprise loosely coupled sets of equations representing the nodal balance of neutrons, as well as neutron current continuity along rows or columns of computational cells. An iterative algorithm that is more suitable for solving large problems concurrently is derived based on the decomposition of the spatial domain and is accelerated using successive overrelaxation. This algorithm is very well suited for parallel computers, especially since the spatial domain decomposition occurs naturally, so that the number of iterations required for convergence does not depend on the number of processors participating in the calculation. Implementation of the authors' algorithm on the Intel iPSC/2 hypercube and Sequent Balance 8000 parallel computer is presented, and measured speedup and efficiency for test problems are reported. The results suggest that the efficiency of the hypercube quickly deteriorates when many processors are used, while the Sequent Balance retains very high efficiency for a comparable number of participating processors. This leads to the conjecture that message-passing parallel computers are not as well suited for this algorithm as shared-memory machines
Micro-seismic waveform matching inversion based on gravitational search algorithm and parallel computation

Science.gov (United States)

Jiang, Y.; Xing, H. L.

2016-12-01

Micro-seismic events induced by water injection, mining activity or oil/gas extraction are quite informative, the interpretation of which can be applied for the reconstruction of underground stress and monitoring of hydraulic fracturing progress in oil/gas reservoirs. The source characterises and locations are crucial parameters that required for these purposes, which can be obtained through the waveform matching inversion (WMI) method. Therefore it is imperative to develop a WMI algorithm with high accuracy and convergence speed. Heuristic algorithm, as a category of nonlinear method, possesses a very high convergence speed and good capacity to overcome local minimal values, and has been well applied for many areas (e.g. image processing, artificial intelligence). However, its effectiveness for micro-seismic WMI is still poorly investigated; very few literatures exits that addressing this subject. In this research an advanced heuristic algorithm, gravitational search algorithm (GSA) , is proposed to estimate the focal mechanism (angle of strike, dip and rake) and source locations in three dimension. Unlike traditional inversion methods, the heuristic algorithm inversion does not require the approximation of green function. The method directly interacts with a CPU parallelized finite difference forward modelling engine, and updating the model parameters under GSA criterions. The effectiveness of this method is tested with synthetic data form a multi-layered elastic model; the results indicate GSA can be well applied on WMI and has its unique advantages. Keywords: Micro-seismicity, Waveform matching inversion, gravitational search algorithm, parallel computation
Optimization of multicast optical networks with genetic algorithm

Science.gov (United States)

Lv, Bo; Mao, Xiangqiao; Zhang, Feng; Qin, Xi; Lu, Dan; Chen, Ming; Chen, Yong; Cao, Jihong; Jian, Shuisheng

2007-11-01

In this letter, aiming to obtain the best multicast performance of optical network in which the video conference information is carried by specified wavelength, we extend the solutions of matrix games with the network coding theory and devise a new method to solve the complex problems of multicast network switching. In addition, an experimental optical network has been testified with best switching strategies by employing the novel numerical solution designed with an effective way of genetic algorithm. The result shows that optimal solutions with genetic algorithm are accordance with the ones with the traditional fictitious play method.
Air data system optimization using a genetic algorithm

Science.gov (United States)

Deshpande, Samir M.; Kumar, Renjith R.; Seywald, Hans; Siemers, Paul M., III

1992-01-01

An optimization method for flush-orifice air data system design has been developed using the Genetic Algorithm approach. The optimization of the orifice array minimizes the effect of normally distributed random noise in the pressure readings on the calculation of air data parameters, namely, angle of attack, sideslip angle and freestream dynamic pressure. The optimization method is applied to the design of Pressure Distribution/Air Data System experiment (PD/ADS) proposed for inclusion in the Aeroassist Flight Experiment (AFE). Results obtained by the Genetic Algorithm method are compared to the results obtained by conventional gradient search method.
Genetic algorithm based optimization of advanced solar cell designs modeled in Silvaco AtlasTM

OpenAIRE

Utsler, James

2006-01-01

A genetic algorithm was used to optimize the power output of multi-junction solar cells. Solar cell operation was modeled using the Silvaco ATLASTM software. The output of the ATLASTM simulation runs served as the input to the genetic algorithm. The genetic algorithm was run as a diffusing computation on a network of eighteen dual processor nodes. Results showed that the genetic algorithm produced better power output optimizations when compared with the results obtained using the hill cli...
Parallel algorithms for online trackfinding at PANDA

Energy Technology Data Exchange (ETDEWEB)

Bianchi, Ludovico; Ritman, James; Stockmanns, Tobias [IKP, Forschungszentrum Juelich GmbH (Germany); Herten, Andreas [JSC, Forschungszentrum Juelich GmbH (Germany); Collaboration: PANDA-Collaboration

2016-07-01

The PANDA experiment, one of the four scientific pillars of the FAIR facility currently in construction in Darmstadt, is a next-generation particle detector that will study collisions of antiprotons with beam momenta of 1.5-15 GeV/c on a fixed proton target. Because of the broad physics scope and the similar signature of signal and background events, PANDA's strategy for data acquisition is to continuously record data from the whole detector and use this global information to perform online event reconstruction and filtering. A real-time rejection factor of up to 1000 must be achieved to match the incoming data rate for offline storage, making all components of the data processing system computationally very challenging. Online particle track identification and reconstruction is an essential step, since track information is used as input in all following phases. Online tracking algorithms must ensure a delicate balance between high tracking efficiency and quality, and minimal computational footprint. For this reason, a massively parallel solution exploiting multiple Graphic Processing Units (GPUs) is under investigation. The talk presents the core concepts of the algorithms being developed for primary trackfinding, along with details of their implementation on GPUs.
Automatic mesh refinement and parallel load balancing for Fokker-Planck-DSMC algorithm

Science.gov (United States)

Küchlin, Stephan; Jenny, Patrick

2018-06-01

Recently, a parallel Fokker-Planck-DSMC algorithm for rarefied gas flow simulation in complex domains at all Knudsen numbers was developed by the authors. Fokker-Planck-DSMC (FP-DSMC) is an augmentation of the classical DSMC algorithm, which mitigates the near-continuum deficiencies in terms of computational cost of pure DSMC. At each time step, based on a local Knudsen number criterion, the discrete DSMC collision operator is dynamically switched to the Fokker-Planck operator, which is based on the integration of continuous stochastic processes in time, and has fixed computational cost per particle, rather than per collision. In this contribution, we present an extension of the previous implementation with automatic local mesh refinement and parallel load-balancing. In particular, we show how the properties of discrete approximations to space-filling curves enable an efficient implementation. Exemplary numerical studies highlight the capabilities of the new code.
Development of Variational Guiding Center Algorithms for Parallel Calculations in Experimental Magnetic Equilibria

Energy Technology Data Exchange (ETDEWEB)

Ellison, C. Leland [PPPL; Finn, J. M. [LANL; Qin, H. [PPPL; Tang, William M. [PPPL

2014-10-01

Structure-preserving algorithms obtained via discrete variational principles exhibit strong promise for the calculation of guiding center test particle trajectories. The non-canonical Hamiltonian structure of the guiding center equations forms a novel and challenging context for geometric integration. To demonstrate the practical relevance of these methods, a prototypical variational midpoint algorithm is applied to an experimental magnetic equilibrium. The stability characteristics, conservation properties, and implementation requirements associated with the variational algorithms are addressed. Furthermore, computational run time is reduced for large numbers of particles by parallelizing the calculation on GPU hardware.
A genetic algorithm-based job scheduling model for big data analytics.

Science.gov (United States)

Lu, Qinghua; Li, Shanshan; Zhang, Weishan; Zhang, Lei

Big data analytics (BDA) applications are a new category of software applications that process large amounts of data using scalable parallel processing infrastructure to obtain hidden value. Hadoop is the most mature open-source big data analytics framework, which implements the MapReduce programming model to process big data with MapReduce jobs. Big data analytics jobs are often continuous and not mutually separated. The existing work mainly focuses on executing jobs in sequence, which are often inefficient and consume high energy. In this paper, we propose a genetic algorithm-based job scheduling model for big data analytics applications to improve the efficiency of big data analytics. To implement the job scheduling model, we leverage an estimation module to predict the performance of clusters when executing analytics jobs. We have evaluated the proposed job scheduling model in terms of feasibility and accuracy.
Rendezvous maneuvers using Genetic Algorithm

International Nuclear Information System (INIS)

Dos Santos, Denílson Paulo Souza; De Almeida Prado, Antônio F Bertachini; Teodoro, Anderson Rodrigo Barretto

2013-01-01

The present paper has the goal of studying orbital maneuvers of Rendezvous, that is an orbital transfer where a spacecraft has to change its orbit to meet with another spacecraft that is travelling in another orbit. This transfer will be accomplished by using a multi-impulsive control. A genetic algorithm is used to find the transfers that have minimum fuel consumption
The multi-niche crowding genetic algorithm: Analysis and applications

Energy Technology Data Exchange (ETDEWEB)

Cedeno, Walter [Univ. of California, Davis, CA (United States)

1995-09-01

The ability of organisms to evolve and adapt to the environment has provided mother nature with a rich and diverse set of species. Only organisms well adapted to their environment can survive from one generation to the next, transferring on the traits, that made them successful, to their offspring. Competition for resources and the ever changing environment drives some species to extinction and at the same time others evolve to maintain the delicate balance in nature. In this disertation we present the multi-niche crowding genetic algorithm, a computational metaphor to the survival of species in ecological niches in the face of competition. The multi-niche crowding genetic algorithm maintains stable subpopulations of solutions in multiple niches in multimodal landscapes. The algorithm introduces the concept of crowding selection to promote mating among members with qirnilar traits while allowing many members of the population to participate in mating. The algorithm uses worst among most similar replacement policy to promote competition among members with similar traits while allowing competition among members of different niches as well. We present empirical and theoretical results for the success of the multiniche crowding genetic algorithm for multimodal function optimization. The properties of the algorithm using different parameters are examined. We test the performance of the algorithm on problems of DNA Mapping, Aquifer Management, and the File Design Problem. Applications that combine the use of heuristics and special operators to solve problems in the areas of combinatorial optimization, grouping, and multi-objective optimization. We conclude by presenting the advantages and disadvantages of the algorithm and describing avenues for future investigation to answer other questions raised by this study.
Mathematical Methods and Algorithms of Mobile Parallel Computing on the Base of Multi-core Processors

Directory of Open Access Journals (Sweden)

Alexander B. Bakulev

2012-11-01

Full Text Available This article deals with mathematical models and algorithms, providing mobility of sequential programs parallel representation on the high-level language, presents formal model of operation environment processes management, based on the proposed model of programs parallel representation, presenting computation process on the base of multi-core processors.
General-purpose parallel algorithm based on CUDA for source pencils' deployment of large γ irradiator

International Nuclear Information System (INIS)

Yang Lei; Gong Xueyu; Wang Ling

2013-01-01

Combined with standard mathematical model for evaluating quality of deploying results, a new high-performance parallel algorithm for source pencils' deployment was obtained by using parallel plant growth simulation algorithm which was completely parallelized with CUDA execute model, and the corresponding code can run on GPU. Based on such work, several instances in various scales were used to test the new version of algorithm. The results show that, based on the advantage of old versions. the performance of new one is improved more than 500 times comparing with the CPU version, and also 30 times with the CPU plus GPU hybrid version. The computation time of new version is less than ten minutes for the irradiator of which the activity is less than 111 PBq. For a single GTX275 GPU, the maximum computing power of new version is no more than 167 PBq as well as the computation time is no more than 25 minutes, and for multiple GPUs, the power can be improved more. Overall, the new version of algorithm running on GPU can satisfy the requirement of source pencils' deployment of any domestic irradiator, and it is of high competitiveness. (authors)
Parameter estimation of fractional-order chaotic systems by using quantum parallel particle swarm optimization algorithm.

Directory of Open Access Journals (Sweden)

Yu Huang

Full Text Available Parameter estimation for fractional-order chaotic systems is an important issue in fractional-order chaotic control and synchronization and could be essentially formulated as a multidimensional optimization problem. A novel algorithm called quantum parallel particle swarm optimization (QPPSO is proposed to solve the parameter estimation for fractional-order chaotic systems. The parallel characteristic of quantum computing is used in QPPSO. This characteristic increases the calculation of each generation exponentially. The behavior of particles in quantum space is restrained by the quantum evolution equation, which consists of the current rotation angle, individual optimal quantum rotation angle, and global optimal quantum rotation angle. Numerical simulation based on several typical fractional-order systems and comparisons with some typical existing algorithms show the effectiveness and efficiency of the proposed algorithm.
Machine Learning in Production Systems Design Using Genetic Algorithms

OpenAIRE

Abu Qudeiri Jaber; Yamamoto Hidehiko Rizauddin Ramli

2008-01-01

To create a solution for a specific problem in machine learning, the solution is constructed from the data or by use a search method. Genetic algorithms are a model of machine learning that can be used to find nearest optimal solution. While the great advantage of genetic algorithms is the fact that they find a solution through evolution, this is also the biggest disadvantage. Evolution is inductive, in nature life does not evolve towards a good solution but it evolves aw...
Steam condenser optimization using Real-parameter Genetic Algorithm for Prototype Fast Breeder Reactor

Energy Technology Data Exchange (ETDEWEB)

Jayalal, M.L., E-mail: jayalal@igcar.gov.in [Indira Gandhi Centre for Atomic Research, Kalpakkam 603102, Tamil Nadu (India); Kumar, L. Satish, E-mail: satish@igcar.gov.in [Indira Gandhi Centre for Atomic Research, Kalpakkam 603102, Tamil Nadu (India); Jehadeesan, R., E-mail: jeha@igcar.gov.in [Indira Gandhi Centre for Atomic Research, Kalpakkam 603102, Tamil Nadu (India); Rajeswari, S., E-mail: raj@igcar.gov.in [Indira Gandhi Centre for Atomic Research, Kalpakkam 603102, Tamil Nadu (India); Satya Murty, S.A.V., E-mail: satya@igcar.gov.in [Indira Gandhi Centre for Atomic Research, Kalpakkam 603102, Tamil Nadu (India); Balasubramaniyan, V.; Chetal, S.C. [Indira Gandhi Centre for Atomic Research, Kalpakkam 603102, Tamil Nadu (India)

2011-10-15

Highlights: > We model design optimization of a vital reactor component using Genetic Algorithm. > Real-parameter Genetic Algorithm is used for steam condenser optimization study. > Comparison analysis done with various Genetic Algorithm related mechanisms. > The results obtained are validated with the reference study results. - Abstract: This work explores the use of Real-parameter Genetic Algorithm and analyses its performance in the steam condenser (or Circulating Water System) optimization study of a 500 MW fast breeder nuclear reactor. Choice of optimum design parameters for condenser for a power plant from among a large number of technically viable combination is a complex task. This is primarily due to the conflicting nature of the economic implications of the different system parameters for maximizing the capitalized profit. In order to find the optimum design parameters a Real-parameter Genetic Algorithm model is developed and applied. The results obtained are validated with the reference study results.
Cultural-Based Genetic Tabu Algorithm for Multiobjective Job Shop Scheduling

Directory of Open Access Journals (Sweden)

Yuzhen Yang

2014-01-01

Full Text Available The job shop scheduling problem, which has been dealt with by various traditional optimization methods over the decades, has proved to be an NP-hard problem and difficult in solving, especially in the multiobjective field. In this paper, we have proposed a novel quadspace cultural genetic tabu algorithm (QSCGTA to solve such problem. This algorithm provides a different structure from the original cultural algorithm in containing double brief spaces and population spaces. These spaces deal with different levels of populations globally and locally by applying genetic and tabu searches separately and exchange information regularly to make the process more effective towards promising areas, along with modified multiobjective domination and transform functions. Moreover, we have presented a bidirectional shifting for the decoding process of job shop scheduling. The computational results we presented significantly prove the effectiveness and efficiency of the cultural-based genetic tabu algorithm for the multiobjective job shop scheduling problem.

Design optimization of brushed permanent magnet D C motor by genetic algorithm

CERN Document Server

Amini, S

2002-01-01

Because of field winding replacement with permanent magnet in brushed permanent magnet D C (PMDC) motors, field losses are eliminated and the structure of the motor is more simple. Efficiency of these motors is therefore increased and the manufacturing process is simplified. Hence, these motors are commonly used in low power applications and their design and optimization is an important consideration. Genetic algorithms are proposed for design optimization of PMD motors because of their independence to objective function structure and its derivative. In this paper genetic algorithms are evaluated for PMDC motor design optimization. an introduction is first presented about PMDC motors, general design procedure and elements of their optimization. Genetic algorithms are then briefly described. Finally results of optimization by genetic algorithms are compared with the one obtained using a conventional method.
Design optimization of brushed permanent magnet D C motor by genetic algorithm

International Nuclear Information System (INIS)

Amini, S.; Oraee, H.

2002-01-01

Because of field winding replacement with permanent magnet in brushed permanent magnet D C (PMDC) motors, field losses are eliminated and the structure of the motor is more simple. Efficiency of these motors is therefore increased and the manufacturing process is simplified. Hence, these motors are commonly used in low power applications and their design and optimization is an important consideration. Genetic algorithms are proposed for design optimization of PMD motors because of their independence to objective function structure and its derivative. In this paper genetic algorithms are evaluated for PMDC motor design optimization. an introduction is first presented about PMDC motors, general design procedure and elements of their optimization. Genetic algorithms are then briefly described. Finally results of optimization by genetic algorithms are compared with the one obtained using a conventional method
Efficient parallel algorithms for string editing and related problems

Science.gov (United States)

Apostolico, Alberto; Atallah, Mikhail J.; Larmore, Lawrence; Mcfaddin, H. S.

1988-01-01

The string editing problem for input strings x and y consists of transforming x into y by performing a series of weighted edit operations on x of overall minimum cost. An edit operation on x can be the deletion of a symbol from x, the insertion of a symbol in x or the substitution of a symbol x with another symbol. This problem has a well known O((absolute value of x)(absolute value of y)) time sequential solution (25). The efficient Program Requirements Analysis Methods (PRAM) parallel algorithms for the string editing problem are given. If m = ((absolute value of x),(absolute value of y)) and n = max((absolute value of x),(absolute value of y)), then the CREW bound is O (log m log n) time with O (mn/log m) processors. In all algorithms, space is O (mn).
Efficient Dual Domain Decoding of Linear Block Codes Using Genetic Algorithms

Directory of Open Access Journals (Sweden)

Ahmed Azouaoui

2012-01-01

Full Text Available A computationally efficient algorithm for decoding block codes is developed using a genetic algorithm (GA. The proposed algorithm uses the dual code in contrast to the existing genetic decoders in the literature that use the code itself. Hence, this new approach reduces the complexity of decoding the codes of high rates. We simulated our algorithm in various transmission channels. The performance of this algorithm is investigated and compared with competitor decoding algorithms including Maini and Shakeel ones. The results show that the proposed algorithm gives large gains over the Chase-2 decoding algorithm and reach the performance of the OSD-3 for some quadratic residue (QR codes. Further, we define a new crossover operator that exploits the domain specific information and compare it with uniform and two point crossover. The complexity of this algorithm is also discussed and compared to other algorithms.
Acoustic Impedance Inversion of Seismic Data Using Genetic Algorithm

Science.gov (United States)

Eladj, Said; Djarfour, Noureddine; Ferahtia, Djalal; Ouadfeul, Sid-Ali

2013-04-01

The inversion of seismic data can be used to constrain estimates of the Earth's acoustic impedance structure. This kind of problem is usually known to be non-linear, high-dimensional, with a complex search space which may be riddled with many local minima, and results in irregular objective functions. We investigate here the performance and the application of a genetic algorithm, in the inversion of seismic data. The proposed algorithm has the advantage of being easily implemented without getting stuck in local minima. The effects of population size, Elitism strategy, uniform cross-over and lower mutation are examined. The optimum solution parameters and performance were decided as a function of the testing error convergence with respect to the generation number. To calculate the fitness function, we used L2 norm of the sample-to-sample difference between the reference and the inverted trace. The cross-over probability is of 0.9-0.95 and mutation has been tested at 0.01 probability. The application of such a genetic algorithm to synthetic data shows that the inverted acoustic impedance section was efficient. Keywords: Seismic, Inversion, acoustic impedance, genetic algorithm, fitness functions, cross-over, mutation.
Genetic engineering versus natural evolution: Genetic algorithms with deterministic operators

NARCIS (Netherlands)

Jozwiak, L.; Postula, A.

2002-01-01

Genetic algorithms (GA) have several important features that predestine them to solve design problems. Their main disadvantage however is the excessively long run-time that is needed to deliver satisfactory results for large instances of complex design problems. The main aims of this paper are (1)
Massively Parallel and Scalable Implicit Time Integration Algorithms for Structural Dynamics

Science.gov (United States)

Farhat, Charbel

1997-01-01

Explicit codes are often used to simulate the nonlinear dynamics of large-scale structural systems, even for low frequency response, because the storage and CPU requirements entailed by the repeated factorizations traditionally found in implicit codes rapidly overwhelm the available computing resources. With the advent of parallel processing, this trend is accelerating because of the following additional facts: (a) explicit schemes are easier to parallelize than implicit ones, and (b) explicit schemes induce short range interprocessor communications that are relatively inexpensive, while the factorization methods used in most implicit schemes induce long range interprocessor communications that often ruin the sought-after speed-up. However, the time step restriction imposed by the Courant stability condition on all explicit schemes cannot yet be offset by the speed of the currently available parallel hardware. Therefore, it is essential to develop efficient alternatives to direct methods that are also amenable to massively parallel processing because implicit codes using unconditionally stable time-integration algorithms are computationally more efficient when simulating the low-frequency dynamics of aerospace structures.
OpenCL Implementation of a Parallel Universal Kriging Algorithm for Massive Spatial Data Interpolation on Heterogeneous Systems

Directory of Open Access Journals (Sweden)

Fang Huang

2016-06-01

Full Text Available In some digital Earth engineering applications, spatial interpolation algorithms are required to process and analyze large amounts of data. Due to its powerful computing capacity, heterogeneous computing has been used in many applications for data processing in various fields. In this study, we explore the design and implementation of a parallel universal kriging spatial interpolation algorithm using the OpenCL programming model on heterogeneous computing platforms for massive Geo-spatial data processing. This study focuses primarily on transforming the hotspots in serial algorithms, i.e., the universal kriging interpolation function, into the corresponding kernel function in OpenCL. We also employ parallelization and optimization techniques in our implementation to improve the code performance. Finally, based on the results of experiments performed on two different high performance heterogeneous platforms, i.e., an NVIDIA graphics processing unit system and an Intel Xeon Phi system (MIC, we show that the parallel universal kriging algorithm can achieve the highest speedup of up to 40× with a single computing device and the highest speedup of up to 80× with multiple devices.
Multi-GPU parallel algorithm design and analysis for improved inversion of probability tomography with gravity gradiometry data

Science.gov (United States)

Hou, Zhenlong; Huang, Danian

2017-09-01

In this paper, we make a study on the inversion of probability tomography (IPT) with gravity gradiometry data at first. The space resolution of the results is improved by multi-tensor joint inversion, depth weighting matrix and the other methods. Aiming at solving the problems brought by the big data in the exploration, we present the parallel algorithm and the performance analysis combining Compute Unified Device Architecture (CUDA) with Open Multi-Processing (OpenMP) based on Graphics Processing Unit (GPU) accelerating. In the test of the synthetic model and real data from Vinton Dome, we get the improved results. It is also proved that the improved inversion algorithm is effective and feasible. The performance of parallel algorithm we designed is better than the other ones with CUDA. The maximum speedup could be more than 200. In the performance analysis, multi-GPU speedup and multi-GPU efficiency are applied to analyze the scalability of the multi-GPU programs. The designed parallel algorithm is demonstrated to be able to process larger scale of data and the new analysis method is practical.
Application of the NSGA-II algorithm to a multi-period inventory-redundancy allocation problem in a series-parallel system

International Nuclear Information System (INIS)

Alikar, Najmeh; Mousavi, Seyed Mohsen; Raja Ghazilla, Raja Ariffin; Tavana, Madjid; Olugu, Ezutah Udoncy

2017-01-01

In this paper, we formulate a mixed-integer binary non-linear programming model to study a series-parallel multi-component multi-periodic inventory-redundancy allocation problem (IRAP). This IRAP is a novel redundancy allocation problem (RAP) because components (products) are purchased under an all unit discount (AUD) policy and then installed on a series-parallel system. The total budget available for purchasing the components, the storage space, the vehicle capacities, and the total weight of the system are limited. Moreover, a penalty function is used to penalize infeasible solutions, generated randomly. The overall goal is to find the optimal number of the components purchased for each subsystem so that the total costs including ordering cost, holding costs, and purchasing cost are minimized while the system reliability is maximized, simultaneously. A non-dominated sorting genetic algorithm-II (NSGA-II), a multi-objective particle swarm optimization (MOPSO), and a multi-objective harmony search (MOHS) algorithm are applied to obtain the optimal Pareto solutions. While no benchmark is available in the literature, some numerical examples are generated randomly to evaluate the results of NSGA-II on the proposed IRAP. The results are in favor of NSGA-II. - Highlights: • An inventory control system employing an all-unit discount policy is considered in the proposed model. • The proposed model considers limited total budget, storage space, transportation capacity, and total weight. Moreover, a penalty function is used to penalize infeasible solutions. • The overall goal is to find the optimal number components purchased for each subsystem so that the total costs including ordering cost, holding cost and purchasing cost are minimized and the system reliability are maximized, simultaneously. • A NSGA-II algorithm is derived where a multi-objective particle swarm optimization and a multi-objective harmony search algorithm are used to evaluate the NSGA-II results.
Algorithmic Trading with Developmental and Linear Genetic Programming

Science.gov (United States)

Wilson, Garnett; Banzhaf, Wolfgang

A developmental co-evolutionary genetic programming approach (PAM DGP) and a standard linear genetic programming (LGP) stock trading systemare applied to a number of stocks across market sectors. Both GP techniques were found to be robust to market fluctuations and reactive to opportunities associated with stock price rise and fall, with PAMDGP generating notably greater profit in some stock trend scenarios. Both algorithms were very accurate at buying to achieve profit and selling to protect assets, while exhibiting bothmoderate trading activity and the ability to maximize or minimize investment as appropriate. The content of the trading rules produced by both algorithms are also examined in relation to stock price trend scenarios.
HPC-NMF: A High-Performance Parallel Algorithm for Nonnegative Matrix Factorization

Energy Technology Data Exchange (ETDEWEB)

2016-08-22

NMF is a useful tool for many applications in different domains such as topic modeling in text mining, background separation in video analysis, and community detection in social networks. Despite its popularity in the data mining community, there is a lack of efficient distributed algorithms to solve the problem for big data sets. We propose a high-performance distributed-memory parallel algorithm that computes the factorization by iteratively solving alternating non-negative least squares (NLS) subproblems for $\\WW$ and $\\HH$. It maintains the data and factor matrices in memory (distributed across processors), uses MPI for interprocessor communication, and, in the dense case, provably minimizes communication costs (under mild assumptions). As opposed to previous implementation, our algorithm is also flexible: It performs well for both dense and sparse matrices, and allows the user to choose any one of the multiple algorithms for solving the updates to low rank factors $\\WW$ and $\\HH$ within the alternating iterations.
Steam condenser optimization using Real-parameter Genetic Algorithm for Prototype Fast Breeder Reactor

International Nuclear Information System (INIS)

Jayalal, M.L.; Kumar, L. Satish; Jehadeesan, R.; Rajeswari, S.; Satya Murty, S.A.V.; Balasubramaniyan, V.; Chetal, S.C.

2011-01-01

Highlights: → We model design optimization of a vital reactor component using Genetic Algorithm. → Real-parameter Genetic Algorithm is used for steam condenser optimization study. → Comparison analysis done with various Genetic Algorithm related mechanisms. → The results obtained are validated with the reference study results. - Abstract: This work explores the use of Real-parameter Genetic Algorithm and analyses its performance in the steam condenser (or Circulating Water System) optimization study of a 500 MW fast breeder nuclear reactor. Choice of optimum design parameters for condenser for a power plant from among a large number of technically viable combination is a complex task. This is primarily due to the conflicting nature of the economic implications of the different system parameters for maximizing the capitalized profit. In order to find the optimum design parameters a Real-parameter Genetic Algorithm model is developed and applied. The results obtained are validated with the reference study results.
Parallel Algorithms for Monte Carlo Particle Transport Simulation on Exascale Computing Architectures

Science.gov (United States)

Romano, Paul Kollath

Monte Carlo particle transport methods are being considered as a viable option for high-fidelity simulation of nuclear reactors. While Monte Carlo methods offer several potential advantages over deterministic methods, there are a number of algorithmic shortcomings that would prevent their immediate adoption for full-core analyses. In this thesis, algorithms are proposed both to ameliorate the degradation in parallel efficiency typically observed for large numbers of processors and to offer a means of decomposing large tally data that will be needed for reactor analysis. A nearest-neighbor fission bank algorithm was proposed and subsequently implemented in the OpenMC Monte Carlo code. A theoretical analysis of the communication pattern shows that the expected cost is O( N ) whereas traditional fission bank algorithms are O(N) at best. The algorithm was tested on two supercomputers, the Intrepid Blue Gene/P and the Titan Cray XK7, and demonstrated nearly linear parallel scaling up to 163,840 processor cores on a full-core benchmark problem. An algorithm for reducing network communication arising from tally reduction was analyzed and implemented in OpenMC. The proposed algorithm groups only particle histories on a single processor into batches for tally purposes---in doing so it prevents all network communication for tallies until the very end of the simulation. The algorithm was tested, again on a full-core benchmark, and shown to reduce network communication substantially. A model was developed to predict the impact of load imbalances on the performance of domain decomposed simulations. The analysis demonstrated that load imbalances in domain decomposed simulations arise from two distinct phenomena: non-uniform particle densities and non-uniform spatial leakage. The dominant performance penalty for domain decomposition was shown to come from these physical effects rather than insufficient network bandwidth or high latency. The model predictions were verified with
Genetic local search algorithm for optimization design of diffractive optical elements.

Science.gov (United States)

Zhou, G; Chen, Y; Wang, Z; Song, H

1999-07-10

We propose a genetic local search algorithm (GLSA) for the optimization design of diffractive optical elements (DOE's). This hybrid algorithm incorporates advantages of both genetic algorithm (GA) and local search techniques. It appears better able to locate the global minimum compared with a canonical GA. Sample cases investigated here include the optimization design of binary-phase Dammann gratings, continuous surface-relief grating array generators, and a uniform top-hat focal plane intensity profile generator. Two GLSA's whose incorporated local search techniques are the hill-climbing method and the simulated annealing algorithm are investigated. Numerical experimental results demonstrate that the proposed algorithm is highly efficient and robust. DOE's that have high diffraction efficiency and excellent uniformity can be achieved by use of the algorithm we propose.
Genetic algorithms - A new technique for solving the neutron spectrum unfolding problem

International Nuclear Information System (INIS)

Freeman, David W.; Edwards, D. Ray; Bolon, Albert E.

1999-01-01

A new technique utilizing genetic algorithms has been applied to the Bonner sphere neutron spectrum unfolding problem. Genetic algorithms are part of a relatively new field of 'evolutionary' solution techniques that mimic living systems with computer-simulated 'chromosome' solutions. Solutions mate and mutate to create better solutions. Several benchmark problems, considered representative of radiation protection environments, have been evaluated using the newly developed UMRGA code which implements the genetic algorithm unfolding technique. The results are compared with results from other well-established unfolding codes. The genetic algorithm technique works remarkably well and produces solutions with relatively high spectral qualities. UMRGA appears to be a superior technique in the absence of a priori data - it does not rely on 'lucky' guesses of input spectra. Calculated personnel doses associated with the unfolded spectra match benchmark values within a few percent
A modified gravitational search algorithm based on a non-dominated sorting genetic approach for hydro-thermal-wind economic emission dispatching

International Nuclear Information System (INIS)

Chen, Fang; Zhou, Jianzhong; Wang, Chao; Li, Chunlong; Lu, Peng

2017-01-01

Wind power is a type of clean and renewable energy, and reasonable utilization of wind power is beneficial to environmental protection and economic development. Therefore, a short-term hydro-thermal-wind economic emission dispatching (SHTW-EED) problem is presented in this paper. The proposed problem aims to distribute the load among hydro, thermal and wind power units to simultaneously minimize economic cost and pollutant emission. To solve the SHTW-EED problem with complex constraints, a modified gravitational search algorithm based on the non-dominated sorting genetic algorithm-III (MGSA-NSGA-III) is proposed. In the proposed MGSA-NSGA-III, a non-dominated sorting approach, reference-point based selection mechanism and chaotic mutation strategy are applied to improve the evolutionary process of the original gravitational search algorithm (GSA) and maintain the distribution diversity of Pareto optimal solutions. Moreover, a parallel computing strategy is introduced to improve the computational efficiency. Finally, the proposed MGSA-NSGA-III is applied to a typical hydro-thermal-wind system to verify its feasibility and effectiveness. The simulation results indicate that the proposed algorithm can obtain low economic cost and small pollutant emission when dealing with the SHTW-EED problem. - Highlights: • A hybrid algorithm is proposed to handle hydro-thermal-wind power dispatching. • Several improvement strategies are applied to the algorithm. • A parallel computing strategy is applied to improve computational efficiency. • Two cases are analyzed to verify the efficiency of the optimize mode.
Simulating Evolution of Drosophila melanogaster Ebony Mutants Using a Genetic Algorithm

DEFF Research Database (Denmark)

Helles, Glennie

2009-01-01

Genetic algorithms are generally quite easy to understand and work with, and they are a popular choice in many cases. One area in which genetic algorithms are widely and successfully used is artificial life where they are used to simulate evolution of artificial creatures. However, despite...... their suggestive name, simplicity and popularity in artificial life, they do not seem to have gained a footing within the field of population genetics to simulate evolution of real organisms --- possibly because genetic algorithms are based on a rather crude simplification of the evolutionary mechanisms known...
A case study of a multiobjective recombinative genetic algorithm with coevolutionary sharing

NARCIS (Netherlands)

Neef, R.M.; Thierens, D.; Arciszewski, H.F.R.

1999-01-01

We present a multiobjective genetic algorithm that incorporates various genetic algorithm techniques that have been proven to be efficient and robust in their problem domain. More specifically, we integrate rank based selection, adaptive niching through coevolutionary sharing, elitist recombination,
A MapReduce-Based Parallel Frequent Pattern Growth Algorithm for Spatiotemporal Association Analysis of Mobile Trajectory Big Data

Directory of Open Access Journals (Sweden)

Dawen Xia

2018-01-01

Full Text Available Frequent pattern mining is an effective approach for spatiotemporal association analysis of mobile trajectory big data in data-driven intelligent transportation systems. While existing parallel algorithms have been successfully applied to frequent pattern mining of large-scale trajectory data, two major challenges are how to overcome the inherent defects of Hadoop to cope with taxi trajectory big data including massive small files and how to discover the implicitly spatiotemporal frequent patterns with MapReduce. To conquer these challenges, this paper presents a MapReduce-based Parallel Frequent Pattern growth (MR-PFP algorithm to analyze the spatiotemporal characteristics of taxi operating using large-scale taxi trajectories with massive small file processing strategies on a Hadoop platform. More specifically, we first implement three methods, that is, Hadoop Archives (HAR, CombineFileInputFormat (CFIF, and Sequence Files (SF, to overcome the existing defects of Hadoop and then propose two strategies based on their performance evaluations. Next, we incorporate SF into Frequent Pattern growth (FP-growth algorithm and then implement the optimized FP-growth algorithm on a MapReduce framework. Finally, we analyze the characteristics of taxi operating in both spatial and temporal dimensions by MR-PFP in parallel. The results demonstrate that MR-PFP is superior to existing Parallel FP-growth (PFP algorithm in efficiency and scalability.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.