WorldWideScience

Sample records for intel fortran compiler

  1. OpenMP-accelerated SWAT simulation using Intel C and FORTRAN compilers: Development and benchmark

    Science.gov (United States)

    Ki, Seo Jin; Sugimura, Tak; Kim, Albert S.

    2015-02-01

    We developed a practical method to accelerate execution of Soil and Water Assessment Tool (SWAT) using open (free) computational resources. The SWAT source code (rev 622) was recompiled using a non-commercial Intel FORTRAN compiler in Ubuntu 12.04 LTS Linux platform, and newly named iOMP-SWAT in this study. GNU utilities of make, gprof, and diff were used to develop the iOMP-SWAT package, profile memory usage, and check identicalness of parallel and serial simulations. Among 302 SWAT subroutines, the slowest routines were identified using GNU gprof, and later modified using Open Multiple Processing (OpenMP) library in an 8-core shared memory system. In addition, a C wrapping function was used to rapidly set large arrays to zero by cross compiling with the original SWAT FORTRAN package. A universal speedup ratio of 2.3 was achieved using input data sets of a large number of hydrological response units. As we specifically focus on acceleration of a single SWAT run, the use of iOMP-SWAT for parameter calibrations will significantly improve the performance of SWAT optimization.

  2. A Note on Compiling Fortran

    Energy Technology Data Exchange (ETDEWEB)

    Busby, L. E. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2017-09-01

    Fortran modules tend to serialize compilation of large Fortran projects, by introducing dependencies among the source files. If file A depends on file B, (A uses a module defined by B), you must finish compiling B before you can begin compiling A. Some Fortran compilers (Intel ifort, GNU gfortran and IBM xlf, at least) offer an option to ‘‘verify syntax’’, with the side effect of also producing any associated Fortran module files. As it happens, this option usually runs much faster than the object code generation and optimization phases. For some projects on some machines, it can be advantageous to compile in two passes: The first pass generates the module files, quickly; the second pass produces the object files, in parallel. We achieve a 3.8× speedup in the case study below.

  3. VFC: The Vienna Fortran Compiler

    Directory of Open Access Journals (Sweden)

    Siegfried Benkner

    1999-01-01

    Full Text Available High Performance Fortran (HPF offers an attractive high‐level language interface for programming scalable parallel architectures providing the user with directives for the specification of data distribution and delegating to the compiler the task of generating an explicitly parallel program. Available HPF compilers can handle regular codes quite efficiently, but dramatic performance losses may be encountered for applications which are based on highly irregular, dynamically changing data structures and access patterns. In this paper we introduce the Vienna Fortran Compiler (VFC, a new source‐to‐source parallelization system for HPF+, an optimized version of HPF, which addresses the requirements of irregular applications. In addition to extended data distribution and work distribution mechanisms, HPF+ provides the user with language features for specifying certain information that decisively influence a program’s performance. This comprises data locality assertions, non‐local access specifications and the possibility of reusing runtime‐generated communication schedules of irregular loops. Performance measurements of kernels from advanced applications demonstrate that with a high‐level data parallel language such as HPF+ a performance close to hand‐written message‐passing programs can be achieved even for highly irregular codes.

  4. JLAPACK – Compiling LAPACK FORTRAN to Java

    Directory of Open Access Journals (Sweden)

    David M. Doolin

    1999-01-01

    Full Text Available The JLAPACK project provides the LAPACK numerical subroutines translated from their subset Fortran 77 source into class files, executable by the Java Virtual Machine (JVM and suitable for use by Java programmers. This makes it possible for Java applications or applets, distributed on the World Wide Web (WWW to use established legacy numerical code that was originally written in Fortran. The translation is accomplished using a special purpose Fortran‐to‐Java (source‐to‐source compiler. The LAPACK API will be considerably simplified to take advantage of Java’s object‐oriented design. This report describes the research issues involved in the JLAPACK project, and its current implementation and status.

  5. SVM Support in the Vienna Fortran Compilation System

    OpenAIRE

    Brezany, Peter; Gerndt, Michael; Sipkova, Viera

    1994-01-01

    Vienna Fortran, a machine-independent language extension to Fortran which allows the user to write programs for distributed-memory systems using global addresses, provides the forall-loop construct for specifying irregular computations that do not cause inter-iteration dependences. Compilers for distributed-memory systems generate code that is based on runtime analysis techniques and is only efficient if, in addition, aggressive compile-time optimizations are applied. Since these optimization...

  6. Scientific Programming with High Performance Fortran: A Case Study Using the xHPF Compiler

    Directory of Open Access Journals (Sweden)

    Eric De Sturler

    1997-01-01

    Full Text Available Recently, the first commercial High Performance Fortran (HPF subset compilers have appeared. This article reports on our experiences with the xHPF compiler of Applied Parallel Research, version 1.2, for the Intel Paragon. At this stage, we do not expect very High Performance from our HPF programs, even though performance will eventually be of paramount importance for the acceptance of HPF. Instead, our primary objective is to study how to convert large Fortran 77 (F77 programs to HPF such that the compiler generates reasonably efficient parallel code. We report on a case study that identifies several problems when parallelizing code with HPF; most of these problems affect current HPF compiler technology in general, although some are specific for the xHPF compiler. We discuss our solutions from the perspective of the scientific programmer, and presenttiming results on the Intel Paragon. The case study comprises three programs of different complexity with respect to parallelization. We use the dense matrix-matrix product to show that the distribution of arrays and the order of nested loops significantly influence the performance of the parallel program. We use Gaussian elimination with partial pivoting to study the parallelization strategy of the compiler. There are various ways to structure this algorithm for a particular data distribution. This example shows how much effort may be demanded from the programmer to support the compiler in generating an efficient parallel implementation. Finally, we use a small application to show that the more complicated structure of a larger program may introduce problems for the parallelization, even though all subroutines of the application are easy to parallelize by themselves. The application consists of a finite volume discretization on a structured grid and a nested iterative solver. Our case study shows that it is possible to obtain reasonably efficient parallel programs with xHPF, although the compiler

  7. OpenMP GNU and Intel Fortran programs for solving the time-dependent Gross-Pitaevskii equation

    Science.gov (United States)

    Young-S., Luis E.; Muruganandam, Paulsamy; Adhikari, Sadhan K.; Lončar, Vladimir; Vudragović, Dušan; Balaž, Antun

    2017-11-01

    We present Open Multi-Processing (OpenMP) version of Fortran 90 programs for solving the Gross-Pitaevskii (GP) equation for a Bose-Einstein condensate in one, two, and three spatial dimensions, optimized for use with GNU and Intel compilers. We use the split-step Crank-Nicolson algorithm for imaginary- and real-time propagation, which enables efficient calculation of stationary and non-stationary solutions, respectively. The present OpenMP programs are designed for computers with multi-core processors and optimized for compiling with both commercially-licensed Intel Fortran and popular free open-source GNU Fortran compiler. The programs are easy to use and are elaborated with helpful comments for the users. All input parameters are listed at the beginning of each program. Different output files provide physical quantities such as energy, chemical potential, root-mean-square sizes, densities, etc. We also present speedup test results for new versions of the programs. Program files doi:http://dx.doi.org/10.17632/y8zk3jgn84.2 Licensing provisions: Apache License 2.0 Programming language: OpenMP GNU and Intel Fortran 90. Computer: Any multi-core personal computer or workstation with the appropriate OpenMP-capable Fortran compiler installed. Number of processors used: All available CPU cores on the executing computer. Journal reference of previous version: Comput. Phys. Commun. 180 (2009) 1888; ibid.204 (2016) 209. Does the new version supersede the previous version?: Not completely. It does supersede previous Fortran programs from both references above, but not OpenMP C programs from Comput. Phys. Commun. 204 (2016) 209. Nature of problem: The present Open Multi-Processing (OpenMP) Fortran programs, optimized for use with commercially-licensed Intel Fortran and free open-source GNU Fortran compilers, solve the time-dependent nonlinear partial differential (GP) equation for a trapped Bose-Einstein condensate in one (1d), two (2d), and three (3d) spatial dimensions for

  8. New compilers speed up applications for Intel-based systems; Intel Compilers pave the way for Intel's Hyper-threading technology

    CERN Multimedia

    2002-01-01

    "Intel Corporation today introduced updated tools to help software developers optimize applications for Intel's expanding family of architectures with key innovations such as Intel's Hyper Threading Technology (1 page).

  9. GRESS, FORTRAN Pre-compiler with Differentiation Enhancement

    International Nuclear Information System (INIS)

    1999-01-01

    1 - Description of program or function: The GRESS FORTRAN pre-compiler (SYMG) and run-time library are used to enhance conventional FORTRAN-77 programs with analytic differentiation of arithmetic statements for automatic differentiation in either forward or reverse mode. GRESS 3.0 is functionally equivalent to GRESS 2.1. GRESS 2.1 is an improved and updated version of the previous released GRESS 1.1. Improvements in the implementation of a the CHAIN option have resulted in a 70 to 85% reduction in execution time and up to a 50% reduction in memory required for forward chaining applications. 2 - Method of solution: GRESS uses a pre-compiler to analyze FORTRAN statements and determine the mathematical operations embodied in them. As each arithmetic assignment statement in a program is analyzed, SYMG generates the partial derivatives of the term on the left with respect to each floating-point variable on the right. The result of the pre-compilation step is a new FORTRAN program that can produce derivatives for any REAL (i.e., single or double precision) variable calculated by the model. Consequently, GRESS enhances FORTRAN programs or subprograms by adding the calculation of derivatives along with the original output. Derivatives from a GRESS enhanced model can be used internally (e.g., iteration acceleration) or externally (e.g., sensitivity studies). By calling GRESS run-time routines, derivatives can be propagated through the code via the chain rule (referred to as the CHAIN option) or accumulated to create an adjoint matrix (referred to as the ADGEN option). A third option, GENSUB, makes it possible to process a subset of a program (i.e., a do loop, subroutine, function, a sequence of subroutines, or a whole program) for calculating derivatives of dependent variables with respect to independent variables. A code enhanced with the GENSUB option can use forward mode, reverse mode, or a hybrid of the two modes. 3 - Restrictions on the complexity of the problem: GRESS

  10. The FORTRAN NALAP code adapted to a microcomputer compiler

    International Nuclear Information System (INIS)

    Lobo, Paulo David de Castro; Borges, Eduardo Madeira; Braz Filho, Francisco Antonio; Guimaraes, Lamartine Nogueira Frutuoso

    2010-01-01

    The Nuclear Energy Division of the Institute for Advanced Studies (IEAv) is conducting the TERRA project (TEcnologia de Reatores Rapidos Avancados), Technology for Advanced Fast Reactors project, aimed at a space reactor application. In this work, to attend the TERRA project, the NALAP code adapted to a microcomputer compiler called Compaq Visual Fortran (Version 6.6) is presented. This code, adapted from the light water reactor transient code RELAP 3B, simulates thermal-hydraulic responses for sodium cooled fast reactors. The strategy to run the code in a PC was divided in some steps mainly to remove unnecessary routines, to eliminate old statements, to introduce new ones and also to include extension precision mode. The source program was able to solve three sample cases under conditions of protected transients suggested in literature: the normal reactor shutdown, with a delay of 200 ms to start the control rod movement and a delay of 500 ms to stop the pumps; reactor scram after transient of loss of flow; and transients protected from overpower. Comparisons were made with results from the time when the NALAP code was acquired by the IEAv, back in the 80's. All the responses for these three simulations reproduced the calculations performed with the CDC compiler in 1985. Further modifications will include the usage of gas as coolant for the nuclear reactor to allow a Closed Brayton Cycle Loop - CBCL - to be used as a heat/electric converter. (author)

  11. The FORTRAN NALAP code adapted to a microcomputer compiler

    Energy Technology Data Exchange (ETDEWEB)

    Lobo, Paulo David de Castro; Borges, Eduardo Madeira; Braz Filho, Francisco Antonio; Guimaraes, Lamartine Nogueira Frutuoso, E-mail: plobo.a@uol.com.b, E-mail: eduardo@ieav.cta.b, E-mail: fbraz@ieav.cta.b, E-mail: guimarae@ieav.cta.b [Instituto de Estudos Avancados (IEAv/CTA), Sao Jose dos Campos, SP (Brazil)

    2010-07-01

    The Nuclear Energy Division of the Institute for Advanced Studies (IEAv) is conducting the TERRA project (TEcnologia de Reatores Rapidos Avancados), Technology for Advanced Fast Reactors project, aimed at a space reactor application. In this work, to attend the TERRA project, the NALAP code adapted to a microcomputer compiler called Compaq Visual Fortran (Version 6.6) is presented. This code, adapted from the light water reactor transient code RELAP 3B, simulates thermal-hydraulic responses for sodium cooled fast reactors. The strategy to run the code in a PC was divided in some steps mainly to remove unnecessary routines, to eliminate old statements, to introduce new ones and also to include extension precision mode. The source program was able to solve three sample cases under conditions of protected transients suggested in literature: the normal reactor shutdown, with a delay of 200 ms to start the control rod movement and a delay of 500 ms to stop the pumps; reactor scram after transient of loss of flow; and transients protected from overpower. Comparisons were made with results from the time when the NALAP code was acquired by the IEAv, back in the 80's. All the responses for these three simulations reproduced the calculations performed with the CDC compiler in 1985. Further modifications will include the usage of gas as coolant for the nuclear reactor to allow a Closed Brayton Cycle Loop - CBCL - to be used as a heat/electric converter. (author)

  12. PGHPF – An Optimizing High Performance Fortran Compiler for Distributed Memory Machines

    Directory of Open Access Journals (Sweden)

    Zeki Bozkus

    1997-01-01

    Full Text Available High Performance Fortran (HPF is the first widely supported, efficient, and portable parallel programming language for shared and distributed memory systems. HPF is realized through a set of directive-based extensions to Fortran 90. It enables application developers and Fortran end-users to write compact, portable, and efficient software that will compile and execute on workstations, shared memory servers, clusters, traditional supercomputers, or massively parallel processors. This article describes a production-quality HPF compiler for a set of parallel machines. Compilation techniques such as data and computation distribution, communication generation, run-time support, and optimization issues are elaborated as the basis for an HPF compiler implementation on distributed memory machines. The performance of this compiler on benchmark programs demonstrates that high efficiency can be achieved executing HPF code on parallel architectures.

  13. Effective SIMD Vectorization for Intel Xeon Phi Coprocessors

    OpenAIRE

    Tian, Xinmin; Saito, Hideki; Preis, Serguei V.; Garcia, Eric N.; Kozhukhov, Sergey S.; Masten, Matt; Cherkasov, Aleksei G.; Panchenko, Nikolay

    2015-01-01

    Efficiently exploiting SIMD vector units is one of the most important aspects in achieving high performance of the application code running on Intel Xeon Phi coprocessors. In this paper, we present several effective SIMD vectorization techniques such as less-than-full-vector loop vectorization, Intel MIC specific alignment optimization, and small matrix transpose/multiplication 2D vectorization implemented in the Intel C/C++ and Fortran production compilers for Intel Xeon Phi coprocessors. A ...

  14. Installation of a new Fortran compiler and effective programming method on the vector supercomputer

    International Nuclear Information System (INIS)

    Nemoto, Toshiyuki; Suzuki, Koichiro; Watanabe, Kenji; Machida, Masahiko; Osanai, Seiji; Isobe, Nobuo; Harada, Hiroo; Yokokawa, Mitsuo

    1992-07-01

    The Fortran compiler, version 10 has been replaced with the new one, version 12 (V12) on the Fujitsu Computer system at JAERI since May, 1992. The benchmark test for the performance of the V12 compiler is carried out with 16 representative nuclear codes in advance of the installation of the compiler. The performance of the compiler is achieved by the factor of 1.13 in average. The effect of the enhanced functions of the compiler and the compatibility to the nuclear codes are also examined. The assistant tool for vectorization TOP10EX is developed. In this report, the results of the evaluation of the V12 compiler and the usage of the tools for vectorization are presented. (author)

  15. Fortran

    CERN Document Server

    Marateck, Samuel L

    1977-01-01

    FORTRAN is written for students who have no prior knowledge of computers or programming. The book aims to teach students how to program using the FORTRAN language.The publication first elaborates on an introduction to computers and programming, introduction to FORTRAN, and calculations and the READ statement. Discussions focus on flow charts, rounding numbers, strings, executing the program, the WRITE and FORMAT statements, performing an addition, input and output devices, and algorithms. The text then takes a look at functions and the IF statement and the DO Loop, the IF-THEN-ELSE and the WHI

  16. Effective SIMD Vectorization for Intel Xeon Phi Coprocessors

    Directory of Open Access Journals (Sweden)

    Xinmin Tian

    2015-01-01

    Full Text Available Efficiently exploiting SIMD vector units is one of the most important aspects in achieving high performance of the application code running on Intel Xeon Phi coprocessors. In this paper, we present several effective SIMD vectorization techniques such as less-than-full-vector loop vectorization, Intel MIC specific alignment optimization, and small matrix transpose/multiplication 2D vectorization implemented in the Intel C/C++ and Fortran production compilers for Intel Xeon Phi coprocessors. A set of workloads from several application domains is employed to conduct the performance study of our SIMD vectorization techniques. The performance results show that we achieved up to 12.5x performance gain on the Intel Xeon Phi coprocessor. We also demonstrate a 2000x performance speedup from the seamless integration of SIMD vectorization and parallelization.

  17. Domain-Specific Acceleration and Auto-Parallelization of Legacy Scientific Code in FORTRAN 77 using Source-to-Source Compilation

    OpenAIRE

    Vanderbauwhede, Wim; Davidson, Gavin

    2017-01-01

    Massively parallel accelerators such as GPGPUs, manycores and FPGAs represent a powerful and affordable tool for scientists who look to speed up simulations of complex systems. However, porting code to such devices requires a detailed understanding of heterogeneous programming tools and effective strategies for parallelization. In this paper we present a source to source compilation approach with whole-program analysis to automatically transform single-threaded FORTRAN 77 legacy code into Ope...

  18. Programming in Fortran M

    Energy Technology Data Exchange (ETDEWEB)

    Foster, I.; Olson, R.; Tuecke, S.

    1993-08-01

    Fortran M is a small set of extensions to Fortran that supports a modular approach to the construction of sequential and parallel programs. Fortran M programs use channels to plug together processes which may be written in Fortran M or Fortran 77. Processes communicate by sending and receiving messages on channels. Channels and processes can be created dynamically, but programs remain deterministic unless specialized nondeterministic constructs are used. Fortran M programs can execute on a range of sequential, parallel, and networked computers. This report incorporates both a tutorial introduction to Fortran M and a users guide for the Fortran M compiler developed at Argonne National Laboratory. The Fortran M compiler, supporting software, and documentation are made available free of charge by Argonne National Laboratory, but are protected by a copyright which places certain restrictions on how they may be redistributed. See the software for details. The latest version of both the compiler and this manual can be obtained by anonymous ftp from Argonne National Laboratory in the directory pub/fortran-m at info.mcs.anl.gov.

  19. FPP: A Fortran preprocessor

    International Nuclear Information System (INIS)

    Boyarski, A.

    1992-11-01

    FPP is a preprocessor which aids in porting Fortran source code across differing platforms. It provides conditional compilation features to enable or disable sections of code, and can modify file names in INCLUDE statements to a syntax suitable for a target platform. FPP is written Fortran 77, and runs on VM/CMS, VAX/VMS, UNIX, and PC/DOS SYSTEMS

  20. Using Coarrays to Parallelize Legacy Fortran Applications: Strategy and Case Study

    Directory of Open Access Journals (Sweden)

    Hari Radhakrishnan

    2015-01-01

    Full Text Available This paper summarizes a strategy for parallelizing a legacy Fortran 77 program using the object-oriented (OO and coarray features that entered Fortran in the 2003 and 2008 standards, respectively. OO programming (OOP facilitates the construction of an extensible suite of model-verification and performance tests that drive the development. Coarray parallel programming facilitates a rapid evolution from a serial application to a parallel application capable of running on multicore processors and many-core accelerators in shared and distributed memory. We delineate 17 code modernization steps used to refactor and parallelize the program and study the resulting performance. Our initial studies were done using the Intel Fortran compiler on a 32-core shared memory server. Scaling behavior was very poor, and profile analysis using TAU showed that the bottleneck in the performance was due to our implementation of a collective, sequential summation procedure. We were able to improve the scalability and achieve nearly linear speedup by replacing the sequential summation with a parallel, binary tree algorithm. We also tested the Cray compiler, which provides its own collective summation procedure. Intel provides no collective reductions. With Cray, the program shows linear speedup even in distributed-memory execution. We anticipate similar results with other compilers once they support the new collective procedures proposed for Fortran 2015.

  1. Recent advances in PC-Linux systems for electronic structure computations by optimized compilers and numerical libraries.

    Science.gov (United States)

    Yu, Jen-Shiang K; Yu, Chin-Hui

    2002-01-01

    One of the most frequently used packages for electronic structure research, GAUSSIAN 98, is compiled on Linux systems with various hardware configurations, including AMD Athlon (with the "Thunderbird" core), AthlonMP, and AthlonXP (with the "Palomino" core) systems as well as the Intel Pentium 4 (with the "Willamette" core) machines. The default PGI FORTRAN compiler (pgf77) and the Intel FORTRAN compiler (ifc) are respectively employed with different architectural optimization options to compile GAUSSIAN 98 and test the performance improvement. In addition to the BLAS library included in revision A.11 of this package, the Automatically Tuned Linear Algebra Software (ATLAS) library is linked against the binary executables to improve the performance. Various Hartree-Fock, density-functional theories, and the MP2 calculations are done for benchmarking purposes. It is found that the combination of ifc with ATLAS library gives the best performance for GAUSSIAN 98 on all of these PC-Linux computers, including AMD and Intel CPUs. Even on AMD systems, the Intel FORTRAN compiler invariably produces binaries with better performance than pgf77. The enhancement provided by the ATLAS library is more significant for post-Hartree-Fock calculations. The performance on one single CPU is potentially as good as that on an Alpha 21264A workstation or an SGI supercomputer. The floating-point marks by SpecFP2000 have similar trends to the results of GAUSSIAN 98 package.

  2. Standard Fortran

    International Nuclear Information System (INIS)

    Marshall, N.H.

    1981-01-01

    Because of its vast software investment in Fortran programs, the nuclear community has an inherent interest in the evolution of Fortran. This paper reviews the impact of the new Fortran 77 standard and discusses the projected changes which can be expected in the future

  3. MPI to Coarray Fortran: Experiences with a CFD Solver for Unstructured Meshes

    Directory of Open Access Journals (Sweden)

    Anuj Sharma

    2017-01-01

    Full Text Available High-resolution numerical methods and unstructured meshes are required in many applications of Computational Fluid Dynamics (CFD. These methods are quite computationally expensive and hence benefit from being parallelized. Message Passing Interface (MPI has been utilized traditionally as a parallelization strategy. However, the inherent complexity of MPI contributes further to the existing complexity of the CFD scientific codes. The Partitioned Global Address Space (PGAS parallelization paradigm was introduced in an attempt to improve the clarity of the parallel implementation. We present our experiences of converting an unstructured high-resolution compressible Navier-Stokes CFD solver from MPI to PGAS Coarray Fortran. We present the challenges, methodology, and performance measurements of our approach using Coarray Fortran. With the Cray compiler, we observe Coarray Fortran as a viable alternative to MPI. We are hopeful that Intel and open-source implementations could be utilized in the future.

  4. Scientific Computing and Apple's Intel Transition

    CERN Document Server

    CERN. Geneva

    2006-01-01

    Intel's published processor roadmap and how it may affect the future of personal and scientific computing About the speaker: Eric Albert is Senior Software Engineer in Apple's Core Technologies group. During Mac OS X's transition to Intel processors he has worked on almost every part of the operating system, from the OS kernel and compiler tools to appli...

  5. Accessing Intel FPGAs for Acceleration

    CERN Multimedia

    CERN. Geneva

    2018-01-01

    In this presentation, we will discuss the latest tools and products from Intel that enables FPGAs to be deployed as Accelerators. We will first talk about the Acceleration Stack for Intel Xeon CPU with FPGAs which makes it easy to create, verify, and execute functions on the Intel Programmable Acceleration Card in a Data Center. We will then talk about the OpenCL flow which allows parallel software developers to create FPGA systems and deploy them using the OpenCL standard. Next, we will talk about the Intel High-Level Synthesis compiler which can convert C++ code into custom RTL code optimized for Intel FPGAs. Lastly, we will focus on the task of running Machine Learning inference on the FPGA leveraging some of the tools we discussed. About the speaker Karl Qi is Sr. Staff Applications Engineer, Technical Training. He has been with the Customer Training department at Altera/Intel for 8 years. Most recently, he is responsible for all training content relating to High-Level Design tools, including the OpenCL...

  6. Scientific Programming in Fortran

    Directory of Open Access Journals (Sweden)

    W. Van Snyder

    2007-01-01

    Full Text Available The Fortran programming language was designed by John Backus and his colleagues at IBM to reduce the cost of programming scientific applications. IBM delivered the first compiler for its model 704 in 1957. IBM's competitors soon offered incompatible versions. ANSI (ASA at the time developed a standard, largely based on IBM's Fortran IV in 1966. Revisions of the standard were produced in 1977, 1990, 1995 and 2003. Development of a revision, scheduled for 2008, is under way. Unlike most other programming languages, Fortran is periodically revised to keep pace with developments in language and processor design, while revisions largely preserve compatibility with previous versions. Throughout, the focus on scientific programming, and especially on efficient generated programs, has been maintained.

  7. Numerical performance and throughput benchmark for electronic structure calculations in PC-Linux systems with new architectures, updated compilers, and libraries.

    Science.gov (United States)

    Yu, Jen-Shiang K; Hwang, Jenn-Kang; Tang, Chuan Yi; Yu, Chin-Hui

    2004-01-01

    A number of recently released numerical libraries including Automatically Tuned Linear Algebra Subroutines (ATLAS) library, Intel Math Kernel Library (MKL), GOTO numerical library, and AMD Core Math Library (ACML) for AMD Opteron processors, are linked against the executables of the Gaussian 98 electronic structure calculation package, which is compiled by updated versions of Fortran compilers such as Intel Fortran compiler (ifc/efc) 7.1 and PGI Fortran compiler (pgf77/pgf90) 5.0. The ifc 7.1 delivers about 3% of improvement on 32-bit machines compared to the former version 6.0. Performance improved from pgf77 3.3 to 5.0 is also around 3% when utilizing the original unmodified optimization options of the compiler enclosed in the software. Nevertheless, if extensive compiler tuning options are used, the speed can be further accelerated to about 25%. The performances of these fully optimized numerical libraries are similar. The double-precision floating-point (FP) instruction sets (SSE2) are also functional on AMD Opteron processors operated in 32-bit compilation, and Intel Fortran compiler has performed better optimization. Hardware-level tuning is able to improve memory bandwidth by adjusting the DRAM timing, and the efficiency in the CL2 mode is further accelerated by 2.6% compared to that of the CL2.5 mode. The FP throughput is measured by simultaneous execution of two identical copies of each of the test jobs. Resultant performance impact suggests that IA64 and AMD64 architectures are able to fulfill significantly higher throughput than the IA32, which is consistent with the SpecFPrate2000 benchmarks.

  8. 75 FR 48338 - Intel Corporation; Analysis of Proposed Consent Order to Aid Public Comment

    Science.gov (United States)

    2010-08-10

    ... product road maps, its compilers, and product benchmarking (Sections VI, VII, and VIII). The Proposed... alleges that Intel's failure to fully disclose the changes it made to its compilers and libraries... benchmarking organizations the effects of its compiler redesign on non-Intel CPUs. Several benchmarking...

  9. Ada Compiler Validation Summary Report: Certificate Number: 940325S1. 11352 DDC-I DACS Sun SPARC/Solaries to Pentium PM Bare Ada Cross Compiler System, Version 4.6.4 Sun SPARCclassic = Intel Pentium (Operated as Bare Machine) Based in Xpress Desktop (Intel Product Number: XBASE6E4F-B)

    Science.gov (United States)

    1994-03-25

    Best Available Copy REPORT DOCUMENTATION PAGE _V -ONC Uft Xaf. WO -" Am u~~ ns~ 940325SI. 11352 , AVV: 94ddc5OO_3d. Compiler: DACS Sun SPARC/ aonais to...Manual for the Ada Proarammina Language, ANSI/MIL-STD-1815A, February 1983 and ISO 8652-1987. [Pro92] Ada Coupiler Validation Procedures, Version 3.1...objectives found to be irrelevant for the given Ada implementation. ISO International Organization for Standardization. LRM The Ada standard, or

  10. High Performance Fortran for Aerospace Applications

    National Research Council Canada - National Science Library

    Mehrotra, Piyush

    2000-01-01

    .... HPF is a set of Fortran extensions designed to provide users with a high-level interface for programming data parallel scientific applications while delegating to the compiler/runtime system the task...

  11. Intel Galileo essentials

    CERN Document Server

    Grimmett, Richard

    2015-01-01

    This book is for anyone who has ever been curious about using the Intel Galileo to create electronics projects. Some programming background is useful, but if you know how to use a personal computer, with the aid of the step-by-step instructions in this book, you can construct complex electronics projects that use the Intel Galileo.

  12. Home automation with Intel Galileo

    CERN Document Server

    Dundar, Onur

    2015-01-01

    This book is for anyone who wants to learn Intel Galileo for home automation and cross-platform software development. No knowledge of programming with Intel Galileo is assumed, but knowledge of the C programming language is essential.

  13. Programação orientada a objetos em FORTRAN

    OpenAIRE

    Beck, André Teófilo; Bazán, Felipe Alexander Vargas

    2011-01-01

    Este artigo apresenta conceitos fundamentais de programação orientada a objetos (OO) em FORTRAN. Em geral, os usuários de FORTRAN não estão familiarizados com estes conceitos, pois os compiladores desta linguagem não possuíam suporte para programação OO até o recente lançamento da versão 11.1 do compilador Intel Visual FORTRAN. Este compilador suporta a maioria das características de orientação a objetos do padrão FORTRAN 2003, permitindo a atualização de práticas de programaçã...

  14. Performance Issues in High Performance Fortran Implementations of Sensor-Based Applications

    Directory of Open Access Journals (Sweden)

    David R. O'hallaron

    1997-01-01

    Full Text Available Applications that get their inputs from sensors are an important and often overlooked application domain for High Performance Fortran (HPF. Such sensor-based applications typically perform regular operations on dense arrays, and often have latency and through put requirements that can only be achieved with parallel machines. This article describes a study of sensor-based applications, including the fast Fourier transform, synthetic aperture radar imaging, narrowband tracking radar processing, multibaseline stereo imaging, and medical magnetic resonance imaging. The applications are written in a dialect of HPF developed at Carnegie Mellon, and are compiled by the Fx compiler for the Intel Paragon. The main results of the study are that (1 it is possible to realize good performance for realistic sensor-based applications written in HPF and (2 the performance of the applications is determined by the performance of three core operations: independent loops (i.e., loops with no dependences between iterations, reductions, and index permutations. The article discusses the implications for HPF implementations and introduces some simple tests that implementers and users can use to measure the efficiency of the loops, reductions, and index permutations generated by an HPF compiler.

  15. Intel: High Throughput Computing Collaboration: A CERN openlab / Intel collaboration

    CERN Multimedia

    CERN. Geneva

    2015-01-01

    The Intel/CERN High Throughput Computing Collaboration studies the application of upcoming Intel technologies to the very challenging environment of the LHC trigger and data-acquisition systems. These systems will need to transport and process many terabits of data every second, in some cases with tight latency constraints. Parallelisation and tight integration of accelerators and classical CPU via Intel's OmniPath fabric are the key elements in this project.

  16. LFK, FORTRAN Application Performance Test

    International Nuclear Information System (INIS)

    McMahon, F.H.

    1991-01-01

    1 - Description of program or function: LFK, the Livermore FORTRAN Kernels, is a computer performance test that measures a realistic floating-point performance range for FORTRAN applications. Informally known as the Livermore Loops test, the LFK test may be used as a computer performance test, as a test of compiler accuracy (via checksums) and efficiency, or as a hardware endurance test. The LFK test, which focuses on FORTRAN as used in computational physics, measures the joint performance of the computer CPU, the compiler, and the computational structures in units of Mega-flops/sec or Mflops. A C language version of subroutine KERNEL is also included which executes 24 samples of C numerical computation. The 24 kernels are a hydrodynamics code fragment, a fragment from an incomplete Cholesky conjugate gradient code, the standard inner product function of linear algebra, a fragment from a banded linear equations routine, a segment of a tridiagonal elimination routine, an example of a general linear recurrence equation, an equation of state fragment, part of an alternating direction implicit integration code, an integrate predictor code, a difference predictor code, a first sum, a first difference, a fragment from a two-dimensional particle-in-cell code, a part of a one-dimensional particle-in-cell code, an example of how casually FORTRAN can be written, a Monte Carlo search loop, an example of an implicit conditional computation, a fragment of a two-dimensional explicit hydrodynamics code, a general linear recurrence equation, part of a discrete ordinates transport program, a simple matrix calculation, a segment of a Planck distribution procedure, a two-dimensional implicit hydrodynamics fragment, and determination of the location of the first minimum in an array. 2 - Method of solution: CPU performance rates depend strongly on the maturity of FORTRAN compiler machine code optimization. The LFK test-bed executes the set of 24 kernels three times, resetting the DO

  17. Modern Fortran in practice

    NARCIS (Netherlands)

    Markus, A.

    2012-01-01

    From its earliest days, the Fortran programming language has been designed with computing efficiency in mind. The latest standard, Fortran 2008, incorporates a host of modern features, including object-orientation, array operations, user-defined types, and provisions for parallel computing. This

  18. Windows for Intel Macs

    CERN Document Server

    Ogasawara, Todd

    2008-01-01

    Even the most devoted Mac OS X user may need to use Windows XP, or may just be curious about XP and its applications. This Short Cut is a concise guide for OS X users who need to quickly get comfortable and become productive with Windows XP basics on their Macs. It covers: Security Networking ApplicationsMac users can easily install and use Windows thanks to Boot Camp and Parallels Desktop for Mac. Boot Camp lets an Intel-based Mac install and boot Windows XP on its own hard drive partition. Parallels Desktop for Mac uses virtualization technology to run Windows XP (or other operating systems

  19. MORTRAN-2, FORTRAN Language Extension with User-Supplied Macros

    International Nuclear Information System (INIS)

    Cook, A. James; Shustek, L.J.

    1980-01-01

    1 - Description of problem or function: MORTRAN2 is a FORTRAN language extension that permits a relatively easy transition from FORTRAN to a more convenient and structured language. Its features include free-field format; alphanumeric statement labels; flexible comment convention; nested block structure; for-by-to, do, while, until, loop, if-then-else, if-else, exit, and next statements; multiple assignment statements; conditional compilation; and automatic listing indentation. The language is implemented by a macro-based pre-processor and is further extensible by user-defined macros. 2 - Method of solution: The MORTRAN2 pre-processor may be regarded as a compiler whose object code is ANSI Standard FORTRAN. The MORTRAN2 language is dynamically defined by macros which are input at each use of the pre-processor. 3 - Restrictions on the complexity of the problem: The pre-processor output must be accepted by a FORTRAN compiler

  20. INTEL: Intel based systems move up in supercomputing ranks

    CERN Multimedia

    2002-01-01

    "The TOP500 supercomputer rankings released today at the Supercomputing 2002 conference show a dramatic increase in the number of Intel-based systems being deployed in high-performance computing (HPC) or supercomputing areas" (1/2 page).

  1. Classical Fortran programming for engineering and scientific applications

    CERN Document Server

    Kupferschmid, Michael

    2009-01-01

    IntroductionWhy Study Programming?The Evolution of FORTRANWhy Study FORTRAN?Classical FORTRANAbout This BookAdvice to InstructorsAbout the AuthorAcknowledgmentsDisclaimersHello, World!Case Study: A First FORTRAN ProgramCompiling the ProgramRunning a Program in UNIXOmissionsExpressions and Assignment StatementsConstantsVariables and Variable NamesArithmetic OperatorsFunction ReferencesExpressionsA

  2. An INTEL 8080 microprocessor development system

    International Nuclear Information System (INIS)

    Horne, P.J.

    1977-01-01

    The INTEL 8080 has become one of the two most widely used microprocessors at CERN, the other being the MOTOROLA 6800. Even thouth this is the case, there have been, to date, only rudimentary facilities available for aiding the development of application programs for this microprocessor. An ideal development system is one which has a sophisticated editing and filing system, an assembler/compiler, and access to the microprocessor application. In many instances access to a PROM programmer is also required, as the application may utilize only PROMs for program storage. With these thoughts in mind, an INTEL 8080 microprocessor development system was implemented in the Proton Synchrotron (PS) Division. This system utilizes a PDP 11/45 as the editing and file-handling machine, and an MSC 8/MOD 80 microcomputer for assembling, PROM programming and debugging user programs at run time. The two machines are linked by an existing CAMAC crate system which will also provide the means of access to microprocessor applications in CAMAC and the interface of the development system to any other application. (Auth.)

  3. Towards Porting a Real-World Seismological Application to the Intel MIC Architecture

    OpenAIRE

    V. Weinberg

    2014-01-01

    This whitepaper aims to discuss first experiences with porting an MPI-based real-world geophysical application to the new Intel Many Integrated Core (MIC) architecture. The selected code SeisSol is an application written in Fortran that can be used to simulate earthquake rupture and radiating seismic wave propagation in complex 3-D heterogeneous materials. The PRACE prototype cluster EURORA at CINECA, Italy, was accessed to analyse the MPI-performance of SeisSol on Intel Xeon Phi on both sing...

  4. Multi-processing CTH: Porting legacy FORTRAN code to MP hardware

    Energy Technology Data Exchange (ETDEWEB)

    Bell, R.L.; Elrick, M.G.; Hertel, E.S. Jr.

    1996-12-31

    CTH is a family of codes developed at Sandia National Laboratories for use in modeling complex multi-dimensional, multi-material problems that are characterized by large deformations and/or strong shocks. A two-step, second-order accurate Eulerian solution algorithm is used to solve the mass, momentum, and energy conservation equations. CTH has historically been run on systems where the data are directly accessible to the cpu, such as workstations and vector supercomputers. Multiple cpus can be used if all data are accessible to all cpus. This is accomplished by placing compiler directives or subroutine calls within the source code. The CTH team has implemented this scheme for Cray shared memory machines under the Unicos operating system. This technique is effective, but difficult to port to other (similar) shared memory architectures because each vendor has a different format of directives or subroutine calls. A different model of high performance computing is one where many (> 1,000) cpus work on a portion of the entire problem and communicate by passing messages that contain boundary data. Most, if not all, codes that run effectively on parallel hardware were written with a parallel computing paradigm in mind. Modifying an existing code written for serial nodes poses a significantly different set of challenges that will be discussed. CTH, a legacy FORTRAN code, has been modified to allow for solutions on distributed memory parallel computers such as the IBM SP2, the Intel Paragon, Cray T3D, or a network of workstations. The message passing version of CTH will be discussed and example calculations will be presented along with performance data. Current timing studies indicate that CTH is 2--3 times faster than equivalent C++ code written specifically for parallel hardware. CTH on the Intel Paragon exhibits linear speed up with problems that are scaled (constant problem size per node) for the number of parallel nodes.

  5. An Initial Evaluation of the NAG f90 Compiler

    Directory of Open Access Journals (Sweden)

    Michael Metcalf

    1992-01-01

    Full Text Available A few weeks before the formal publication of the ISO Fortran 90 Standard, NAG announced the world's first f90 compiler. We have evaluated the compiler by using it to assess the impact of Fortran 90 on the CERN Program Library.

  6. Intel Xeon Phi coprocessor high performance programming

    CERN Document Server

    Jeffers, James

    2013-01-01

    Authors Jim Jeffers and James Reinders spent two years helping educate customers about the prototype and pre-production hardware before Intel introduced the first Intel Xeon Phi coprocessor. They have distilled their own experiences coupled with insights from many expert customers, Intel Field Engineers, Application Engineers and Technical Consulting Engineers, to create this authoritative first book on the essentials of programming for this new architecture and these new products. This book is useful even before you ever touch a system with an Intel Xeon Phi coprocessor. To ensure that your applications run at maximum efficiency, the authors emphasize key techniques for programming any modern parallel computing system whether based on Intel Xeon processors, Intel Xeon Phi coprocessors, or other high performance microprocessors. Applying these techniques will generally increase your program performance on any system, and better prepare you for Intel Xeon Phi coprocessors and the Intel MIC architecture. It off...

  7. Fortran code for SU(3) lattice gauge theory with and without MPI checkerboard parallelization

    Science.gov (United States)

    Berg, Bernd A.; Wu, Hao

    2012-10-01

    We document plain Fortran and Fortran MPI checkerboard code for Markov chain Monte Carlo simulations of pure SU(3) lattice gauge theory with the Wilson action in D dimensions. The Fortran code uses periodic boundary conditions and is suitable for pedagogical purposes and small scale simulations. For the Fortran MPI code two geometries are covered: the usual torus with periodic boundary conditions and the double-layered torus as defined in the paper. Parallel computing is performed on checkerboards of sublattices, which partition the full lattice in one, two, and so on, up to D directions (depending on the parameters set). For updating, the Cabibbo-Marinari heatbath algorithm is used. We present validations and test runs of the code. Performance is reported for a number of currently used Fortran compilers and, when applicable, MPI versions. For the parallelized code, performance is studied as a function of the number of processors. Program summary Program title: STMC2LSU3MPI Catalogue identifier: AEMJ_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEMJ_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 26666 No. of bytes in distributed program, including test data, etc.: 233126 Distribution format: tar.gz Programming language: Fortran 77 compatible with the use of Fortran 90/95 compilers, in part with MPI extensions. Computer: Any capable of compiling and executing Fortran 77 or Fortran 90/95, when needed with MPI extensions. Operating system: Red Hat Enterprise Linux Server 6.1 with OpenMPI + pgf77 11.8-0, Centos 5.3 with OpenMPI + gfortran 4.1.2, Cray XT4 with MPICH2 + pgf90 11.2-0. Has the code been vectorised or parallelized?: Yes, parallelized using MPI extensions. Number of processors used: 2 to 11664 RAM: 200 Mega bytes per process. Classification: 11

  8. Roofline Analysis in the Intel® Advisor to Deliver Optimized Performance for applications on Intel® Xeon Phi™ Processor

    Energy Technology Data Exchange (ETDEWEB)

    Koskela, Tuomas S.; Lobet, Mathieu; Deslippe, Jack; Matveev, Zakhar

    2017-05-23

    recent optimizations to its collision kernel [4], most of the computing time is spent in the electron push (pushe) kernel, where these optimization efforts have been focused. The kernel code scaled well with MPI+OpenMP but had almost no automatic compiler vectorization, in part due to indirect memory addresses and in part due to low trip counts of low-level loops that would be candidates for vectorization. Particle blocking and sorting have been implemented to increase trip counts of low-level loops and improve memory locality, and OpenMP directives have been added to vectorize compute-intensive loops that were identified by Advisor. The optimizations have improved the performance of the pushe kernel 2x on Haswell processors and 1.7x on KNL. The KNL node-for-node performance has been brought to within 30% of a NERSC Cori phase I Haswell node and we expect to bridge this gap by reducing the memory footprint of compute intensive routines to improve cache reuse. ## PICSAR is a Fortran/Python high-performance Particle-In-Cell library targeting at MIC architectures first designed to be coupled with the PIC code WARP for the simulation of laser-matter interaction and particle accelerators. PICSAR also contains a FORTRAN stand-alone kernel for performance studies and benchmarks. A MPI domain decomposition is used between NUMA domains and a tile decomposition (cache-blocking) handled by OpenMP has been added for shared-memory parallelism and better cache management. The so-called current deposition and field gathering steps that compose the PIC time loop constitute major hotspots that have been rewritten to enable more efficient vectorization. Particle communications between tiles and MPI domain has been merged and parallelized. All considered, these improvements provide speedups of 3.1 for order 1 and 4.6 for order 3 interpolation shape factors on KNL configured in SNC4 quadrant flat mode. Performance is similar between a node of cori phase 1 and KNL at order 1 and better on

  9. Compiler issues associated with safety-related software

    International Nuclear Information System (INIS)

    Feinauer, L.R.

    1991-01-01

    A critical issue in the quality assurance of safety-related software is the ability of the software to produce identical results, independent of the host machine, operating system, or compiler version under which the software is installed. A study is performed using the VIPRE-0l, FREY-01, and RETRAN-02 safety-related codes. Results from an IBM 3083 computer are compared with results from a CYBER 860 computer. All three of the computer programs examined are written in FORTRAN; the VIPRE code uses the FORTRAN 66 compiler, whereas the FREY and RETRAN codes use the FORTRAN 77 compiler. Various compiler options are studied to determine their effect on the output between machines. Since the Control Data Corporation and IBM machines inherently represent numerical data differently, methods of producing equivalent accuracy of data representation were an important focus of the study. This paper identifies particular problems in the automatic double-precision option (AUTODBL) of the IBM FORTRAN 1.4.x series of compilers. The IBM FORTRAN version 2 compilers provide much more stable, reliable compilation for engineering software. Careful selection of compilers and compiler options can help guarantee identical results between different machines. To ensure reproducibility of results, the same compiler and compiler options should be used to install the program as were used in the development and testing of the program

  10. Portable parallel programming in a Fortran environment

    International Nuclear Information System (INIS)

    May, E.N.

    1989-01-01

    Experience using the Argonne-developed PARMACs macro package to implement a portable parallel programming environment is described. Fortran programs with intrinsic parallelism of coarse and medium granularity are easily converted to parallel programs which are portable among a number of commercially available parallel processors in the class of shared-memory bus-based and local-memory network based MIMD processors. The parallelism is implemented using standard UNIX (tm) tools and a small number of easily understood synchronization concepts (monitors and message-passing techniques) to construct and coordinate multiple cooperating processes on one or many processors. Benchmark results are presented for parallel computers such as the Alliant FX/8, the Encore MultiMax, the Sequent Balance, the Intel iPSC/2 Hypercube and a network of Sun 3 workstations. These parallel machines are typical MIMD types with from 8 to 30 processors, each rated at from 1 to 10 MIPS processing power. The demonstration code used for this work is a Monte Carlo simulation of the response to photons of a ''nearly realistic'' lead, iron and plastic electromagnetic and hadronic calorimeter, using the EGS4 code system. 6 refs., 2 figs., 2 tabs

  11. Strategies and Experiences Using High Performance Fortran

    National Research Council Canada - National Science Library

    Shires, Dale

    2001-01-01

    .... High performance Fortran (HPF) is a relative new addition to the Fortran dialect It is an attempt to provide an efficient high-level Fortran parallel programming language for the latest generation of been debatable...

  12. Optimizing Performance of Combustion Chemistry Solvers on Intel's Many Integrated Core (MIC) Architectures

    Energy Technology Data Exchange (ETDEWEB)

    Sitaraman, Hariswaran [National Renewable Energy Laboratory (NREL), Golden, CO (United States); Grout, Ray W [National Renewable Energy Laboratory (NREL), Golden, CO (United States)

    2017-06-09

    This work investigates novel algorithm designs and optimization techniques for restructuring chemistry integrators in zero and multidimensional combustion solvers, which can then be effectively used on the emerging generation of Intel's Many Integrated Core/Xeon Phi processors. These processors offer increased computing performance via large number of lightweight cores at relatively lower clock speeds compared to traditional processors (e.g. Intel Sandybridge/Ivybridge) used in current supercomputers. This style of processor can be productively used for chemistry integrators that form a costly part of computational combustion codes, in spite of their relatively lower clock speeds. Performance commensurate with traditional processors is achieved here through the combination of careful memory layout, exposing multiple levels of fine grain parallelism and through extensive use of vendor supported libraries (Cilk Plus and Math Kernel Libraries). Important optimization techniques for efficient memory usage and vectorization have been identified and quantified. These optimizations resulted in a factor of ~ 3 speed-up using Intel 2013 compiler and ~ 1.5 using Intel 2017 compiler for large chemical mechanisms compared to the unoptimized version on the Intel Xeon Phi. The strategies, especially with respect to memory usage and vectorization, should also be beneficial for general purpose computational fluid dynamics codes.

  13. An environment for parallel structuring of Fortran programs

    International Nuclear Information System (INIS)

    Sridharan, K.; McShea, M.; Denton, C.; Eventoff, B.; Browne, J.C.; Newton, P.; Ellis, M.; Grossbard, D.; Wise, T.; Clemmer, D.

    1990-01-01

    The paper describes and illustrates an environment for interactive support of the detection and implementation of macro-level parallelism in Fortran programs. The approach couples algorithms for dependence analysis with both innovative techniques for complexity management and capabilities for the measurement and analysis of the parallel computation structures generated through use of the environment. The resulting environment is complementary to the more common approach of seeking local parallelism by loop unrolling, either by an automatic compiler or manually. (orig.)

  14. A Case Study of Some Issues in the Optimization of Fortran 90 Array Notation

    Directory of Open Access Journals (Sweden)

    John D. McCalpin

    1996-01-01

    Full Text Available Some issues in the relationship of coding style and compiler optimization are discussed with regard to Fortran 90 array notation. A review of several important Fortran 90 array constructs and their performance on vector and scalar hardware sets the stage for a more detailed example based on the kernel of a finite difference computational fluid dynamics model, specifically the nonlinear shallow water equations. Special attention is paid to the optimization of memory use and memory traffic. It is shown that the style of coding interacts with the rules of Fortran 90 and the current state of the art of Fortran 90 compilers to produce a fairly wide range of performance levels. Although performance degradations are typically small, a few cases of more serious loss of effciency are identified and discussed.

  15. Development of the static analyzer ANALYSIS/EX for FORTRAN programs

    International Nuclear Information System (INIS)

    Osanai, Seiji; Yokokawa, Mitsuo

    1993-08-01

    The static analyzer 'ANALYSIS' is the software tool for analyzing tree structure and COMMON regions of a FORTRAN program statically. With the installation of the new FORTRAN compiler, FORTRAN77EX(V12), to the computer system at JAERI, a new version of ANALYSIS, 'ANALYSIS/EX', has been developed to enhance its analyzing functions. In addition to the conventional functions of ANALYSIS, the ANALYSIS/EX is capable of analyzing of FORTRAN programs written in the FORTRAN77EX(V12) language grammar such as large-scale nuclear codes. The analyzing function of COMMON regions are also improved so as to obtain the relation between variables in COMMON regions in more detail. In this report, results of improvement and enhanced functions of the static analyzer ANALYSIS/EX are presented. (author)

  16. Unlock performance secrets of next-gen Intel hardware

    CERN Multimedia

    CERN. Geneva

    2015-01-01

    Intel® Xeon Phi Product. About the speaker Zakhar is a software architect in Intel SSG group. His current role is Parallel Studio architect with focus on SIMD vector parallelism assistance tools. Before it he was working as Intel Advisor XE software architect and software development team-lead. Before joining Intel he was...

  17. Intel Corporation osaleb Eesti koolitusprogrammis / Raivo Juurak

    Index Scriptorium Estoniae

    Juurak, Raivo, 1949-

    2001-01-01

    Haridusministeeriumis tutvustati infotehnoloogiaalast koolitusprogrammi, milles osaleb maailma suuremaid arvutifirmasid Intel Corporation. Koolituskursuse käigus õpetatakse aineõpetajaid oma ainetundides interneti võimalusi kasutama. 50-tunnised kursused viiakse läbi kõigis maakondades

  18. Applications Performance on NAS Intel Paragon XP/S - 15#

    Science.gov (United States)

    Saini, Subhash; Simon, Horst D.; Copper, D. M. (Technical Monitor)

    1994-01-01

    The Numerical Aerodynamic Simulation (NAS) Systems Division received an Intel Touchstone Sigma prototype model Paragon XP/S- 15 in February, 1993. The i860 XP microprocessor with an integrated floating point unit and operating in dual -instruction mode gives peak performance of 75 million floating point operations (NIFLOPS) per second for 64 bit floating point arithmetic. It is used in the Paragon XP/S-15 which has been installed at NAS, NASA Ames Research Center. The NAS Paragon has 208 nodes and its peak performance is 15.6 GFLOPS. Here, we will report on early experience using the Paragon XP/S- 15. We have tested its performance using both kernels and applications of interest to NAS. We have measured the performance of BLAS 1, 2 and 3 both assembly-coded and Fortran coded on NAS Paragon XP/S- 15. Furthermore, we have investigated the performance of a single node one-dimensional FFT, a distributed two-dimensional FFT and a distributed three-dimensional FFT Finally, we measured the performance of NAS Parallel Benchmarks (NPB) on the Paragon and compare it with the performance obtained on other highly parallel machines, such as CM-5, CRAY T3D, IBM SP I, etc. In particular, we investigated the following issues, which can strongly affect the performance of the Paragon: a. Impact of the operating system: Intel currently uses as a default an operating system OSF/1 AD from the Open Software Foundation. The paging of Open Software Foundation (OSF) server at 22 MB to make more memory available for the application degrades the performance. We found that when the limit of 26 NIB per node out of 32 MB available is reached, the application is paged out of main memory using virtual memory. When the application starts paging, the performance is considerably reduced. We found that dynamic memory allocation can help applications performance under certain circumstances. b. Impact of data cache on the i860/XP: We measured the performance of the BLAS both assembly coded and Fortran

  19. Cloudy's Journey from FORTRAN to C, Why and How

    Science.gov (United States)

    Ferland, G. J.

    Cloudy is a large-scale plasma simulation code that is widely used across the astronomical community as an aid in the interpretation of spectroscopic data. The cover of the ADAS VI book featured predictions of the code. The FORTRAN 77 source code has always been freely available on the Internet, contributing to its widespread use. The coming of PCs and Linux has fundamentally changed the computing environment. Modern Fortran compilers (F90 and F95) are not freely available. A common-use code must be written in either FORTRAN 77 or C to be Open Source/GNU/Linux friendly. F77 has serious drawbacks - modern language constructs cannot be used, students do not have skills in this language, and it does not contribute to their future employability. It became clear that the code would have to be ported to C to have a viable future. I describe the approach I used to convert Cloudy from FORTRAN 77 with MILSPEC extensions to ANSI/ISO 89 C. Cloudy is now openly available as a C code, and will evolve to C++ as gcc and standard C++ mature. Cloudy looks to a bright future with a modern language.

  20. Replacing Fortran Namelists with JSON

    Science.gov (United States)

    Robinson, T. E., Jr.

    2017-12-01

    Maintaining a log of input parameters for a climate model is very important to understanding potential causes for answer changes during the development stages. Additionally, since modern Fortran is now interoperable with C, a more modern approach to software infrastructure to include code written in C is necessary. Merging these two separate facets of climate modeling requires a quality control for monitoring changes to input parameters and model defaults that can work with both Fortran and C. JSON will soon replace namelists as the preferred key/value pair input in the GFDL model. By adding a JSON parser written in C into the model, the input can be used by all functions and subroutines in the model, errors can be handled by the model instead of by the internal namelist parser, and the values can be output into a single file that is easily parsable by readily available tools. Input JSON files can handle all of the functionality of a namelist while being portable between C and Fortran. Fortran wrappers using unlimited polymorphism are crucial to allow for simple and compact code which avoids the need for many subroutines contained in an interface. Errors can be handled with more detail by providing information about location of syntax errors or typos. The output JSON provides a ground truth for values that the model actually uses by providing not only the values loaded through the input JSON, but also any default values that were not included. This kind of quality control on model input is crucial for maintaining reproducibility and understanding any answer changes resulting from changes in the input.

  1. Fortran 90 for scientists and engineers

    CERN Document Server

    Hahn, Brian

    1994-01-01

    The introduction of the Fortran 90 standard is the first significant change in the Fortran language in over 20 years. this book is designed for anyone wanting to learn Fortran for the first time or or a programmer who needs to upgrade from Fortran 77 to Fortran 90.Employing a practical, problem-based approach this book provides a comprehensive introduction to the language. More experienced programmers will find it a useful update to the new standard and will benefit from the emphasis on science and engineering applications.

  2. Extension of the AMBER molecular dynamics software to Intel's Many Integrated Core (MIC) architecture

    Science.gov (United States)

    Needham, Perri J.; Bhuiyan, Ashraf; Walker, Ross C.

    2016-04-01

    We present an implementation of explicit solvent particle mesh Ewald (PME) classical molecular dynamics (MD) within the PMEMD molecular dynamics engine, that forms part of the AMBER v14 MD software package, that makes use of Intel Xeon Phi coprocessors by offloading portions of the PME direct summation and neighbor list build to the coprocessor. We refer to this implementation as pmemd MIC offload and in this paper present the technical details of the algorithm, including basic models for MPI and OpenMP configuration, and analyze the resultant performance. The algorithm provides the best performance improvement for large systems (>400,000 atoms), achieving a ∼35% performance improvement for satellite tobacco mosaic virus (1,067,095 atoms) when 2 Intel E5-2697 v2 processors (2 ×12 cores, 30M cache, 2.7 GHz) are coupled to an Intel Xeon Phi coprocessor (Model 7120P-1.238/1.333 GHz, 61 cores). The implementation utilizes a two-fold decomposition strategy: spatial decomposition using an MPI library and thread-based decomposition using OpenMP. We also present compiler optimization settings that improve the performance on Intel Xeon processors, while retaining simulation accuracy.

  3. DTK C/Fortran Interface Development for NEAMS FSI Simulations

    Energy Technology Data Exchange (ETDEWEB)

    Slattery, Stuart R. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Lebrun-Grandie, Damien T. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

    2016-09-19

    This report documents the development of DataTransferKit (DTK) C and Fortran interfaces for fluid-structure-interaction (FSI) simulations in NEAMS. In these simulations, the codes Nek5000 and Diablo are being coupled within the SHARP framework to study flow-induced vibration (FIV) in reactor steam generators. We will review the current Nek5000/Diablo coupling algorithm in SHARP and the current state of the solution transfer scheme used in this implementation. We will then present existing DTK algorithms which may be used instead to provide an improvement in both flexibility and scalability of the current SHARP implementation. We will show how these can be used within the current FSI scheme using a new set of interfaces to the algorithms developed by this work. These new interfaces currently expose the mesh-free solution transfer algorithms in DTK, a C++ library, and are written in C and Fortran to enable coupling of both Nek5000 and Diablo in their native Fortran language. They have been compiled and tested on Cooley, the test-bed machine for Mira at ALCF.

  4. Automatic Loop Parallelization via Compiler Guided Refactoring

    DEFF Research Database (Denmark)

    Larsen, Per; Ladelsky, Razya; Lidman, Jacob

    For many parallel applications, performance relies not on instruction-level parallelism, but on loop-level parallelism. Unfortunately, many modern applications are written in ways that obstruct automatic loop parallelization. Since we cannot identify sufficient parallelization opportunities...... for these codes in a static, off-line compiler, we developed an interactive compilation feedback system that guides the programmer in iteratively modifying application source, thereby improving the compiler’s ability to generate loop-parallel code. We use this compilation system to modify two sequential...... benchmarks, finding that the code parallelized in this way runs up to 8.3 times faster on an octo-core Intel Xeon 5570 system and up to 12.5 times faster on a quad-core IBM POWER6 system. Benchmark performance varies significantly between the systems. This suggests that semi-automatic parallelization should...

  5. Theorem Proving in Intel Hardware Design

    Science.gov (United States)

    O'Leary, John

    2009-01-01

    For the past decade, a framework combining model checking (symbolic trajectory evaluation) and higher-order logic theorem proving has been in production use at Intel. Our tools and methodology have been used to formally verify execution cluster functionality (including floating-point operations) for a number of Intel products, including the Pentium(Registered TradeMark)4 and Core(TradeMark)i7 processors. Hardware verification in 2009 is much more challenging than it was in 1999 - today s CPU chip designs contain many processor cores and significant firmware content. This talk will attempt to distill the lessons learned over the past ten years, discuss how they apply to today s problems, outline some future directions.

  6. CERN welcomes Intel Science Fair winners

    CERN Multimedia

    Katarina Anthony

    2012-01-01

    This June, CERN welcomed twelve gifted young scientists aged 15-18 for a week-long visit of the Laboratory. These talented students were the winners of a special award co-funded by CERN and Intel, given yearly at the Intel International Science and Engineering Fair (ISEF).   The CERN award winners at the Intel ISEF 2012 Special Awards Ceremony. © Society for Science & the Public (SSP). The CERN award was set up back in 2009 as an opportunity to bring some of the best and brightest young minds to the Laboratory. The award winners are selected from among 1,500 talented students participating in ISEF – the world's largest pre-university science competition, in which students compete for more than €3 million in awards. “CERN gave an award – which was obviously this trip – to students studying physics, maths, electrical engineering and computer science,” says Benjamin Craig Bartlett, 17, from South Carolina, USA, wh...

  7. An Introduction to High Performance Fortran

    Directory of Open Access Journals (Sweden)

    John Merlin

    1995-01-01

    Full Text Available High Performance Fortran (HPF is an informal standard for extensions to Fortran 90 to assist its implementation on parallel architectures, particularly for data-parallel computation. Among other things, it includes directives for specifying data distribution across multiple memories, and concurrent execution features. This article provides a tutorial introduction to the main features of HPF.

  8. READDATA: a FORTRAN 77 codeword input package

    International Nuclear Information System (INIS)

    Lander, P.A.

    1983-07-01

    A new codeword input package has been produced as a result of the incompatibility between different dialects of FORTRAN, especially when character variables are passed as parameters. This report is for those who wish to use a codeword input package with FORTRAN 77. The package, called ''Readdata'', attempts to combine the best features of its predecessors such as BINPUT and pseudo-BINPUT. (author)

  9. Kemari: A Portable High Performance Fortran System for Distributed Memory Parallel Processors

    Directory of Open Access Journals (Sweden)

    T. Kamachi

    1997-01-01

    Full Text Available We have developed a compilation system which extends High Performance Fortran (HPF in various aspects. We support the parallelization of well-structured problems with loop distribution and alignment directives similar to HPF's data distribution directives. Such directives give both additional control to the user and simplify the compilation process. For the support of unstructured problems, we provide directives for dynamic data distribution through user-defined mappings. The compiler also allows integration of message-passing interface (MPI primitives. The system is part of a complete programming environment which also comprises a parallel debugger and a performance monitor and analyzer. After an overview of the compiler, we describe the language extensions and related compilation mechanisms in detail. Performance measurements demonstrate the compiler's applicability to a variety of application classes.

  10. Visualization of Distributed Data Structures for High Performance Fortran-Like Languages

    Directory of Open Access Journals (Sweden)

    Rainer Koppler

    1997-01-01

    Full Text Available This article motivates the usage of graphics and visualization for efficient utilization of High Performance Fortran's (HPF's data distribution facilities. It proposes a graphical toolkit consisting of exploratory and estimation tools which allow the programmer to navigate through complex distributions and to obtain graphical ratings with respect to load distribution and communication. The toolkit has been implemented in a mapping design and visualization tool which is coupled with a compilation system for the HPF predecessor Vienna Fortran. Since this language covers a superset of HPF's facilities, the tool may also be used for visualization of HPF data structures.

  11. DB90: A Fortran Callable Relational Database Routine for Scientific and Engineering Computer Programs

    Science.gov (United States)

    Wrenn, Gregory A.

    2005-01-01

    This report describes a database routine called DB90 which is intended for use with scientific and engineering computer programs. The software is written in the Fortran 90/95 programming language standard with file input and output routines written in the C programming language. These routines should be completely portable to any computing platform and operating system that has Fortran 90/95 and C compilers. DB90 allows a program to supply relation names and up to 5 integer key values to uniquely identify each record of each relation. This permits the user to select records or retrieve data in any desired order.

  12. Experience with Intel's Many Integrated Core Architecture in ATLAS Software

    CERN Document Server

    Fleischmann, S; The ATLAS collaboration; Lavrijsen, W; Neumann, M; Vitillo, R

    2014-01-01

    Intel recently released the first commercial boards of its Many Integrated Core (MIC) Architecture. MIC is Intel's solution for the domain of throughput computing, currently dominated by general purpose programming on graphics processors (GPGPU). MIC allows the use of the more familiar x86 programming model and supports standard technologies such as OpenMP, MPI, and Intel's Threading Building Blocks. This should make it possible to develop for both throughput and latency devices using a single code base.\

  13. Experience with Intel's Many Integrated Core Architecture in ATLAS Software

    CERN Document Server

    Fleischmann, S; The ATLAS collaboration; Lavrijsen, W; Neumann, M; Vitillo, R

    2013-01-01

    Intel recently released the first commercial boards of its Many Integrated Core (MIC) Architecture. MIC is Intel's solution for the domain of throughput computing, currently dominated by general purpose programming on graphics processors (GPGPU). MIC allows the use of the more familiar x86 programming model and supports standard technologies such as OpenMP, MPI, and Intel's Threading Building Blocks. This should make it possible to develop for both throughput and latency devices using a single code base.\

  14. Parallel Programming with Intel Parallel Studio XE

    CERN Document Server

    Blair-Chappell , Stephen

    2012-01-01

    Optimize code for multi-core processors with Intel's Parallel Studio Parallel programming is rapidly becoming a "must-know" skill for developers. Yet, where to start? This teach-yourself tutorial is an ideal starting point for developers who already know Windows C and C++ and are eager to add parallelism to their code. With a focus on applying tools, techniques, and language extensions to implement parallelism, this essential resource teaches you how to write programs for multicore and leverage the power of multicore in your programs. Sharing hands-on case studies and real-world examples, the

  15. Fortran Testing and Refactoring Infrastructure, Phase I

    Data.gov (United States)

    National Aeronautics and Space Administration — Tech-X proposes to develop a comprehensive Fortran testing and refactoring infrastructure that allows developers and scientists to leverage the benefits of a...

  16. Fortran Testing and Refactoring Infrastructure, Phase II

    Data.gov (United States)

    National Aeronautics and Space Administration — Tech-X proposes to develop a comprehensive Fortran testing and refactoring infrastructure that allows developers and scientists to leverage the benefits of...

  17. Trusted Computing Technologies, Intel Trusted Execution Technology.

    Energy Technology Data Exchange (ETDEWEB)

    Guise, Max Joseph; Wendt, Jeremy Daniel

    2011-01-01

    We describe the current state-of-the-art in Trusted Computing Technologies - focusing mainly on Intel's Trusted Execution Technology (TXT). This document is based on existing documentation and tests of two existing TXT-based systems: Intel's Trusted Boot and Invisible Things Lab's Qubes OS. We describe what features are lacking in current implementations, describe what a mature system could provide, and present a list of developments to watch. Critical systems perform operation-critical computations on high importance data. In such systems, the inputs, computation steps, and outputs may be highly sensitive. Sensitive components must be protected from both unauthorized release, and unauthorized alteration: Unauthorized users should not access the sensitive input and sensitive output data, nor be able to alter them; the computation contains intermediate data with the same requirements, and executes algorithms that the unauthorized should not be able to know or alter. Due to various system requirements, such critical systems are frequently built from commercial hardware, employ commercial software, and require network access. These hardware, software, and network system components increase the risk that sensitive input data, computation, and output data may be compromised.

  18. Radiation Failures in Intel 14nm Microprocessors

    Science.gov (United States)

    Bossev, Dobrin P.; Duncan, Adam R.; Gadlage, Matthew J.; Roach, Austin H.; Kay, Matthew J.; Szabo, Carl; Berger, Tammy J.; York, Darin A.; Williams, Aaron; LaBel, K.; hide

    2016-01-01

    In this study the 14 nm Intel Broadwell 5th generation core series 5005U-i3 and 5200U-i5 was mounted on Dell Inspiron laptops, MSI Cubi and Gigabyte Brix barebones and tested with Windows 8 and CentOS7 at idle. Heavy-ion-induced hard- and catastrophic failures do not appear to be related to the Intel 14nm Tri-Gate FinFET process. They originate from a small (9 m 140 m) area on the 32nm planar PCH die (not the CPU) as initially speculated. The hard failures seem to be due to a SEE but the exact physical mechanism has yet to be identified. Some possibilities include latch-ups, charge ion trapping or implantation, ion channels, or a combination of those (in biased conditions). The mechanism of the catastrophic failures seems related to the presence of electric power (1.05V core voltage). The 1064 nm laser mimics ionization radiation and induces soft- and hard failures as a direct result of electron-hole pair production, not heat. The 14nm FinFET processes continue to look promising for space radiation environments.

  19. Exploiting first-class arrays in Fortran for accelerator programming

    International Nuclear Information System (INIS)

    Rasmussen, Craig E.; Weseloh, Wayne N.; Robey, Robert W.; Sottile, Matthew J.; Quinlan, Daniel; Overbey, Jeffrey

    2010-01-01

    Emerging architectures for high performance computing often are well suited to a data parallel programming model. This paper presents a simple programming methodology based on existing languages and compiler tools that allows programmers to take advantage of these systems. We will work with the array features of Fortran 90 to show how this infrequently exploited, standardized language feature is easily transformed to lower level accelerator code. Our transformations are based on a mapping from Fortran 90 to C++ code with OpenCL extensions. The sheer complexity of programming for clusters of many or multi-core processors with tens of millions threads of execution make the simplicity of the data parallel model attractive. Furthermore, the increasing complexity of todays applications (especially when convolved with the increasing complexity of the hardware) and the need for portability across hardware architectures make a higher-level and simpler programming model like data parallel attractive. The goal of this work has been to exploit source-to-source transformations that allow programmers to develop and maintain programs at a high-level of abstraction, without coding to a specific hardware architecture. Furthermore these transformations allow multiple hardware architectures to be targeted without changing the high-level source. It also removes the necessity for application programmers to understand details of the accelerator architecture or to know OpenCL.

  20. MULTITASKER, Multitasking Kernel for C and FORTRAN Under UNIX

    International Nuclear Information System (INIS)

    Brooks, E.D. III

    1988-01-01

    1 - Description of program or function: MULTITASKER implements a multitasking kernel for the C and FORTRAN programming languages that runs under UNIX. The kernel provides a multitasking environment which serves two purposes. The first is to provide an efficient portable environment for the development, debugging, and execution of production multiprocessor programs. The second is to provide a means of evaluating the performance of a multitasking program on model multiprocessor hardware. The performance evaluation features require no changes in the application program source and are implemented as a set of compile- and run-time options in the kernel. 2 - Method of solution: The FORTRAN interface to the kernel is identical in function to the CRI multitasking package provided for the Cray XMP. This provides a migration path to high speed (but small N) multiprocessors once the application has been coded and debugged. With use of the UNIX m4 macro preprocessor, source compatibility can be achieved between the UNIX code development system and the target Cray multiprocessor. The kernel also provides a means of evaluating a program's performance on model multiprocessors. Execution traces may be obtained which allow the user to determine kernel overhead, memory conflicts between various tasks, and the average concurrency being exploited. The kernel may also be made to switch tasks every cpu instruction with a random execution ordering. This allows the user to look for unprotected critical regions in the program. These features, implemented as a set of compile- and run-time options, cause extra execution overhead which is not present in the standard production version of the kernel

  1. FORTRAN program for calculating liquid-phase and gas-phase thermal diffusion column coefficients

    International Nuclear Information System (INIS)

    Rutherford, W.M.

    1980-01-01

    A computer program (COLCO) was developed for calculating thermal diffusion column coefficients from theory. The program, which is written in FORTRAN IV, can be used for both liquid-phase and gas-phase thermal diffusion columns. Column coefficients for the gas phase can be based on gas properties calculated from kinetic theory using tables of omega integrals or on tables of compiled physical properties as functions of temperature. Column coefficients for the liquid phase can be based on compiled physical property tables. Program listings, test data, sample output, and users manual are supplied for appendices

  2. C Versus Fortran-77 for Scientific Programming

    Directory of Open Access Journals (Sweden)

    Tom MacDonald

    1992-01-01

    Full Text Available The predominant programming language for numeric and scientific applications is Fortran-77 and supercomputers are primarily used to run large-scale numeric and scientific applications. Standard C* is not widely used for numerical and scientific programming, yet Standard C provides many desirable linguistic features not present in Fortran-77. Furthermore, the existence of a standard library and preprocessor eliminates the worst portability problems. A comparison of Standard C and Fortran-77 shows several key deficiencies in C that reduce its ability to adequately solve some numerical problems. Some of these problems have already been addressed by the C standard but others remain. Standard C with a few extensions and modifications could be suitable for all numerical applications and could become more popular in supercomputing environments.

  3. IRRIGOGRAPHY AFTER PREPARATIONS OF PATIENTS WITH FORTRANS

    Directory of Open Access Journals (Sweden)

    Irena Jankovic

    2006-01-01

    Full Text Available Fortrans® is a laxative in the form of the powder which is used for making solution for oral application. Laxative effects are achieved over long linear polymer (polyethylene-glikol - PEG 4000 which binds water molecules, increasing thus the volume of fluids in the intestinal tract.Material of study comprises 150 irrigographies made at the Institute of Radiology of the Clinical Center in Nis in the period from January 2004 to Jun 2005.The preparation in these cases was done by Fortrans®. The contrast medium used was barium sulfate.The results of study are presented in illustrations and irrigograms.In conclusion,we can say that Fortrans® provides reliable, effective and simple preparation of patients for irrigography as well as for fast, comfortable and efficient endographic examination (irrigography. The obtained irrigograms are of satisfactory quality, showing sharp contrasts.

  4. Alternatives to FORTRAN in control systems

    International Nuclear Information System (INIS)

    Howell, J.A.; Wright, R.M.

    1985-01-01

    Control system software has traditionally been written in assembly language, FORTRAN, or Basic. Today there exist several high-level languages with features that make them convenient and effective in control systems. These features include bit manipulation, user-defined data types, character manipulation, and high-level logical operations. Some of theses languages are quite different from FORTRAN and yet are easy to read and use. We discuss several languages, their features that make them convenient for control systems, and give examples of their use. We focus particular attention on the language C, developed by Bell Laboratories

  5. Basic linear algebra subprograms for FORTRAN usage

    Science.gov (United States)

    Lawson, C. L.; Hanson, R. J.; Kincaid, D. R.; Krogh, F. T.

    1977-01-01

    A package of 38 low level subprograms for many of the basic operations of numerical linear algebra is presented. The package is intended to be used with FORTRAN. The operations in the package are dot products, elementary vector operations, Givens transformations, vector copy and swap, vector norms, vector scaling, and the indices of components of largest magnitude. The subprograms and a test driver are available in portable FORTRAN. Versions of the subprograms are also provided in assembly language for the IBM 360/67, the CDC 6600 and CDC 7600, and the Univac 1108.

  6. How to Interface Fortran with Matlab

    OpenAIRE

    Sagastizábal , Claudia; Vige , Guillaume

    1995-01-01

    Projet PROMATH; We describe the general procedure for interfacing Fortran routines with Matlab. We explain how to write a mex-file and the associated gateway function. In particular, each different type of argument is considered in detail. We finish with an illustrative example

  7. Pattern recognition in molecular dynamics. [FORTRAN

    Energy Technology Data Exchange (ETDEWEB)

    Zurek, W H; Schieve, W C [Texas Univ., Austin (USA)

    1977-07-01

    An algorithm for the recognition of the formation of bound molecular states in the computer simulation of a dilute gas is presented. Applications to various related problems in physics and chemistry are pointed out. Data structure and decision processes are described. Performance of the FORTRAN program based on the algorithm in cooperation with the molecular dynamics program is described and the results are presented.

  8. Analysis of Intel IA-64 Processor Support for Secure Systems

    National Research Council Canada - National Science Library

    Unalmis, Bugra

    2001-01-01

    .... Systems could be constructed for which serious security threats would be eliminated. This thesis explores the Intel IA-64 processor's hardware support and its relationship to software for building a secure system...

  9. MILC staggered conjugate gradient performance on Intel KNL

    OpenAIRE

    DeTar, Carleton; Doerfler, Douglas; Gottlieb, Steven; Jha, Ashish; Kalamkar, Dhiraj; Li, Ruizi; Toussaint, Doug

    2016-01-01

    We review our work done to optimize the staggered conjugate gradient (CG) algorithm in the MILC code for use with the Intel Knights Landing (KNL) architecture. KNL is the second gener- ation Intel Xeon Phi processor. It is capable of massive thread parallelism, data parallelism, and high on-board memory bandwidth and is being adopted in supercomputing centers for scientific research. The CG solver consumes the majority of time in production running, so we have spent most of our effort on it. ...

  10. [Intel random number generator-based true random number generator].

    Science.gov (United States)

    Huang, Feng; Shen, Hong

    2004-09-01

    To establish a true random number generator on the basis of certain Intel chips. The random numbers were acquired by programming using Microsoft Visual C++ 6.0 via register reading from the random number generator (RNG) unit of an Intel 815 chipset-based computer with Intel Security Driver (ISD). We tested the generator with 500 random numbers in NIST FIPS 140-1 and X(2) R-Squared test, and the result showed that the random number it generated satisfied the demand of independence and uniform distribution. We also compared the random numbers generated by Intel RNG-based true random number generator and those from the random number table statistically, by using the same amount of 7500 random numbers in the same value domain, which showed that the SD, SE and CV of Intel RNG-based random number generator were less than those of the random number table. The result of u test of two CVs revealed no significant difference between the two methods. Intel RNG-based random number generator can produce high-quality random numbers with good independence and uniform distribution, and solves some problems with random number table in acquisition of the random numbers.

  11. Performance of Artificial Intelligence Workloads on the Intel Core 2 Duo Series Desktop Processors

    OpenAIRE

    Abdul Kareem PARCHUR; Kuppangari Krishna RAO; Fazal NOORBASHA; Ram Asaray SINGH

    2010-01-01

    As the processor architecture becomes more advanced, Intel introduced its Intel Core 2 Duo series processors. Performance impact on Intel Core 2 Duo processors are analyzed using SPEC CPU INT 2006 performance numbers. This paper studied the behavior of Artificial Intelligence (AI) benchmarks on Intel Core 2 Duo series processors. Moreover, we estimated the task completion time (TCT) @1 GHz, @2 GHz and @3 GHz Intel Core 2 Duo series processors frequency. Our results show the performance scalab...

  12. High Performance Object-Oriented Scientific Programming in Fortran 90

    Science.gov (United States)

    Norton, Charles D.; Decyk, Viktor K.; Szymanski, Boleslaw K.

    1997-01-01

    We illustrate how Fortran 90 supports object-oriented concepts by example of plasma particle computations on the IBM SP. Our experience shows that Fortran 90 and object-oriented methodology give high performance while providing a bridge from Fortran 77 legacy codes to modern programming principles. All of our object-oriented Fortran 90 codes execute more quickly thatn the equeivalent C++ versions, yet the abstraction modelling capabilities used for scentific programming are comparably powereful.

  13. Comparative VME Performance Tests for MEN A20 Intel-L865 and RIO-3 PPC-LynxOS platforms

    CERN Document Server

    Andersen, M; CERN. Geneva. BE Department

    2009-01-01

    This benchmark note presents test results from reading values over VME using different methods and different sizes of data registers, running on two different platforms Intel-L865 and PPC-LynxOS. We find that the PowerPC is a factor 3 faster in accessing an array of contiguous VME memory locations. Block transfer and DMA read accesses are also tested and compared with conventional single access reads.

  14. Running the EGS4 Monte Carlo code with Fortran 90 on a pentium computer

    International Nuclear Information System (INIS)

    Caon, M.; Bibbo, G.; Pattison, J.

    1996-01-01

    The possibility to run the EGS4 Monte Carlo code radiation transport system for medical radiation modelling on a microcomputer is discussed. This has been done using a Fortran 77 compiler with a 32-bit memory addressing system running under a memory extender operating system. In addition a virtual memory manager such as QEMM386 was required. It has successfully run on a SUN Sparcstation2. In 1995 faster Pentium-based microcomputers became available as did the Windows 95 operating system which can handle 32-bit programs, multitasking and provides its own virtual memory management. The paper describe how with simple modification to the batch files it was possible to run EGS4 on a Pentium under Fortran 90 and Windows 95. This combination of software and hardware is cheaper and faster than running it on a SUN Sparcstation2. 8 refs., 1 tab

  15. Running the EGS4 Monte Carlo code with Fortran 90 on a pentium computer

    Energy Technology Data Exchange (ETDEWEB)

    Caon, M. [Flinders Univ. of South Australia, Bedford Park, SA (Australia)]|[Univercity of South Australia, SA (Australia); Bibbo, G. [Womens and Childrens hospital, SA (Australia); Pattison, J. [Univercity of South Australia, SA (Australia)

    1996-09-01

    The possibility to run the EGS4 Monte Carlo code radiation transport system for medical radiation modelling on a microcomputer is discussed. This has been done using a Fortran 77 compiler with a 32-bit memory addressing system running under a memory extender operating system. In addition a virtual memory manager such as QEMM386 was required. It has successfully run on a SUN Sparcstation2. In 1995 faster Pentium-based microcomputers became available as did the Windows 95 operating system which can handle 32-bit programs, multitasking and provides its own virtual memory management. The paper describe how with simple modification to the batch files it was possible to run EGS4 on a Pentium under Fortran 90 and Windows 95. This combination of software and hardware is cheaper and faster than running it on a SUN Sparcstation2. 8 refs., 1 tab.

  16. Performance Evaluation of Multithreaded Geant4 Simulations Using an Intel Xeon Phi Cluster

    Directory of Open Access Journals (Sweden)

    P. Schweitzer

    2015-01-01

    Full Text Available The objective of this study is to evaluate the performances of Intel Xeon Phi hardware accelerators for Geant4 simulations, especially for multithreaded applications. We present the complete methodology to guide users for the compilation of their Geant4 applications on Phi processors. Then, we propose series of benchmarks to compare the performance of Xeon CPUs and Phi processors for a Geant4 example dedicated to the simulation of electron dose point kernels, the TestEm12 example. First, we compare a distributed execution of a sequential version of the Geant4 example on both architectures before evaluating the multithreaded version of the Geant4 example. If Phi processors demonstrated their ability to accelerate computing time (till a factor 3.83 when distributing sequential Geant4 simulations, we do not reach the same level of speedup when considering the multithreaded version of the Geant4 example.

  17. Engineering a compiler

    CERN Document Server

    Cooper, Keith D

    2012-01-01

    As computing has changed, so has the role of both the compiler and the compiler writer. The proliferation of processors, environments, and constraints demands an equally large number of compilers. To adapt, compiler writers retarget code generators, add optimizations, and work on issues such as code space or power consumption. Engineering a Compiler re-balances the curriculum for an introductory course in compiler construction to reflect the issues that arise in today's practice. Authors Keith Cooper and Linda Torczon convey both the art and the science of compiler construction and show best practice algorithms for the major problems inside a compiler. ·Focuses on the back end of the compiler-reflecting the focus of research and development over the last decade ·Applies the well-developed theory behind scanning and parsing to introduce concepts that play a critical role in optimization and code generation. ·Introduces the student to optimization through data-flow analysis, SSA form, and a selection of sc...

  18. Parallel spatial direct numerical simulations on the Intel iPSC/860 hypercube

    Science.gov (United States)

    Joslin, Ronald D.; Zubair, Mohammad

    1993-01-01

    The implementation and performance of a parallel spatial direct numerical simulation (PSDNS) approach on the Intel iPSC/860 hypercube is documented. The direct numerical simulation approach is used to compute spatially evolving disturbances associated with the laminar-to-turbulent transition in boundary-layer flows. The feasibility of using the PSDNS on the hypercube to perform transition studies is examined. The results indicate that the direct numerical simulation approach can effectively be parallelized on a distributed-memory parallel machine. By increasing the number of processors nearly ideal linear speedups are achieved with nonoptimized routines; slower than linear speedups are achieved with optimized (machine dependent library) routines. This slower than linear speedup results because the Fast Fourier Transform (FFT) routine dominates the computational cost and because the routine indicates less than ideal speedups. However with the machine-dependent routines the total computational cost decreases by a factor of 4 to 5 compared with standard FORTRAN routines. The computational cost increases linearly with spanwise wall-normal and streamwise grid refinements. The hypercube with 32 processors was estimated to require approximately twice the amount of Cray supercomputer single processor time to complete a comparable simulation; however it is estimated that a subgrid-scale model which reduces the required number of grid points and becomes a large-eddy simulation (PSLES) would reduce the computational cost and memory requirements by a factor of 10 over the PSDNS. This PSLES implementation would enable transition simulations on the hypercube at a reasonable computational cost.

  19. Initial results on computational performance of Intel Many Integrated Core (MIC) architecture: implementation of the Weather and Research Forecasting (WRF) Purdue-Lin microphysics scheme

    Science.gov (United States)

    Mielikainen, Jarno; Huang, Bormin; Huang, Allen H.

    2014-10-01

    Purdue-Lin scheme is a relatively sophisticated microphysics scheme in the Weather Research and Forecasting (WRF) model. The scheme includes six classes of hydro meteors: water vapor, cloud water, raid, cloud ice, snow and graupel. The scheme is very suitable for massively parallel computation as there are no interactions among horizontal grid points. In this paper, we accelerate the Purdue Lin scheme using Intel Many Integrated Core Architecture (MIC) hardware. The Intel Xeon Phi is a high performance coprocessor consists of up to 61 cores. The Xeon Phi is connected to a CPU via the PCI Express (PICe) bus. In this paper, we will discuss in detail the code optimization issues encountered while tuning the Purdue-Lin microphysics Fortran code for Xeon Phi. In particularly, getting a good performance required utilizing multiple cores, the wide vector operations and make efficient use of memory. The results show that the optimizations improved performance of the original code on Xeon Phi 5110P by a factor of 4.2x. Furthermore, the same optimizations improved performance on Intel Xeon E5-2603 CPU by a factor of 1.2x compared to the original code.

  20. Parallelization of particle transport using Intel® TBB

    International Nuclear Information System (INIS)

    Apostolakis, J; Brun, R; Carminati, F; Gheata, A; Wenzel, S; Belogurov, S; Ovcharenko, E

    2014-01-01

    One of the current challenges in HEP computing is the development of particle propagation algorithms capable of efficiently use all performance aspects of modern computing devices. The Geant-Vector project at CERN has recently introduced an approach in this direction. This paper describes the implementation of a similar workflow using the Intel(r) Threading Building Blocks (Intel(r) TBB) library. This approach is intended to overcome the potential bottleneck of having a single dispatcher on many-core architectures and to result in better scalability compared to the initial pthreads-based version.

  1. SPARQL compiler for Bobox

    OpenAIRE

    Čermák, Miroslav

    2013-01-01

    The goal of the work is to design and implement a SPARQL compiler for the Bobox system. In addition to lexical and syntactic analysis corresponding to W3C standard for SPARQL language, it performs semantic analysis and optimization of queries. Compiler will constuct an appropriate model for execution in Bobox, that depends on the physical database schema.

  2. Application of Modern Fortran to Spacecraft Trajectory Design and Optimization

    Science.gov (United States)

    Williams, Jacob; Falck, Robert D.; Beekman, Izaak B.

    2018-01-01

    In this paper, applications of the modern Fortran programming language to the field of spacecraft trajectory optimization and design are examined. Modern object-oriented Fortran has many advantages for scientific programming, although many legacy Fortran aerospace codes have not been upgraded to use the newer standards (or have been rewritten in other languages perceived to be more modern). NASA's Copernicus spacecraft trajectory optimization program, originally a combination of Fortran 77 and Fortran 95, has attempted to keep up with modern standards and makes significant use of the new language features. Various algorithms and methods are presented from trajectory tools such as Copernicus, as well as modern Fortran open source libraries and other projects.

  3. Intel Legend and CERN would build up high speed Internet

    CERN Multimedia

    2002-01-01

    Intel, Legend and China Education and Research Network jointly announced on the 25th of April that they will be cooperating with each other to build up the new generation high speed internet, over the next three years (1/2 page).

  4. Communication overhead on the Intel iPSC-860 hypercube

    Science.gov (United States)

    Bokhari, Shahid H.

    1990-01-01

    Experiments were conducted on the Intel iPSC-860 hypercube in order to evaluate the overhead of interprocessor communication. It is demonstrated that: (1) contrary to popular belief, the distance between two communicating processors has a significant impact on communication time, (2) edge contention can increase communication time by a factor of more than 7, and (3) node contention has no measurable impact.

  5. Connecting Effective Instruction and Technology. Intel-elebration: Safari.

    Science.gov (United States)

    Burton, Larry D.; Prest, Sharon

    Intel-ebration is an attempt to integrate the following research-based instructional frameworks and strategies: (1) dimensions of learning; (2) multiple intelligences; (3) thematic instruction; (4) cooperative learning; (5) project-based learning; and (6) instructional technology. This paper presents a thematic unit on safari, using the…

  6. Emulating Multiple Inheritance in Fortran 2003/2008

    Directory of Open Access Journals (Sweden)

    Karla Morris

    2015-01-01

    in Fortran 2003. The design unleashes the power of the associated class relationships for modeling complicated data structures yet avoids the ambiguities that plague some multiple inheritance scenarios.

  7. Using the Intel Math Kernel Library on Peregrine | High-Performance

    Science.gov (United States)

    Computing | NREL the Intel Math Kernel Library on Peregrine Using the Intel Math Kernel Library on Peregrine Learn how to use the Intel Math Kernel Library (MKL) with Peregrine system software. MKL architectures. Core math functions in MKL include BLAS, LAPACK, ScaLAPACK, sparse solvers, fast Fourier

  8. A Linear Algebra Framework for Static High Performance Fortran Code Distribution

    Directory of Open Access Journals (Sweden)

    Corinne Ancourt

    1997-01-01

    Full Text Available High Performance Fortran (HPF was developed to support data parallel programming for single-instruction multiple-data (SIMD and multiple-instruction multiple-data (MIMD machines with distributed memory. The programmer is provided a familiar uniform logical address space and specifies the data distribution by directives. The compiler then exploits these directives to allocate arrays in the local memories, to assign computations to elementary processors, and to migrate data between processors when required. We show here that linear algebra is a powerful framework to encode HPF directives and to synthesize distributed code with space-efficient array allocation, tight loop bounds, and vectorized communications for INDEPENDENT loops. The generated code includes traditional optimizations such as guard elimination, message vectorization and aggregation, and overlap analysis. The systematic use of an affine framework makes it possible to prove the compilation scheme correct.

  9. Roofline Analysis in the Intel® Advisor to Deliver Optimized Performance for applications on Intel® Xeon Phi™ Processor

    OpenAIRE

    Koskela, TS; Lobet, M

    2017-01-01

    In this session we show, in two case studies, how the roofline feature of Intel Advisor has been utilized to optimize the performance of kernels of the XGC1 and PICSAR codes in preparation for Intel Knights Landing architecture. The impact of the implemented optimizations and the benefits of using the automatic roofline feature of Intel Advisor to study performance of large applications will be presented. This demonstrates an effective optimization strategy that has enabled these science appl...

  10. DISPPAK SUBPAK, MS FORTRAN Extended Subroutine Library

    International Nuclear Information System (INIS)

    Langer, S.

    1991-01-01

    1 - Description of program or function: DISPPAK is a set of routines for use with Microsoft FORTRAN programs that allows the flexible display of information on the screen of an IBM PC in both text and graphics modes. The text mode routines allow the cursor to be placed at an arbitrary point on the screen and text to be displayed at the cursor location, making it possible to create menus and other structured displays. A routine to set the color of the characters that these routines display is also provided. A set of line drawing routines is included for use with IBM's Color Graphics Adapter or an equivalent board (such as the Enhanced Graphics Adapter in CGA emulation mode). These routines support both pixel coordinates and a user-specified set of real number coordinates. SUBPAK is a function library which allows Microsoft FORTRAN programs to calculate random numbers, issue calls to the operating system, read individual characters from the keyboard, perform Boolean and shift operations, and communicate with the I/O ports of the IBM PC. In addition, peek and poke routines, a routine that returns the address of any variable, and routines that can access the system time and date are included. 2 - Method of solution: For the DISPPAK line drawing routines, the user selects a fraction of the screen to use for plotting, chooses the coordinates that refer to the lower-left and upper-right corners, and decides whether the mapping should be linear or logarithmic. Lines are then drawn between endpoints defined in terms of the user coordinate system. Out-of-range coordinates are forced to the border of the window before the line is drawn. 3 - Restrictions on the complexity of the problem: No support is provided for filled areas or text

  11. ARBUS: A FORTRAN tool for generating tree structure diagrams

    International Nuclear Information System (INIS)

    Ferrero, C.; Zanger, M.

    1992-02-01

    The FORTRAN77 stand-alone code ARBUS has been designed to aid the user by providing a tree structure diagram generating utility for computer programs written in FORTRAN language. This report is intended to describe the main purpose and features of ARBUS and to highlight some additional applications of the code by means of practical test cases. (orig.) [de

  12. Object-Oriented Scientific Programming with Fortran 90

    Science.gov (United States)

    Norton, C.

    1998-01-01

    Fortran 90 is a modern language that introduces many important new features beneficial for scientific programming. We discuss our experiences in plasma particle simulation and unstructured adaptive mesh refinement on supercomputers, illustrating the features of Fortran 90 that support the object-oriented methodology.

  13. The Fortran-P Translator: Towards Automatic Translation of Fortran 77 Programs for Massively Parallel Processors

    Directory of Open Access Journals (Sweden)

    Matthew O'keefe

    1995-01-01

    Full Text Available Massively parallel processors (MPPs hold the promise of extremely high performance that, if realized, could be used to study problems of unprecedented size and complexity. One of the primary stumbling blocks to this promise has been the lack of tools to translate application codes to MPP form. In this article we show how applications codes written in a subset of Fortran 77, called Fortran-P, can be translated to achieve good performance on several massively parallel machines. This subset can express codes that are self-similar, where the algorithm applied to the global data domain is also applied to each subdomain. We have found many codes that match the Fortran-P programming style and have converted them using our tools. We believe a self-similar coding style will accomplish what a vectorizable style has accomplished for vector machines by allowing the construction of robust, user-friendly, automatic translation systems that increase programmer productivity and generate fast, efficient code for MPPs.

  14. A fast random number generator for the Intel Paragon supercomputer

    Science.gov (United States)

    Gutbrod, F.

    1995-06-01

    A pseudo-random number generator is presented which makes optimal use of the architecture of the i860-microprocessor and which is expected to have a very long period. It is therefore a good candidate for use on the parallel supercomputer Paragon XP. In the assembler version, it needs 6.4 cycles for a real∗4 random number. There is a FORTRAN routine which yields identical numbers up to rare and minor rounding discrepancies, and it needs 28 cycles. The FORTRAN performance on other microprocessors is somewhat better. Arguments for the quality of the generator and some numerical tests are given.

  15. MILC staggered conjugate gradient performance on Intel KNL

    Energy Technology Data Exchange (ETDEWEB)

    Li, Ruiz [Indiana Univ., Bloomington, IN (United States). Dept. of Physics; Detar, Carleton [Univ. of Utah, Salt Lake City, UT (United States). Dept. of Physics and Astronomy; Doerfler, Douglas W. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). National Energy Research Scientific Computing Center (NERSC); Gottlieb, Steven [Indiana Univ., Bloomington, IN (United States). Dept. of Physics; Jha, Asish [Intel Corp., Hillsboro, OR (United States). Sofware and Services Group; Kalamkar, Dhiraj [Intel Labs., Bangalore (India). Parallel Computing Lab.; Toussaint, Doug [Univ. of Arizona, Tucson, AZ (United States). Physics Dept.

    2016-11-03

    We review our work done to optimize the staggered conjugate gradient (CG) algorithm in the MILC code for use with the Intel Knights Landing (KNL) architecture. KNL is the second gener- ation Intel Xeon Phi processor. It is capable of massive thread parallelism, data parallelism, and high on-board memory bandwidth and is being adopted in supercomputing centers for scientific research. The CG solver consumes the majority of time in production running, so we have spent most of our effort on it. We compare performance of an MPI+OpenMP baseline version of the MILC code with a version incorporating the QPhiX staggered CG solver, for both one-node and multi-node runs.

  16. Vectorization for Molecular Dynamics on Intel Xeon Phi Corpocessors

    Science.gov (United States)

    Yi, Hongsuk

    2014-03-01

    Many modern processors are capable of exploiting data-level parallelism through the use of single instruction multiple data (SIMD) execution. The new Intel Xeon Phi coprocessor supports 512 bit vector registers for the high performance computing. In this paper, we have developed a hierarchical parallelization scheme for accelerated molecular dynamics simulations with the Terfoff potentials for covalent bond solid crystals on Intel Xeon Phi coprocessor systems. The scheme exploits multi-level parallelism computing. We combine thread-level parallelism using a tightly coupled thread-level and task-level parallelism with 512-bit vector register. The simulation results show that the parallel performance of SIMD implementations on Xeon Phi is apparently superior to their x86 CPU architecture.

  17. Full cycle trigonometric function on Intel Quartus II Verilog

    Science.gov (United States)

    Mustapha, Muhazam; Zulkarnain, Nur Antasha

    2018-02-01

    This paper discusses about an improvement of a previous research on hardware based trigonometric calculations. Tangent function will also be implemented to get a complete set. The functions have been simulated using Quartus II where the result will be compared to the previous work. The number of bits has also been extended for each trigonometric function. The design is based on RTL due to its resource efficient nature. At earlier stage, a technology independent test bench simulation was conducted on ModelSim due to its convenience in capturing simulation data so that accuracy information can be obtained. On second stage, Intel/Altera Quartus II will be used to simulate on technology dependent platform, particularly on the one belonging to Intel/Altera itself. Real data on no. logic elements used and propagation delay have also been obtained.

  18. Compiler Technology for Parallel Scientific Computation

    Directory of Open Access Journals (Sweden)

    Can Özturan

    1994-01-01

    Full Text Available There is a need for compiler technology that, given the source program, will generate efficient parallel codes for different architectures with minimal user involvement. Parallel computation is becoming indispensable in solving large-scale problems in science and engineering. Yet, the use of parallel computation is limited by the high costs of developing the needed software. To overcome this difficulty we advocate a comprehensive approach to the development of scalable architecture-independent software for scientific computation based on our experience with equational programming language (EPL. Our approach is based on a program decomposition, parallel code synthesis, and run-time support for parallel scientific computation. The program decomposition is guided by the source program annotations provided by the user. The synthesis of parallel code is based on configurations that describe the overall computation as a set of interacting components. Run-time support is provided by the compiler-generated code that redistributes computation and data during object program execution. The generated parallel code is optimized using techniques of data alignment, operator placement, wavefront determination, and memory optimization. In this article we discuss annotations, configurations, parallel code generation, and run-time support suitable for parallel programs written in the functional parallel programming language EPL and in Fortran.

  19. Porting FEASTFLOW to the Intel Xeon Phi: Lessons Learned

    OpenAIRE

    Georgios Goumas

    2014-01-01

    In this paper we report our experiences in porting the FEASTFLOW software infrastructure to the Intel Xeon Phi coprocessor. Our efforts involved both the evaluation of programming models including OpenCL, POSIX threads and OpenMP and typical optimization strategies like parallelization and vectorization. Since the straightforward porting process of the already existing OpenCL version of the code encountered performance problems that require further analysis, we focused our efforts on the impl...

  20. Protein Alignment on the Intel Xeon Phi Coprocessor

    OpenAIRE

    Ramstad, Jorun

    2015-01-01

    There is an increasing need for sensitive, high perfomance sequence alignemnet tools. With the growing databases of scientificly analyzed protein sequences, more compute power is necessary. Specialized architectures arise, and a transition from serial to specialized implementationsis is required. This thesis is a study of whether Intel 60's cores Xeon Phi coprocessor is a suitable architecture for implementation of a sequence alignment tool. The performance relative to existing tools are eval...

  1. Staggered Dslash Performance on Intel Xeon Phi Architecture

    OpenAIRE

    Li, Ruizi; Gottlieb, Steven

    2014-01-01

    The conjugate gradient (CG) algorithm is among the most essential and time consuming parts of lattice calculations with staggered quarks. We test the performance of CG and dslash, the key step in the CG algorithm, on the Intel Xeon Phi, also known as the Many Integrated Core (MIC) architecture. We try different parallelization strategies using MPI, OpenMP, and the vector processing units (VPUs).

  2. Intel·ligència emocional a maternal

    OpenAIRE

    Missé Cortina, Jordi

    2015-01-01

    Inclusió d'activitats d'intel·ligència emocional a maternal A i B per al treball de l'adquisició de valors com l'autoestima, el respecte, la tolerància, etc. Inclusión de actividades de inteligencia emocional en maternal A y B para el trabajo de la adquisición de valores como la autoestima, el respeto, la tolerancia, etc. Practicum for the Psychology program on Educational Psychology.

  3. C to VHDL compiler

    Science.gov (United States)

    Berdychowski, Piotr P.; Zabolotny, Wojciech M.

    2010-09-01

    The main goal of C to VHDL compiler project is to make FPGA platform more accessible for scientists and software developers. FPGA platform offers unique ability to configure the hardware to implement virtually any dedicated architecture, and modern devices provide sufficient number of hardware resources to implement parallel execution platforms with complex processing units. All this makes the FPGA platform very attractive for those looking for efficient heterogeneous, computing environment. Current industry standard in development of digital systems on FPGA platform is based on HDLs. Although very effective and expressive in hands of hardware development specialists, these languages require specific knowledge and experience, unreachable for most scientists and software programmers. C to VHDL compiler project attempts to remedy that by creating an application, that derives initial VHDL description of a digital system (for further compilation and synthesis), from purely algorithmic description in C programming language. This idea itself is not new, and the C to VHDL compiler combines the best approaches from existing solutions developed over many previous years, with the introduction of some new unique improvements.

  4. Types: A data abstraction package in FORTRAN

    International Nuclear Information System (INIS)

    Youssef, S.

    1990-01-01

    TYPES is a collection of Fortran programs which allow the creation and manipulation of abstract ''data objects'' without the need for a preprocessor. Each data object is assigned a ''type'' as it is created which implies participation in a set of characteristic operations. Available types include scalars, logicals, ordered sets, stacks, queues, sequences, trees, arrays, character strings, block text, histograms, virtual and allocatable memories. A data object may contain integers, reals, or other data objects in any combination. In addition to the type specific operations, a set of universal utilities allows for copying input/output to disk, naming, editing, displaying, user input, interactive creation, tests for equality of contents or structure, machine to machine translation or source code creation for and data object. TYPES is available on VAX/VMS, SUN 3, SPARC, DEC/Ultrix, Silicon Graphics 4D and Cray/Unicos machines. The capabilities of the package are discussed together with characteristic applications and experience in writing the GVerify package

  5. FASTPLOT, Interface Routines to MS FORTRAN Graphics Library

    International Nuclear Information System (INIS)

    1999-01-01

    1 - Description of program or function: FASTPLOT is a library of routines that can be used to interface with the Microsoft FORTRAN Graphics library (GRAPHICS.LIB). The FASTPLOT routines simplify the development of graphics applications and add capabilities such as histograms, Splines, symbols, and error bars. FASTPLOT also includes routines that can be used to create menus. 2 - Methods: FASTPLOT is a library of routines which must be linked with a user's FORTRAN programs that call any FASTPLOT routines. In addition, the user must link with the Microsoft FORTRAN Graphics library (GRAPHICS.LIB). 3 - Restrictions on the complexity of the problem: None noted

  6. Compilation of results 1987

    International Nuclear Information System (INIS)

    1987-01-01

    A compilation is carried out which in concentrated form presents reports on research and development within the nuclear energy field covering a two and a half years period. The foregoing report was edited in December 1984. The projects are presendted with title, project number, responsible unit, person to contact and short result reports. The result reports consist of short summaries over each project. (L.F.)

  7. Performance of Artificial Intelligence Workloads on the Intel Core 2 Duo Series Desktop Processors

    Directory of Open Access Journals (Sweden)

    Abdul Kareem PARCHUR

    2010-12-01

    Full Text Available As the processor architecture becomes more advanced, Intel introduced its Intel Core 2 Duo series processors. Performance impact on Intel Core 2 Duo processors are analyzed using SPEC CPU INT 2006 performance numbers. This paper studied the behavior of Artificial Intelligence (AI benchmarks on Intel Core 2 Duo series processors. Moreover, we estimated the task completion time (TCT @1 GHz, @2 GHz and @3 GHz Intel Core 2 Duo series processors frequency. Our results show the performance scalability in Intel Core 2 Duo series processors. Even though AI benchmarks have similar execution time, they have dissimilar characteristics which are identified using principal component analysis and dendogram. As the processor frequency increased from 1.8 GHz to 3.167 GHz the execution time is decreased by ~370 sec for AI workloads. In the case of Physics/Quantum Computing programs it was ~940 sec.

  8. LISP software generative compilation within the frame of a SLIP system; La compilation generative de programmes LISP dans le cadre d'un systeme SLIP

    Energy Technology Data Exchange (ETDEWEB)

    Sitbon, Andre

    1968-04-24

    After having outlined the limitations associated with the use of some programming languages (Fortran, Algol, assembler, and so on), and the interest of the use of the LISP structure and its associated language, the author notices that some problems remain regarding the memorisation of the computing process obtained by interpretation. Thus, he introduces a generative compiler which produces an executable programme, and which is written in a language very close to the used machine language, i.e. the FAP assembler language.

  9. IFF, Full-Screen Input Menu Generator for FORTRAN Program

    International Nuclear Information System (INIS)

    Seidl, Albert

    1991-01-01

    1 - Description of program or function: The IFF-package contains input modules for use within FORTRAN programs. This package enables the programmer to easily include interactive menu-directed data input (module VTMEN1) and command-word processing (module INPCOM) into a FORTRAN program. 2 - Method of solution: No mathematical operations are performed. 3 - Restrictions on the complexity of the problem: Certain restrictions of use may arise from the dimensioning of arrays. Field lengths are defined via PARAMETER-statements

  10. Comparison of Processor Performance of SPECint2006 Benchmarks of some Intel Xeon Processors

    OpenAIRE

    Abdul Kareem PARCHUR; Ram Asaray SINGH

    2012-01-01

    High performance is a critical requirement to all microprocessors manufacturers. The present paper describes the comparison of performance in two main Intel Xeon series processors (Type A: Intel Xeon X5260, X5460, E5450 and L5320 and Type B: Intel Xeon X5140, 5130, 5120 and E5310). The microarchitecture of these processors is implemented using the basis of a new family of processors from Intel starting with the Pentium 4 processor. These processors can provide a performance boost for many ke...

  11. Does the Intel Xeon Phi processor fit HEP workloads?

    Science.gov (United States)

    Nowak, A.; Bitzes, G.; Dotti, A.; Lazzaro, A.; Jarp, S.; Szostek, P.; Valsan, L.; Botezatu, M.; Leduc, J.

    2014-06-01

    This paper summarizes the five years of CERN openlab's efforts focused on the Intel Xeon Phi co-processor, from the time of its inception to public release. We consider the architecture of the device vis a vis the characteristics of HEP software and identify key opportunities for HEP processing, as well as scaling limitations. We report on improvements and speedups linked to parallelization and vectorization on benchmarks involving software frameworks such as Geant4 and ROOT. Finally, we extrapolate current software and hardware trends and project them onto accelerators of the future, with the specifics of offline and online HEP processing in mind.

  12. Does the Intel Xeon Phi processor fit HEP workloads?

    International Nuclear Information System (INIS)

    Nowak, A; Bitzes, G; Dotti, A; Lazzaro, A; Jarp, S; Szostek, P; Valsan, L; Botezatu, M; Leduc, J

    2014-01-01

    This paper summarizes the five years of CERN openlab's efforts focused on the Intel Xeon Phi co-processor, from the time of its inception to public release. We consider the architecture of the device vis a vis the characteristics of HEP software and identify key opportunities for HEP processing, as well as scaling limitations. We report on improvements and speedups linked to parallelization and vectorization on benchmarks involving software frameworks such as Geant4 and ROOT. Finally, we extrapolate current software and hardware trends and project them onto accelerators of the future, with the specifics of offline and online HEP processing in mind.

  13. Computer and compiler effects on code results: status report

    International Nuclear Information System (INIS)

    1996-01-01

    Within the framework of the international effort on the assessment of computer codes, which are designed to describe the overall reactor coolant system (RCS) thermalhydraulic response, core damage progression, and fission product release and transport during severe accidents, there has been a continuous debate as to whether the code results are influenced by different code users or by different computers or compilers. The first aspect, the 'Code User Effect', has been investigated already. In this paper the other aspects will be discussed and proposals are given how to make large system codes insensitive to different computers and compilers. Hardware errors and memory problems are not considered in this report. The codes investigated herein are integrated code systems (e. g. ESTER, MELCOR) and thermalhydraulic system codes with extensions for severe accident simulation (e. g. SCDAP/RELAP, ICARE/CATHARE, ATHLET-CD), and codes to simulate fission product transport (e. g. TRAPMELT, SOPHAEROS). Since all of these codes are programmed in Fortran 77, the discussion herein is based on this programming language although some remarks are made about Fortran 90. Some observations about different code results by using different computers are reported and possible reasons for this unexpected behaviour are listed. Then methods are discussed how to avoid portability problems

  14. Exploring synchrotron radiation capabilities: The ALS-Intel CRADA

    International Nuclear Information System (INIS)

    Gozzo, F.; Cossy-Favre, A.; Padmore, H.

    1997-01-01

    Synchrotron radiation spectroscopy and spectromicroscopy were applied, at the Advanced Light Source, to the analysis of materials and problems of interest to the commercial semiconductor industry. The authors discuss some of the results obtained at the ALS using existing capabilities, in particular the small spot ultra-ESCA instrument on beamline 7.0 and the AMS (Applied Material Science) endstation on beamline 9.3.2. The continuing trend towards smaller feature size and increased performance for semiconductor components has driven the semiconductor industry to invest in the development of sophisticated and complex instrumentation for the characterization of microstructures. Among the crucial milestones established by the Semiconductor Industry Association are the needs for high quality, defect free and extremely clean silicon wafers, very thin gate oxides, lithographies near 0.1 micron and advanced material interconnect structures. The requirements of future generations cannot be met with current industrial technologies. The purpose of the ALS-Intel CRADA (Cooperative Research And Development Agreement) is to explore, compare and improve the utility of synchrotron-based techniques for practical analysis of substrates of interest to semiconductor chip manufacturing. The first phase of the CRADA project consisted in exploring existing ALS capabilities and techniques on some problems of interest. Some of the preliminary results obtained on Intel samples are discussed here

  15. Evaluation of the Intel Westmere-EX server processor

    CERN Document Server

    Jarp, S; Leduc, J; Nowak, A; CERN. Geneva. IT Department

    2011-01-01

    One year after the arrival of the Intel Xeon 7500 systems (“Nehalem-EX”), CERN openlab is presenting a set of benchmark results obtained when running on the new Xeon E7-4870 Processors, representing the “Westmere-EX” family. A modern 4-socket, 40-core system is confronted with the previous generation of expandable (“EX”) platforms, represented by a 4-socket, 32-core Intel Xeon X7560 based system – both being “top of the line” systems. Benchmarking of modern processors is a very complex affair. One has to control (at least) the following features: processor frequency, overclocking via Turbo mode, the number of physical cores in use, the use of logical cores via Symmetric MultiThreading (SMT), the cache sizes available, the configured memory topology, as well as the power configuration if throughput per watt is to be measured. As in previous activities, we have tried to do a good job of comparing like with like. In a “top of the line” comparison based on the HEPSPEC06 benchmark, the “We...

  16. Global synchronization algorithms for the Intel iPSC/860

    Science.gov (United States)

    Seidel, Steven R.; Davis, Mark A.

    1992-01-01

    In a distributed memory multicomputer that has no global clock, global processor synchronization can only be achieved through software. Global synchronization algorithms are used in tridiagonal systems solvers, CFD codes, sequence comparison algorithms, and sorting algorithms. They are also useful for event simulation, debugging, and for solving mutual exclusion problems. For the Intel iPSC/860 in particular, global synchronization can be used to ensure the most effective use of the communication network for operations such as the shift, where each processor in a one-dimensional array or ring concurrently sends a message to its right (or left) neighbor. Three global synchronization algorithms are considered for the iPSC/860: the gysnc() primitive provided by Intel, the PICL primitive sync0(), and a new recursive doubling synchronization (RDS) algorithm. The performance of these algorithms is compared to the performance predicted by communication models of both the long and forced message protocols. Measurements of the cost of shift operations preceded by global synchronization show that the RDS algorithm always synchronizes the nodes more precisely and costs only slightly more than the other two algorithms.

  17. Comparison of and conversion between different implementations of the FORTRAN programming language

    Science.gov (United States)

    Treinish, L.

    1980-01-01

    A guideline for computer programmers who may need to exchange FORTRAN programs between several computers is presented. The characteristics of the FORTRAN language available on three different types of computers are outlined, and procedures and other considerations for the transfer of programs from one type of FORTRAN to another are discussed. In addition, the variance of these different FORTRAN's from the FORTRAN 77 standard are discussed.

  18. TOOLPACK1, Tools for Development and Maintenance of FORTRAN 77 Program

    International Nuclear Information System (INIS)

    Cowell, Wayne R.

    1993-01-01

    1 - Description of program or function: TOOLPACK1 consists of the following categories of software; (1) an integrated collection of tools intended to support the development and maintenance of FORTRAN 77 programs, in particular moderate-sized collections of mathematical software; (2) several user/Toolpack interfaces, one of which is selected for use at any particular installation; (3) three implementations of the tool/system interface, called TIE (Tool Interface to the Environment). The tools are written in FORTRAN 77 and are portable among TIE installations. The source contains symbolic constants as macro names and must be expanded with a suitable macro expander before being compiled and loaded. A portable macro expander is supplied in TOOLPACK1. The tools may be divided into three functional areas: general, documentation, and processing. One tool, the macro processor, Can be used in any of these categories. ISTDC: data comparison tool is designed mainly for comparing files of numeric values, and files with embedded text. ISTET Expands tabs. ISTFI: finds all the include files that a file needs. ISTGP Searches multiple files for occurrences of a regular expression. ISTHP: will provide limited help information about tools. ISTMP: The macro processor may be used to pre-process a file. The processor provides macro replacement, inclusion, conditional replacement, and processing capabilities for complex file processing. ISTSP: TIE-conforming version of the SPLIT utility to split up the concatenated files used on the tape. ISTSV: save/restore utility to save and restore sub-trees of the Portable File Store (PFS). ISTTD: text comparison tool. ISTVC: simple text file version controller. ISTAL: aids is a preprocessor that can be used to generate specific information from intermediate files created by other tools. The information that can be generated includes call-graphs, cross reference listings, segment execution frequencies, and symbol information. ISTAL can also strip

  19. Lawrence Livermore National Laboratory selects Intel Itanium 2 processors for world's most powerful Linux cluster

    CERN Multimedia

    2003-01-01

    "Intel Corporation, system manufacturer California Digital and the University of California at Lawrence Livermore National Laboratory (LLNL) today announced they are building one of the world's most powerful supercomputers. The supercomputer project, codenamed "Thunder," uses nearly 4,000 Intel® Itanium® 2 processors... is expected to be complete in January 2004" (1 page).

  20. Advanced compiler design and implementation

    CERN Document Server

    Muchnick, Steven S

    1997-01-01

    From the Foreword by Susan L. Graham: This book takes on the challenges of contemporary languages and architectures, and prepares the reader for the new compiling problems that will inevitably arise in the future. The definitive book on advanced compiler design This comprehensive, up-to-date work examines advanced issues in the design and implementation of compilers for modern processors. Written for professionals and graduate students, the book guides readers in designing and implementing efficient structures for highly optimizing compilers for real-world languages. Covering advanced issues in fundamental areas of compiler design, this book discusses a wide array of possible code optimizations, determining the relative importance of optimizations, and selecting the most effective methods of implementation. * Lays the foundation for understanding the major issues of advanced compiler design * Treats optimization in-depth * Uses four case studies of commercial compiling suites to illustrate different approache...

  1. Efficient Implementation of Many-body Quantum Chemical Methods on the Intel Xeon Phi Coprocessor

    Energy Technology Data Exchange (ETDEWEB)

    Apra, Edoardo; Klemm, Michael; Kowalski, Karol

    2014-12-01

    This paper presents the implementation and performance of the highly accurate CCSD(T) quantum chemistry method on the Intel Xeon Phi coprocessor within the context of the NWChem computational chemistry package. The widespread use of highly correlated methods in electronic structure calculations is contingent upon the interplay between advances in theory and the possibility of utilizing the ever-growing computer power of emerging heterogeneous architectures. We discuss the design decisions of our implementation as well as the optimizations applied to the compute kernels and data transfers between host and coprocessor. We show the feasibility of adopting the Intel Many Integrated Core Architecture and the Intel Xeon Phi coprocessor for developing efficient computational chemistry modeling tools. Remarkable scalability is demonstrated by benchmarks. Our solution scales up to a total of 62560 cores with the concurrent utilization of Intel Xeon processors and Intel Xeon Phi coprocessors.

  2. Analysis OpenMP performance of AMD and Intel architecture for breaking waves simulation using MPS

    Science.gov (United States)

    Alamsyah, M. N. A.; Utomo, A.; Gunawan, P. H.

    2018-03-01

    Simulation of breaking waves by using Navier-Stokes equation via moving particle semi-implicit method (MPS) over close domain is given. The results show the parallel computing on multicore architecture using OpenMP platform can reduce the computational time almost half of the serial time. Here, the comparison using two computer architectures (AMD and Intel) are performed. The results using Intel architecture is shown better than AMD architecture in CPU time. However, in efficiency, the computer with AMD architecture gives slightly higher than the Intel. For the simulation by 1512 number of particles, the CPU time using Intel and AMD are 12662.47 and 28282.30 respectively. Moreover, the efficiency using similar number of particles, AMD obtains 50.09 % and Intel up to 49.42 %.

  3. An Extensible Open-Source Compiler Infrastructure for Testing

    Energy Technology Data Exchange (ETDEWEB)

    Quinlan, D; Ur, S; Vuduc, R

    2005-12-09

    Testing forms a critical part of the development process for large-scale software, and there is growing need for automated tools that can read, represent, analyze, and transform the application's source code to help carry out testing tasks. However, the support required to compile applications written in common general purpose languages is generally inaccessible to the testing research community. In this paper, we report on an extensible, open-source compiler infrastructure called ROSE, which is currently in development at Lawrence Livermore National Laboratory. ROSE specifically targets developers who wish to build source-based tools that implement customized analyses and optimizations for large-scale C, C++, and Fortran90 scientific computing applications (on the order of a million lines of code or more). However, much of this infrastructure can also be used to address problems in testing, and ROSE is by design broadly accessible to those without a formal compiler background. This paper details the interactions between testing of applications and the ways in which compiler technology can aid in the understanding of those applications. We emphasize the particular aspects of ROSE, such as support for the general analysis of whole programs, that are particularly well-suited to the testing research community and the scale of the problems that community solves.

  4. LISP software generative compilation within the frame of a SLIP system

    International Nuclear Information System (INIS)

    Sitbon, Andre

    1968-01-01

    After having outlined the limitations associated with the use of some programming languages (Fortran, Algol, assembler, and so on), and the interest of the use of the LISP structure and its associated language, the author notices that some problems remain regarding the memorisation of the computing process obtained by interpretation. Thus, he introduces a generative compiler which produces an executable programme, and which is written in a language very close to the used machine language, i.e. the FAP assembler language

  5. Aspects of FORTRAN in large-scale programming

    International Nuclear Information System (INIS)

    Metcalf, M.

    1983-01-01

    In these two lectures I examine the following three questions: i) Why did high-energy physicists begin to use FORTRAN. ii) Why do high-energy physicists continue to use FORTRAN. iii) Will high-energy physicists always use FORTRAN. In order to find answers to these questions, it is necessary to look at the history of the language, its present position, and its likely future, and also to consider its manner of use, the topic of portability, and the competition from other languages. Here we think especially of early competition from ALGOL, the more recent spread in the use of PASCAL, and the appearance of a completely new and ambitious language, ADA. (orig.)

  6. Introduction to modern Fortran for the Earth system sciences

    CERN Document Server

    Chirila, Dragos B

    2014-01-01

    This work provides a short "getting started" guide to Fortran 90/95. The main target audience consists of newcomers to the field of numerical computation within Earth system sciences (students, researchers or scientific programmers). Furthermore, readers accustomed to other programming languages may also benefit from this work, by discovering how some programming techniques they are familiar with map to Fortran 95. The main goal is to enable readers to quickly start using Fortran 95 for writing useful programs. It also introduces a gradual discussion of Input/Output facilities relevant for Earth system sciences, from the simplest ones to the more advanced netCDF library (which has become a de facto standard for handling the massive datasets used within Earth system sciences). While related works already treat these disciplines separately (each often providing much more information than needed by the beginning practitioner), the reader finds in this book a shorter guide which links them. Compared to other book...

  7. Aspects of FORTRAN in large-scale programming

    CERN Document Server

    Metcalf, M

    1983-01-01

    In these two lectures I shall try to examine the following three questions: i) Why did high-energy physicists begin to use FORTRAN? ii) Why do high-energy physicists continue to use FORTRAN? iii) Will high-energy physicists always use FORTRAN? In order to find answers to these questions, it is necessary to look at the history of the language, its present position, and its likely future, and also to consider its manner of use, the topic of portability, and the competition from other languages. Here we think especially of early competition from ALGOL, the more recent spread in the use of PASCAL, and the appearance of a completely new and ambitious language, ADA.

  8. Numerical methods of mathematical optimization with Algol and Fortran programs

    CERN Document Server

    Künzi, Hans P; Zehnder, C A; Rheinboldt, Werner

    1971-01-01

    Numerical Methods of Mathematical Optimization: With ALGOL and FORTRAN Programs reviews the theory and the practical application of the numerical methods of mathematical optimization. An ALGOL and a FORTRAN program was developed for each one of the algorithms described in the theoretical section. This should result in easy access to the application of the different optimization methods.Comprised of four chapters, this volume begins with a discussion on the theory of linear and nonlinear optimization, with the main stress on an easily understood, mathematically precise presentation. In addition

  9. 75 FR 21353 - Intel Corporation, Fab 20 Division, Including On-Site Leased Workers From Volt Technical...

    Science.gov (United States)

    2010-04-23

    ... DEPARTMENT OF LABOR Employment and Training Administration [TA-W-73,642] Intel Corporation, Fab 20... of Intel Corporation, Fab 20 Division, including on-site leased workers of Volt Technical Resources... Precision, Inc. were employed on-site at the Hillsboro, Oregon location of Intel Corporation, Fab 20...

  10. Expert Programmer versus Parallelizing Compiler: A Comparative Study of Two Approaches for Distributed Shared Memory

    Directory of Open Access Journals (Sweden)

    M. F. P. O'Boyle

    1996-01-01

    Full Text Available This article critically examines current parallel programming practice and optimizing compiler development. The general strategies employed by compiler and programmer to optimize a Fortran program are described, and then illustrated for a specific case by applying them to a well-known scientific program, TRED2, using the KSR-1 as the target architecture. Extensive measurement is applied to the resulting versions of the program, which are compared with a version produced by a commercial optimizing compiler, KAP. The compiler strategy significantly outperforms KAP and does not fall far short of the performance achieved by the programmer. Following the experimental section each approach is critiqued by the other. Perceived flaws, advantages, and common ground are outlined, with an eye to improving both schemes.

  11. Compiler Feedback using Continuous Dynamic Compilation during Development

    DEFF Research Database (Denmark)

    Jensen, Nicklas Bo; Karlsson, Sven; Probst, Christian W.

    2014-01-01

    to optimization. This tool can help programmers understand what the optimizing compiler has done and suggest automatic source code changes in cases where the compiler refrains from optimizing. We have integrated our tool into an integrated development environment, interactively giving feedback as part...

  12. Comparison of Processor Performance of SPECint2006 Benchmarks of some Intel Xeon Processors

    Directory of Open Access Journals (Sweden)

    Abdul Kareem PARCHUR

    2012-08-01

    Full Text Available High performance is a critical requirement to all microprocessors manufacturers. The present paper describes the comparison of performance in two main Intel Xeon series processors (Type A: Intel Xeon X5260, X5460, E5450 and L5320 and Type B: Intel Xeon X5140, 5130, 5120 and E5310. The microarchitecture of these processors is implemented using the basis of a new family of processors from Intel starting with the Pentium 4 processor. These processors can provide a performance boost for many key application areas in modern generation. The scaling of performance in two major series of Intel Xeon processors (Type A: Intel Xeon X5260, X5460, E5450 and L5320 and Type B: Intel Xeon X5140, 5130, 5120 and E5310 has been analyzed using the performance numbers of 12 CPU2006 integer benchmarks, performance numbers that exhibit significant differences in performance. The results and analysis can be used by performance engineers, scientists and developers to better understand the performance scaling in modern generation processors.

  13. Evaluation of CHO Benchmarks on the Arria 10 FPGA using Intel FPGA SDK for OpenCL

    Energy Technology Data Exchange (ETDEWEB)

    Jin, Zheming [Argonne National Lab. (ANL), Argonne, IL (United States); Yoshii, Kazutomo [Argonne National Lab. (ANL), Argonne, IL (United States); Finkel, Hal [Argonne National Lab. (ANL), Argonne, IL (United States); Cappello, Franck [Argonne National Lab. (ANL), Argonne, IL (United States)

    2017-05-23

    The OpenCL standard is an open programming model for accelerating algorithms on heterogeneous computing system. OpenCL extends the C-based programming language for developing portable codes on different platforms such as CPU, Graphics processing units (GPUs), Digital Signal Processors (DSPs) and Field Programmable Gate Arrays (FPGAs). The Intel FPGA SDK for OpenCL is a suite of tools that allows developers to abstract away the complex FPGA-based development flow for a high-level software development flow. Users can focus on the design of hardware-accelerated kernel functions in OpenCL and then direct the tools to generate the low-level FPGA implementations. The approach makes the FPGA-based development more accessible to software users as the needs for hybrid computing using CPUs and FPGAs are increasing. It can also significantly reduce the hardware development time as users can evaluate different ideas with high-level language without deep FPGA domain knowledge. Benchmarking of OpenCL-based framework is an effective way for analyzing the performance of system by studying the execution of the benchmark applications. CHO is a suite of benchmark applications that provides support for OpenCL [1]. The authors presented CHO as an OpenCL port of the CHStone benchmark. Using Altera OpenCL (AOCL) compiler to synthesize the benchmark applications, they listed the resource usage and performance of each kernel that can be successfully synthesized by the compiler. In this report, we evaluate the resource usage and performance of the CHO benchmark applications using the Intel FPGA SDK for OpenCL and Nallatech 385A FPGA board that features an Arria 10 FPGA device. The focus of the report is to have a better understanding of the resource usage and performance of the kernel implementations using Arria-10 FPGA devices compared to Stratix-5 FPGA devices. In addition, we also gain knowledge about the limitations of the current compiler when it fails to synthesize a benchmark

  14. Numerical integration subprogrammes in Fortran II-D

    Energy Technology Data Exchange (ETDEWEB)

    Fry, C. R.

    1966-12-15

    This note briefly describes some integration subprogrammes written in FORTRAN II-D for the IBM 1620-II at CARDE. These presented are two Newton-Cotes, Chebyshev polynomial summation, Filon's, Nordsieck's and optimum Runge-Kutta and predictor-corrector methods. A few miscellaneous numerical integration procedures are also mentioned covering statistical functions, oscillating integrands and functions occurring in electrical engineering.

  15. Multidimentional and Multi-Parameter Fortran-Based Curve Fitting ...

    African Journals Online (AJOL)

    This work briefly describes the mathematics behind the algorithm, and also elaborates how to implement it using FORTRAN 95 programming language. The advantage of this algorithm, when it is extended to surfaces and complex functions, is that it makes researchers to have a better trust during fitting. It also improves the ...

  16. An Introduction to Fortran Programming: An IPI Approach.

    Science.gov (United States)

    Fisher, D. D.; And Others

    This text is designed to give individually paced instruction in Fortran Programing. The text contains fifteen units. Unit titles include: Flowcharts, Input and Output, Loops, and Debugging. Also included is an extensive set of appendices. These were designed to contain a great deal of practical information necessary to the course. These appendices…

  17. Performance of a plasma fluid code on the Intel parallel computers

    International Nuclear Information System (INIS)

    Lynch, V.E.; Carreras, B.A.; Drake, J.B.; Leboeuf, J.N.; Liewer, P.

    1992-01-01

    One approach to improving the real-time efficiency of plasma turbulence calculations is to use a parallel algorithm. A parallel algorithm for plasma turbulence calculations was tested on the Intel iPSC/860 hypercube and the Touchtone Delta machine. Using the 128 processors of the Intel iPSC/860 hypercube, a factor of 5 improvement over a single-processor CRAY-2 is obtained. For the Touchtone Delta machine, the corresponding improvement factor is 16. For plasma edge turbulence calculations, an extrapolation of the present results to the Intel σ machine gives an improvement factor close to 64 over the single-processor CRAY-2

  18. Performance of a plasma fluid code on the Intel parallel computers

    International Nuclear Information System (INIS)

    Lynch, V.E.; Carreras, B.A.; Drake, J.B.; Leboeuf, J.N.; Liewer, P.

    1992-01-01

    One approach to improving the real-time efficiency of plasma turbulence calculations is to use a parallel algorithm. A parallel algorithm for plasma turbulence calculations was tested on the Intel iPSC/860 hypercube and the Touchtone Delta machine. Using the 128 processors of the Intel iPSC/860 hypercube, a factor of 5 improvement over a single-processor CRAY-2 is obtained. For the Touchtone Delta machine, the corresponding improvement factor is 16. For plasma edge turbulence calculations, an extrapolation of the present results to the Intel (sigma) machine gives an improvement factor close to 64 over the single-processor CRAY-2. 12 refs

  19. Performance of a plasma fluid code on the Intel parallel computers

    Science.gov (United States)

    Lynch, V. E.; Carreras, B. A.; Drake, J. B.; Leboeuf, J. N.; Liewer, P.

    1992-01-01

    One approach to improving the real-time efficiency of plasma turbulence calculations is to use a parallel algorithm. A parallel algorithm for plasma turbulence calculations was tested on the Intel iPSC/860 hypercube and the Touchtone Delta machine. Using the 128 processors of the Intel iPSC/860 hypercube, a factor of 5 improvement over a single-processor CRAY-2 is obtained. For the Touchtone Delta machine, the corresponding improvement factor is 16. For plasma edge turbulence calculations, an extrapolation of the present results to the Intel (sigma) machine gives an improvement factor close to 64 over the single-processor CRAY-2.

  20. Compiling a 50-year journey

    DEFF Research Database (Denmark)

    Hutton, Graham; Bahr, Patrick

    2017-01-01

    Fifty years ago, John McCarthy and James Painter published the first paper on compiler verification, in which they showed how to formally prove the correctness of a compiler that translates arithmetic expressions into code for a register-based machine. In this article, we revisit this example...

  1. Single event effect testing of the Intel 80386 family and the 80486 microprocessor

    International Nuclear Information System (INIS)

    Moran, A.; LaBel, K.; Gates, M.; Seidleck, C.; McGraw, R.; Broida, M.; Firer, J.; Sprehn, S.

    1996-01-01

    The authors present single event effect test results for the Intel 80386 microprocessor, the 80387 coprocessor, the 82380 peripheral device, and on the 80486 microprocessor. Both single event upset and latchup conditions were monitored

  2. CAMSHIFT Tracker Design Experiments With Intel OpenCV and SAI

    National Research Council Canada - National Science Library

    Francois, Alexandre R

    2004-01-01

    ... (including multi-modal) systems, must be specifically addressed. This report describes design and implementation experiments for CAMSHIFT-based tracking systems using Intel's Open Computer Vision library and SAI...

  3. Multi-threaded ATLAS simulation on Intel Knights Landing processors

    Science.gov (United States)

    Farrell, Steven; Calafiura, Paolo; Leggett, Charles; Tsulaia, Vakhtang; Dotti, Andrea; ATLAS Collaboration

    2017-10-01

    The Knights Landing (KNL) release of the Intel Many Integrated Core (MIC) Xeon Phi line of processors is a potential game changer for HEP computing. With 72 cores and deep vector registers, the KNL cards promise significant performance benefits for highly-parallel, compute-heavy applications. Cori, the newest supercomputer at the National Energy Research Scientific Computing Center (NERSC), was delivered to its users in two phases with the first phase online at the end of 2015 and the second phase now online at the end of 2016. Cori Phase 2 is based on the KNL architecture and contains over 9000 compute nodes with 96GB DDR4 memory. ATLAS simulation with the multithreaded Athena Framework (AthenaMT) is a good potential use-case for the KNL architecture and supercomputers like Cori. ATLAS simulation jobs have a high ratio of CPU computation to disk I/O and have been shown to scale well in multi-threading and across many nodes. In this paper we will give an overview of the ATLAS simulation application with details on its multi-threaded design. Then, we will present a performance analysis of the application on KNL devices and compare it to a traditional x86 platform to demonstrate the capabilities of the architecture and evaluate the benefits of utilizing KNL platforms like Cori for ATLAS production.

  4. Performance optimization of Qbox and WEST on Intel Knights Landing

    Science.gov (United States)

    Zheng, Huihuo; Knight, Christopher; Galli, Giulia; Govoni, Marco; Gygi, Francois

    We present the optimization of electronic structure codes Qbox and WEST targeting the Intel®Xeon Phi™processor, codenamed Knights Landing (KNL). Qbox is an ab-initio molecular dynamics code based on plane wave density functional theory (DFT) and WEST is a post-DFT code for excited state calculations within many-body perturbation theory. Both Qbox and WEST employ highly scalable algorithms which enable accurate large-scale electronic structure calculations on leadership class supercomputer platforms beyond 100,000 cores, such as Mira and Theta at the Argonne Leadership Computing Facility. In this work, features of the KNL architecture (e.g. hierarchical memory) are explored to achieve higher performance in key algorithms of the Qbox and WEST codes and to develop a road-map for further development targeting next-generation computing architectures. In particular, the optimizations of the Qbox and WEST codes on the KNL platform will target efficient large-scale electronic structure calculations of nanostructured materials exhibiting complex structures and prediction of their electronic and thermal properties for use in solar and thermal energy conversion device. This work was supported by MICCoM, as part of Comp. Mats. Sci. Program funded by the U.S. DOE, Office of Sci., BES, MSE Division. This research used resources of the ALCF, which is a DOE Office of Sci. User Facility under Contract DE-AC02-06CH11357.

  5. Multi-threaded ATLAS simulation on Intel Knights Landing processors

    CERN Document Server

    AUTHOR|(INSPIRE)INSPIRE-00014247; The ATLAS collaboration; Calafiura, Paolo; Leggett, Charles; Tsulaia, Vakhtang; Dotti, Andrea

    2017-01-01

    The Knights Landing (KNL) release of the Intel Many Integrated Core (MIC) Xeon Phi line of processors is a potential game changer for HEP computing. With 72 cores and deep vector registers, the KNL cards promise significant performance benefits for highly-parallel, compute-heavy applications. Cori, the newest supercomputer at the National Energy Research Scientific Computing Center (NERSC), was delivered to its users in two phases with the first phase online at the end of 2015 and the second phase now online at the end of 2016. Cori Phase 2 is based on the KNL architecture and contains over 9000 compute nodes with 96GB DDR4 memory. ATLAS simulation with the multithreaded Athena Framework (AthenaMT) is a good potential use-case for the KNL architecture and supercomputers like Cori. ATLAS simulation jobs have a high ratio of CPU computation to disk I/O and have been shown to scale well in multi-threading and across many nodes. In this paper we will give an overview of the ATLAS simulation application with detai...

  6. Multi-threaded ATLAS Simulation on Intel Knights Landing Processors

    CERN Document Server

    Farrell, Steven; The ATLAS collaboration; Calafiura, Paolo; Leggett, Charles

    2016-01-01

    The Knights Landing (KNL) release of the Intel Many Integrated Core (MIC) Xeon Phi line of processors is a potential game changer for HEP computing. With 72 cores and deep vector registers, the KNL cards promise significant performance benefits for highly-parallel, compute-heavy applications. Cori, the newest supercomputer at the National Energy Research Scientific Computing Center (NERSC), will be delivered to its users in two phases with the first phase online now and the second phase expected in mid-2016. Cori Phase 2 will be based on the KNL architecture and will contain over 9000 compute nodes with 96GB DDR4 memory. ATLAS simulation with the multithreaded Athena Framework (AthenaMT) is a great use-case for the KNL architecture and supercomputers like Cori. Simulation jobs have a high ratio of CPU computation to disk I/O and have been shown to scale well in multi-threading and across many nodes. In this presentation we will give an overview of the ATLAS simulation application with details on its multi-thr...

  7. Evaluation of the Intel Sandy Bridge-EP server processor

    CERN Document Server

    Jarp, S; Leduc, J; Nowak, A; CERN. Geneva. IT Department

    2012-01-01

    In this paper we report on a set of benchmark results recently obtained by CERN openlab when comparing an 8-core “Sandy Bridge-EP” processor with Intel’s previous microarchitecture, the “Westmere-EP”. The Intel marketing names for these processors are “Xeon E5-2600 processor series” and “Xeon 5600 processor series”, respectively. Both processors are produced in a 32nm process, and both platforms are dual-socket servers. Multiple benchmarks were used to get a good understanding of the performance of the new processor. We used both industry-standard benchmarks, such as SPEC2006, and specific High Energy Physics benchmarks, representing both simulation of physics detectors and data analysis of physics events. Before summarizing the results we must stress the fact that benchmarking of modern processors is a very complex affair. One has to control (at least) the following features: processor frequency, overclocking via Turbo mode, the number of physical cores in use, the use of logical cores ...

  8. Evaluation of the Intel Nehalem-EX server processor

    CERN Document Server

    Jarp, S; Leduc, J; Nowak, A; CERN. Geneva. IT Department

    2010-01-01

    In this paper we report on a set of benchmark results recently obtained by the CERN openlab by comparing the 4-socket, 32-core Intel Xeon X7560 server with the previous generation 4-socket server, based on the Xeon X7460 processor. The Xeon X7560 processor represents a major change in many respects, especially the memory sub-system, so it was important to make multiple comparisons. In most benchmarks the two 4-socket servers were compared. It should be underlined that both servers represent the “top of the line” in terms of frequency. However, in some cases, it was important to compare systems that integrated the latest processor features, such as QPI links, Symmetric multithreading and over-clocking via Turbo mode, and in such situations the X7560 server was compared to a dual socket L5520 based system with an identical frequency of 2.26 GHz. Before summarizing the results we must stress the fact that benchmarking of modern processors is a very complex affair. One has to control (at least) the following ...

  9. Implementation of the Next Generation Attenuation (NGA) ground-motion prediction equations in Fortran and R

    Science.gov (United States)

    Kaklamanos, James; Boore, David M.; Thompson, Eric M.; Campbell, Kenneth W.

    2010-01-01

    This report presents two methods for implementing the earthquake ground-motion prediction equations released in 2008 as part of the Next Generation Attenuation of Ground Motions (NGA-West, or NGA) project coordinated by the Pacific Earthquake Engineering Research Center (PEER). These models were developed for predicting ground-motion parameters for shallow crustal earthquakes in active tectonic regions (such as California). Of the five ground-motion prediction equations (GMPEs) developed during the NGA project, four models are implemented: the GMPEs of Abrahamson and Silva (2008), Boore and Atkinson (2008), Campbell and Bozorgnia (2008), and Chiou and Youngs (2008a); these models are abbreviated as AS08, BA08, CB08, and CY08, respectively. Since site response is widely recognized as an important influence of ground motions, engineering applications typically require that such effects be modeled. The model of Idriss (2008) is not implemented in our programs because it does not explicitly include site response, whereas the other four models include site response and use the same variable to describe the site condition (VS30). We do not intend to discourage the use of the Idriss (2008) model, but we have chosen to implement the other four NGA models in our programs for those users who require ground-motion estimates for various site conditions. We have implemented the NGA models by using two separate programming languages: Fortran and R (R Development Core Team, 2010). Fortran, a compiled programming language, has been used in the scientific community for decades. R is an object-oriented language and environment for statistical computing that is gaining popularity in the statistical and scientific community. Derived from the S language and environment developed at Bell Laboratories, R is an open-source language that is freely available at http://www.r-project.org/ (last accessed 11 January 2011). In R, the functions for computing the NGA equations can be loaded as an

  10. Performance Characterization of Multi-threaded Graph Processing Applications on Intel Many-Integrated-Core Architecture

    OpenAIRE

    Liu, Xu; Chen, Langshi; Firoz, Jesun S.; Qiu, Judy; Jiang, Lei

    2017-01-01

    Intel Xeon Phi many-integrated-core (MIC) architectures usher in a new era of terascale integration. Among emerging killer applications, parallel graph processing has been a critical technique to analyze connected data. In this paper, we empirically evaluate various computing platforms including an Intel Xeon E5 CPU, a Nvidia Geforce GTX1070 GPU and an Xeon Phi 7210 processor codenamed Knights Landing (KNL) in the domain of parallel graph processing. We show that the KNL gains encouraging per...

  11. Adaptation of MPDATA Heterogeneous Stencil Computation to Intel Xeon Phi Coprocessor

    Directory of Open Access Journals (Sweden)

    Lukasz Szustak

    2015-01-01

    Full Text Available The multidimensional positive definite advection transport algorithm (MPDATA belongs to the group of nonoscillatory forward-in-time algorithms and performs a sequence of stencil computations. MPDATA is one of the major parts of the dynamic core of the EULAG geophysical model. In this work, we outline an approach to adaptation of the 3D MPDATA algorithm to the Intel MIC architecture. In order to utilize available computing resources, we propose the (3 + 1D decomposition of MPDATA heterogeneous stencil computations. This approach is based on combination of the loop tiling and fusion techniques. It allows us to ease memory/communication bounds and better exploit the theoretical floating point efficiency of target computing platforms. An important method of improving the efficiency of the (3 + 1D decomposition is partitioning of available cores/threads into work teams. It permits for reducing inter-cache communication overheads. This method also increases opportunities for the efficient distribution of MPDATA computation onto available resources of the Intel MIC architecture, as well as Intel CPUs. We discuss preliminary performance results obtained on two hybrid platforms, containing two CPUs and Intel Xeon Phi. The top-of-the-line Intel Xeon Phi 7120P gives the best performance results, and executes MPDATA almost 2 times faster than two Intel Xeon E5-2697v2 CPUs.

  12. User manual for two simple postscript output FORTRAN plotting routines

    Science.gov (United States)

    Nguyen, T. X.

    1991-01-01

    Graphics is one of the important tools in engineering analysis and design. However, plotting routines that generate output on high quality laser printers normally come in graphics packages, which tend to be expensive and system dependent. These factors become important for small computer systems or desktop computers, especially when only some form of a simple plotting routine is sufficient. With the Postscript language becoming popular, there are more and more Postscript laser printers now available. Simple, versatile, low cost plotting routines that can generate output on high quality laser printers are needed and standard FORTRAN language plotting routines using output in Postscript language seems logical. The purpose here is to explain two simple FORTRAN plotting routines that generate output in Postscript language.

  13. Algorithmic synthesis using Python compiler

    Science.gov (United States)

    Cieszewski, Radoslaw; Romaniuk, Ryszard; Pozniak, Krzysztof; Linczuk, Maciej

    2015-09-01

    This paper presents a python to VHDL compiler. The compiler interprets an algorithmic description of a desired behavior written in Python and translate it to VHDL. FPGA combines many benefits of both software and ASIC implementations. Like software, the programmed circuit is flexible, and can be reconfigured over the lifetime of the system. FPGAs have the potential to achieve far greater performance than software as a result of bypassing the fetch-decode-execute operations of traditional processors, and possibly exploiting a greater level of parallelism. This can be achieved by using many computational resources at the same time. Creating parallel programs implemented in FPGAs in pure HDL is difficult and time consuming. Using higher level of abstraction and High-Level Synthesis compiler implementation time can be reduced. The compiler has been implemented using the Python language. This article describes design, implementation and results of created tools.

  14. A FORTRAN program for a least-square fitting

    International Nuclear Information System (INIS)

    Yamazaki, Tetsuo

    1978-01-01

    A practical FORTRAN program for a least-squares fitting is presented. Although the method is quite usual, the program calculates not only the most satisfactory set of values of unknowns but also the plausible errors associated with them. As an example, a measured lateral absorbed-dose distribution in water for a narrow 25-MeV electron beam is fitted to a Gaussian distribution. (auth.)

  15. A Fortran Program for Deep Space Sensor Analysis.

    Science.gov (United States)

    1984-12-14

    used to help p maintain currency to the deep space satellite catelog? Research Question Can a Fortran program be designed to evaluate the effectiveness ...Range ( AFETR ) Range p Measurements Laboratory (RML) is located in Malibar, .- Florida. Like GEODSS, Malibar uses a 48 inch telescope with a...phased out. This mode will evaluate the effect of the loss of the 3 Baker-Nunn sites to mode 3 Mode 5 through Mode 8 Modes 5 through 8 are identical to

  16. A compiler for variational forms

    OpenAIRE

    Kirby, Robert C.; Logg, Anders

    2011-01-01

    As a key step towards a complete automation of the finite element method, we present a new algorithm for automatic and efficient evaluation of multilinear variational forms. The algorithm has been implemented in the form of a compiler, the FEniCS Form Compiler FFC. We present benchmark results for a series of standard variational forms, including the incompressible Navier-Stokes equations and linear elasticity. The speedup compared to the standard quadrature-based approach is impressive; in s...

  17. MAPLIB, Thermodynamics Materials Property Generator for FORTRAN Program

    International Nuclear Information System (INIS)

    Schumann, U.; Zimmerer, W. and others

    1978-01-01

    1 - Nature of physical problem solved: MAPLIB is a program system which is able to incorporate the values of the properties of any material in a form suitable for use in other computer programs. The data are implemented in FORTRAN functions. A utility program is provided to assist in library management. 2 - Method of solution: MAPLIB consists of the following parts: 1) Conventions for the data format. 2) Some integrated data. 3) A data access system (FORTRAN subroutine). 4) An utility program for updating and documentation of the actual library content. The central part is a set of FORTRAN functions, e.g. WL H2O v(t,p) (heat conduction of water vapor as a function of temperature and pressure), which compute the required data and which can be called by the user program. The data content of MAPLIB has been delivered by many persons. There was no systematic evaluation of the material. It is the responsibility of every user to check the data for physical accuracy. MAPLIB only serves as a library system for manipulation and storing of such data. 3 - Restrictions on the complexity of the problem: a) See responsibility as explained above. b) Up to 1000 data functions could be implemented. c) If too many data functions are included in MAPLIB, the storage requirements become excessive for application in users programs

  18. Automatic generation of Fortran programs for algebraic simulation models

    International Nuclear Information System (INIS)

    Schopf, W.; Rexer, G.; Ruehle, R.

    1978-04-01

    This report documents a generator program by which econometric simulation models formulated in an application-orientated language can be transformed automatically in a Fortran program. Thus the model designer is able to build up, test and modify models without the need of a Fortran programmer. The development of a computer model is therefore simplified and shortened appreciably; in chapter 1-3 of this report all rules are presented for the application of the generator to the model design. Algebraic models including exogeneous and endogeneous time series variables, lead and lag function can be generated. In addition, to these language elements, Fortran sequences can be applied to the formulation of models in the case of complex model interrelations. Automatically the generated model is a module of the program system RSYST III and is therefore able to exchange input and output data with the central data bank of the system and in connection with the method library modules can be used to handle planning problems. (orig.) [de

  19. Final Report, Center for Programming Models for Scalable Parallel Computing: Co-Array Fortran, Grant Number DE-FC02-01ER25505

    Energy Technology Data Exchange (ETDEWEB)

    Robert W. Numrich

    2008-04-22

    The major accomplishment of this project is the production of CafLib, an 'object-oriented' parallel numerical library written in Co-Array Fortran. CafLib contains distributed objects such as block vectors and block matrices along with procedures, attached to each object, that perform basic linear algebra operations such as matrix multiplication, matrix transpose and LU decomposition. It also contains constructors and destructors for each object that hide the details of data decomposition from the programmer, and it contains collective operations that allow the programmer to calculate global reductions, such as global sums, global minima and global maxima, as well as vector and matrix norms of several kinds. CafLib is designed to be extensible in such a way that programmers can define distributed grid and field objects, based on vector and matrix objects from the library, for finite difference algorithms to solve partial differential equations. A very important extra benefit that resulted from the project is the inclusion of the co-array programming model in the next Fortran standard called Fortran 2008. It is the first parallel programming model ever included as a standard part of the language. Co-arrays will be a supported feature in all Fortran compilers, and the portability provided by standardization will encourage a large number of programmers to adopt it for new parallel application development. The combination of object-oriented programming in Fortran 2003 with co-arrays in Fortran 2008 provides a very powerful programming model for high-performance scientific computing. Additional benefits from the project, beyond the original goal, include a programto provide access to the co-array model through access to the Cray compiler as a resource for teaching and research. Several academics, for the first time, included the co-array model as a topic in their courses on parallel computing. A separate collaborative project with LANL and PNNL showed how to

  20. Balancing Contention and Synchronization on the Intel Paragon

    Science.gov (United States)

    Bokhari, Shahid H.; Nicol, David M.

    1996-01-01

    The Intel Paragon is a mesh-connected distributed memory parallel computer. It uses an oblivious and deterministic message routing algorithm: this permits us to develop highly optimized schedules for frequently needed communication patterns. The complete exchange is one such pattern. Several approaches are available for carrying it out on the mesh. We study an algorithm developed by Scott. This algorithm assumes that a communication link can carry one message at a time and that a node can only transmit one message at a time. It requires global synchronization to enforce a schedule of transmissions. Unfortunately global synchronization has substantial overhead on the Paragon. At the same time the powerful interconnection mechanism of this machine permits 2 or 3 messages to share a communication link with minor overhead. It can also overlap multiple message transmission from the same node to some extent. We develop a generalization of Scott's algorithm that executes complete exchange with a prescribed contention. Schedules that incur greater contention require fewer synchronization steps. This permits us to tradeoff contention against synchronization overhead. We describe the performance of this algorithm and compare it with Scott's original algorithm as well as with a naive algorithm that does not take interconnection structure into account. The Bounded contention algorithm is always better than Scott's algorithm and outperforms the naive algorithm for all but the smallest message sizes. The naive algorithm fails to work on meshes larger than 12 x 12. These results show that due consideration of processor interconnect and machine performance parameters is necessary to obtain peak performance from the Paragon and its successor mesh machines.

  1. Fortran interface layer of the framework for developing particle simulator FDPS

    Science.gov (United States)

    Namekata, Daisuke; Iwasawa, Masaki; Nitadori, Keigo; Tanikawa, Ataru; Muranushi, Takayuki; Wang, Long; Hosono, Natsuki; Nomura, Kentaro; Makino, Junichiro

    2018-06-01

    Numerical simulations based on particle methods have been widely used in various fields including astrophysics. To date, various versions of simulation software have been developed by individual researchers or research groups in each field, through a huge amount of time and effort, even though the numerical algorithms used are very similar. To improve the situation, we have developed a framework, called FDPS (Framework for Developing Particle Simulators), which enables researchers to develop massively parallel particle simulation codes for arbitrary particle methods easily. Until version 3.0, FDPS provided an API (application programming interface) for the C++ programming language only. This limitation comes from the fact that FDPS is developed using the template feature in C++, which is essential to support arbitrary data types of particle. However, there are many researchers who use Fortran to develop their codes. Thus, the previous versions of FDPS require such people to invest much time to learn C++. This is inefficient. To cope with this problem, we developed a Fortran interface layer in FDPS, which provides API for Fortran. In order to support arbitrary data types of particle in Fortran, we design the Fortran interface layer as follows. Based on a given derived data type in Fortran representing particle, a PYTHON script provided by us automatically generates a library that manipulates the C++ core part of FDPS. This library is seen as a Fortran module providing an API of FDPS from the Fortran side and uses C programs internally to interoperate Fortran with C++. In this way, we have overcome several technical issues when emulating a `template' in Fortran. Using the Fortran interface, users can develop all parts of their codes in Fortran. We show that the overhead of the Fortran interface part is sufficiently small and a code written in Fortran shows a performance practically identical to the one written in C++.

  2. Advanced C and C++ compiling

    CERN Document Server

    Stevanovic, Milan

    2014-01-01

    Learning how to write C/C++ code is only the first step. To be a serious programmer, you need to understand the structure and purpose of the binary files produced by the compiler: object files, static libraries, shared libraries, and, of course, executables.Advanced C and C++ Compiling explains the build process in detail and shows how to integrate code from other developers in the form of deployed libraries as well as how to resolve issues and potential mismatches between your own and external code trees.With the proliferation of open source, understanding these issues is increasingly the res

  3. Multi-Kepler GPU vs. multi-Intel MIC for spin systems simulations

    Science.gov (United States)

    Bernaschi, M.; Bisson, M.; Salvadore, F.

    2014-10-01

    We present and compare the performances of two many-core architectures: the Nvidia Kepler and the Intel MIC both in a single system and in cluster configuration for the simulation of spin systems. As a benchmark we consider the time required to update a single spin of the 3D Heisenberg spin glass model by using the Over-relaxation algorithm. We present data also for a traditional high-end multi-core architecture: the Intel Sandy Bridge. The results show that although on the two Intel architectures it is possible to use basically the same code, the performances of a Intel MIC change dramatically depending on (apparently) minor details. Another issue is that to obtain a reasonable scalability with the Intel Phi coprocessor (Phi is the coprocessor that implements the MIC architecture) in a cluster configuration it is necessary to use the so-called offload mode which reduces the performances of the single system. As to the GPU, the Kepler architecture offers a clear advantage with respect to the previous Fermi architecture maintaining exactly the same source code. Scalability of the multi-GPU implementation remains very good by using the CPU as a communication co-processor of the GPU. All source codes are provided for inspection and for double-checking the results.

  4. Formula Translation in Blitz++, NumPy and Modern Fortran: A Case Study of the Language Choice Tradeoffs

    Directory of Open Access Journals (Sweden)

    Sylwester Arabas

    2014-01-01

    Full Text Available Three object-oriented implementations of a prototype solver of the advection equation are introduced. The presented programs are based on Blitz++ (C++, NumPy (Python and Fortran's built-in array containers. The solvers constitute implementations of the Multidimensional Positive-Definite Advective Transport Algorithm (MPDATA. The introduced codes serve as examples for how the application of object-oriented programming (OOP techniques and new language constructs from C++11 and Fortran 2008 allow to reproduce the mathematical notation used in the literature within the program code. A discussion on the tradeoffs of the programming language choice is presented. The main angles of comparison are code brevity and syntax clarity (and hence maintainability and auditability as well as performance. All performance tests are carried out using free and open-source compilers. In the case of Python, a significant performance gain is observed when switching from the standard interpreter (CPython to the PyPy implementation of Python. Entire source code of all three implementations is embedded in the text and is licensed under the terms of the GNU GPL license.

  5. Dynamic Memory De-allocation in Fortran 95/2003 Derived Type Calculus

    Directory of Open Access Journals (Sweden)

    Damian W.I. Rouson

    2005-01-01

    Full Text Available Abstract data types developed for computational science and engineering are frequently modeled after physical objects whose state variables must satisfy governing differential equations. Generalizing the associated algebraic and differential operators to operate on the abstract data types facilitates high-level program constructs that mimic standard mathematical notation. For non-trivial expressions, multiple object instantiations must occur to hold intermediate results during the expression's evaluation. When the dimension of each object's state space is not specified at compile-time, the programmer becomes responsible for dynamically allocating and de-allocating memory for each instantiation. With the advent of allocatable components in Fortran 2003 derived types, the potential exists for these intermediate results to occupy a substantial fraction of a program's footprint in memory. This issue becomes particularly acute at the highest levels of abstraction where coarse-grained data structures predominate. This paper proposes a set of rules for de-allocating memory that has been dynamically allocated for intermediate results in derived type calculus, while distinguishing that memory from more persistent objects. The new rules are applied to the design of a polymorphic time integrator for integrating evolution equations governing dynamical systems. Associated issues of efficiency and design robustness are discussed.

  6. On the tradeoffs of programming language choice for numerical modelling in geoscience. A case study comparing modern Fortran, C++/Blitz++ and Python/NumPy.

    Science.gov (United States)

    Jarecka, D.; Arabas, S.; Fijalkowski, M.; Gaynor, A.

    2012-04-01

    The language of choice for numerical modelling in geoscience has long been Fortran. A choice of a particular language and coding paradigm comes with different set of tradeoffs such as that between performance, ease of use (and ease of abuse), code clarity, maintainability and reusability, availability of open source compilers, debugging tools, adequate external libraries and parallelisation mechanisms. The availability of trained personnel and the scale and activeness of the developer community is of importance as well. We present a short comparison study aimed at identification and quantification of these tradeoffs for a particular example of an object oriented implementation of a parallel 2D-advection-equation solver in Python/NumPy, C++/Blitz++ and modern Fortran. The main angles of comparison will be complexity of implementation, performance of various compilers or interpreters and characterisation of the "added value" gained by a particular choice of the language. The choice of the numerical problem is dictated by the aim to make the comparison useful and meaningful to geoscientists. Python is chosen as a language that traditionally is associated with ease of use, elegant syntax but limited performance. C++ is chosen for its traditional association with high performance but even higher complexity and syntax obscurity. Fortran is included in the comparison for its widespread use in geoscience often attributed to its performance. We confront the validity of these traditional views. We point out how the usability of a particular language in geoscience depends on the characteristics of the language itself and the availability of pre-existing software libraries (e.g. NumPy, SciPy, PyNGL, PyNIO, MPI4Py for Python and Blitz++, Boost.Units, Boost.MPI for C++). Having in mind the limited complexity of the considered numerical problem, we present a tentative comparison of performance of the three implementations with different open source compilers including CPython and

  7. Mass: Fortran program for calculating mass-absorption coefficients

    International Nuclear Information System (INIS)

    Nielsen, Aa.; Svane Petersen, T.

    1980-01-01

    Determinations of mass-absorption coefficients in the x-ray analysis of trace elements are an important and time consuming part of the arithmetic calculation. In the course of time different metods have been used. The program MASS calculates the mass-absorption coefficients from a given major element analysis at the x-ray wavelengths normally used in trace element determinations and lists the chemical analysis and the mass-absorption coefficients. The program is coded in FORTRAN IV, and is operational on the IBM 370/165 computer, on the UNIVAC 1110 and on PDP 11/05. (author)

  8. Numerical computation of molecular integrals via optimized (vectorized) FORTRAN code

    International Nuclear Information System (INIS)

    Scott, T.C.; Grant, I.P.; Saunders, V.R.

    1997-01-01

    The calculation of molecular properties based on quantum mechanics is an area of fundamental research whose horizons have always been determined by the power of state-of-the-art computers. A computational bottleneck is the numerical calculation of the required molecular integrals to sufficient precision. Herein, we present a method for the rapid numerical evaluation of molecular integrals using optimized FORTRAN code generated by Maple. The method is based on the exploitation of common intermediates and the optimization can be adjusted to both serial and vectorized computations. (orig.)

  9. A FORTRAN realization of the block adjustment of CCD frames

    Science.gov (United States)

    Yu, Yong; Tang, Zhenghong; Li, Jinling; Zhao, Ming

    A FORTRAN version realization of the block adjustment (BA) of overlapping CCD frames is developed. The flowchart is introduced including (a) data collection, (b) preprocessing, and (c) BA and object positioning. The subroutines and their functions are also demonstrated. The program package is tested by simulated data with/without the application of white noises. It is also preliminarily applied to the reduction of optical positions of four extragalactic radio sources. The results show that because of the increase in the sky coverage and number of reference stars, the precision of deducted positions is improved compared with single plate adjustment.

  10. The FORTRAN-77 version of the Karlsruhe program system KAPROS

    International Nuclear Information System (INIS)

    Moritz, N.

    1985-02-01

    The FORTRAN-77 KAPROS-kernel includes some major changes compared with the version, which is described in the KfK-report 2254. The changes are documented in this report from the point of view of the system-programmer. This report is meant to be a supplement to the KfK-report 2254, assuming that the reader of this report is familiar with the KfK-report 2254. He also should be familiar with the IBM operating system MVS SP1.3.2 and the usual terms of data processing. (orig.) [de

  11. 1988 Bulletin compilation and index

    International Nuclear Information System (INIS)

    1989-02-01

    This document is published to provide current information about the national program for managing spent fuel and high-level radioactive waste. This document is a compilation of issues from the 1988 calendar year. A table of contents and one index have been provided to assist in finding information

  12. Compilation of solar abundance data

    International Nuclear Information System (INIS)

    Hauge, Oe.; Engvold, O.

    1977-01-01

    Interest in the previous compilations of solar abundance data by the same authors (ITA--31 and ITA--39) has led to this third, revised edition. Solar abundance data of 67 elements are tabulated and in addition upper limits for the abundances of 5 elements are listed. References are made to 167 papers. A recommended abundance value is given for each element. (JIW)

  13. 1988 Bulletin compilation and index

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1989-02-01

    This document is published to provide current information about the national program for managing spent fuel and high-level radioactive waste. This document is a compilation of issues from the 1988 calendar year. A table of contents and one index have been provided to assist in finding information.

  14. Intel Xeon Phi accelerated Weather Research and Forecasting (WRF) Goddard microphysics scheme

    Science.gov (United States)

    Mielikainen, J.; Huang, B.; Huang, A. H.-L.

    2014-12-01

    The Weather Research and Forecasting (WRF) model is a numerical weather prediction system designed to serve both atmospheric research and operational forecasting needs. The WRF development is a done in collaboration around the globe. Furthermore, the WRF is used by academic atmospheric scientists, weather forecasters at the operational centers and so on. The WRF contains several physics components. The most time consuming one is the microphysics. One microphysics scheme is the Goddard cloud microphysics scheme. It is a sophisticated cloud microphysics scheme in the Weather Research and Forecasting (WRF) model. The Goddard microphysics scheme is very suitable for massively parallel computation as there are no interactions among horizontal grid points. Compared to the earlier microphysics schemes, the Goddard scheme incorporates a large number of improvements. Thus, we have optimized the Goddard scheme code. In this paper, we present our results of optimizing the Goddard microphysics scheme on Intel Many Integrated Core Architecture (MIC) hardware. The Intel Xeon Phi coprocessor is the first product based on Intel MIC architecture, and it consists of up to 61 cores connected by a high performance on-die bidirectional interconnect. The Intel MIC is capable of executing a full operating system and entire programs rather than just kernels as the GPU does. The MIC coprocessor supports all important Intel development tools. Thus, the development environment is one familiar to a vast number of CPU developers. Although, getting a maximum performance out of MICs will require using some novel optimization techniques. Those optimization techniques are discussed in this paper. The results show that the optimizations improved performance of Goddard microphysics scheme on Xeon Phi 7120P by a factor of 4.7×. In addition, the optimizations reduced the Goddard microphysics scheme's share of the total WRF processing time from 20.0 to 7.5%. Furthermore, the same optimizations

  15. IMAGEP - A FORTRAN ALGORITHM FOR DIGITAL IMAGE PROCESSING

    Science.gov (United States)

    Roth, D. J.

    1994-01-01

    IMAGEP is a FORTRAN computer algorithm containing various image processing, analysis, and enhancement functions. It is a keyboard-driven program organized into nine subroutines. Within the subroutines are other routines, also, selected via keyboard. Some of the functions performed by IMAGEP include digitization, storage and retrieval of images; image enhancement by contrast expansion, addition and subtraction, magnification, inversion, and bit shifting; display and movement of cursor; display of grey level histogram of image; and display of the variation of grey level intensity as a function of image position. This algorithm has possible scientific, industrial, and biomedical applications in material flaw studies, steel and ore analysis, and pathology, respectively. IMAGEP is written in VAX FORTRAN for DEC VAX series computers running VMS. The program requires the use of a Grinnell 274 image processor which can be obtained from Mark McCloud Associates, Campbell, CA. An object library of the required GMR series software is included on the distribution media. IMAGEP requires 1Mb of RAM for execution. The standard distribution medium for this program is a 1600 BPI 9track magnetic tape in VAX FILES-11 format. It is also available on a TK50 tape cartridge in VAX FILES-11 format. This program was developed in 1991. DEC, VAX, VMS, and TK50 are trademarks of Digital Equipment Corporation.

  16. High-performance computing on the Intel Xeon Phi how to fully exploit MIC architectures

    CERN Document Server

    Wang, Endong; Shen, Bo; Zhang, Guangyong; Lu, Xiaowei; Wu, Qing; Wang, Yajuan

    2014-01-01

    The aim of this book is to explain to high-performance computing (HPC) developers how to utilize the Intel® Xeon Phi™ series products efficiently. To that end, it introduces some computing grammar, programming technology and optimization methods for using many-integrated-core (MIC) platforms and also offers tips and tricks for actual use, based on the authors' first-hand optimization experience.The material is organized in three sections. The first section, "Basics of MIC", introduces the fundamentals of MIC architecture and programming, including the specific Intel MIC programming environment

  17. Evaluating the transport layer of the ALFA framework for the Intel® Xeon Phi™ Coprocessor

    Science.gov (United States)

    Santogidis, Aram; Hirstius, Andreas; Lalis, Spyros

    2015-12-01

    The ALFA framework supports the software development of major High Energy Physics experiments. As part of our research effort to optimize the transport layer of ALFA, we focus on profiling its data transfer performance for inter-node communication on the Intel Xeon Phi Coprocessor. In this article we present the collected performance measurements with the related analysis of the results. The optimization opportunities that are discovered, help us to formulate the future plans of enabling high performance data transfer for ALFA on the Intel Xeon Phi architecture.

  18. Implementation of High-Order Multireference Coupled-Cluster Methods on Intel Many Integrated Core Architecture.

    Science.gov (United States)

    Aprà, E; Kowalski, K

    2016-03-08

    In this paper we discuss the implementation of multireference coupled-cluster formalism with singles, doubles, and noniterative triples (MRCCSD(T)), which is capable of taking advantage of the processing power of the Intel Xeon Phi coprocessor. We discuss the integration of two levels of parallelism underlying the MRCCSD(T) implementation with computational kernels designed to offload the computationally intensive parts of the MRCCSD(T) formalism to Intel Xeon Phi coprocessors. Special attention is given to the enhancement of the parallel performance by task reordering that has improved load balancing in the noniterative part of the MRCCSD(T) calculations. We also discuss aspects regarding efficient optimization and vectorization strategies.

  19. Evaluation of the Intel Xeon Phi Co-processor to accelerate the sensitivity map calculation for PET imaging

    Science.gov (United States)

    Dey, T.; Rodrigue, P.

    2015-07-01

    We aim to evaluate the Intel Xeon Phi coprocessor for acceleration of 3D Positron Emission Tomography (PET) image reconstruction. We focus on the sensitivity map calculation as one computational intensive part of PET image reconstruction, since it is a promising candidate for acceleration with the Many Integrated Core (MIC) architecture of the Xeon Phi. The computation of the voxels in the field of view (FoV) can be done in parallel and the 103 to 104 samples needed to calculate the detection probability of each voxel can take advantage of vectorization. We use the ray tracing kernels of the Embree project to calculate the hit points of the sample rays with the detector and in a second step the sum of the radiological path taking into account attenuation is determined. The core components are implemented using the Intel single instruction multiple data compiler (ISPC) to enable a portable implementation showing efficient vectorization either on the Xeon Phi and the Host platform. On the Xeon Phi, the calculation of the radiological path is also implemented in hardware specific intrinsic instructions (so-called `intrinsics') to allow manually-optimized vectorization. For parallelization either OpenMP and ISPC tasking (based on pthreads) are evaluated.Our implementation achieved a scalability factor of 0.90 on the Xeon Phi coprocessor (model 5110P) with 60 cores at 1 GHz. Only minor differences were found between parallelization with OpenMP and the ISPC tasking feature. The implementation using intrinsics was found to be about 12% faster than the portable ISPC version. With this version, a speedup of 1.43 was achieved on the Xeon Phi coprocessor compared to the host system (HP SL250s Gen8) equipped with two Xeon (E5-2670) CPUs, with 8 cores at 2.6 to 3.3 GHz each. Using a second Xeon Phi card the speedup could be further increased to 2.77. No significant differences were found between the results of the different Xeon Phi and the Host implementations. The examination

  20. Evaluation of the Intel Xeon Phi Co-processor to accelerate the sensitivity map calculation for PET imaging

    International Nuclear Information System (INIS)

    Dey, T.; Rodrigue, P.

    2015-01-01

    We aim to evaluate the Intel Xeon Phi coprocessor for acceleration of 3D Positron Emission Tomography (PET) image reconstruction. We focus on the sensitivity map calculation as one computational intensive part of PET image reconstruction, since it is a promising candidate for acceleration with the Many Integrated Core (MIC) architecture of the Xeon Phi. The computation of the voxels in the field of view (FoV) can be done in parallel and the 10 3 to 10 4 samples needed to calculate the detection probability of each voxel can take advantage of vectorization. We use the ray tracing kernels of the Embree project to calculate the hit points of the sample rays with the detector and in a second step the sum of the radiological path taking into account attenuation is determined. The core components are implemented using the Intel single instruction multiple data compiler (ISPC) to enable a portable implementation showing efficient vectorization either on the Xeon Phi and the Host platform. On the Xeon Phi, the calculation of the radiological path is also implemented in hardware specific intrinsic instructions (so-called 'intrinsics') to allow manually-optimized vectorization. For parallelization either OpenMP and ISPC tasking (based on pthreads) are evaluated.Our implementation achieved a scalability factor of 0.90 on the Xeon Phi coprocessor (model 5110P) with 60 cores at 1 GHz. Only minor differences were found between parallelization with OpenMP and the ISPC tasking feature. The implementation using intrinsics was found to be about 12% faster than the portable ISPC version. With this version, a speedup of 1.43 was achieved on the Xeon Phi coprocessor compared to the host system (HP SL250s Gen8) equipped with two Xeon (E5-2670) CPUs, with 8 cores at 2.6 to 3.3 GHz each. Using a second Xeon Phi card the speedup could be further increased to 2.77. No significant differences were found between the results of the different Xeon Phi and the Host implementations. The

  1. HAL/S-FC compiler system specifications

    Science.gov (United States)

    1976-01-01

    This document specifies the informational interfaces within the HAL/S-FC compiler, and between the compiler and the external environment. This Compiler System Specification is for the HAL/S-FC compiler and its associated run time facilities which implement the full HAL/S language. The HAL/S-FC compiler is designed to operate stand-alone on any compatible IBM 360/370 computer and within the Software Development Laboratory (SDL) at NASA/JSC, Houston, Texas.

  2. Computer applications in physics with FORTRAN, BASIC and C

    CERN Document Server

    Chandra, Suresh

    2014-01-01

    Because of encouraging response for first two editions of the book and for taking into account valuable suggestion from teachers as well as students, the text for Interpolation, Differentiation, Integration, Roots of an Equation, Solution of Simultaneous Equations, Eigenvalues and Eigenvectors of Matrix, Solution of Differential Equations, Solution of Partial Differential Equations, Monte Carlo Method and Simulation, Computation of some Functions is improved throughout and presented in a more systematic manner by using simple language. These techniques have vast applications in Science, Engineering and Technology. The C language is becoming popular in universities, colleges and engineering institutions. Besides the C language, programs are written in FORTRAN and BASIC languages. Consequently, this book has rather wide scope for its use. Each of the topics are developed in a systematic manner; thus making this book useful for graduate, postgraduate and engineering students. KEY FEATURES: Each topic is self exp...

  3. Accelerating the Pace of Protein Functional Annotation With Intel Xeon Phi Coprocessors.

    Science.gov (United States)

    Feinstein, Wei P; Moreno, Juana; Jarrell, Mark; Brylinski, Michal

    2015-06-01

    Intel Xeon Phi is a new addition to the family of powerful parallel accelerators. The range of its potential applications in computationally driven research is broad; however, at present, the repository of scientific codes is still relatively limited. In this study, we describe the development and benchmarking of a parallel version of eFindSite, a structural bioinformatics algorithm for the prediction of ligand-binding sites in proteins. Implemented for the Intel Xeon Phi platform, the parallelization of the structure alignment portion of eFindSite using pragma-based OpenMP brings about the desired performance improvements, which scale well with the number of computing cores. Compared to a serial version, the parallel code runs 11.8 and 10.1 times faster on the CPU and the coprocessor, respectively; when both resources are utilized simultaneously, the speedup is 17.6. For example, ligand-binding predictions for 501 benchmarking proteins are completed in 2.1 hours on a single Stampede node equipped with the Intel Xeon Phi card compared to 3.1 hours without the accelerator and 36.8 hours required by a serial version. In addition to the satisfactory parallel performance, porting existing scientific codes to the Intel Xeon Phi architecture is relatively straightforward with a short development time due to the support of common parallel programming models by the coprocessor. The parallel version of eFindSite is freely available to the academic community at www.brylinski.org/efindsite.

  4. Game-Based Experiential Learning in Online Management Information Systems Classes Using Intel's IT Manager 3

    Science.gov (United States)

    Bliemel, Michael; Ali-Hassan, Hossam

    2014-01-01

    For several years, we used Intel's flash-based game "IT Manager 3: Unseen Forces" as an experiential learning tool, where students had to act as a manager making real-time prioritization decisions about repairing computer problems, training and upgrading systems with better technologies as well as managing increasing numbers of technical…

  5. Newsgroups, Activist Publics, and Corporate Apologia: The Case of Intel and Its Pentium Chip.

    Science.gov (United States)

    Hearit, Keith Michael

    1999-01-01

    Applies J. Grunig's theory of publics to the phenomenon of Internet newsgroups using the case of the flawed Intel Pentium chip. Argues that technology facilitates the rapid movement of publics from the theoretical construct stage to the active stage. Illustrates some of the difficulties companies face in establishing their identity in cyberspace.…

  6. Why K-12 IT Managers and Administrators Are Embracing the Intel-Based Mac

    Science.gov (United States)

    Technology & Learning, 2007

    2007-01-01

    Over the past year, Apple has dramatically increased its share of the school computer marketplace--especially in the category of notebook computers. A recent study conducted by Grunwald Associates and Rockman et al. reports that one of the major reasons for this growth is Apple's introduction of the Intel processor to the entire line of Mac…

  7. Performance tuning Weather Research and Forecasting (WRF) Goddard longwave radiative transfer scheme on Intel Xeon Phi

    Science.gov (United States)

    Mielikainen, Jarno; Huang, Bormin; Huang, Allen H.

    2015-10-01

    Next-generation mesoscale numerical weather prediction system, the Weather Research and Forecasting (WRF) model, is a designed for dual use for forecasting and research. WRF offers multiple physics options that can be combined in any way. One of the physics options is radiance computation. The major source for energy for the earth's climate is solar radiation. Thus, it is imperative to accurately model horizontal and vertical distribution of the heating. Goddard solar radiative transfer model includes the absorption duo to water vapor,ozone, ozygen, carbon dioxide, clouds and aerosols. The model computes the interactions among the absorption and scattering by clouds, aerosols, molecules and surface. Finally, fluxes are integrated over the entire longwave spectrum.In this paper, we present our results of optimizing the Goddard longwave radiative transfer scheme on Intel Many Integrated Core Architecture (MIC) hardware. The Intel Xeon Phi coprocessor is the first product based on Intel MIC architecture, and it consists of up to 61 cores connected by a high performance on-die bidirectional interconnect. The coprocessor supports all important Intel development tools. Thus, the development environment is familiar one to a vast number of CPU developers. Although, getting a maximum performance out of MICs will require using some novel optimization techniques. Those optimization techniques are discusses in this paper. The optimizations improved the performance of the original Goddard longwave radiative transfer scheme on Xeon Phi 7120P by a factor of 2.2x. Furthermore, the same optimizations improved the performance of the Goddard longwave radiative transfer scheme on a dual socket configuration of eight core Intel Xeon E5-2670 CPUs by a factor of 2.1x compared to the original Goddard longwave radiative transfer scheme code.

  8. Nonalgebraic symbol manipulators for use in scientific and engineering modeling: introducing the FORSE (FORtran Symobol Expander)

    International Nuclear Information System (INIS)

    Schultz, J.H.; Lettvin, J.D.

    1982-02-01

    A system of nonalgebraic symbol manipulators, called The FORSE (FORtran Symbol Expanders) has been developed to document and prepare input-output for Fortran programs. The use of documentation at the level of the individual equation is defended. The operation of The FORSE is described, along with user instructions and a worked example

  9. PIG 3 - A simple compiler for mercury

    International Nuclear Information System (INIS)

    Bindon, D.C.

    1961-06-01

    A short machine language compilation scheme is described; which will read programmes from paper tape, punched cards, or magnetic tape. The compiler occupies pages 8-15 of the ferrite store during translation. (author)

  10. PIG 3 - A simple compiler for mercury

    Energy Technology Data Exchange (ETDEWEB)

    Bindon, D C [Computer Branch, Technical Assessments and Services Division, Atomic Energy Establishment, Winfrith, Dorchester, Dorset (United Kingdom)

    1961-06-15

    A short machine language compilation scheme is described; which will read programmes from paper tape, punched cards, or magnetic tape. The compiler occupies pages 8-15 of the ferrite store during translation. (author)

  11. Proving correctness of compilers using structured graphs

    DEFF Research Database (Denmark)

    Bahr, Patrick

    2014-01-01

    it into a compiler implementation using a graph type along with a correctness proof. The implementation and correctness proof of a compiler using a tree type without explicit jumps is simple, but yields code duplication. Our method provides a convenient way of improving such a compiler without giving up the benefits...

  12. Scalability of Parallel Spatial Direct Numerical Simulations on Intel Hypercube and IBM SP1 and SP2

    Science.gov (United States)

    Joslin, Ronald D.; Hanebutte, Ulf R.; Zubair, Mohammad

    1995-01-01

    The implementation and performance of a parallel spatial direct numerical simulation (PSDNS) approach on the Intel iPSC/860 hypercube and IBM SP1 and SP2 parallel computers is documented. Spatially evolving disturbances associated with the laminar-to-turbulent transition in boundary-layer flows are computed with the PSDNS code. The feasibility of using the PSDNS to perform transition studies on these computers is examined. The results indicate that PSDNS approach can effectively be parallelized on a distributed-memory parallel machine by remapping the distributed data structure during the course of the calculation. Scalability information is provided to estimate computational costs to match the actual costs relative to changes in the number of grid points. By increasing the number of processors, slower than linear speedups are achieved with optimized (machine-dependent library) routines. This slower than linear speedup results because the computational cost is dominated by FFT routine, which yields less than ideal speedups. By using appropriate compile options and optimized library routines on the SP1, the serial code achieves 52-56 M ops on a single node of the SP1 (45 percent of theoretical peak performance). The actual performance of the PSDNS code on the SP1 is evaluated with a "real world" simulation that consists of 1.7 million grid points. One time step of this simulation is calculated on eight nodes of the SP1 in the same time as required by a Cray Y/MP supercomputer. For the same simulation, 32-nodes of the SP1 and SP2 are required to reach the performance of a Cray C-90. A 32 node SP1 (SP2) configuration is 2.9 (4.6) times faster than a Cray Y/MP for this simulation, while the hypercube is roughly 2 times slower than the Y/MP for this application. KEY WORDS: Spatial direct numerical simulations; incompressible viscous flows; spectral methods; finite differences; parallel computing.

  13. Investigating the Use of the Intel Xeon Phi for Event Reconstruction

    Science.gov (United States)

    Sherman, Keegan; Gilfoyle, Gerard

    2014-09-01

    The physics goal of Jefferson Lab is to understand how quarks and gluons form nuclei and it is being upgraded to a higher, 12-GeV beam energy. The new CLAS12 detector in Hall B will collect 5-10 terabytes of data per day and will require considerable computing resources. We are investigating tools, such as the Intel Xeon Phi, to speed up the event reconstruction. The Kalman Filter is one of the methods being studied. It is a linear algebra algorithm that estimates the state of a system by combining existing data and predictions of those measurements. The tools required to apply this technique (i.e. matrix multiplication, matrix inversion) are being written using C++ intrinsics for Intel's Xeon Phi Coprocessor, which uses the Many Integrated Cores (MIC) architecture. The Intel MIC is a new high-performance chip that connects to a host machine through the PCIe bus and is built to run highly vectorized and parallelized code making it a well-suited device for applications such as the Kalman Filter. Our tests of the MIC optimized algorithms needed for the filter show significant increases in speed. For example, matrix multiplication of 5x5 matrices on the MIC was able to run up to 69 times faster than the host core. The physics goal of Jefferson Lab is to understand how quarks and gluons form nuclei and it is being upgraded to a higher, 12-GeV beam energy. The new CLAS12 detector in Hall B will collect 5-10 terabytes of data per day and will require considerable computing resources. We are investigating tools, such as the Intel Xeon Phi, to speed up the event reconstruction. The Kalman Filter is one of the methods being studied. It is a linear algebra algorithm that estimates the state of a system by combining existing data and predictions of those measurements. The tools required to apply this technique (i.e. matrix multiplication, matrix inversion) are being written using C++ intrinsics for Intel's Xeon Phi Coprocessor, which uses the Many Integrated Cores (MIC

  14. On the Automatic Parallelization of Sparse and Irregular Fortran Programs

    Directory of Open Access Journals (Sweden)

    Yuan Lin

    1999-01-01

    Full Text Available Automatic parallelization is usually believed to be less effective at exploiting implicit parallelism in sparse/irregular programs than in their dense/regular counterparts. However, not much is really known because there have been few research reports on this topic. In this work, we have studied the possibility of using an automatic parallelizing compiler to detect the parallelism in sparse/irregular programs. The study with a collection of sparse/irregular programs led us to some common loop patterns. Based on these patterns new techniques were derived that produced good speedups when manually applied to our benchmark codes. More importantly, these parallelization methods can be implemented in a parallelizing compiler and can be applied automatically.

  15. GKS-EZ programming manual for FORTRAN-77

    Energy Technology Data Exchange (ETDEWEB)

    Beach, R.C.

    1992-01-01

    A standard has now been adopted for subroutine packages that drive graphic devices. It is known as the Graphical Kernel system (GKS), and many commercial implementations of it are available. Unfortunately, it is a difficult system to learn, and certain functions that are important for scientific use are not provided. Although GKS can be used to achieve portability of graphic applications between graphic devices, computers, and operating systems, it can also be misused in this respect. In addition, it introduces the very real problem of portability between the various implementations of GKS. This document describes a set of FORTRAN-77 subroutines that may be used to control a wide variety of graphic devices and overcome most of these problems. Some of these subroutines are from GKS itself, while others are higher-level subroutines that call GKS subroutines. These subroutines are collectively known as GKS-EZ. The purpose is to supply someone who is not a specialist in computer graphics with a flexible, robust, and easy to learn graphics system. Users of GKS-EZ should not have much need for a full GKS manual; this document will supply all of the information to use GKS-EZ except for a few items. These missing items include the numeric identification of the supported graphic devices and the procedure for linking the GKS subroutines into a executable module.

  16. SPECFUN1, Portable Special FORTRAN Routines with Test Drivers

    International Nuclear Information System (INIS)

    Cody, W.J.

    1993-01-01

    1 - Description of program or function: SPECFUN is a collection of transportable FORTRAN subroutines and test drivers to evaluate certain special functions. The individual subroutines are - Name/Description: ALGAMA: Log gamma function, DAW: Dawson's integral, EI: Exponential integrals, ERF: Error function, ERFC: Complementary error function, GAMMA: Gamma function, I0: Bessel function I-sub-0, I1: Bessel function I-sub-1, J0Y0: Bessel functions J-sub-0 and Y-sub-0, J1Y1: Bessel functions J-sub-1 and Y-sub-1, K0: Bessel function K-sub-0, K1: Bessel function K-sub-1, PSI: Logarithmic derivative of the gamma function, REN: Random number generator, RIBESL: Bessel function I-sub-(N,ALPHA), RJBESL: Bessel function J-sub-(N,ALPHA), RKBESL: Bessel function K-sub-(N,ALPHA), RYBESL: Bessel function Y-sub-(N,ALPHA), MACHAR: Machine-dependent constants. 2 - Method of solution: SPECFUN generally uses rational mini-max approximations for functions of one variable and recurrence relations for functions of two or more variables. 3 - Restrictions on the complexity of the problem: Accuracy is targeted at between 18 and 20 significant decimal digits

  17. VIDEO-PC, SVGA Routines for FORTRAN on PC

    International Nuclear Information System (INIS)

    1994-01-01

    1 - Description of program or function: These primitives are the lowest level routines needed to perform super VGA graphics on a PC. A sample main program is included that exercises the primitives. Both Lahey and Microsoft FORTRAN's have graphics libraries. However, the libraries do not support 256 color graphics at resolutions greater than 320x200. The primitives bypass these libraries while still conforming to standard usage of BIOS. The supported graphics modes depend upon the PC graphics card and its memory. Super VGA resolutions of 640x480 and 800x600 have been tested on an ATI VGA Wonder card with 512 K memory and on several 80486 PC's (unknown manufacturers) at retail stores. 2 - Method of solution: Both Lahey and Microsoft primitives depend upon sending the correct parameters to the PC BIOS (basic input output system) as discussed in the references. 3 - Restrictions on the complexity of the problem: The primitives were developed for 256 color VGA applications. It is known, for instance, that some CGA and 16 color VGA video modes will not operate correctly using these primitives. Potential users should try the test program on their PC/video card combination to determine applicability

  18. RELABEL2007, Labels FORTRAN Statements in ENDF Format Processing Programs

    International Nuclear Information System (INIS)

    2007-01-01

    1 - Description of program or function: RELABEL labels a ENDF/B pre-processing program so that statement labels are in increasing order in increments of 10 within each routine, and cards are identified in columns 73-80 by three alphanumeric characters in columns 73-75 and sequence numbers in columns 76-80 in increments of 10. IAEA1314/10: This version include the updates up to January 30, 2007. Changes in ENDF/B-VII Format and procedures, as well as the evaluations themselves, make it impossible for versions of the ENDF/B pre-processing codes earlier than PREPRO 2007 (2007 Version) to accurately process current ENDF/B-VII evaluations. The present code can handle all existing ENDF/B-VI evaluations through release 8, which will be the last release of ENDF/B-VI. Modifications from previous versions: Relabel VERS. 2007-1 (JAN. 2007): No change since March 2004 version 2 - Method of solution: 3 - Restrictions on the complexity of the problem: RELABEL is designed to maintain ENDF/B processing programs which use a restricted set of FORTRAN statements. As such, this program is not completely general

  19. MODLIB, library of Fortran modules for nuclear reaction codes

    International Nuclear Information System (INIS)

    Talou, Patrick

    2006-01-01

    1 - Description of program or function: ModLib is a library of Fortran (90-compatible) modules to be used in existing and future nuclear reaction codes. The development of the library is an international effort being undertaken under the auspices of the long-term Subgroup A of the OECD/NEA Working Party on Evaluation and Cooperation. The aim is to constitute a library of well-tested and well-documented pieces of codes that can be used with confidence in all our coding efforts. This effort will undoubtedly help avoid the duplication of work, and most certainly facilitate the very important inter-comparisons between existing codes. 2 - Methods: - Width f luctuations [Talou, Chadwick]: calculates width fluctuation correction factors (output) for a set of transmission coefficients (input). Three methods are available: HRTW, Moldauer, and Verbaarschot (also called GOE approach). So far, no distinction is made according to the type of the coefficients channels (particle emission, gamma-ray emission, fission). - Gamma s trength [Herman]: calculates gamma-ray transmission coefficients using a Giant Resonance formalism. - Level d ensity [Koning]: computes the Gilbert-Cameron-Ignatyuk formalism for the continuum nuclear level density. - CHECKR, FIZCON, INTER, PSYCHE, STANEF [Dunford]: these modules are used in the MODLIB project but are not included in this package. They are available from the NEA Data Bank Computer Program Service under Package Ids: CHECKR (USCD1208), FIZCON (USCD1209), INTER (USCD1212), PSYCHE (USCD1216), STANEF (USCD1218)

  20. Parallelization of MCNP4 code by using simple FORTRAN algorithms

    International Nuclear Information System (INIS)

    Yazid, P.I.; Takano, Makoto; Masukawa, Fumihiro; Naito, Yoshitaka.

    1993-12-01

    Simple FORTRAN algorithms, that rely only on open, close, read and write statements, together with disk files and some UNIX commands have been applied to parallelization of MCNP4. The code, named MCNPNFS, maintains almost all capabilities of MCNP4 in solving shielding problems. It is able to perform parallel computing on a set of any UNIX workstations connected by a network, regardless of the heterogeneity in hardware system, provided that all processors produce a binary file in the same format. Further, it is confirmed that MCNPNFS can be executed also on Monte-4 vector-parallel computer. MCNPNFS has been tested intensively by executing 5 photon-neutron benchmark problems, a spent fuel cask problem and 17 sample problems included in the original code package of MCNP4. Three different workstations, connected by a network, have been used to execute MCNPNFS in parallel. By measuring CPU time, the parallel efficiency is determined to be 58% to 99% and 86% in average. On Monte-4, MCNPNFS has been executed using 4 processors concurrently and has achieved the parallel efficiency of 79% in average. (author)

  1. NLEdit: A generic graphical user interface for Fortran programs

    Science.gov (United States)

    Curlett, Brian P.

    1994-01-01

    NLEdit is a generic graphical user interface for the preprocessing of Fortran namelist input files. The interface consists of a menu system, a message window, a help system, and data entry forms. A form is generated for each namelist. The form has an input field for each namelist variable along with a one-line description of that variable. Detailed help information, default values, and minimum and maximum allowable values can all be displayed via menu picks. Inputs are processed through a scientific calculator program that allows complex equations to be used instead of simple numeric inputs. A custom user interface is generated simply by entering information about the namelist input variables into an ASCII file. There is no need to learn a new graphics system or programming language. NLEdit can be used as a stand-alone program or as part of a larger graphical user interface. Although NLEdit is intended for files using namelist format, it can be easily modified to handle other file formats.

  2. FORTRAN text correction with the CDC-1604-A console typewriter during reading a punched card program

    International Nuclear Information System (INIS)

    Kotorobaj, F.; Ruzhichka, Ya.; Stolyarskij, Yu.V.

    1977-01-01

    The paper describes FORTRAN text correction with the CDC 1604-A console typewriter during reading a punched card program. This method gives one more possibility of FORTRAN program correction during program's input to the CDC 1604-A computer. This essentially reduced the time necessary for punched card correction with other methods. Possibility of inputting desired number of punched cards one after another allows one writing small FORTRAN programs to computer core storage with simultaneous punching of the cards. The correction program has been written to the CDC 1604 COOP monitor

  3. Blocked All-Pairs Shortest Paths Algorithm on Intel Xeon Phi KNL Processor: A Case Study

    OpenAIRE

    Rucci, Enzo; De Giusti, Armando Eduardo; Naiouf, Marcelo

    2017-01-01

    Manycores are consolidating in HPC community as a way of improving performance while keeping power efficiency. Knights Landing is the recently released second generation of Intel Xeon Phi architec- ture.While optimizing applications on CPUs, GPUs and first Xeon Phi’s has been largely studied in the last years, the new features in Knights Landing processors require the revision of programming and optimization techniques for these devices. In this work, we selected the Floyd-Warshall algorithm ...

  4. Benchmarking Data Analysis and Machine Learning Applications on the Intel KNL Many-Core Processor

    OpenAIRE

    Byun, Chansup; Kepner, Jeremy; Arcand, William; Bestor, David; Bergeron, Bill; Gadepally, Vijay; Houle, Michael; Hubbell, Matthew; Jones, Michael; Klein, Anna; Michaleas, Peter; Milechin, Lauren; Mullen, Julie; Prout, Andrew; Rosa, Antonio

    2017-01-01

    Knights Landing (KNL) is the code name for the second-generation Intel Xeon Phi product family. KNL has generated significant interest in the data analysis and machine learning communities because its new many-core architecture targets both of these workloads. The KNL many-core vector processor design enables it to exploit much higher levels of parallelism. At the Lincoln Laboratory Supercomputing Center (LLSC), the majority of users are running data analysis applications such as MATLAB and O...

  5. Accelerating the Global Nested Air Quality Prediction Modeling System (GNAQPMS) model on Intel Xeon Phi processors

    OpenAIRE

    Wang, Hui; Chen, Huansheng; Wu, Qizhong; Lin, Junming; Chen, Xueshun; Xie, Xinwei; Wang, Rongrong; Tang, Xiao; Wang, Zifa

    2017-01-01

    The GNAQPMS model is the global version of the Nested Air Quality Prediction Modelling System (NAQPMS), which is a multi-scale chemical transport model used for air quality forecast and atmospheric environmental research. In this study, we present our work of porting and optimizing the GNAQPMS model on the second generation Intel Xeon Phi processor codename “Knights Landing” (KNL). Compared with the first generation Xeon Phi coprocessor, KNL introduced many new hardware features such as a boo...

  6. Applying the roofline performance model to the intel xeon phi knights landing processor

    OpenAIRE

    Doerfler, D; Deslippe, J; Williams, S; Oliker, L; Cook, B; Kurth, T; Lobet, M; Malas, T; Vay, JL; Vincenti, H

    2016-01-01

    � Springer International Publishing AG 2016. The Roofline Performance Model is a visually intuitive method used to bound the sustained peak floating-point performance of any given arithmetic kernel on any given processor architecture. In the Roofline, performance is nominally measured in floating-point operations per second as a function of arithmetic intensity (operations per byte of data). In this study we determine the Roofline for the Intel Knights Landing (KNL) processor, determining t...

  7. Efficient irregular wavefront propagation algorithms on Intel® Xeon Phi™

    OpenAIRE

    Gomes, Jeremias M.; Teodoro, George; de Melo, Alba; Kong, Jun; Kurc, Tahsin; Saltz, Joel H.

    2015-01-01

    We investigate the execution of the Irregular Wavefront Propagation Pattern (IWPP), a fundamental computing structure used in several image analysis operations, on the Intel® Xeon Phi™ co-processor. An efficient implementation of IWPP on the Xeon Phi is a challenging problem because of IWPP’s irregularity and the use of atomic instructions in the original IWPP algorithm to resolve race conditions. On the Xeon Phi, the use of SIMD and vectorization instructions is critical to attain high perfo...

  8. Performance Engineering for a Medical Imaging Application on the Intel Xeon Phi Accelerator

    OpenAIRE

    Hofmann, Johannes; Treibig, Jan; Hager, Georg; Wellein, Gerhard

    2013-01-01

    We examine the Xeon Phi, which is based on Intel's Many Integrated Cores architecture, for its suitability to run the FDK algorithm--the most commonly used algorithm to perform the 3D image reconstruction in cone-beam computed tomography. We study the challenges of efficiently parallelizing the application and means to enable sensible data sharing between threads despite the lack of a shared last level cache. Apart from parallelization, SIMD vectorization is critical for good performance on t...

  9. DBPQL: A view-oriented query language for the Intel Data Base Processor

    Science.gov (United States)

    Fishwick, P. A.

    1983-01-01

    An interactive query language (BDPQL) for the Intel Data Base Processor (DBP) is defined. DBPQL includes a parser generator package which permits the analyst to easily create and manipulate the query statement syntax and semantics. The prototype language, DBPQL, includes trace and performance commands to aid the analyst when implementing new commands and analyzing the execution characteristics of the DBP. The DBPQL grammar file and associated key procedures are included as an appendix to this report.

  10. Autonomous controller (JCAM 10) for CAMAC crate with 8080 (INTEL) microprocessor

    International Nuclear Information System (INIS)

    Gallice, P.; Mathis, M.

    1975-01-01

    The CAMAC crate autonomous controller JCAM-10 is designed around an INTEL 8080 microprocessor in association with a 5K RAM and 4K REPROM memory. The concept of the module is described, in which data transfers between CAMAC modules and the memory are optimised from software point of view as well as from execution time. In fact, the JCAM-10 is a microcomputer with a set of 1000 peripheral units represented by the CAMAC modules commercially available

  11. Practical Implementation of Lattice QCD Simulation on Intel Xeon Phi Knights Landing

    OpenAIRE

    Kanamori, Issaku; Matsufuru, Hideo

    2017-01-01

    We investigate implementation of lattice Quantum Chromodynamics (QCD) code on the Intel Xeon Phi Knights Landing (KNL). The most time consuming part of the numerical simulations of lattice QCD is a solver of linear equation for a large sparse matrix that represents the strong interaction among quarks. To establish widely applicable prescriptions, we examine rather general methods for the SIMD architecture of KNL, such as using intrinsics and manual prefetching, to the matrix multiplication an...

  12. Acceleration of Blender Cycles Path-Tracing Engine Using Intel Many Integrated Core Architecture

    OpenAIRE

    Jaroš , Milan; Říha , Lubomír; Strakoš , Petr; Karásek , Tomáš; Vašatová , Alena; Jarošová , Marta; Kozubek , Tomáš

    2015-01-01

    Part 2: Algorithms; International audience; This paper describes the acceleration of the most computationally intensive kernels of the Blender rendering engine, Blender Cycles, using Intel Many Integrated Core architecture (MIC). The proposed parallelization, which uses OpenMP technology, also improves the performance of the rendering engine when running on multi-core CPUs and multi-socket servers. Although the GPU acceleration is already implemented in Cycles, its functionality is limited. O...

  13. Real-time data acquisition and feedback control using Linux Intel computers

    International Nuclear Information System (INIS)

    Penaflor, B.G.; Ferron, J.R.; Piglowski, D.A.; Johnson, R.D.; Walker, M.L.

    2006-01-01

    This paper describes the experiences of the DIII-D programming staff in adapting Linux based Intel computing hardware for use in real-time data acquisition and feedback control systems. Due to the highly dynamic and unstable nature of magnetically confined plasmas in tokamak fusion experiments, real-time data acquisition and feedback control systems are in routine use with all major tokamaks. At DIII-D, plasmas are created and sustained using a real-time application known as the digital plasma control system (PCS). During each experiment, the PCS periodically samples data from hundreds of diagnostic signals and provides these data to control algorithms implemented in software. These algorithms compute the necessary commands to send to various actuators that affect plasma performance. The PCS consists of a group of rack mounted Intel Xeon computer systems running an in-house customized version of the Linux operating system tailored specifically to meet the real-time performance needs of the plasma experiments. This paper provides a more detailed description of the real-time computing hardware and custom developed software, including recent work to utilize dual Intel Xeon equipped computers within the PCS

  14. Implementation of an Agent-Based Parallel Tissue Modelling Framework for the Intel MIC Architecture

    Directory of Open Access Journals (Sweden)

    Maciej Cytowski

    2017-01-01

    Full Text Available Timothy is a novel large scale modelling framework that allows simulating of biological processes involving different cellular colonies growing and interacting with variable environment. Timothy was designed for execution on massively parallel High Performance Computing (HPC systems. The high parallel scalability of the implementation allows for simulations of up to 109 individual cells (i.e., simulations at tissue spatial scales of up to 1 cm3 in size. With the recent advancements of the Timothy model, it has become critical to ensure appropriate performance level on emerging HPC architectures. For instance, the introduction of blood vessels supplying nutrients to the tissue is a very important step towards realistic simulations of complex biological processes, but it greatly increased the computational complexity of the model. In this paper, we describe the process of modernization of the application in order to achieve high computational performance on HPC hybrid systems based on modern Intel® MIC architecture. Experimental results on the Intel Xeon Phi™ coprocessor x100 and the Intel Xeon Phi processor x200 are presented.

  15. Experience with Intel's many integrated core architecture in ATLAS software

    International Nuclear Information System (INIS)

    Fleischmann, S; Neumann, M; Kama, S; Lavrijsen, W; Vitillo, R

    2014-01-01

    Intel recently released the first commercial boards of its Many Integrated Core (MIC) Architecture. MIC is Intel's solution for the domain of throughput computing, currently dominated by general purpose programming on graphics processors (GPGPU). MIC allows the use of the more familiar x86 programming model and supports standard technologies such as OpenMP, MPI, and Intel's Threading Building Blocks (TBB). This should make it possible to develop for both throughput and latency devices using a single code base. In ATLAS Software, track reconstruction has been shown to be a good candidate for throughput computing on GPGPU devices. In addition, the newly proposed offline parallel event-processing framework, GaudiHive, uses TBB for task scheduling. The MIC is thus, in principle, a good fit for this domain. In this paper, we report our experiences of porting to and optimizing ATLAS tracking algorithms for the MIC, comparing the programmability and relative cost/performance of the MIC against those of current GPGPUs and latency-optimized CPUs.

  16. Scaling deep learning workloads: NVIDIA DGX-1/Pascal and Intel Knights Landing

    Energy Technology Data Exchange (ETDEWEB)

    Gawande, Nitin A.; Landwehr, Joshua B.; Daily, Jeffrey A.; Tallent, Nathan R.; Vishnu, Abhinav; Kerbyson, Darren J.

    2017-08-24

    Deep Learning (DL) algorithms have become ubiquitous in data analytics. As a result, major computing vendors --- including NVIDIA, Intel, AMD, and IBM --- have architectural road-maps influenced by DL workloads. Furthermore, several vendors have recently advertised new computing products as accelerating large DL workloads. Unfortunately, it is difficult for data scientists to quantify the potential of these different products. This paper provides a performance and power analysis of important DL workloads on two major parallel architectures: NVIDIA DGX-1 (eight Pascal P100 GPUs interconnected with NVLink) and Intel Knights Landing (KNL) CPUs interconnected with Intel Omni-Path or Cray Aries. Our evaluation consists of a cross section of convolutional neural net workloads: CifarNet, AlexNet, GoogLeNet, and ResNet50 topologies using the Cifar10 and ImageNet datasets. The workloads are vendor-optimized for each architecture. Our analysis indicates that although GPUs provide the highest overall performance, the gap can close for some convolutional networks; and the KNL can be competitive in performance/watt. We find that NVLink facilitates scaling efficiency on GPUs. However, its importance is heavily dependent on neural network architecture. Furthermore, for weak-scaling --- sometimes encouraged by restricted GPU memory --- NVLink is less important.

  17. Scaling Deep Learning Workloads: NVIDIA DGX-1/Pascal and Intel Knights Landing

    Energy Technology Data Exchange (ETDEWEB)

    Gawande, Nitin A.; Landwehr, Joshua B.; Daily, Jeffrey A.; Tallent, Nathan R.; Vishnu, Abhinav; Kerbyson, Darren J.

    2017-07-03

    Deep Learning (DL) algorithms have become ubiquitous in data analytics. As a result, major computing vendors --- including NVIDIA, Intel, AMD and IBM --- have architectural road-maps influenced by DL workloads. Furthermore, several vendors have recently advertised new computing products as accelerating DL workloads. Unfortunately, it is difficult for data scientists to quantify the potential of these different products. This paper provides a performance and power analysis of important DL workloads on two major parallel architectures: NVIDIA DGX-1 (eight Pascal P100 GPUs interconnected with NVLink) and Intel Knights Landing (KNL) CPUs interconnected with Intel Omni-Path. Our evaluation consists of a cross section of convolutional neural net workloads: CifarNet, CaffeNet, AlexNet and GoogleNet topologies using the Cifar10 and ImageNet datasets. The workloads are vendor optimized for each architecture. GPUs provide the highest overall raw performance. Our analysis indicates that although GPUs provide the highest overall performance, the gap can close for some convolutional networks; and KNL can be competitive when considering performance/watt. Furthermore, NVLink is critical to GPU scaling.

  18. FORTRAN data files transference from VAX/VMS to ALPHA/UNIX; Traspaso de ficheros FORTRAN de datos de VAX/VMS a ALPHA/UNIX

    Energy Technology Data Exchange (ETDEWEB)

    Sanchez, E.; Milligen, B. Ph van [CIEMAT (Spain)

    1997-09-01

    Several tools have been developed to access the TJ-IU databases, which currently reside in VAX/VMS servers, from the TJ-II Data Acquisition System DEC ALPHA 8400 server. The TJ-I/TJ-IU databases are not homogeneous and contain several types of data files, namely, SADE, CAMAC and FORTRAN unformatted files. The tools presented in this report allow one to transfer CAMAC and those FORTRAN unformatted files defined herein, from a VAX/VMS server, for data manipulation on the ALPHA/Digital UNIX server. (Author)

  19. Comparison of PASCAL and FORTRAN for solving problems in the physical sciences

    Science.gov (United States)

    Watson, V. R.

    1981-01-01

    The paper compares PASCAL and FORTRAN for problem solving in the physical sciences, due to requests NASA has received to make PASCAL available on the Numerical Aerodynamic Simulator (scheduled to be operational in 1986). PASCAL disadvantages include the lack of scientific utility procedures equivalent to the IBM scientific subroutine package or the IMSL package which are available in FORTRAN. Advantages include a well-organized, easy to read and maintain writing code, range checking to prevent errors, and a broad selection of data types. It is concluded that FORTRAN may be the better language, although ADA (patterned after PASCAL) may surpass FORTRAN due to its ability to add complex and vector math, and the specify the precision and range of variables.

  20. FORTRAN programs for transient eddy current calculations using a perturbation-polynomial expansion technique

    International Nuclear Information System (INIS)

    Carpenter, K.H.

    1976-11-01

    A description is given of FORTRAN programs for transient eddy current calculations in thin, non-magnetic conductors using a perturbation-polynomial expansion technique. Basic equations are presented as well as flow charts for the programs implementing them. The implementation is in two steps--a batch program to produce an intermediate data file and interactive programs to produce graphical output. FORTRAN source listings are included for all program elements, and sample inputs and outputs are given for the major programs

  1. A Class-Specific Optimizing Compiler

    Directory of Open Access Journals (Sweden)

    Michael D. Sharp

    1993-01-01

    Full Text Available Class-specific optimizations are compiler optimizations specified by the class implementor to the compiler. They allow the compiler to take advantage of the semantics of the particular class so as to produce better code. Optimizations of interest include the strength reduction of class:: array address calculations, elimination of large temporaries, and the placement of asynchronous send/recv calls so as to achieve computation/communication overlap. We will outline our progress towards the implementation of a C++ compiler capable of incorporating class-specific optimizations.

  2. A Symmetric Approach to Compilation and Decompilation

    DEFF Research Database (Denmark)

    Ager, Mads Sig; Danvy, Olivier; Goldberg, Mayer

    2002-01-01

    Just as an interpreter for a source language can be turned into a compiler from the source language to a target language, we observe that an interpreter for a target language can be turned into a compiler from the target language to a source language. In both cases, the key issue is the choice of...

  3. An exploratory discussion on business files compilation

    International Nuclear Information System (INIS)

    Gao Chunying

    2014-01-01

    Business files compilation for an enterprise is a distillation and recreation of its spiritual wealth, from which the applicable information can be available to those who want to use it in a fast, extensive and precise way. Proceeding from the effects of business files compilation on scientific researches, productive constructions and developments, this paper in five points discusses the way how to define topics, analyze historical materials, search or select data and process it to an enterprise archives collection. Firstly, to expound the importance and necessity of business files compilation in production, operation and development of an company; secondly, to present processing methods from topic definition, material searching and data selection to final examination and correction; thirdly, to define principle and classification in order to make different categories and levels of processing methods available to business files compilation; fourthly, to discuss the specific method how to implement a file compilation through a documentation collection upon principle of topic definition gearing with demand; fifthly, to address application of information technology to business files compilation in view point of widely needs for business files so as to level up enterprise archives management. The present discussion focuses on the examination and correction principle of enterprise historical material compilation and the basic classifications as well as the major forms of business files compilation achievements. (author)

  4. Evaluation of vectorization potential of Graph500 on Intel's Xeon Phi

    OpenAIRE

    Stanic, Milan; Palomar, Oscar; Ratkovic, Ivan; Duric, Milovan; Unsal, Osman; Cristal, Adrian; Valero, Mateo

    2014-01-01

    Graph500 is a data intensive application for high performance computing and it is an increasingly important workload because graphs are a core part of most analytic applications. So far there is no work that examines if Graph500 is suitable for vectorization mostly due a lack of vector memory instructions for irregular memory accesses. The Xeon Phi is a massively parallel processor recently released by Intel with new features such as a wide 512-bit vector unit and vector scatter/gather instru...

  5. Performance Analysis of an Astrophysical Simulation Code on the Intel Xeon Phi Architecture

    OpenAIRE

    Noormofidi, Vahid; Atlas, Susan R.; Duan, Huaiyu

    2015-01-01

    We have developed the astrophysical simulation code XFLAT to study neutrino oscillations in supernovae. XFLAT is designed to utilize multiple levels of parallelism through MPI, OpenMP, and SIMD instructions (vectorization). It can run on both CPU and Xeon Phi co-processors based on the Intel Many Integrated Core Architecture (MIC). We analyze the performance of XFLAT on configurations with CPU only, Xeon Phi only and both CPU and Xeon Phi. We also investigate the impact of I/O and the multi-n...

  6. Optimizing the MapReduce Framework on Intel Xeon Phi Coprocessor

    OpenAIRE

    Lu, Mian; Zhang, Lei; Huynh, Huynh Phung; Ong, Zhongliang; Liang, Yun; He, Bingsheng; Goh, Rick Siow Mong; Huynh, Richard

    2013-01-01

    With the ease-of-programming, flexibility and yet efficiency, MapReduce has become one of the most popular frameworks for building big-data applications. MapReduce was originally designed for distributed-computing, and has been extended to various architectures, e,g, multi-core CPUs, GPUs and FPGAs. In this work, we focus on optimizing the MapReduce framework on Xeon Phi, which is the latest product released by Intel based on the Many Integrated Core Architecture. To the best of our knowledge...

  7. Mashup d'aplicacions basat en un buscador intel·ligent

    OpenAIRE

    Sancho Piqueras, Javier

    2010-01-01

    Mashup de funcionalitats, basat en un cercador intel·ligent, en aquest cas pensat per a cursos, carreres màsters, etc. La finalitat és adjuntar diverses aplicacions amb l'únic propòsit que en aquest cas és un buscador però que també ens permet utilitzar eines per a la connectivitat mitjançant web Services, o xarxes socials. Mashup de funcionalidades, basado en un buscador inteligente, en este caso pensado para cursos, carreras másters, etc. La finalidad es juntar diversas aplicaciones con ...

  8. Profiling CPU-bound workloads on Intel Haswell-EP platforms

    CERN Document Server

    Guerri, Marco; Cristovao, Cordeiro; CERN. Geneva. IT Department

    2017-01-01

    With the increasing adoption of public and private cloud resources to support the demands in terms of computing capacity of the WLCG, the HEP community has begun studying several benchmarking applications aimed at continuously assessing the performance of virtual machines procured from commercial providers. In order to characterise the behaviour of these benchmarks, in-depth profiling activities have been carried out. In this document we outline our experience in profiling one specific application, the ATLAS Kit Validation, in an attempt to explain an unexpected distribution in the performance samples obtained on systems based on Intel Haswell-EP processors.

  9. A new shared-memory programming paradigm for molecular dynamics simulations on the Intel Paragon

    International Nuclear Information System (INIS)

    D'Azevedo, E.F.; Romine, C.H.

    1994-12-01

    This report describes the use of shared memory emulation with DOLIB (Distributed Object Library) to simplify parallel programming on the Intel Paragon. A molecular dynamics application is used as an example to illustrate the use of the DOLIB shared memory library. SOTON-PAR, a parallel molecular dynamics code with explicit message-passing using a Lennard-Jones 6-12 potential, is rewritten using DOLIB primitives. The resulting code has no explicit message primitives and resembles a serial code. The new code can perform dynamic load balancing and achieves better performance than the original parallel code with explicit message-passing

  10. Optimizing the updated Goddard shortwave radiation Weather Research and Forecasting (WRF) scheme for Intel Many Integrated Core (MIC) architecture

    Science.gov (United States)

    Mielikainen, Jarno; Huang, Bormin; Huang, Allen H.-L.

    2015-05-01

    Intel Many Integrated Core (MIC) ushers in a new era of supercomputing speed, performance, and compatibility. It allows the developers to run code at trillions of calculations per second using the familiar programming model. In this paper, we present our results of optimizing the updated Goddard shortwave radiation Weather Research and Forecasting (WRF) scheme on Intel Many Integrated Core Architecture (MIC) hardware. The Intel Xeon Phi coprocessor is the first product based on Intel MIC architecture, and it consists of up to 61 cores connected by a high performance on-die bidirectional interconnect. The co-processor supports all important Intel development tools. Thus, the development environment is familiar one to a vast number of CPU developers. Although, getting a maximum performance out of Xeon Phi will require using some novel optimization techniques. Those optimization techniques are discusses in this paper. The results show that the optimizations improved performance of the original code on Xeon Phi 7120P by a factor of 1.3x.

  11. Extending R packages to support 64-bit compiled code: An illustration with spam64 and GIMMS NDVI3g data

    Science.gov (United States)

    Gerber, Florian; Mösinger, Kaspar; Furrer, Reinhard

    2017-07-01

    Software packages for spatial data often implement a hybrid approach of interpreted and compiled programming languages. The compiled parts are usually written in C, C++, or Fortran, and are efficient in terms of computational speed and memory usage. Conversely, the interpreted part serves as a convenient user-interface and calls the compiled code for computationally demanding operations. The price paid for the user friendliness of the interpreted component is-besides performance-the limited access to low level and optimized code. An example of such a restriction is the 64-bit vector support of the widely used statistical language R. On the R side, users do not need to change existing code and may not even notice the extension. On the other hand, interfacing 64-bit compiled code efficiently is challenging. Since many R packages for spatial data could benefit from 64-bit vectors, we investigate strategies to efficiently pass 64-bit vectors to compiled languages. More precisely, we show how to simply extend existing R packages using the foreign function interface to seamlessly support 64-bit vectors. This extension is shown with the sparse matrix algebra R package spam. The new capabilities are illustrated with an example of GIMMS NDVI3g data featuring a parametric modeling approach for a non-stationary covariance matrix.

  12. Reactive Geochemical Transport Modeling of Concentrated AqueousSolutions: Supplement to TOUGHREACT User's Guide for the PitzerIon-Interaction Model

    Energy Technology Data Exchange (ETDEWEB)

    Zhang, Guoxiang; Spycher, Nicolas; Xu, Tianfu; Sonnenthal, Eric; Steefel , Carl

    2006-12-15

    In this report, we present: -- The Pitzer ion-interactiontheory and models -- Input file requirements for using the TOUGHREACTPitzer ion-interaction model and associated databases -- Run-time errormessages -- Verification test cases and application examples. For themain code structure, features, overall solution methods, description ofinput/output files for parameters other than those specific to theimplemented Pitzer model, and error messages, see the TOUGHREACT User'sGuide (Xu et al., 2005). The TOUGHREACT Pitzer version runs on aDEC-alpha architecture CPU, under OSF1 V5.1, with Compaq Digital FortranCompiler. The compiler run-time libraries are required for execution aswell as compilation. The code also runs on Intel Pentium IV andhigher-version CPU-based machines with Compaq Visual Fortran Compiler orIntel Fortran Compiler (integrated with the Microsoft DevelopmentEnvironment). The minimum hardware configuration should include 1 GB RAMand 1 GB (2 GB recommended) of available disk space.

  13. Regulatory and technical reports compilation for 1980

    International Nuclear Information System (INIS)

    Oliu, W.E.; McKenzi, L.

    1981-04-01

    This compilation lists formal regulatory and technical reports and conference proceedings issued in 1980 by the US Nuclear Regulatory Commission. The compilation is divided into four major sections. The first major section consists of a sequential listing of all NRC reports in report-number order. The second major section of this compilation consists of a key-word index to report titles. The third major section contains an alphabetically arranged listing of contractor report numbers cross-referenced to their corresponding NRC report numbers. Finally, the fourth section is an errata supplement

  14. Python based high-level synthesis compiler

    Science.gov (United States)

    Cieszewski, Radosław; Pozniak, Krzysztof; Romaniuk, Ryszard

    2014-11-01

    This paper presents a python based High-Level synthesis (HLS) compiler. The compiler interprets an algorithmic description of a desired behavior written in Python and map it to VHDL. FPGA combines many benefits of both software and ASIC implementations. Like software, the mapped circuit is flexible, and can be reconfigured over the lifetime of the system. FPGAs therefore have the potential to achieve far greater performance than software as a result of bypassing the fetch-decode-execute operations of traditional processors, and possibly exploiting a greater level of parallelism. Creating parallel programs implemented in FPGAs is not trivial. This article describes design, implementation and first results of created Python based compiler.

  15. Application of Intel Many Integrated Core (MIC) accelerators to the Pleim-Xiu land surface scheme

    Science.gov (United States)

    Huang, Melin; Huang, Bormin; Huang, Allen H.

    2015-10-01

    The land-surface model (LSM) is one physics process in the weather research and forecast (WRF) model. The LSM includes atmospheric information from the surface layer scheme, radiative forcing from the radiation scheme, and precipitation forcing from the microphysics and convective schemes, together with internal information on the land's state variables and land-surface properties. The LSM is to provide heat and moisture fluxes over land points and sea-ice points. The Pleim-Xiu (PX) scheme is one LSM. The PX LSM features three pathways for moisture fluxes: evapotranspiration, soil evaporation, and evaporation from wet canopies. To accelerate the computation process of this scheme, we employ Intel Xeon Phi Many Integrated Core (MIC) Architecture as it is a multiprocessor computer structure with merits of efficient parallelization and vectorization essentials. Our results show that the MIC-based optimization of this scheme running on Xeon Phi coprocessor 7120P improves the performance by 2.3x and 11.7x as compared to the original code respectively running on one CPU socket (eight cores) and on one CPU core with Intel Xeon E5-2670.

  16. Optimizing the Betts-Miller-Janjic cumulus parameterization with Intel Many Integrated Core (MIC) architecture

    Science.gov (United States)

    Huang, Melin; Huang, Bormin; Huang, Allen H.-L.

    2015-10-01

    The schemes of cumulus parameterization are responsible for the sub-grid-scale effects of convective and/or shallow clouds, and intended to represent vertical fluxes due to unresolved updrafts and downdrafts and compensating motion outside the clouds. Some schemes additionally provide cloud and precipitation field tendencies in the convective column, and momentum tendencies due to convective transport of momentum. The schemes all provide the convective component of surface rainfall. Betts-Miller-Janjic (BMJ) is one scheme to fulfill such purposes in the weather research and forecast (WRF) model. National Centers for Environmental Prediction (NCEP) has tried to optimize the BMJ scheme for operational application. As there are no interactions among horizontal grid points, this scheme is very suitable for parallel computation. With the advantage of Intel Xeon Phi Many Integrated Core (MIC) architecture, efficient parallelization and vectorization essentials, it allows us to optimize the BMJ scheme. If compared to the original code respectively running on one CPU socket (eight cores) and on one CPU core with Intel Xeon E5-2670, the MIC-based optimization of this scheme running on Xeon Phi coprocessor 7120P improves the performance by 2.4x and 17.0x, respectively.

  17. Optimizing zonal advection of the Advanced Research WRF (ARW) dynamics for Intel MIC

    Science.gov (United States)

    Mielikainen, Jarno; Huang, Bormin; Huang, Allen H.

    2014-10-01

    The Weather Research and Forecast (WRF) model is the most widely used community weather forecast and research model in the world. There are two distinct varieties of WRF. The Advanced Research WRF (ARW) is an experimental, advanced research version featuring very high resolution. The WRF Nonhydrostatic Mesoscale Model (WRF-NMM) has been designed for forecasting operations. WRF consists of dynamics code and several physics modules. The WRF-ARW core is based on an Eulerian solver for the fully compressible nonhydrostatic equations. In the paper, we will use Intel Intel Many Integrated Core (MIC) architecture to substantially increase the performance of a zonal advection subroutine for optimization. It is of the most time consuming routines in the ARW dynamics core. Advection advances the explicit perturbation horizontal momentum equations by adding in the large-timestep tendency along with the small timestep pressure gradient tendency. We will describe the challenges we met during the development of a high-speed dynamics code subroutine for MIC architecture. Furthermore, lessons learned from the code optimization process will be discussed. The results show that the optimizations improved performance of the original code on Xeon Phi 5110P by a factor of 2.4x.

  18. Implementation of a 3-D nonlinear MHD [magnetohydrodynamics] calculation on the Intel hypercube

    International Nuclear Information System (INIS)

    Lynch, V.E.; Carreras, B.A.; Drake, J.B.; Hicks, H.R.; Lawkins, W.F.

    1987-01-01

    The optimization of numerical schemes and increasing computer capabilities in the last ten years have improved the efficiency of 3-D nonlinear resistive MHD calculations by about two to three orders of magnitude. However, we are still very limited in performing these types of calculations. Hypercubes have a large number of processors with only local memory and bidirectional links among neighbors. The Intel Hypercube at Oak Ridge has 64 processors with 0.5 megabytes of memory per processor. The multiplicity of processors opens new possibilities for the treatment of such computations. The constraint on time and resources favored the approach of using the existing RSF code which solves as an initial value problem the reduced set of MHD equations for a periodic cylindrical geometry. This code includes minimal physics and geometry, but contains the basic three dimensionality and nonlinear structure of the equations. The code solves the reduced set of MHD equations by Fourier expansion in two angular coordinates and finite differences in the radial one. Due to the continuing interest in these calculations and the likelihood that future supercomputers will take greater advantage of parallelism, the present study was initiated by the ORNL Exploratory Studies Committee and funded entirely by Laboratory Discretionary Funds. The objectives of the study were: to ascertain the suitability of MHD calculation for parallel computation, to design and implement a parallel algorithm to perform the computations, and to evaluate the hypercube, and in particular, ORNL's Intel iPSC, for use in MHD computations

  19. 3-D electromagnetic plasma particle simulations on the Intel Delta parallel computer

    International Nuclear Information System (INIS)

    Wang, J.; Liewer, P.C.

    1994-01-01

    A three-dimensional electromagnetic PIC code has been developed on the 512 node Intel Touchstone Delta MIMD parallel computer. This code is based on the General Concurrent PIC algorithm which uses a domain decomposition to divide the computation among the processors. The 3D simulation domain can be partitioned into 1-, 2-, or 3-dimensional sub-domains. Particles must be exchanged between processors as they move among the subdomains. The Intel Delta allows one to use this code for very-large-scale simulations (i.e. over 10 8 particles and 10 6 grid cells). The parallel efficiency of this code is measured, and the overall code performance on the Delta is compared with that on Cray supercomputers. It is shown that their code runs with a high parallel efficiency of ≥ 95% for large size problems. The particle push time achieved is 115 nsecs/particle/time step for 162 million particles on 512 nodes. Comparing with the performance on a single processor Cray C90, this represents a factor of 58 speedup. The code uses a finite-difference leap frog method for field solve which is significantly more efficient than fast fourier transforms on parallel computers. The performance of this code on the 128 node Cray T3D will also be discussed

  20. Vectorization vs. compilation in query execution

    NARCIS (Netherlands)

    J. Sompolski (Juliusz); M. Zukowski (Marcin); P.A. Boncz (Peter)

    2011-01-01

    textabstractCompiling database queries into executable (sub-) programs provides substantial benefits comparing to traditional interpreted execution. Many of these benefits, such as reduced interpretation overhead, better instruction code locality, and providing opportunities to use SIMD

  1. Gravity Data for Indiana (300 records compiled)

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The gravity data (300 records) were compiled by Purdue University. This data base was received in February 1993. Principal gravity parameters include Free-air...

  2. A Compilation of Internship Reports - 2012

    Energy Technology Data Exchange (ETDEWEB)

    Stegman M.; Morris, M.; Blackburn, N.

    2012-08-08

    This compilation documents all research project undertaken by the 2012 summer Department of Energy - Workforce Development for Teachers and Scientists interns during their internship program at Brookhaven National Laboratory.

  3. ALGOL compiler. Syntax and semantic analysis

    International Nuclear Information System (INIS)

    Tarbouriech, Robert

    1971-01-01

    In this research thesis, the author reports the development of an ALGOL compiler which performs the main following tasks: systematic scan of the origin-programme to recognise the different components (identifiers, reserved words, constants, separators), analysis of the origin-programme structure to build up its statements and arithmetic expressions, processing of symbolic names (identifiers) to associate them with values they represent, and memory allocation for data and programme. Several issues are thus addressed: characteristics of the machine for which the compiler is developed, exact definition of the language (grammar, identifier and constant formation), syntax processing programme to provide the compiler with necessary elements (language vocabulary, precedence matrix), description of the first two phases of compilation: lexicographic analysis, and syntax analysis. The last phase (machine-code generation) is not addressed

  4. A software methodology for compiling quantum programs

    Science.gov (United States)

    Häner, Thomas; Steiger, Damian S.; Svore, Krysta; Troyer, Matthias

    2018-04-01

    Quantum computers promise to transform our notions of computation by offering a completely new paradigm. To achieve scalable quantum computation, optimizing compilers and a corresponding software design flow will be essential. We present a software architecture for compiling quantum programs from a high-level language program to hardware-specific instructions. We describe the necessary layers of abstraction and their differences and similarities to classical layers of a computer-aided design flow. For each layer of the stack, we discuss the underlying methods for compilation and optimization. Our software methodology facilitates more rapid innovation among quantum algorithm designers, quantum hardware engineers, and experimentalists. It enables scalable compilation of complex quantum algorithms and can be targeted to any specific quantum hardware implementation.

  5. Compilation of Instantaneous Source Functions for Varying ...

    African Journals Online (AJOL)

    Compilation of Instantaneous Source Functions for Varying Architecture of a Layered Reservoir with Mixed Boundaries and Horizontal Well Completion Part III: B-Shaped Architecture with Vertical Well in the Upper Layer.

  6. Compilation of Instantaneous Source Functions for Varying ...

    African Journals Online (AJOL)

    Compilation of Instantaneous Source Functions for Varying Architecture of a Layered Reservoir with Mixed Boundaries and Horizontal Well Completion Part IV: Normal and Inverted Letter 'h' and 'H' Architecture.

  7. Revisiting Intel Xeon Phi optimization of Thompson cloud microphysics scheme in Weather Research and Forecasting (WRF) model

    Science.gov (United States)

    Mielikainen, Jarno; Huang, Bormin; Huang, Allen

    2015-10-01

    The Thompson cloud microphysics scheme is a sophisticated cloud microphysics scheme in the Weather Research and Forecasting (WRF) model. The scheme is very suitable for massively parallel computation as there are no interactions among horizontal grid points. Compared to the earlier microphysics schemes, the Thompson scheme incorporates a large number of improvements. Thus, we have optimized the speed of this important part of WRF. Intel Many Integrated Core (MIC) ushers in a new era of supercomputing speed, performance, and compatibility. It allows the developers to run code at trillions of calculations per second using the familiar programming model. In this paper, we present our results of optimizing the Thompson microphysics scheme on Intel Many Integrated Core Architecture (MIC) hardware. The Intel Xeon Phi coprocessor is the first product based on Intel MIC architecture, and it consists of up to 61 cores connected by a high performance on-die bidirectional interconnect. The coprocessor supports all important Intel development tools. Thus, the development environment is familiar one to a vast number of CPU developers. Although, getting a maximum performance out of MICs will require using some novel optimization techniques. New optimizations for an updated Thompson scheme are discusses in this paper. The optimizations improved the performance of the original Thompson code on Xeon Phi 7120P by a factor of 1.8x. Furthermore, the same optimizations improved the performance of the Thompson on a dual socket configuration of eight core Intel Xeon E5-2670 CPUs by a factor of 1.8x compared to the original Thompson code.

  8. The Transition and Adoption to Modern Programming Concepts for Scientific Computing in Fortran

    Directory of Open Access Journals (Sweden)

    Charles D. Norton

    2007-01-01

    Full Text Available This paper describes our experiences in the early exploration of modern concepts introduced in Fortran90 for large-scale scientific programming. We review our early work in expressing object-oriented concepts based on the new Fortran90 constructs – foreign to most programmers at the time – our experimental work in applying them to various applications, the impact on the WG5/J3 standards committees to consider formalizing object-oriented constructs for later versions of Fortran, and work in exploring how other modern programming techniques such as Design Patterns can and have impacted our software development. Applications will be drawn from plasma particle simulation and finite element adaptive mesh refinement for solid earth crustal deformation modeling.

  9. An Efficient Compiler for Weighted Rewrite Rules

    OpenAIRE

    Mohri, Mehryar; Sproat, Richard

    1996-01-01

    Context-dependent rewrite rules are used in many areas of natural language and speech processing. Work in computational phonology has demonstrated that, given certain conditions, such rewrite rules can be represented as finite-state transducers (FSTs). We describe a new algorithm for compiling rewrite rules into FSTs. We show the algorithm to be simpler and more efficient than existing algorithms. Further, many of our applications demand the ability to compile weighted rules into weighted FST...

  10. Compilation of Sandia Laboratories technical capabilities

    International Nuclear Information System (INIS)

    Lundergan, C.D.; Mead, P.L.

    1975-11-01

    This report is a compilation of 17 individual documents that together summarize the technical capabilities of Sandia Laboratories. Each document in this compilation contains details about a specific area of capability. Examples of application of the capability to research and development problems are provided. An eighteenth document summarizes the content of the other seventeen. Each of these documents was issued with a separate report number (SAND 74-0073A through SAND 74-0091, except -0078)

  11. Compilation of Sandia Laboratories technical capabilities

    Energy Technology Data Exchange (ETDEWEB)

    Lundergan, C. D.; Mead, P. L. [eds.

    1975-11-01

    This report is a compilation of 17 individual documents that together summarize the technical capabilities of Sandia Laboratories. Each document in this compilation contains details about a specific area of capability. Examples of application of the capability to research and development problems are provided. An eighteenth document summarizes the content of the other seventeen. Each of these documents was issued with a separate report number (SAND 74-0073A through SAND 74-0091, except -0078). (RWR)

  12. Electronic circuits for communications systems: A compilation

    Science.gov (United States)

    1972-01-01

    The compilation of electronic circuits for communications systems is divided into thirteen basic categories, each representing an area of circuit design and application. The compilation items are moderately complex and, as such, would appeal to the applications engineer. However, the rationale for the selection criteria was tailored so that the circuits would reflect fundamental design principles and applications, with an additional requirement for simplicity whenever possible.

  13. "SMART": A Compact and Handy FORTRAN Code for the Physics of Stellar Atmospheres

    Science.gov (United States)

    Sapar, A.; Poolamäe, R.

    2003-01-01

    A new computer code SMART (Spectra from Model Atmospheres by Radiative Transfer) for computing the stellar spectra, forming in plane-parallel atmospheres, has been compiled by us and A. Aret. To guarantee wide compatibility of the code with shell environment, we chose FORTRAN-77 as programming language and tried to confine ourselves to common part of its numerous versions both in WINDOWS and LINUX. SMART can be used for studies of several processes in stellar atmospheres. The current version of the programme is undergoing rapid changes due to our goal to elaborate a simple, handy and compact code. Instead of linearisation (being a mathematical method of recurrent approximations) we propose to use the physical evolutionary changes or in other words relaxation of quantum state populations rates from LTE to NLTE has been studied using small number of NLTE states. This computational scheme is essentially simpler and more compact than the linearisation. This relaxation scheme enables using instead of the Λ-iteration procedure a physically changing emissivity (or the source function) which incorporates in itself changing Menzel coefficients for NLTE quantum state populations. However, the light scattering on free electrons is in the terms of Feynman graphs a real second-order quantum process and cannot be reduced to consequent processes of absorption and emission as in the case of radiative transfer in spectral lines. With duly chosen input parameters the code SMART enables computing radiative acceleration to the matter of stellar atmosphere in turbulence clumps. This also enables to connect the model atmosphere in more detail with the problem of the stellar wind triggering. Another problem, which has been incorporated into the computer code SMART, is diffusion of chemical elements and their isotopes in the atmospheres of chemically peculiar (CP) stars due to usual radiative acceleration and the essential additional acceleration generated by the light-induced drift. As

  14. Bridging FORTRAN to object oriented paradigm for HEP data modeling task

    International Nuclear Information System (INIS)

    Huang, J.

    1993-12-01

    Object oriented (OO) technology appears to offer tangible benefits to the high energy physics (HEP) community. Yet many physicists view this newest software development used with much reservation and reluctance. Facing the reality of having to support the existing physics applications, which are written in FORTRAN, the software engineers in the Computer Engineering Group of the Physics Research Division at the Superconducting Super Collider Laboratory have accepted the challenge of mixing an old language with the new technology. This paper describes the experience and the techniques devised to fit FORTRAN into the OO paradigm (OOP)

  15. FORTRAN data files transference from VAX/VMS to ALPHA/UNIX

    International Nuclear Information System (INIS)

    Sanchez, E.; Milligen, B.Ph. van

    1997-01-01

    Several tools have been developed to access the TJ-I and TJ-IU databases, which currently reside in VAX/VMS servers, from the TJ-II Data Acquisition System DEC ALPHA 8400 server. The TJ-I/TJ-IU databases are not homogeneous and contain several types of data files, namely, SADE. CAMAC and FORTRAN un formatted files. The tools presented in this report allow one to transfer CAMAC and those FORTRAN un formatted files defined herein. from a VAX/VMS server, for data manipulation on the ALPHA/Digital UNIX server. (Author) 5 refs

  16. Fortran code for generating random probability vectors, unitaries, and quantum states

    Directory of Open Access Journals (Sweden)

    Jonas eMaziero

    2016-03-01

    Full Text Available The usefulness of generating random configurations is recognized in many areas of knowledge. Fortran was born for scientific computing and has been one of the main programming languages in this area since then. And several ongoing projects targeting towards its betterment indicate that it will keep this status in the decades to come. In this article, we describe Fortran codes produced, or organized, for the generation of the following random objects: numbers, probability vectors, unitary matrices, and quantum state vectors and density matrices. Some matrix functions are also included and may be of independent interest.

  17. Transitioning to Intel-based Linux Servers in the Payload Operations Integration Center

    Science.gov (United States)

    Guillebeau, P. L.

    2004-01-01

    The MSFC Payload Operations Integration Center (POIC) is the focal point for International Space Station (ISS) payload operations. The POIC contains the facilities, hardware, software and communication interface necessary to support payload operations. ISS ground system support for processing and display of real-time spacecraft and telemetry and command data has been operational for several years. The hardware components were reaching end of life and vendor costs were increasing while ISS budgets were becoming severely constrained. Therefore it has been necessary to migrate the Unix portions of our ground systems to commodity priced Intel-based Linux servers. hardware architecture including networks, data storage, and highly available resources. This paper will concentrate on the Linux migration implementation for the software portion of our ground system. The migration began with 3.5 million lines of code running on Unix platforms with separate servers for telemetry, command, Payload information management systems, web, system control, remote server interface and databases. The Intel-based system is scheduled to be available for initial operational use by August 2004 The overall migration to Intel-based Linux servers in the control center involves changes to the This paper will address the Linux migration study approach including the proof of concept, criticality of customer buy-in and importance of beginning with POSlX compliant code. It will focus on the development approach explaining the software lifecycle. Other aspects of development will be covered including phased implementation, interim milestones and metrics measurements and reporting mechanisms. This paper will also address the testing approach covering all levels of testing including development, development integration, IV&V, user beta testing and acceptance testing. Test results including performance numbers compared with Unix servers will be included. need for a smooth transition while maintaining

  18. Compiling software for a hierarchical distributed processing system

    Science.gov (United States)

    Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D; Smith, Brian E

    2013-12-31

    Compiling software for a hierarchical distributed processing system including providing to one or more compiling nodes software to be compiled, wherein at least a portion of the software to be compiled is to be executed by one or more nodes; compiling, by the compiling node, the software; maintaining, by the compiling node, any compiled software to be executed on the compiling node; selecting, by the compiling node, one or more nodes in a next tier of the hierarchy of the distributed processing system in dependence upon whether any compiled software is for the selected node or the selected node's descendents; sending to the selected node only the compiled software to be executed by the selected node or selected node's descendent.

  19. Real time interrupt handling using FORTRAN IV plus under RSX-11M

    International Nuclear Information System (INIS)

    Schultz, D.E.

    1981-01-01

    A real-time data acquisition application for a linear accelerator is described. The important programming features of this application are use of connect to interrupt, a shared library, map to I/O page, and a shared data area. How you can provide rapid interrupt handling using these tools from FORTRAN IV PLUS is explained

  20. Pseudo-BINPUT, a free formal input package for Fortran programmes

    International Nuclear Information System (INIS)

    Gubbins, M.E.

    1977-11-01

    Pseudo - BINPUT is an input package for reading free format data in codeword control in a FORTRAN programme. To a large degree it mimics in function the Winfrith Subroutine Library routine BINPUT. By using calls of the data input package DECIN to mimic the input routine BINPUT, Pseudo - BINPUT combines some of the advantages of both systems. (U.K.)

  1. A FORTRAN program for an IBM PC compatible computer for calculating kinematical electron diffraction patterns

    International Nuclear Information System (INIS)

    Skjerpe, P.

    1989-01-01

    This report describes a computer program which is useful in transmission electron microscopy. The program is written in FORTRAN and calculates kinematical electron diffraction patterns in any zone axis from a given crystal structure. Quite large unit cells, containing up to 2250 atoms, can be handled by the program. The program runs on both the Helcules graphic card and the standard IBM CGA card

  2. Programs in Fortran language for reporting the results of the analyses by ICP emission spectroscopy

    International Nuclear Information System (INIS)

    Roca, M.

    1985-01-01

    Three programs, written in FORTRAN IV language, for reporting the results of the analyses by ICP emission spectroscopy from data stored in files on floppy disks have been developed. They are intended, respectively, for the analyses of: 1) waters, 2) granites and slates, and 3) different kinds of geological materials. (Author) 8 refs

  3. SLACINPT - a FORTRAN program that generates boundary data for the SLAC gun code

    International Nuclear Information System (INIS)

    Michel, W.L.; Hepburn, J.D.

    1982-03-01

    The FORTRAN program SLACINPT was written to simplify the preparation of boundary data for the SLAC gun code. In SLACINPT, the boundary is described by a sequence of straight line or arc segments. From these, the program generates the individual boundary mesh point data, required as input by the SLAC gun code

  4. Extracting UML Class Diagrams from Object-Oriented Fortran: ForUML

    Directory of Open Access Journals (Sweden)

    Aziz Nanthaamornphong

    2015-01-01

    Full Text Available Many scientists who implement computational science and engineering software have adopted the object-oriented (OO Fortran paradigm. One of the challenges faced by OO Fortran developers is the inability to obtain high level software design descriptions of existing applications. Knowledge of the overall software design is not only valuable in the absence of documentation, it can also serve to assist developers with accomplishing different tasks during the software development process, especially maintenance and refactoring. The software engineering community commonly uses reverse engineering techniques to deal with this challenge. A number of reverse engineering-based tools have been proposed, but few of them can be applied to OO Fortran applications. In this paper, we propose a software tool to extract unified modeling language (UML class diagrams from Fortran code. The UML class diagram facilitates the developers' ability to examine the entities and their relationships in the software system. The extracted diagrams enhance software maintenance and evolution. The experiments carried out to evaluate the proposed tool show its accuracy and a few of the limitations.

  5. Autogenic Feedback Training (Body Fortran) with Biofeedback and the Computer for Self-Improvement and Change.

    Science.gov (United States)

    Cassel, Russell N.; Sumintardja, Elmira Nasrudin

    1983-01-01

    Describes autogenic feedback training, which provides the basis whereby an individual is able to improve on well being through use of a technique described as "body fortran," implying that you program self as one programs a computer. Necessary requisites are described including relaxation training and the management of stress. (JAC)

  6. Innovative Language-Based & Object-Oriented Structured AMR Using Fortran 90 and OpenMP

    Science.gov (United States)

    Norton, C.; Balsara, D.

    1999-01-01

    Parallel adaptive mesh refinement (AMR) is an important numerical technique that leads to the efficient solution of many physical and engineering problems. In this paper, we describe how AMR programing can be performed in an object-oreinted way using the modern aspects of Fortran 90 combined with the parallelization features of OpenMP.

  7. High-throughput sockets over RDMA for the Intel Xeon Phi coprocessor

    CERN Document Server

    Santogidis, Aram

    2017-01-01

    In this paper we describe the design, implementation and performance of Trans4SCIF, a user-level socket-like transport library for the Intel Xeon Phi coprocessor. Trans4SCIF library is primarily intended for high-throughput applications. It uses RDMA transfers over the native SCIF support, in a way that is transparent for the application, which has the illusion of using conventional stream sockets. We also discuss the integration of Trans4SCIF with the ZeroMQ messaging library, used extensively by several applications running at CERN. We show that this can lead to a substantial, up to 3x, increase of application throughput compared to the default TCP/IP transport option.

  8. GW Calculations of Materials on the Intel Xeon-Phi Architecture

    Science.gov (United States)

    Deslippe, Jack; da Jornada, Felipe H.; Vigil-Fowler, Derek; Biller, Ariel; Chelikowsky, James R.; Louie, Steven G.

    Intel Xeon-Phi processors are expected to power a large number of High-Performance Computing (HPC) systems around the United States and the world in the near future. We evaluate the ability of GW and pre-requisite Density Functional Theory (DFT) calculations for materials on utilizing the Xeon-Phi architecture. We describe the optimization process and performance improvements achieved. We find that the GW method, like other higher level Many-Body methods beyond standard local/semilocal approximations to Kohn-Sham DFT, is particularly well suited for many-core architectures due to the ability to exploit a large amount of parallelism over plane-waves, band-pairs and frequencies. Support provided by the SCIDAC program, Department of Energy, Office of Science, Advanced Scientic Computing Research and Basic Energy Sciences. Grant Numbers DE-SC0008877 (Austin) and DE-AC02-05CH11231 (LBNL).

  9. Heterogeneous High Throughput Scientific Computing with APM X-Gene and Intel Xeon Phi

    CERN Document Server

    Abdurachmanov, David; Elmer, Peter; Eulisse, Giulio; Knight, Robert; Muzaffar, Shahzad

    2014-01-01

    Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for cost- efficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with specialized processors. In this paper, we examine the Intel Xeon Phi Many Integrated Cores (MIC) co-processor and Applied Micro X-Gene ARMv8 64-bit low-power server system-on-a-chip (SoC) solutions for scientific computing applications. We report our experience on software porting, performance and energy efficiency and evaluate the potential for use of such technologies in the context of distributed computing systems such as the Worldwide LHC Computing Grid (WLCG).

  10. Heterogeneous High Throughput Scientific Computing with APM X-Gene and Intel Xeon Phi

    Science.gov (United States)

    Abdurachmanov, David; Bockelman, Brian; Elmer, Peter; Eulisse, Giulio; Knight, Robert; Muzaffar, Shahzad

    2015-05-01

    Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for cost- efficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with specialized processors. In this paper, we examine the Intel Xeon Phi Many Integrated Cores (MIC) co-processor and Applied Micro X-Gene ARMv8 64-bit low-power server system-on-a-chip (SoC) solutions for scientific computing applications. We report our experience on software porting, performance and energy efficiency and evaluate the potential for use of such technologies in the context of distributed computing systems such as the Worldwide LHC Computing Grid (WLCG).

  11. Heterogeneous High Throughput Scientific Computing with APM X-Gene and Intel Xeon Phi

    International Nuclear Information System (INIS)

    Abdurachmanov, David; Bockelman, Brian; Elmer, Peter; Eulisse, Giulio; Muzaffar, Shahzad; Knight, Robert

    2015-01-01

    Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for cost- efficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with specialized processors. In this paper, we examine the Intel Xeon Phi Many Integrated Cores (MIC) co-processor and Applied Micro X-Gene ARMv8 64-bit low-power server system-on-a-chip (SoC) solutions for scientific computing applications. We report our experience on software porting, performance and energy efficiency and evaluate the potential for use of such technologies in the context of distributed computing systems such as the Worldwide LHC Computing Grid (WLCG). (paper)

  12. Communication overhead on the Intel Paragon, IBM SP2 and Meiko CS-2

    Science.gov (United States)

    Bokhari, Shahid H.

    1995-01-01

    Interprocessor communication overhead is a crucial measure of the power of parallel computing systems-its impact can severely limit the performance of parallel programs. This report presents measurements of communication overhead on three contemporary commercial multicomputer systems: the Intel Paragon, the IBM SP2 and the Meiko CS-2. In each case the time to communicate between processors is presented as a function of message length. The time for global synchronization and memory access is discussed. The performance of these machines in emulating hypercubes and executing random pairwise exchanges is also investigated. It is shown that the interprocessor communication time depends heavily on the specific communication pattern required. These observations contradict the commonly held belief that communication overhead on contemporary machines is independent of the placement of tasks on processors. The information presented in this report permits the evaluation of the efficiency of parallel algorithm implementations against standard baselines.

  13. A performance study of sparse Cholesky factorization on INTEL iPSC/860

    Science.gov (United States)

    Zubair, M.; Ghose, M.

    1992-01-01

    The problem of Cholesky factorization of a sparse matrix has been very well investigated on sequential machines. A number of efficient codes exist for factorizing large unstructured sparse matrices. However, there is a lack of such efficient codes on parallel machines in general, and distributed machines in particular. Some of the issues that are critical to the implementation of sparse Cholesky factorization on a distributed memory parallel machine are ordering, partitioning and mapping, load balancing, and ordering of various tasks within a processor. Here, we focus on the effect of various partitioning schemes on the performance of sparse Cholesky factorization on the Intel iPSC/860. Also, a new partitioning heuristic for structured as well as unstructured sparse matrices is proposed, and its performance is compared with other schemes.

  14. Plasma turbulence calculations on the Intel iPSC/860 (rx) hypercube

    International Nuclear Information System (INIS)

    Lynch, V.E.; Ruiter, J.R.

    1990-01-01

    One approach to improving the real-time efficiency of plasma turbulence calculations is to use a parallel algorithm. A serial algorithm used for plasma turbulence calculations was modified to allocate a radial region in each node. In this way, convolutions at a fixed radius are performed in parallel, and communication is limited to boundary values for each radial region. For a semi-implicity numerical scheme (tridiagonal matrix solver), there is a factor of 3 improvement in efficiency with the Intel iPSC/860 machine using 64 processors over a single-processor Cray-II. For block-tridiagonal matrix cases (fully implicit code), a second parallelization takes place. The Fourier components are distributed in nodes. In each node, the block-tridiagonal matrix is inverted for each of allocated Fourier components. The algorithm for this second case has not yet been optimized. 10 refs., 4 figs

  15. Evaluation of the Intel iWarp parallel processor for space flight applications

    Science.gov (United States)

    Hine, Butler P., III; Fong, Terrence W.

    1993-01-01

    The potential of a DARPA-sponsored advanced processor, the Intel iWarp, for use in future SSF Data Management Systems (DMS) upgrades is evaluated through integration into the Ames DMS testbed and applications testing. The iWarp is a distributed, parallel computing system well suited for high performance computing applications such as matrix operations and image processing. The system architecture is modular, supports systolic and message-based computation, and is capable of providing massive computational power in a low-cost, low-power package. As a consequence, the iWarp offers significant potential for advanced space-based computing. This research seeks to determine the iWarp's suitability as a processing device for space missions. In particular, the project focuses on evaluating the ease of integrating the iWarp into the SSF DMS baseline architecture and the iWarp's ability to support computationally stressing applications representative of SSF tasks.

  16. Lattice QCD with Domain Decomposition on Intel Xeon Phi Co-Processors

    Energy Technology Data Exchange (ETDEWEB)

    Heybrock, Simon; Joo, Balint; Kalamkar, Dhiraj D; Smelyanskiy, Mikhail; Vaidyanathan, Karthikeyan; Wettig, Tilo; Dubey, Pradeep

    2014-12-01

    The gap between the cost of moving data and the cost of computing continues to grow, making it ever harder to design iterative solvers on extreme-scale architectures. This problem can be alleviated by alternative algorithms that reduce the amount of data movement. We investigate this in the context of Lattice Quantum Chromodynamics and implement such an alternative solver algorithm, based on domain decomposition, on Intel Xeon Phi co-processor (KNC) clusters. We demonstrate close-to-linear on-chip scaling to all 60 cores of the KNC. With a mix of single- and half-precision the domain-decomposition method sustains 400-500 Gflop/s per chip. Compared to an optimized KNC implementation of a standard solver [1], our full multi-node domain-decomposition solver strong-scales to more nodes and reduces the time-to-solution by a factor of 5.

  17. Compilations and evaluations of nuclear structure and decay data

    International Nuclear Information System (INIS)

    Lorenz, A.

    1977-10-01

    This is the third issue of a report series on published and to-be-published compilations and evaluations of nuclear structure and decay (NSD) data. This compilation is published and distributed by the IAEA Nuclear Data Section approximately every six months. This compilation of compilations and evaluations is designed to keep the nuclear scientific community informed of the availability of compiled or evaluated NSD data, and contains references to laboratory reports, journal articles and books containing selected compilations and evaluations

  18. Using Intel Xeon Phi to accelerate the WRF TEMF planetary boundary layer scheme

    Science.gov (United States)

    Mielikainen, Jarno; Huang, Bormin; Huang, Allen

    2014-05-01

    The Weather Research and Forecasting (WRF) model is designed for numerical weather prediction and atmospheric research. The WRF software infrastructure consists of several components such as dynamic solvers and physics schemes. Numerical models are used to resolve the large-scale flow. However, subgrid-scale parameterizations are for an estimation of small-scale properties (e.g., boundary layer turbulence and convection, clouds, radiation). Those have a significant influence on the resolved scale due to the complex nonlinear nature of the atmosphere. For the cloudy planetary boundary layer (PBL), it is fundamental to parameterize vertical turbulent fluxes and subgrid-scale condensation in a realistic manner. A parameterization based on the Total Energy - Mass Flux (TEMF) that unifies turbulence and moist convection components produces a better result that the other PBL schemes. For that reason, the TEMF scheme is chosen as the PBL scheme we optimized for Intel Many Integrated Core (MIC), which ushers in a new era of supercomputing speed, performance, and compatibility. It allows the developers to run code at trillions of calculations per second using the familiar programming model. In this paper, we present our optimization results for TEMF planetary boundary layer scheme. The optimizations that were performed were quite generic in nature. Those optimizations included vectorization of the code to utilize vector units inside each CPU. Furthermore, memory access was improved by scalarizing some of the intermediate arrays. The results show that the optimization improved MIC performance by 14.8x. Furthermore, the optimizations increased CPU performance by 2.6x compared to the original multi-threaded code on quad core Intel Xeon E5-2603 running at 1.8 GHz. Compared to the optimized code running on a single CPU socket the optimized MIC code is 6.2x faster.

  19. Compilation of data on elementary particles

    International Nuclear Information System (INIS)

    Trippe, T.G.

    1984-09-01

    The most widely used data compilation in the field of elementary particle physics is the Review of Particle Properties. The origin, development and current state of this compilation are described with emphasis on the features which have contributed to its success: active involvement of particle physicists; critical evaluation and review of the data; completeness of coverage; regular distribution of reliable summaries including a pocket edition; heavy involvement of expert consultants; and international collaboration. The current state of the Review and new developments such as providing interactive access to the Review's database are described. Problems and solutions related to maintaining a strong and supportive relationship between compilation groups and the researchers who produce and use the data are discussed

  20. Compilation of current high energy physics experiments

    International Nuclear Information System (INIS)

    1978-09-01

    This compilation of current high-energy physics experiments is a collaborative effort of the Berkeley Particle Data Group, the SLAC library, and the nine participating laboratories: Argonne (ANL), Brookhaven (BNL), CERN, DESY, Fermilab (FNAL), KEK, Rutherford (RHEL), Serpukhov (SERP), and SLAC. Nominally, the compilation includes summaries of all high-energy physics experiments at the above laboratories that were approved (and not subsequently withdrawn) before about June 1978, and had not completed taking of data by 1 January 1975. The experimental summaries are supplemented with three indexes to the compilation, several vocabulary lists giving names or abbreviations used, and a short summary of the beams at each of the laboratories (except Rutherford). The summaries themselves are included on microfiche

  1. Extension of Alvis compiler front-end

    Energy Technology Data Exchange (ETDEWEB)

    Wypych, Michał; Szpyrka, Marcin; Matyasik, Piotr, E-mail: mwypych@agh.edu.pl, E-mail: mszpyrka@agh.edu.pl, E-mail: ptm@agh.edu.pl [AGH University of Science and Technology, Department of Applied Computer Science, Al. Mickiewicza 30, 30-059 Krakow (Poland)

    2015-12-31

    Alvis is a formal modelling language that enables possibility of verification of distributed concurrent systems. An Alvis model semantics finds expression in an LTS graph (labelled transition system). Execution of any language statement is expressed as a transition between formally defined states of such a model. An LTS graph is generated using a middle-stage Haskell representation of an Alvis model. Moreover, Haskell is used as a part of the Alvis language and is used to define parameters’ types and operations on them. Thanks to the compiler’s modular construction many aspects of compilation of an Alvis model may be modified. Providing new plugins for Alvis Compiler that support languages like Java or C makes possible using these languages as a part of Alvis instead of Haskell. The paper presents the compiler internal model and describes how the default specification language can be altered by new plugins.

  2. Asian collaboration on nuclear reaction data compilation

    International Nuclear Information System (INIS)

    Aikawa, Masayuki; Furutachi, Naoya; Kato, Kiyoshi; Makinaga, Ayano; Devi, Vidya; Ichinkhorloo, Dagvadorj; Odsuren, Myagmarjav; Tsubakihara, Kohsuke; Katayama, Toshiyuki; Otuka, Naohiko

    2013-01-01

    Nuclear reaction data are essential for research and development in nuclear engineering, radiation therapy, nuclear physics and astrophysics. Experimental data must be compiled in a database and be accessible to nuclear data users. One of the nuclear reaction databases is the EXFOR database maintained by the International Network of Nuclear Reaction Data Centres (NRDC) under the auspices of the International Atomic Energy Agency. Recently, collaboration among the Asian NRDC members is being further developed under the support of the Asia-Africa Science Platform Program of the Japan Society for the Promotion of Science. We report the activity for three years to develop the Asian collaboration on nuclear reaction data compilation. (author)

  3. Promising Compilation to ARMv8 POP

    OpenAIRE

    Podkopaev, Anton; Lahav, Ori; Vafeiadis, Viktor

    2017-01-01

    We prove the correctness of compilation of relaxed memory accesses and release-acquire fences from the "promising" semantics of [Kang et al. POPL'17] to the ARMv8 POP machine of [Flur et al. POPL'16]. The proof is highly non-trivial because both the ARMv8 POP and the promising semantics provide some extremely weak consistency guarantees for normal memory accesses; however, they do so in rather different ways. Our proof of compilation correctness to ARMv8 POP strengthens the results of the Kan...

  4. Run-Time and Compiler Support for Programming in Adaptive Parallel Environments

    Directory of Open Access Journals (Sweden)

    Guy Edjlali

    1997-01-01

    Full Text Available For better utilization of computing resources, it is important to consider parallel programming environments in which the number of available processors varies at run-time. In this article, we discuss run-time support for data-parallel programming in such an adaptive environment. Executing programs in an adaptive environment requires redistributing data when the number of processors changes, and also requires determining new loop bounds and communication patterns for the new set of processors. We have developed a run-time library to provide this support. We discuss how the run-time library can be used by compilers of high-performance Fortran (HPF-like languages to generate code for an adaptive environment. We present performance results for a Navier-Stokes solver and a multigrid template run on a network of workstations and an IBM SP-2. Our experiments show that if the number of processors is not varied frequently, the cost of data redistribution is not significant compared to the time required for the actual computation. Overall, our work establishes the feasibility of compiling HPF for a network of nondedicated workstations, which are likely to be an important resource for parallel programming in the future.

  5. Data compilation of single pion photoproduction below 2 GeV

    International Nuclear Information System (INIS)

    Inagaki, Y.; Nakamura, T.; Ukai, K.

    1976-01-01

    The compilation of data of single pion photoproduction experiment below 2 GeV is presented with the keywords which specify the experiment. These data are written on a magnetic tape. Data format and the indices for the keywords are given. Various programs of using this tape are also presented. The results of the compilation are divided into two types. The one is the reference card on which the information of the experiment is given. The other is the data card. These reference and data cards are written using all A-type format on an original tape. The copy tapes are available, which are written by various types on request. There are two kinds of the copy tapes. The one is same as the original tape, and the other is the one different in the data card. Namely, this card is written by F-type following the data type. One experiment on this tape is represented by 3 kinds of the cards. One reference card with A-type format, many data cards with F-type format and one identifying card. Various programs which are written by FORTRAN are ready for these original and copy tapes. (Kato, T.)

  6. Parallelizing More Loops with Compiler Guided Refactoring

    DEFF Research Database (Denmark)

    Larsen, Per; Ladelsky, Razya; Lidman, Jacob

    2012-01-01

    an interactive compilation feedback system that guides programmers in iteratively modifying their application source code. This helps leverage the compiler’s ability to generate loop-parallel code. We employ our system to modify two sequential benchmarks dealing with image processing and edge detection...

  7. Safety and maintenance engineering: A compilation

    Science.gov (United States)

    1974-01-01

    A compilation is presented for the dissemination of information on technological developments which have potential utility outside the aerospace and nuclear communities. Safety of personnel engaged in the handling of hazardous materials and equipment, protection of equipment from fire, high wind, or careless handling by personnel, and techniques for the maintenance of operating equipment are reported.

  8. Compilation of information on melter modeling

    International Nuclear Information System (INIS)

    Eyler, L.L.

    1996-03-01

    The objective of the task described in this report is to compile information on modeling capabilities for the High-Temperature Melter and the Cold Crucible Melter and issue a modeling capabilities letter report summarizing existing modeling capabilities. The report is to include strategy recommendations for future modeling efforts to support the High Level Waste (BLW) melter development

  9. Design of methodology for incremental compiler construction

    Directory of Open Access Journals (Sweden)

    Pavel Haluza

    2011-01-01

    Full Text Available The paper deals with possibilities of the incremental compiler construction. It represents the compiler construction possibilities for languages with a fixed set of lexical units and for languages with a variable set of lexical units, too. The methodology design for the incremental compiler construction is based on the known algorithms for standard compiler construction and derived for both groups of languages. Under the group of languages with a fixed set of lexical units there belong languages, where each lexical unit has its constant meaning, e.g., common programming languages. For this group of languages the paper tries to solve the problem of the incremental semantic analysis, which is based on incremental parsing. In the group of languages with a variable set of lexical units (e.g., professional typographic system TEX, it is possible to change arbitrarily the meaning of each character on the input file at any time during processing. The change takes effect immediately and its validity can be somehow limited or is given by the end of the input. For this group of languages this paper tries to solve the problem case when we use macros temporarily changing the category of arbitrary characters.

  10. Compilation of cross-sections. Pt. 1

    International Nuclear Information System (INIS)

    Flaminio, V.; Moorhead, W.G.; Morrison, D.R.O.; Rivoire, N.

    1983-01-01

    A compilation of integral cross-sections for hadronic reactions is presented. This is an updated version of CERN/HERA 79-1, 79-2, 79-3. It contains all data published up to the beginning of 1982, but some more recent data have also been included. Plots of the cross-sections versus incident laboratory momentum are also given. (orig.)

  11. Verified compilation of Concurrent Managed Languages

    Science.gov (United States)

    2017-11-01

    Communications Division Information Directorate This report is published in the interest of scientific and technical information exchange, and its...271, 2007. [85] Viktor Vafeiadis. Modular fine-grained concurrency verification. Technical Report UCAM-CL-TR- 726, University of Cambridge, Computer...VERIFIED COMPILATION OF CONCURRENT MANAGED LANGUAGES PURDUE UNIVERSITY NOVEMBER 2017 FINAL TECHNICAL REPORT APPROVED FOR PUBLIC RELEASE

  12. Nuclear power plant operational data compilation system

    International Nuclear Information System (INIS)

    Silberberg, S.

    1980-01-01

    Electricite de France R and D Division has set up a nuclear power plant operational data compilation system. This data bank, created through American documents allows results about plant operation and operational material behaviour to be given. At present, French units at commercial operation are taken into account. Results obtained after five years of data bank operation are given. (author)

  13. Compiler-Agnostic Function Detection in Binaries

    NARCIS (Netherlands)

    Andriesse, D.A.; Slowinska, J.M.; Bos, H.J.

    2017-01-01

    We propose Nucleus, a novel function detection algorithm for binaries. In contrast to prior work, Nucleus is compiler-agnostic, and does not require any learning phase or signature information. Instead of scanning for signatures, Nucleus detects functions at the Control Flow Graph-level, making it

  14. Compiler Driven Code Comments and Refactoring

    DEFF Research Database (Denmark)

    Larsen, Per; Ladelsky, Razya; Karlsson, Sven

    2011-01-01

    . We demonstrate the ability of our tool to trans- form code, and suggest code refactoring that increase its amenability to optimization. The preliminary results shows that, with our tool-set, au- tomatic loop parallelization with the GNU C compiler, gcc, yields 8.6x best-case speedup over...

  15. Expectation Levels in Dictionary Consultation and Compilation ...

    African Journals Online (AJOL)

    Dictionary consultation and compilation is a two-way engagement between two parties, namely a dictionary user and a lexicographer. How well users cope with looking up words in a Bantu language dictionary and to what extent their expectations are met, depends on their consultation skills, their knowledge of the structure ...

  16. Expectation Levels in Dictionary Consultation and Compilation*

    African Journals Online (AJOL)

    Abstract: Dictionary consultation and compilation is a two-way engagement between two par- ties, namely a dictionary user and a lexicographer. How well users cope with looking up words in a Bantu language dictionary and to what extent their expectations are met, depends on their con- sultation skills, their knowledge of ...

  17. Compilation of cross-sections. Pt. 4

    International Nuclear Information System (INIS)

    Alekhin, S.I.; Ezhela, V.V.; Lugovsky, S.B.; Tolstenkov, A.N.; Yushchenko, O.P.; Baldini, A.; Cobal, M.; Flaminio, V.; Capiluppi, P.; Giacomelli, G.; Mandrioli, G.; Rossi, A.M.; Serra, P.; Moorhead, W.G.; Morrison, D.R.O.; Rivoire, N.

    1987-01-01

    This is the fourth volume in our series of data compilations on integrated cross-sections for weak, electromagnetic, and strong interaction processes. This volume covers data on reactions induced by photons, neutrinos, hyperons, and K L 0 . It contains all data published up to June 1986. Plots of the cross-sections versus incident laboratory momentum are also given. (orig.)

  18. 1991 OCRWM bulletin compilation and index

    International Nuclear Information System (INIS)

    1992-05-01

    The OCRWM Bulletin is published by the Department of Energy, Office of Civilian Radioactive Waste Management, to provide current information about the national program for managing spent fuel and high-level radioactive waste. The document is a compilation of issues from the 1991 calendar year. A table of contents and an index have been provided to reference information contained in this year's Bulletins

  19. Compiling Relational Bayesian Networks for Exact Inference

    DEFF Research Database (Denmark)

    Jaeger, Manfred; Darwiche, Adnan; Chavira, Mark

    2006-01-01

    We describe in this paper a system for exact inference with relational Bayesian networks as defined in the publicly available PRIMULA tool. The system is based on compiling propositional instances of relational Bayesian networks into arithmetic circuits and then performing online inference...

  20. Compiling Relational Bayesian Networks for Exact Inference

    DEFF Research Database (Denmark)

    Jaeger, Manfred; Chavira, Mark; Darwiche, Adnan

    2004-01-01

    We describe a system for exact inference with relational Bayesian networks as defined in the publicly available \\primula\\ tool. The system is based on compiling propositional instances of relational Bayesian networks into arithmetic circuits and then performing online inference by evaluating...

  1. A Computer Program to Compile a Flander-Amidon Interaction Analysis Matrix

    Science.gov (United States)

    Hardy, Robert C.

    1970-01-01

    A program was written in FORTRAN IV for an IBM 3600 to produce the Flanders-Amidon Interaction Analysis Matrix and to also produce percentages of certain p FORTRAN IV and V for the Univac 1108. (Editor/RT)

  2. HAL/S-FC compiler system functional specification

    Science.gov (United States)

    1974-01-01

    Compiler organization is discussed, including overall compiler structure, internal data transfer, compiler development, and code optimization. The user, system, and SDL interfaces are described, along with compiler system requirements. Run-time software support package and restrictions and dependencies are also considered of the HAL/S-FC system.

  3. Optimization of Grillages Using Genetic Algorithms for Integrating Matlab and Fortran Environments

    Directory of Open Access Journals (Sweden)

    Darius Mačiūnas

    2012-12-01

    Full Text Available The purpose of the paper is to present technology applied for the global optimization of grillage-type pile foundations (further grillages. The goal of optimization is to obtain the optimal layout of pile placement in the grillages. The problem can be categorized as a topology optimization problem. The objective function is comprised of maximum reactive force emerging in a pile. The reactive force is minimized during the procedure of optimization during which variables enclose the positions of piles beneath connecting beams. Reactive forces in all piles are computed utilizing an original algorithm implemented in the Fortran programming language. The algorithm is integrated into the MatLab environment where the optimization procedure is executed utilizing a genetic algorithm. The article also describes technology enabling the integration of MatLab and Fortran environments. The authors seek to evaluate the quality of a solution to the problem analyzing experimental results obtained applying the proposed technology.

  4. Optimization of Grillages Using Genetic Algorithms for Integrating Matlab and Fortran Environments

    Directory of Open Access Journals (Sweden)

    Darius Mačiūnas

    2013-02-01

    Full Text Available The purpose of the paper is to present technology applied for the global optimization of grillage-type pile foundations (further grillages. The goal of optimization is to obtain the optimal layout of pile placement in the grillages. The problem can be categorized as a topology optimization problem. The objective function is comprised of maximum reactive force emerging in a pile. The reactive force is minimized during the procedure of optimization during which variables enclose the positions of piles beneath connecting beams. Reactive forces in all piles are computed utilizing an original algorithm implemented in the Fortran programming language. The algorithm is integrated into the MatLab environment where the optimization procedure is executed utilizing a genetic algorithm. The article also describes technology enabling the integration of MatLab and Fortran environments. The authors seek to evaluate the quality of a solution to the problem analyzing experimental results obtained applying the proposed technology.

  5. Perspex machine: V. Compilation of C programs

    Science.gov (United States)

    Spanner, Matthew P.; Anderson, James A. D. W.

    2006-01-01

    The perspex machine arose from the unification of the Turing machine with projective geometry. The original, constructive proof used four special, perspective transformations to implement the Turing machine in projective geometry. These four transformations are now generalised and applied in a compiler, implemented in Pop11, that converts a subset of the C programming language into perspexes. This is interesting both from a geometrical and a computational point of view. Geometrically, it is interesting that program source can be converted automatically to a sequence of perspective transformations and conditional jumps, though we find that the product of homogeneous transformations with normalisation can be non-associative. Computationally, it is interesting that program source can be compiled for a Reduced Instruction Set Computer (RISC), the perspex machine, that is a Single Instruction, Zero Exception (SIZE) computer.

  6. A browser-based tool for conversion between Fortran NAMELIST and XML/HTML

    Science.gov (United States)

    Naito, O.

    A browser-based tool for conversion between Fortran NAMELIST and XML/HTML is presented. It runs on an HTML5 compliant browser and generates reusable XML files to aid interoperability. It also provides a graphical interface for editing and annotating variables in NAMELIST, hence serves as a primitive code documentation environment. Although the tool is not comprehensive, it could be viewed as a test bed for integrating legacy codes into modern systems.

  7. FORTRAN subroutine for computing the optimal estimate of f(x)

    International Nuclear Information System (INIS)

    Gaffney, P.W.

    1980-10-01

    A FORTRAN subroutine called RANGE is presented that is designed to compute the optimal estimate of a function f given values of the function at n distinct points x 1 2 < ... < x/sub n/ and given a bound on one of the derivatives of f. We donate this estimate by Ω. It is optimal in the sense that the error abs value (f - Ω) has the smallest possible error bound

  8. A browser-based tool for conversion between Fortran NAMELIST and XML/HTML

    Directory of Open Access Journals (Sweden)

    O. Naito

    2017-01-01

    Full Text Available A browser-based tool for conversion between Fortran NAMELIST and XML/HTML is presented. It runs on an HTML5 compliant browser and generates reusable XML files to aid interoperability. It also provides a graphical interface for editing and annotating variables in NAMELIST, hence serves as a primitive code documentation environment. Although the tool is not comprehensive, it could be viewed as a test bed for integrating legacy codes into modern systems.

  9. The inverse of winnowing: a FORTRAN subroutine and discussion of unwinnowing discrete data

    Science.gov (United States)

    Bracken, Robert E.

    2004-01-01

    This report describes an unwinnowing algorithm that utilizes a discrete Fourier transform, and a resulting Fortran subroutine that winnows or unwinnows a 1-dimensional stream of discrete data; the source code is included. The unwinnowing algorithm effectively increases (by integral factors) the number of available data points while maintaining the original frequency spectrum of a data stream. This has utility when an increased data density is required together with an availability of higher order derivatives that honor the original data.

  10. A FORTRAN version implementation of block adjustment of CCD frames and its preliminary application

    Science.gov (United States)

    Yu, Y.; Tang, Z.-H.; Li, J.-L.; Zhao, M.

    2005-09-01

    A FORTRAN version implementation of the block adjustment (BA) of overlapping CCD frames is developed and its flowchart is shown. The program is preliminarily applied to obtain the optical positions of four extragalactic radio sources. The results show that because of the increase in the number and sky coverage of reference stars the precision of optical positions with BA is improved compared with the single CCD frame adjustment.

  11. Compilation of data from hadronic atoms

    International Nuclear Information System (INIS)

    Poth, H.

    1979-01-01

    This compilation is a survey of the existing data of hadronic atoms (pionic-atoms, kaonic-atoms, antiprotonic-atoms, sigmonic-atoms). It collects measurements of the energies, intensities and line width of X-rays from hadronic atoms. Averaged values for each hadronic atom are given and the data are summarized. The listing contains data on 58 pionic-atoms, on 54 kaonic-atoms, on 23 antiprotonic-atoms and on 20 sigmonic-atoms. (orig./HB) [de

  12. A compilation of subsurface hydrogeologic data

    International Nuclear Information System (INIS)

    1986-03-01

    This report presents a compilation of both fracture properties and hydrogeological parameters relevant to the flow of groundwater in fractured rock systems. Methods of data acquisition as well as the scale of and conditions during the measurement are recorded. Measurements and analytical techniques for each of the parameters under consideration have been reviewed with respect to their methodology, assumptions and accuracy. Both the rock type and geologic setting associated with these measurements have also been recorded. 373 refs

  13. Compilation of requests for nuclear data

    International Nuclear Information System (INIS)

    1981-03-01

    A request list for nuclear data which was produced from a computerized data file by the National Nuclear Data Center is presented. The request list is given by target nucleus (isotope) and then reaction type. The purpose of the compilation is to summarize the current needs of US Nuclear Energy programs and other applied technologies for nuclear data. Requesters are identified by laboratory, last name, and sponsoring US government agency

  14. Existing Fortran interfaces to Trilinos in preparation for exascale ForTrilinos development

    Energy Technology Data Exchange (ETDEWEB)

    Evans, Katherine J. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Young, Mitchell T. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Collins, Benjamin S. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Johnson, Seth R. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Prokopenko, Andrey V. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Heroux, Michael A. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2017-03-01

    This report summarizes the current state of Fortran interfaces to the Trilinos library within several key applications of the Exascale Computing Program (ECP), with the aim of informing developers about strategies to develop ForTrilinos, an exascale-ready, Fortran interface software package within Trilinos. The two software projects assessed within are the DOE Office of Science's Accelerated Climate Model for Energy (ACME) atmosphere component, CAM, and the DOE Office of Nuclear Energy's core-simulator portion of VERA, a nuclear reactor simulation code. Trilinos is an object-oriented, C++ based software project, and spans a collection of algorithms and other enabling technologies such as uncertainty quantification and mesh generation. To date, Trilinos has enabled these codes to achieve large-scale simulation results, however the simulation needs of CAM and VERA-CS will approach exascale over the next five years. A Fortran interface to Trilinos that enables efficient use of programming models and more advanced algorithms is necessary. Where appropriate, the needs of the CAM and VERA-CS software to achieve their simulation goals are called out specifically. With this report, a design document and execution plan for ForTrilinos development can proceed.

  15. Acceleration of Monte Carlo simulation of photon migration in complex heterogeneous media using Intel many-integrated core architecture.

    Science.gov (United States)

    Gorshkov, Anton V; Kirillin, Mikhail Yu

    2015-08-01

    Over two decades, the Monte Carlo technique has become a gold standard in simulation of light propagation in turbid media, including biotissues. Technological solutions provide further advances of this technique. The Intel Xeon Phi coprocessor is a new type of accelerator for highly parallel general purpose computing, which allows execution of a wide range of applications without substantial code modification. We present a technical approach of porting our previously developed Monte Carlo (MC) code for simulation of light transport in tissues to the Intel Xeon Phi coprocessor. We show that employing the accelerator allows reducing computational time of MC simulation and obtaining simulation speed-up comparable to GPU. We demonstrate the performance of the developed code for simulation of light transport in the human head and determination of the measurement volume in near-infrared spectroscopy brain sensing.

  16. Performance Evaluation of an Intel Haswell- and Ivy Bridge-Based Supercomputer Using Scientific and Engineering Applications

    Science.gov (United States)

    Saini, Subhash; Hood, Robert T.; Chang, Johnny; Baron, John

    2016-01-01

    We present a performance evaluation conducted on a production supercomputer of the Intel Xeon Processor E5- 2680v3, a twelve-core implementation of the fourth-generation Haswell architecture, and compare it with Intel Xeon Processor E5-2680v2, an Ivy Bridge implementation of the third-generation Sandy Bridge architecture. Several new architectural features have been incorporated in Haswell including improvements in all levels of the memory hierarchy as well as improvements to vector instructions and power management. We critically evaluate these new features of Haswell and compare with Ivy Bridge using several low-level benchmarks including subset of HPCC, HPCG and four full-scale scientific and engineering applications. We also present a model to predict the performance of HPCG and Cart3D within 5%, and Overflow within 10% accuracy.

  17. GNAQPMS v1.1: accelerating the Global Nested Air Quality Prediction Modeling System (GNAQPMS) on Intel Xeon Phi processors

    OpenAIRE

    H. Wang; H. Wang; H. Wang; H. Wang; H. Chen; H. Chen; Q. Wu; Q. Wu; J. Lin; X. Chen; X. Xie; R. Wang; R. Wang; X. Tang; Z. Wang

    2017-01-01

    The Global Nested Air Quality Prediction Modeling System (GNAQPMS) is the global version of the Nested Air Quality Prediction Modeling System (NAQPMS), which is a multi-scale chemical transport model used for air quality forecast and atmospheric environmental research. In this study, we present the porting and optimisation of GNAQPMS on a second-generation Intel Xeon Phi processor, codenamed Knights Landing (KNL). Compared with the first-generation Xeon Phi coprocessor (code...

  18. Student Intern Ben Freed Competes as Finalist in Intel STS Competition, Three Other Interns Named Semifinalists | Poster

    Science.gov (United States)

    By Ashley DeVine, Staff Writer Werner H. Kirstin (WHK) student intern Ben Freed was one of 40 finalists to compete in the Intel Science Talent Search (STS) in Washington, DC, in March. “It was seven intense days of interacting with amazing judges and incredibly smart and interesting students. We met President Obama, and then the MIT astronomy lab named minor planets after each

  19. Evaluating the transport layer of the ALFA framework for the Intel(®) Xeon Phi(™) Coprocessor

    OpenAIRE

    Santogidis, Aram; Hirstius, Andreas; Lalis, Spyros

    2015-01-01

    The ALFA framework supports the software development of major High Energy Physics experiments. As part of our research effort to optimize the transport layer of ALFA, we focus on profiling its data transfer performance for inter-node communication on the Intel Xeon Phi Coprocessor. In this article we present the collected performance measurements with the related analysis of the results. The optimization opportunities that are discovered, help us to formulate the future plans of enabling high...

  20. Computationally efficient implementation of sarse-tap FIR adaptive filters with tap-position control on intel IA-32 processors

    OpenAIRE

    Hirano, Akihiro; Nakayama, Kenji

    2008-01-01

    This paper presents an computationally ef cient implementation of sparse-tap FIR adaptive lters with tapposition control on Intel IA-32 processors with single-instruction multiple-data (SIMD) capability. In order to overcome randomorder memory access which prevents a ectorization, a blockbased processing and a re-ordering buffer are introduced. A dynamic register allocation and the use of memory-to-register operations help the maximization of the loop-unrolling level. Up to 66percent speedup ...

  1. Efficient irregular wavefront propagation algorithms on Intel® Xeon Phi™.

    Science.gov (United States)

    Gomes, Jeremias M; Teodoro, George; de Melo, Alba; Kong, Jun; Kurc, Tahsin; Saltz, Joel H

    2015-10-01

    We investigate the execution of the Irregular Wavefront Propagation Pattern (IWPP), a fundamental computing structure used in several image analysis operations, on the Intel ® Xeon Phi ™ co-processor. An efficient implementation of IWPP on the Xeon Phi is a challenging problem because of IWPP's irregularity and the use of atomic instructions in the original IWPP algorithm to resolve race conditions. On the Xeon Phi, the use of SIMD and vectorization instructions is critical to attain high performance. However, SIMD atomic instructions are not supported. Therefore, we propose a new IWPP algorithm that can take advantage of the supported SIMD instruction set. We also evaluate an alternate storage container (priority queue) to track active elements in the wavefront in an effort to improve the parallel algorithm efficiency. The new IWPP algorithm is evaluated with Morphological Reconstruction and Imfill operations as use cases. Our results show performance improvements of up to 5.63 × on top of the original IWPP due to vectorization. Moreover, the new IWPP achieves speedups of 45.7 × and 1.62 × , respectively, as compared to efficient CPU and GPU implementations.

  2. Optimizing meridional advection of the Advanced Research WRF (ARW) dynamics for Intel Xeon Phi coprocessor

    Science.gov (United States)

    Mielikainen, Jarno; Huang, Bormin; Huang, Allen H.-L.

    2015-05-01

    The most widely used community weather forecast and research model in the world is the Weather Research and Forecast (WRF) model. Two distinct varieties of WRF exist. The one we are interested is the Advanced Research WRF (ARW) is an experimental, advanced research version featuring very high resolution. The WRF Nonhydrostatic Mesoscale Model (WRF-NMM) has been designed for forecasting operations. WRF consists of dynamics code and several physics modules. The WRF-ARW core is based on an Eulerian solver for the fully compressible nonhydrostatic equations. In the paper, we optimize a meridional (north-south direction) advection subroutine for Intel Xeon Phi coprocessor. Advection is of the most time consuming routines in the ARW dynamics core. It advances the explicit perturbation horizontal momentum equations by adding in the large-timestep tendency along with the small timestep pressure gradient tendency. We will describe the challenges we met during the development of a high-speed dynamics code subroutine for MIC architecture. Furthermore, lessons learned from the code optimization process will be discussed. The results show that the optimizations improved performance of the original code on Xeon Phi 7120P by a factor of 1.2x.

  3. Software and DVFS Tuning for Performance and Energy-Efficiency on Intel KNL Processors

    Directory of Open Access Journals (Sweden)

    Enrico Calore

    2018-06-01

    Full Text Available Energy consumption of processors and memories is quickly becoming a limiting factor in the deployment of large computing systems. For this reason, it is important to understand the energy performance of these processors and to study strategies allowing their use in the most efficient way. In this work, we focus on the computing and energy performance of the Knights Landing Xeon Phi, the latest Intel many-core architecture processor for HPC applications. We consider the 64-core Xeon Phi 7230 and profile its performance and energy efficiency using both its on-chip MCDRAM and the off-chip DDR4 memory as the main storage for application data. As a benchmark application, we use a lattice Boltzmann code heavily optimized for this architecture and implemented using several different arrangements of the application data in memory (data-layouts, in short. We also assess the dependence of energy consumption on data-layouts, memory configurations (DDR4 or MCDRAM and the number of threads per core. We finally consider possible trade-offs between computing performance and energy efficiency, tuning the clock frequency of the processor using the Dynamic Voltage and Frequency Scaling (DVFS technique.

  4. A comparison of SuperLU solvers on the intel MIC architecture

    Science.gov (United States)

    Tuncel, Mehmet; Duran, Ahmet; Celebi, M. Serdar; Akaydin, Bora; Topkaya, Figen O.

    2016-10-01

    In many science and engineering applications, problems may result in solving a sparse linear system AX=B. For example, SuperLU_MCDT, a linear solver, was used for the large penta-diagonal matrices for 2D problems and hepta-diagonal matrices for 3D problems, coming from the incompressible blood flow simulation (see [1]). It is important to test the status and potential improvements of state-of-the-art solvers on new technologies. In this work, sequential, multithreaded and distributed versions of SuperLU solvers (see [2]) are examined on the Intel Xeon Phi coprocessors using offload programming model at the EURORA cluster of CINECA in Italy. We consider a portfolio of test matrices containing patterned matrices from UFMM ([3]) and randomly located matrices. This architecture can benefit from high parallelism and large vectors. We find that the sequential SuperLU benefited up to 45 % performance improvement from the offload programming depending on the sparse matrix type and the size of transferred and processed data.

  5. Modeling high-temperature superconductors and metallic alloys on the Intel IPSC/860

    Science.gov (United States)

    Geist, G. A.; Peyton, B. W.; Shelton, W. A.; Stocks, G. M.

    Oak Ridge National Laboratory has embarked on several computational Grand Challenges, which require the close cooperation of physicists, mathematicians, and computer scientists. One of these projects is the determination of the material properties of alloys from first principles and, in particular, the electronic structure of high-temperature superconductors. While the present focus of the project is on superconductivity, the approach is general enough to permit study of other properties of metallic alloys such as strength and magnetic properties. This paper describes the progress to date on this project. We include a description of a self-consistent KKR-CPA method, parallelization of the model, and the incorporation of a dynamic load balancing scheme into the algorithm. We also describe the development and performance of a consolidated KKR-CPA code capable of running on CRAYs, workstations, and several parallel computers without source code modification. Performance of this code on the Intel iPSC/860 is also compared to a CRAY 2, CRAY YMP, and several workstations. Finally, some density of state calculations of two perovskite superconductors are given.

  6. Efficient irregular wavefront propagation algorithms on Intel® Xeon Phi™

    Science.gov (United States)

    Gomes, Jeremias M.; Teodoro, George; de Melo, Alba; Kong, Jun; Kurc, Tahsin; Saltz, Joel H.

    2016-01-01

    We investigate the execution of the Irregular Wavefront Propagation Pattern (IWPP), a fundamental computing structure used in several image analysis operations, on the Intel® Xeon Phi™ co-processor. An efficient implementation of IWPP on the Xeon Phi is a challenging problem because of IWPP’s irregularity and the use of atomic instructions in the original IWPP algorithm to resolve race conditions. On the Xeon Phi, the use of SIMD and vectorization instructions is critical to attain high performance. However, SIMD atomic instructions are not supported. Therefore, we propose a new IWPP algorithm that can take advantage of the supported SIMD instruction set. We also evaluate an alternate storage container (priority queue) to track active elements in the wavefront in an effort to improve the parallel algorithm efficiency. The new IWPP algorithm is evaluated with Morphological Reconstruction and Imfill operations as use cases. Our results show performance improvements of up to 5.63× on top of the original IWPP due to vectorization. Moreover, the new IWPP achieves speedups of 45.7× and 1.62×, respectively, as compared to efficient CPU and GPU implementations. PMID:27298591

  7. Deployment of the OSIRIS EM-PIC code on the Intel Knights Landing architecture

    Science.gov (United States)

    Fonseca, Ricardo

    2017-10-01

    Electromagnetic particle-in-cell (EM-PIC) codes such as OSIRIS have found widespread use in modelling the highly nonlinear and kinetic processes that occur in several relevant plasma physics scenarios, ranging from astrophysical settings to high-intensity laser plasma interaction. Being computationally intensive, these codes require large scale HPC systems, and a continuous effort in adapting the algorithm to new hardware and computing paradigms. In this work, we report on our efforts on deploying the OSIRIS code on the new Intel Knights Landing (KNL) architecture. Unlike the previous generation (Knights Corner), these boards are standalone systems, and introduce several new features, include the new AVX-512 instructions and on-package MCDRAM. We will focus on the parallelization and vectorization strategies followed, as well as memory management, and present a detailed performance evaluation of code performance in comparison with the CPU code. This work was partially supported by Fundaçã para a Ciência e Tecnologia (FCT), Portugal, through Grant No. PTDC/FIS-PLA/2940/2014.

  8. Plasma Science and Applications at the Intel Science Fair: A Retrospective

    Science.gov (United States)

    Berry, Lee

    2009-11-01

    For the past five years, the Coalition for Plasma Science (CPS) has presented an award for a plasma project at the Intel International Science and Engineering Fair (ISEF). Eligible projects have ranged from grape-based plasma production in a microwave oven to observation of the effects of viscosity in a fluid model of quark-gluon plasma. Most projects have been aimed at applications, including fusion, thrusters, lighting, materials processing, and GPS improvements. However diagnostics (spectroscopy), technology (magnets), and theory (quark-gluon plasmas) have also been represented. All of the CPS award-winning projects so far have been based on experiments, with two awards going to women students and three to men. Since the award was initiated, both the number and quality of plasma projects has increased. The CPS expects this trend to continue, and looks forward to continuing its work with students who are excited about the possibilities of plasma. You too can share this excitement by judging at the 2010 fair in San Jose on May 11-12.

  9. Implementation of 5-layer thermal diffusion scheme in weather research and forecasting model with Intel Many Integrated Cores

    Science.gov (United States)

    Huang, Melin; Huang, Bormin; Huang, Allen H.

    2014-10-01

    For weather forecasting and research, the Weather Research and Forecasting (WRF) model has been developed, consisting of several components such as dynamic solvers and physical simulation modules. WRF includes several Land- Surface Models (LSMs). The LSMs use atmospheric information, the radiative and precipitation forcing from the surface layer scheme, the radiation scheme, and the microphysics/convective scheme all together with the land's state variables and land-surface properties, to provide heat and moisture fluxes over land and sea-ice points. The WRF 5-layer thermal diffusion simulation is an LSM based on the MM5 5-layer soil temperature model with an energy budget that includes radiation, sensible, and latent heat flux. The WRF LSMs are very suitable for massively parallel computation as there are no interactions among horizontal grid points. The features, efficient parallelization and vectorization essentials, of Intel Many Integrated Core (MIC) architecture allow us to optimize this WRF 5-layer thermal diffusion scheme. In this work, we present the results of the computing performance on this scheme with Intel MIC architecture. Our results show that the MIC-based optimization improved the performance of the first version of multi-threaded code on Xeon Phi 5110P by a factor of 2.1x. Accordingly, the same CPU-based optimizations improved the performance on Intel Xeon E5- 2603 by a factor of 1.6x as compared to the first version of multi-threaded code.

  10. Evaluation of the Intel Xeon Phi 7120 and NVIDIA K80 as accelerators for two-dimensional panel codes.

    Science.gov (United States)

    Einkemmer, Lukas

    2017-01-01

    To optimize the geometry of airfoils for a specific application is an important engineering problem. In this context genetic algorithms have enjoyed some success as they are able to explore the search space without getting stuck in local optima. However, these algorithms require the computation of aerodynamic properties for a significant number of airfoil geometries. Consequently, for low-speed aerodynamics, panel methods are most often used as the inner solver. In this paper we evaluate the performance of such an optimization algorithm on modern accelerators (more specifically, the Intel Xeon Phi 7120 and the NVIDIA K80). For that purpose, we have implemented an optimized version of the algorithm on the CPU and Xeon Phi (based on OpenMP, vectorization, and the Intel MKL library) and on the GPU (based on CUDA and the MAGMA library). We present timing results for all codes and discuss the similarities and differences between the three implementations. Overall, we observe a speedup of approximately 2.5 for adding an Intel Xeon Phi 7120 to a dual socket workstation and a speedup between 3.4 and 3.8 for adding a NVIDIA K80 to a dual socket workstation.

  11. Notes on Compiling a Corpus- Based Dictionary

    Directory of Open Access Journals (Sweden)

    František Čermák

    2011-10-01

    Full Text Available

    ABSTRACT: On the basis of sample analysis of a Czech adjective, a definition based on the data drawn from the Czech National Corpus (cf. Čermák and Schmiedtová 2003 is gradually compiled and finally offered, pointing at the drawbacks of definitions found in traditional dictionaries. Steps undertaken here are then generalized and used, in an ordered sequence (similar to a work-flow ordering, as topics, briefly discussed in the second part to which lexicographers of monolingual dictionaries should pay attention. These are supplemented by additional remarks and caveats useful in the compilation of a dictionary. Thus, a brief survey of some of the major steps of dictionary compilation is presented here, supplemented by the original Czech data, analyzed in their raw, though semiotically classified form.

    OPSOMMING: Aantekeninge oor die samestelling van 'n korpusgebaseerde woordeboek. Op grond van 'n steekproefontleding van 'n Tsjeggiese adjektief, word 'n definisie gebaseer op data ontleen aan die Tsjeggiese Nasionale Korpus (cf. Čermák en Schmiedtová 2003 geleidelik saamgestel en uiteindelik aangebied wat wys op die gebreke van definisies aangetref in tradisionele woordeboeke. Stappe wat hier onderneem word, word dan veralgemeen en gebruik in 'n geordende reeks (soortgelyk aan 'n werkvloeiordening, as onderwerpe, kortliks bespreek in die tweede deel, waaraan leksikograwe van eentalige woordeboeke aandag behoort te gee. Hulle word aangevul deur bykomende opmerkings en waarskuwings wat nuttig is vir die samestelling van 'n woordeboek. Op dié manier word 'n kort oorsig van sommige van die hoofstappe van woordeboeksamestelling hier aangebied, aangevul deur die oorspronklike Tsjeggiese data, ontleed in hul onbewerkte, alhoewel semioties geklassifiseerde vorm.

    Sleutelwoorde: EENTALIGE WOORDEBOEKE, KORPUSLEKSIKOGRAFIE, SINTAGMATIEK EN PARADIGMATIEK IN WOORDEBOEKE, WOORDEBOEKINSKRYWING, SOORTE LEMMAS, PRAGMATIEK, BEHANDELING VAN

  12. A Performance Tuning Methodology with Compiler Support

    Directory of Open Access Journals (Sweden)

    Oscar Hernandez

    2008-01-01

    Full Text Available We have developed an environment, based upon robust, existing, open source software, for tuning applications written using MPI, OpenMP or both. The goal of this effort, which integrates the OpenUH compiler and several popular performance tools, is to increase user productivity by providing an automated, scalable performance measurement and optimization system. In this paper we describe our environment, show how these complementary tools can work together, and illustrate the synergies possible by exploiting their individual strengths and combined interactions. We also present a methodology for performance tuning that is enabled by this environment. One of the benefits of using compiler technology in this context is that it can direct the performance measurements to capture events at different levels of granularity and help assess their importance, which we have shown to significantly reduce the measurement overheads. The compiler can also help when attempting to understand the performance results: it can supply information on how a code was translated and whether optimizations were applied. Our methodology combines two performance views of the application to find bottlenecks. The first is a high level view that focuses on OpenMP/MPI performance problems such as synchronization cost and load imbalances; the second is a low level view that focuses on hardware counter analysis with derived metrics that assess the efficiency of the code. Our experiments have shown that our approach can significantly reduce overheads for both profiling and tracing to acceptable levels and limit the number of times the application needs to be run with selected hardware counters. In this paper, we demonstrate the workings of this methodology by illustrating its use with selected NAS Parallel Benchmarks and a cloud resolving code.

  13. Evaluation of HAL/S language compilability using SAMSO's Compiler Writing System (CWS)

    Science.gov (United States)

    Feliciano, M.; Anderson, H. D.; Bond, J. W., III

    1976-01-01

    NASA/Langley is engaged in a program to develop an adaptable guidance and control software concept for spacecraft such as shuttle-launched payloads. It is envisioned that this flight software be written in a higher-order language, such as HAL/S, to facilitate changes or additions. To make this adaptable software transferable to various onboard computers, a compiler writing system capability is necessary. A joint program with the Air Force Space and Missile Systems Organization was initiated to determine if the Compiler Writing System (CWS) owned by the Air Force could be utilized for this purpose. The present study explores the feasibility of including the HAL/S language constructs in CWS and the effort required to implement these constructs. This will determine the compilability of HAL/S using CWS and permit NASA/Langley to identify the HAL/S constructs desired for their applications. The study consisted of comparing the implementation of the Space Programming Language using CWS with the requirements for the implementation of HAL/S. It is the conclusion of the study that CWS already contains many of the language features of HAL/S and that it can be expanded for compiling part or all of HAL/S. It is assumed that persons reading and evaluating this report have a basic familiarity with (1) the principles of compiler construction and operation, and (2) the logical structure and applications characteristics of HAL/S and SPL.

  14. Compilation of accident statistics in PSE

    International Nuclear Information System (INIS)

    Jobst, C.

    1983-04-01

    The objective of the investigations on transportation carried out within the framework of the 'Project - Studies on Safety in Waste Management (PSE II)' is the determination of the risk of accidents in the transportation of radioactive materials by rail. The fault tree analysis is used for the determination of risks in the transportation system. This method offers a possibility for the determination of frequency and consequences of accidents which could lead to an unintended release of radionuclides. The study presented compiles all data obtained from the accident statistics of the Federal German Railways. (orig./RB) [de

  15. HAL/S-360 compiler system specification

    Science.gov (United States)

    Johnson, A. E.; Newbold, P. N.; Schulenberg, C. W.; Avakian, A. E.; Varga, S.; Helmers, P. H.; Helmers, C. T., Jr.; Hotz, R. L.

    1974-01-01

    A three phase language compiler is described which produces IBM 360/370 compatible object modules and a set of simulation tables to aid in run time verification. A link edit step augments the standard OS linkage editor. A comprehensive run time system and library provide the HAL/S operating environment, error handling, a pseudo real time executive, and an extensive set of mathematical, conversion, I/O, and diagnostic routines. The specifications of the information flow and content for this system are also considered.

  16. Compiling the First Monolingual Lusoga Dictionary

    Directory of Open Access Journals (Sweden)

    Minah Nabirye

    2011-10-01

    Full Text Available

    Abstract: In this research article a study is made of the approach followed to compile the first-ever monolingual dictionary for Lusoga. Lusoga is a Bantu language spoken in Uganda by slightly over two mil-lion people. Being an under-resourced language, the Lusoga orthography had to be designed, a grammar written, and a corpus built, before embarking on the compilation of the dictionary. This compilation was aimed at attaining an academic degree, hence requiring a rigorous research methodology. Firstly, the prevail-ing methods for compiling dictionaries were mainly practical and insufficient in explaining the theoretical linguistic basis for dictionary compilation. Since dictionaries are based on meaning, the theory of meaning was used to account for all linguistic data considered in dictionaries. However, meaning is considered at a very abstract level, far removed from the process of compiling dictionaries. Another theory, the theory of modularity, was used to bridge the gap between the theory of meaning and the compilation process. The modular theory explains how the different modules of a language contribute information to the different parts of the dictionary article or dictionary information in general. Secondly, the research also had to contend with the different approaches for analysing Bantu languages for Bantu and European audiences. A descrip-tion of the Bantu- and European-centred approaches to Bantu studies was undertaken in respect of (a the classification of Lusoga words, and (b the specification of their citations. As a result, Lusoga lexicography deviates from the prevailing Bantu classification and citation of nouns, adjectives and verbs in particular. The dictionary was tested on two separate occasions and all the feedback was considered in the compilation pro-cess. This article, then, gives an overall summary of all the steps involved in the compilation of the Eiwanika ly'Olusoga, i.e. the Monolingual Lusoga Dictionary

  17. abc: An extensible AspectJ compiler

    DEFF Research Database (Denmark)

    Avgustinov, Pavel; Christensen, Aske Simon; Hendren, Laurie

    2005-01-01

    checking and code generation, as well as data flow and control flow analyses. The AspectBench Compiler (abc) is an implementation of such a workbench. The base version of abc implements the full AspectJ language. Its frontend is built, using the Polyglot framework, as a modular extension of the Java...... language. The use of Polyglot gives flexibility of syntax and type checking. The backend is built using the Soot framework, to give modular code generation and analyses. In this paper, we outline the design of abc, focusing mostly on how the design supports extensibility. We then provide a general overview...

  18. Compilation of actinide neutron nuclear data

    International Nuclear Information System (INIS)

    1979-01-01

    The Swedish nuclear data committee has compiled a selected set of neutron cross section data for the 16 most important actinide isotopes. The aim of the report is to present available data in a comprehensible way to allow a comparison between different evaluated libraries and to judge about the reliability of these libraries from the experimental data. The data are given in graphical form below about 1 ev and above about 10 keV shile the 2200 m/s cross sections and resonance integrals are given in numerical form. (G.B.)

  19. Data compilation for particle impact desorption

    International Nuclear Information System (INIS)

    Oshiyama, Takashi; Nagai, Siro; Ozawa, Kunio; Takeuchi, Fujio.

    1984-05-01

    The desorption of gases from solid surfaces by incident electrons, ions and photons is one of the important processes of hydrogen recycling in the controlled thermonuclear reactors. We have surveyed the literature concerning the particle impact desorption published through 1983 and compiled the data on the desorption cross sections and desorption yields with the aid of a computer. This report presents the results obtained for electron stimulated desorption, the desorption cross sections and yields being given in graphs and tables as functions of incident electron energy, surface temperature and gas exposure. (author)

  20. Performance Evaluation of Computation and Communication Kernels of the Fast Multipole Method on Intel Manycore Architecture

    KAUST Repository

    AbdulJabbar, Mustafa Abdulmajeed

    2017-07-31

    Manycore optimizations are essential for achieving performance worthy of anticipated exascale systems. Utilization of manycore chips is inevitable to attain the desired floating point performance of these energy-austere systems. In this work, we revisit ExaFMM, the open source Fast Multiple Method (FMM) library, in light of highly tuned shared-memory parallelization and detailed performance analysis on the new highly parallel Intel manycore architecture, Knights Landing (KNL). We assess scalability and performance gain using task-based parallelism of the FMM tree traversal. We also provide an in-depth analysis of the most computationally intensive part of the traversal kernel (i.e., the particle-to-particle (P2P) kernel), by comparing its performance across KNL and Broadwell architectures. We quantify different configurations that exploit the on-chip 512-bit vector units within different task-based threading paradigms. MPI communication-reducing and NUMA-aware approaches for the FMM’s global tree data exchange are examined with different cluster modes of KNL. By applying several algorithm- and architecture-aware optimizations for FMM, we show that the N-Body kernel on 256 threads of KNL achieves on average 2.8× speedup compared to the non-vectorized version, whereas on 56 threads of Broadwell, it achieves on average 2.9× speedup. In addition, the tree traversal kernel on KNL scales monotonically up to 256 threads with task-based programming models. The MPI-based communication-reducing algorithms show expected improvements of the data locality across the KNL on-chip network.

  1. Performance Evaluation of Computation and Communication Kernels of the Fast Multipole Method on Intel Manycore Architecture

    KAUST Repository

    AbdulJabbar, Mustafa Abdulmajeed; Al Farhan, Mohammed; Yokota, Rio; Keyes, David E.

    2017-01-01

    Manycore optimizations are essential for achieving performance worthy of anticipated exascale systems. Utilization of manycore chips is inevitable to attain the desired floating point performance of these energy-austere systems. In this work, we revisit ExaFMM, the open source Fast Multiple Method (FMM) library, in light of highly tuned shared-memory parallelization and detailed performance analysis on the new highly parallel Intel manycore architecture, Knights Landing (KNL). We assess scalability and performance gain using task-based parallelism of the FMM tree traversal. We also provide an in-depth analysis of the most computationally intensive part of the traversal kernel (i.e., the particle-to-particle (P2P) kernel), by comparing its performance across KNL and Broadwell architectures. We quantify different configurations that exploit the on-chip 512-bit vector units within different task-based threading paradigms. MPI communication-reducing and NUMA-aware approaches for the FMM’s global tree data exchange are examined with different cluster modes of KNL. By applying several algorithm- and architecture-aware optimizations for FMM, we show that the N-Body kernel on 256 threads of KNL achieves on average 2.8× speedup compared to the non-vectorized version, whereas on 56 threads of Broadwell, it achieves on average 2.9× speedup. In addition, the tree traversal kernel on KNL scales monotonically up to 256 threads with task-based programming models. The MPI-based communication-reducing algorithms show expected improvements of the data locality across the KNL on-chip network.

  2. Thread-level parallelization and optimization of NWChem for the Intel MIC architecture

    Energy Technology Data Exchange (ETDEWEB)

    Shan, Hongzhang [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Williams, Samuel [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); de Jong, Wibe [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Oliker, Leonid [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

    2015-01-01

    In the multicore era it was possible to exploit the increase in on-chip parallelism by simply running multiple MPI processes per chip. Unfortunately, manycore processors' greatly increased thread- and data-level parallelism coupled with a reduced memory capacity demand an altogether different approach. In this paper we explore augmenting two NWChem modules, triples correction of the CCSD(T) and Fock matrix construction, with OpenMP in order that they might run efficiently on future manycore architectures. As the next NERSC machine will be a self-hosted Intel MIC (Xeon Phi) based supercomputer, we leverage an existing MIC testbed at NERSC to evaluate our experiments. In order to proxy the fact that future MIC machines will not have a host processor, we run all of our experiments in native mode. We found that while straightforward application of OpenMP to the deep loop nests associated with the tensor contractions of CCSD(T) was sufficient in attaining high performance, significant e ort was required to safely and efeciently thread the TEXAS integral package when constructing the Fock matrix. Ultimately, our new MPI+OpenMP hybrid implementations attain up to 65× better performance for the triples part of the CCSD(T) due in large part to the fact that the limited on-card memory limits the existing MPI implementation to a single process per card. Additionally, we obtain up to 1.6× better performance on Fock matrix constructions when compared with the best MPI implementations running multiple processes per card.

  3. Using Intel's Knight Landing Processor to Accelerate Global Nested Air Quality Prediction Modeling System (GNAQPMS) Model

    Science.gov (United States)

    Wang, H.; Chen, H.; Chen, X.; Wu, Q.; Wang, Z.

    2016-12-01

    The Global Nested Air Quality Prediction Modeling System for Hg (GNAQPMS-Hg) is a global chemical transport model coupled Hg transport module to investigate the mercury pollution. In this study, we present our work of transplanting the GNAQPMS model on Intel Xeon Phi processor, Knights Landing (KNL) to accelerate the model. KNL is the second-generation product adopting Many Integrated Core Architecture (MIC) architecture. Compared with the first generation Knight Corner (KNC), KNL has more new hardware features, that it can be used as unique processor as well as coprocessor with other CPU. According to the Vtune tool, the high overhead modules in GNAQPMS model have been addressed, including CBMZ gas chemistry, advection and convection module, and wet deposition module. These high overhead modules were accelerated by optimizing code and using new techniques of KNL. The following optimized measures was done: 1) Changing the pure MPI parallel mode to hybrid parallel mode with MPI and OpenMP; 2.Vectorizing the code to using the 512-bit wide vector computation unit. 3. Reducing unnecessary memory access and calculation. 4. Reducing Thread Local Storage (TLS) for common variables with each OpenMP thread in CBMZ. 5. Changing the way of global communication from files writing and reading to MPI functions. After optimization, the performance of GNAQPMS is greatly increased both on CPU and KNL platform, the single-node test showed that optimized version has 2.6x speedup on two sockets CPU platform and 3.3x speedup on one socket KNL platform compared with the baseline version code, which means the KNL has 1.29x speedup when compared with 2 sockets CPU platform.

  4. Navier-Stokes Aerodynamic Simulation of the V-22 Osprey on the Intel Paragon MPP

    Science.gov (United States)

    Vadyak, Joseph; Shrewsbury, George E.; Narramore, Jim C.; Montry, Gary; Holst, Terry; Kwak, Dochan (Technical Monitor)

    1995-01-01

    The paper will describe the Development of a general three-dimensional multiple grid zone Navier-Stokes flowfield simulation program (ENS3D-MPP) designed for efficient execution on the Intel Paragon Massively Parallel Processor (MPP) supercomputer, and the subsequent application of this method to the prediction of the viscous flowfield about the V-22 Osprey tiltrotor vehicle. The flowfield simulation code solves the thin Layer or full Navier-Stoke's equation - for viscous flow modeling, or the Euler equations for inviscid flow modeling on a structured multi-zone mesh. In the present paper only viscous simulations will be shown. The governing difference equations are solved using a time marching implicit approximate factorization method with either TVD upwind or central differencing used for the convective terms and central differencing used for the viscous diffusion terms. Steady state or Lime accurate solutions can be calculated. The present paper will focus on steady state applications, although time accurate solution analysis is the ultimate goal of this effort. Laminar viscosity is calculated using Sutherland's law and the Baldwin-Lomax two layer algebraic turbulence model is used to compute the eddy viscosity. The Simulation method uses an arbitrary block, curvilinear grid topology. An automatic grid adaption scheme is incorporated which concentrates grid points in high density gradient regions. A variety of user-specified boundary conditions are available. This paper will present the application of the scalable and superscalable versions to the steady state viscous flow analysis of the V-22 Osprey using a multiple zone global mesh. The mesh consists of a series of sheared cartesian grid blocks with polar grids embedded within to better simulate the wing tip mounted nacelle. MPP solutions will be shown in comparison to equivalent Cray C-90 results and also in comparison to experimental data. Discussions on meshing considerations, wall clock execution time

  5. Thread-Level Parallelization and Optimization of NWChem for the Intel MIC Architecture

    Energy Technology Data Exchange (ETDEWEB)

    Shan, Hongzhang; Williams, Samuel; Jong, Wibe de; Oliker, Leonid

    2014-10-10

    In the multicore era it was possible to exploit the increase in on-chip parallelism by simply running multiple MPI processes per chip. Unfortunately, manycore processors' greatly increased thread- and data-level parallelism coupled with a reduced memory capacity demand an altogether different approach. In this paper we explore augmenting two NWChem modules, triples correction of the CCSD(T) and Fock matrix construction, with OpenMP in order that they might run efficiently on future manycore architectures. As the next NERSC machine will be a self-hosted Intel MIC (Xeon Phi) based supercomputer, we leverage an existing MIC testbed at NERSC to evaluate our experiments. In order to proxy the fact that future MIC machines will not have a host processor, we run all of our experiments in tt native mode. We found that while straightforward application of OpenMP to the deep loop nests associated with the tensor contractions of CCSD(T) was sufficient in attaining high performance, significant effort was required to safely and efficiently thread the TEXAS integral package when constructing the Fock matrix. Ultimately, our new MPI OpenMP hybrid implementations attain up to 65x better performance for the triples part of the CCSD(T) due in large part to the fact that the limited on-card memory limits the existing MPI implementation to a single process per card. Additionally, we obtain up to 1.6x better performance on Fock matrix constructions when compared with the best MPI implementations running multiple processes per card.

  6. Compiling models into real-time systems

    International Nuclear Information System (INIS)

    Dormoy, J.L.; Cherriaux, F.; Ancelin, J.

    1992-08-01

    This paper presents an architecture for building real-time systems from models, and model-compiling techniques. This has been applied for building a real-time model-based monitoring system for nuclear plants, called KSE, which is currently being used in two plants in France. We describe how we used various artificial intelligence techniques for building it: a model-based approach, a logical model of its operation, a declarative implementation of these models, and original knowledge-compiling techniques for automatically generating the real-time expert system from those models. Some of those techniques have just been borrowed from the literature, but we had to modify or invent other techniques which simply did not exist. We also discuss two important problems, which are often underestimated in the artificial intelligence literature: size, and errors. Our architecture, which could be used in other applications, combines the advantages of the model-based approach with the efficiency requirements of real-time applications, while in general model-based approaches present serious drawbacks on this point

  7. Regular expressions compiler and some applications

    International Nuclear Information System (INIS)

    Saldana A, H.

    1978-01-01

    We deal with high level programming language of a Regular Expressions Compiler (REC). The first chapter is an introduction in which the history of the REC development and the problems related to its numerous applicatons are described. The syntactic and sematic rules as well as the language features are discussed just after the introduction. Concerning the applicatons as examples, an adaptation is given in order to solve numerical problems and another for the data manipulation. The last chapter is an exposition of ideas and techniques about the compiler construction. Examples of the adaptation to numerical problems show the applications to education, vector analysis, quantum mechanics, physics, mathematics and other sciences. The rudiments of an operating system for a minicomputer are the examples of the adaptation to symbolic data manipulaton. REC is a programming language that could be applied to solve problems in almost any human activity. Handling of computer graphics, control equipment, research on languages, microprocessors and general research are some of the fields in which this programming language can be applied and developed. (author)

  8. Compiling models into real-time systems

    International Nuclear Information System (INIS)

    Dormoy, J.L.; Cherriaux, F.; Ancelin, J.

    1992-08-01

    This paper presents an architecture for building real-time systems from models, and model-compiling techniques. This has been applied for building a real-time model-base monitoring system for nuclear plants, called KSE, which is currently being used in two plants in France. We describe how we used various artificial intelligence techniques for building it: a model-based approach, a logical model of its operation, a declarative implementation of these models, and original knowledge-compiling techniques for automatically generating the real-time expert system from those models. Some of those techniques have just been borrowed from the literature, but we had to modify or invent other techniques which simply did not exist. We also discuss two important problems, which are often underestimated in the artificial intelligence literature: size, and errors. Our architecture, which could be used in other applications, combines the advantages of the model-based approach with the efficiency requirements of real-time applications, while in general model-based approaches present serious drawbacks on this point

  9. FAIL-SAFE: Fault Aware IntelLigent Software for Exascale

    Science.gov (United States)

    2016-06-13

    and that these programs can continue to correct solutions. To broaden the impact of this research, we also needed to be able to ameliorate errors...designing an interface between the application and an introspection framework for resilience ( IFR ) based on the inference engine SHINE; (4) using...the ROSE compiler to translate annotations into reasoning rules for the IFR ; and (5) designing a Knowledge/Experience Database, which will store

  10. 12 CFR 411.600 - Semi-annual compilation.

    Science.gov (United States)

    2010-01-01

    ... 12 Banks and Banking 4 2010-01-01 2010-01-01 false Semi-annual compilation. 411.600 Section 411.600 Banks and Banking EXPORT-IMPORT BANK OF THE UNITED STATES NEW RESTRICTIONS ON LOBBYING Agency Reports § 411.600 Semi-annual compilation. (a) The head of each agency shall collect and compile the...

  11. Compilation of functional languages using flow graph analysis

    NARCIS (Netherlands)

    Hartel, Pieter H.; Glaser, Hugh; Wild, John M.

    A system based on the notion of a flow graph is used to specify formally and to implement a compiler for a lazy functional language. The compiler takes a simple functional language as input and generates C. The generated C program can then be compiled, and loaded with an extensive run-time system to

  12. A Fortran program (RELAX3D) to solve the 3 dimensional Poisson (Laplace) equation

    International Nuclear Information System (INIS)

    Houtman, H.; Kost, C.J.

    1983-09-01

    RELAX3D is an efficient, user friendly, interactive FORTRAN program which solves the Poisson (Laplace) equation Λ 2 =p for a general 3 dimensional geometry consisting of Dirichlet and Neumann boundaries approximated to lie on a regular 3 dimensional mesh. The finite difference equations at these nodes are solved using a successive point-iterative over-relaxation method. A menu of commands, supplemented by HELP facility, controls the dynamic loading of the subroutine describing the problem case, the iterations to converge to a solution, and the contour plotting of any desired slices, etc

  13. CWF and TABLE - Two Fortran programmes for the calculation of Coulomb penetration and shift factors

    International Nuclear Information System (INIS)

    Norton, D.S.; James, M.F.

    1965-12-01

    CWF and TABLE are Fortran programmes, written for the IBM 7090 and English-Electric Leo Marconi KDF9 computers, that calculate the penetration and shift factors for a charged particle in a Coulomb field. The numerical methods used are those of Lutz and Karvelis. The two programmes are very similar. Input to TABLE is in the form of the centre-of-mass co-ordinates. CWF is intended for use in calculating cross-sections for neutron-induced reactions which result in charged particle emission, and the input is in the form of the neutron energy in the laboratory frame of reference, together with other necessary reaction data. (author)

  14. LIONS: a new set of Fortran 90 codes for the SPIRAL project at GANIL

    International Nuclear Information System (INIS)

    Bertrand, P.

    1994-01-01

    A set of new computer programs developed at GANIL is presented; these codes are used to study different parts of the SPIRAL project (a new radioactive ion beam facility), and particularly the dynamics in the CIME cyclotron, its injection inflector, and the new extraction system of the ECR ion sources. Three important modules are described: CHA3D for the evaluation of 3D electric fields with or without space charge effects, LIONS for the motion of ions and EXTRACT for the ECRIS extraction. These modules are written in Fortran 90 in a ''data parallel scheme''. They work either on UNIX workstations or parallel and vectorial computers. (author). 5 figs., 5 refs

  15. MONTEC, an interactive fortran program to simulate radiation dose and dose-rate responses of populations

    International Nuclear Information System (INIS)

    Perry, K.A.; Szekely, J.G.

    1983-09-01

    The computer program MONTEC was written to simulate the distribution of responses in a population whose members are exposed to multiple radiation doses at variable dose rates. These doses and dose rates are randomly selected from lognormal distributions. The individual radiation responses are calculated from three equations, which include dose and dose-rate terms. Other response-dose/rate relationships or distributions can be incorporated by the user as the need arises. The purpose of this documentation is to provide a complete operating manual for the program. This version is written in FORTRAN-10 for the DEC system PDP-10

  16. QEDMOD: Fortran program for calculating the model Lamb-shift operator

    Science.gov (United States)

    Shabaev, V. M.; Tupitsyn, I. I.; Yerokhin, V. A.

    2018-02-01

    We present Fortran package QEDMOD for computing the model QED operator hQED that can be used to account for the Lamb shift in accurate atomic-structure calculations. The package routines calculate the matrix elements of hQED with the user-specified one-electron wave functions. The operator can be used to calculate Lamb shift in many-electron atomic systems with a typical accuracy of few percent, either by evaluating the matrix element of hQED with the many-electron wave function, or by adding hQED to the Dirac-Coulomb-Breit Hamiltonian.

  17. LIONS: a new set of Fortran90 codes for the SPIRAL project at GANIL

    International Nuclear Information System (INIS)

    Bertrand, P.

    1995-01-01

    In this paper a set of new computer programs developed at GANIL is presented. These codes are used to study different parts of the SPIRAL project, in particular the dynamics in the CIME cyclotron and the new extraction system of the ECR ion sources. Three important modules are described: CHA3D for the evaluation of 3D electric fields with or without space charge effects, LIONS for the motion of ions and EXTRACT for the ECRIS extraction. These modules are written in Fortran90 in a ''data parallel scheme''. They work either on UNIX workstations or parallel and vectorial computers. (orig.)

  18. JTpack90: A parallel, object-based, Fortran 90 linear algebra package

    Energy Technology Data Exchange (ETDEWEB)

    Turner, J.A.; Kothe, D.B. [Los Alamos National Lab., NM (United States); Ferrell, R.C. [Cambridge Power Computing Associates, Ltd., Brookline, MA (United States)

    1997-03-01

    The authors have developed an object-based linear algebra package, currently with emphasis on sparse Krylov methods, driven primarily by needs of the Los Alamos National Laboratory parallel unstructured-mesh casting simulation tool Telluride. Support for a number of sparse storage formats, methods, and preconditioners have been implemented, driven primarily by application needs. They describe the object-based Fortran 90 approach, which enhances maintainability, performance, and extensibility, the parallelization approach using a new portable gather/scatter library (PGSLib), current capabilities and future plans, and present preliminary performance results on a variety of platforms.

  19. FORTRAN computer programs to process Savannah River Laboratory hydrogeochemical and stream-sediment reconnaissance data

    International Nuclear Information System (INIS)

    Zinkl, R.J.; Shettel, D.L. Jr.; D'Andrea, R.F. Jr.

    1980-03-01

    FORTRAN computer programs have been written to read, edit, and reformat the hydrogeochemical and stream-sediment reconnaissance data produced by Savannah River Laboratory for the National Uranium Resource Evaluation program. The data are presorted by Savannah River Laboratory into stream sediment, ground water, and stream water for each 1 0 x 2 0 quadrangle. Extraneous information is eliminated, and missing analyses are assigned a specific value (-99999.0). Negative analyses are below the detection limit; the absolute value of a negative analysis is assumed to be the detection limit

  20. A new Fortran 90 program to compute regular and irregular associated Legendre functions (new version announcement)

    Science.gov (United States)

    Schneider, Barry I.; Segura, Javier; Gil, Amparo; Guan, Xiaoxu; Bartschat, Klaus

    2018-04-01

    This is a revised and updated version of a modern Fortran 90 code to compute the regular Plm (x) and irregular Qlm (x) associated Legendre functions for all x ∈(- 1 , + 1) (on the cut) and | x | > 1 and integer degree (l) and order (m). The necessity to revise the code comes as a consequence of some comments of Prof. James Bremer of the UC//Davis Mathematics Department, who discovered that there were errors in the code for large integer degree and order for the normalized regular Legendre functions on the cut.

  1. specsim: A Fortran-77 program for conditional spectral simulation in 3D

    Science.gov (United States)

    Yao, Tingting

    1998-12-01

    A Fortran 77 program, specsim, is presented for conditional spectral simulation in 3D domains. The traditional Fourier integral method allows generating random fields with a given covariance spectrum. Conditioning to local data is achieved by an iterative identification of the conditional phase information. A flowchart of the program is given to illustrate the implementation procedures of the program. A 3D case study is presented to demonstrate application of the program. A comparison with the traditional sequential Gaussian simulation algorithm emphasizes the advantages and drawbacks of the proposed algorithm.

  2. WHO GLOBAL TUBERCULOSIS REPORTS: COMPILATION AND INTERPRETATION

    Directory of Open Access Journals (Sweden)

    I. A. Vаsilyevа

    2017-01-01

    Full Text Available The purpose of the article is to inform national specialists involved in tuberculosis control about methods for compilation of WHO global tuberculosis statistics, which are used when developing strategies and programmes for tuberculosis control and evaluation of their efficiency.  The article explains in detail the notions of main WHO epidemiological rates, used in the international publications on tuberculosis along with the data on their registered values, new approaches to making the list of country with the highest burden of tuberculosis, drug resistant tuberculosis and tuberculosis with concurrent HIV infection. The article compares the rates in the Russian Federation with global data as well as data from countries within WHO European Regions and countries with highest TB burden. It presents materials on the achievement of Global goals in tuberculosis control and main provisions of WHO End TB Strategy for 2015-2035 adopted as a part of UNO Sustainable Development Goals.  

  3. Molecular dynamics and diffusion a compilation

    CERN Document Server

    Fisher, David

    2013-01-01

    The molecular dynamics technique was developed in the 1960s as the outgrowth of attempts to model complicated systems by using either a) direct physical simulation or (following the great success of Monte Carlo methods) by b) using computer techniques. Computer simulation soon won out over clumsy physical simulation, and the ever-increasing speed and sophistication of computers has naturally made molecular dynamics simulation into a more and more successful technique. One of its most popular applications is the study of diffusion, and some experts now even claim that molecular dynamics simulation is, in the case of situations involving well-characterised elements and structures, more accurate than experimental measurement. The present double volume includes a compilation (over 600 items) of predicted solid-state diffusion data, for all of the major materials groups, dating back nearly four decades. The double volume also includes some original papers: "Determination of the Activation Energy for Formation and ...

  4. Compilation of Existing Neutron Screen Technology

    Directory of Open Access Journals (Sweden)

    N. Chrysanthopoulou

    2014-01-01

    Full Text Available The presence of fast neutron spectra in new reactors is expected to induce a strong impact on the contained materials, including structural materials, nuclear fuels, neutron reflecting materials, and tritium breeding materials. Therefore, introduction of these reactors into operation will require extensive testing of their components, which must be performed under neutronic conditions representative of those expected to prevail inside the reactor cores when in operation. Due to limited availability of fast reactors, testing of future reactor materials will mostly take place in water cooled material test reactors (MTRs by tailoring the neutron spectrum via neutron screens. The latter rely on the utilization of materials capable of absorbing neutrons at specific energy. A large but fragmented experience is available on that topic. In this work a comprehensive compilation of the existing neutron screen technology is attempted, focusing on neutron screens developed in order to locally enhance the fast over thermal neutron flux ratio in a reactor core.

  5. Compilation of requests for nuclear data

    International Nuclear Information System (INIS)

    1983-01-01

    The purpose of this compilation is to summarize the current needs of US Nuclear Energy programs and other applied technolgies for nuclear data. It is the result of a biennial review in which the Department of Energy (DOE) and contractors, Department of Defense Laboratories and contractors, and other interested groups have been asked to review and revise their requests for nuclear data. It was felt that the evaluators of cross section data and the users of these evaluations should be involved in the review of the data requests to make this compilation more useful. This request list is ordered by target nucleus (Isotope) and then reaction type (Quantity). Each request is assigned a unique identifying number. The first two digits of this number give the year the request was initiated. All requests for a given Isotope and Quantity are grouped (or blocked) together. The requests in a block are followed by any status comments. Each request has a unique Isotope, Quantity and Requester. The requester is identified by laboratory, last name, and sponsoring US government agency, e.g., BET, DEI, DNR. All requesters, together with their addresses and phone numbers, are given in appendix B. A list of the evaluators responsible for ENDF/B-V evaluations with their affiliation appears in appendix C. All requests must give the energy (or range of energy) for the incident particle when appropriate. The accuracy needed in percent is also given. The error quoted is assumed to be 1-sigma at each measured point in the energy range requested unless a comment specifies otherwise. Sometimes a range of accuracy indicated by two values is given or some statement is given in the free text comments. An incident particle energy resolution in percent is sometimes given

  6. Compilation of requests for nuclear data

    International Nuclear Information System (INIS)

    Weston, L.W.; Larson, D.C.

    1993-02-01

    This compilation represents the current needs for nuclear data measurements and evaluations as expressed by interested fission and fusion reactor designers, medical users of nuclear data, nuclear data evaluators, CSEWG members and other interested parties. The requests and justifications are reviewed by the Data Request and Status Subcommittee of CSEWG as well as most of the general CSEWG membership. The basic format and computer programs for the Request List were produced by the National Nuclear Data Center (NNDC) at Brookhaven National Laboratory. The NNDC produced the Request List for many years. The Request List is compiled from a computerized data file. Each request has a unique isotope, reaction type, requestor and identifying number. The first two digits of the identifying number are the year in which the request was initiated. Every effort has been made to restrict the notations to those used in common nuclear physics textbooks. Most requests are for individual isotopes as are most ENDF evaluations, however, there are some requests for elemental measurements. Each request gives a priority rating which will be discussed in Section 2, the neutron energy range for which the request is made, the accuracy requested in terms of one standard deviation, and the requested energy resolution in terms of one standard deviation. Also given is the requestor with the comments which were furnished with the request. The addresses and telephone numbers of the requestors are given in Appendix 1. ENDF evaluators who may be contacted concerning evaluations are given in Appendix 2. Experimentalists contemplating making one of the requested measurements are encouraged to contact both the requestor and evaluator who may provide valuable information. This is a working document in that it will change with time. New requests or comments may be submitted to the editors or a regular CSEWG member at any time

  7. Research and Practice of the News Map Compilation Service

    Science.gov (United States)

    Zhao, T.; Liu, W.; Ma, W.

    2018-04-01

    Based on the needs of the news media on the map, this paper researches on the news map compilation service, conducts demand research on the service of compiling news maps, designs and compiles the public authority base map suitable for media publication, and constructs the news base map material library. It studies the compilation of domestic and international news maps with timeliness and strong pertinence and cross-regional characteristics, constructs the hot news thematic gallery and news map customization services, conducts research on types of news maps, establish closer liaison and cooperation methods with news media, and guides news media to use correct maps. Through the practice of the news map compilation service, this paper lists two cases of news map preparation services used by different media, compares and analyses cases, summarizes the research situation of news map compilation service, and at the same time puts forward outstanding problems and development suggestions in the service of news map compilation service.

  8. Compilations and evaluations of nuclear structure and decay data

    International Nuclear Information System (INIS)

    Lorenz, A.

    1977-03-01

    This is the second issue of a report series on published and to-be-published compilations and evaluations of nuclear structure and decay (NSD) data. This compilation of compilations and evaluations is designed to keep the nuclear scientific community informed of the availability of compiled or evaluated NSD data, and contains references to laboratory reports, journal articles and books containing selected compilations and evaluations. It excludes references to ''mass-chain'' evaluations normally published in the ''Nuclear Data Sheets'' and ''Nuclear Physics''. The material contained in this compilation is sorted according to eight subject categories: general compilations; basic isotopic properties; nuclear structure properties; nuclear decay processes; half-lives, energies and spectra; nuclear decay processes: gamma-rays; nuclear decay processes: fission products; nuclear decay processes: (others); atomic processes

  9. The formal specification of abstract data types and their implementation in Fortran 90: implementation issues concerning the use of pointers

    Science.gov (United States)

    Maley, D.; Kilpatrick, P. L.; Schreiner, E. W.; Scott, N. S.; Diercksen, G. H. F.

    1996-10-01

    In this paper we continue our investigation into the development of computational-science software based on the identification and formal specification of Abstract Data Types (ADTs) and their implementation in Fortran 90. In particular, we consider the consequences of using pointers when implementing a formally specified ADT in Fortran 90. Our aim is to highlight the resulting conflict between the goal of information hiding, which is central to the ADT methodology, and the space efficiency of the implementation. We show that the issue of storage recovery cannot be avoided by the ADT user, and present a range of implementations of a simple ADT to illustrate various approaches towards satisfactory storage management. Finally, we propose a set of guidelines for implementing ADTs using pointers in Fortran 90. These guidelines offer a way gracefully to provide disposal operations in Fortran 90. Such an approach is desirable since Fortran 90 does not provide automatic garbage collection which is offered by many object-oriented languages including Eiffel, Java, Smalltalk, and Simula.

  10. DDT: A Research Tool for Automatic Data Distribution in High Performance Fortran

    Directory of Open Access Journals (Sweden)

    Eduard AyguadÉ

    1997-01-01

    Full Text Available This article describes the main features and implementation of our automatic data distribution research tool. The tool (DDT accepts programs written in Fortran 77 and generates High Performance Fortran (HPF directives to map arrays onto the memories of the processors and parallelize loops, and executable statements to remap these arrays. DDT works by identifying a set of computational phases (procedures and loops. The algorithm builds a search space of candidate solutions for these phases which is explored looking for the combination that minimizes the overall cost; this cost includes data movement cost and computation cost. The movement cost reflects the cost of accessing remote data during the execution of a phase and the remapping costs that have to be paid in order to execute the phase with the selected mapping. The computation cost includes the cost of executing a phase in parallel according to the selected mapping and the owner computes rule. The tool supports interprocedural analysis and uses control flow information to identify how phases are sequenced during the execution of the application.

  11. Reformulation RELAP5-3D in FORTRAN 95 and Results

    International Nuclear Information System (INIS)

    Mesina, George L.

    2010-01-01

    RELAP5-3D is a nuclear power plant code used worldwide for safety analysis, design, and operator training. In keeping with ongoing developments in the computing industry, we have re-architected the code in the FORTRAN 95 language, the current, fully-available, FORTRAN language. These changes include a complete reworking of the database and conversion of the source code to take advantage of new constructs. The improvements and impacts to the code are manifold. It is a completely machine-independent code that produces machine independent fluid property and plot files and expands to the exact size needed to accommodate the user's input. Runtime is generally better for larger input models. Other impacts of code conversion are improved code readability, reduced maintenance and development time, increased adaptability to new computing platforms, and increased code longevity. The conversion methodology, code improvements and testing upgrades are presented in a manner that will be useful to future conversion projects for other such large codes. Comparison between the pre- and post-conversion code are made on the basis of code metrics and code performance.

  12. The development of GPU-based parallel PRNG for Monte Carlo applications in CUDA Fortran

    Directory of Open Access Journals (Sweden)

    Hamed Kargaran

    2016-04-01

    Full Text Available The implementation of Monte Carlo simulation on the CUDA Fortran requires a fast random number generation with good statistical properties on GPU. In this study, a GPU-based parallel pseudo random number generator (GPPRNG have been proposed to use in high performance computing systems. According to the type of GPU memory usage, GPU scheme is divided into two work modes including GLOBAL_MODE and SHARED_MODE. To generate parallel random numbers based on the independent sequence method, the combination of middle-square method and chaotic map along with the Xorshift PRNG have been employed. Implementation of our developed PPRNG on a single GPU showed a speedup of 150x and 470x (with respect to the speed of PRNG on a single CPU core for GLOBAL_MODE and SHARED_MODE, respectively. To evaluate the accuracy of our developed GPPRNG, its performance was compared to that of some other commercially available PPRNGs such as MATLAB, FORTRAN and Miller-Park algorithm through employing the specific standard tests. The results of this comparison showed that the developed GPPRNG in this study can be used as a fast and accurate tool for computational science applications.

  13. The development of GPU-based parallel PRNG for Monte Carlo applications in CUDA Fortran

    Energy Technology Data Exchange (ETDEWEB)

    Kargaran, Hamed, E-mail: h-kargaran@sbu.ac.ir; Minuchehr, Abdolhamid; Zolfaghari, Ahmad [Department of nuclear engineering, Shahid Behesti University, Tehran, 1983969411 (Iran, Islamic Republic of)

    2016-04-15

    The implementation of Monte Carlo simulation on the CUDA Fortran requires a fast random number generation with good statistical properties on GPU. In this study, a GPU-based parallel pseudo random number generator (GPPRNG) have been proposed to use in high performance computing systems. According to the type of GPU memory usage, GPU scheme is divided into two work modes including GLOBAL-MODE and SHARED-MODE. To generate parallel random numbers based on the independent sequence method, the combination of middle-square method and chaotic map along with the Xorshift PRNG have been employed. Implementation of our developed PPRNG on a single GPU showed a speedup of 150x and 470x (with respect to the speed of PRNG on a single CPU core) for GLOBAL-MODE and SHARED-MODE, respectively. To evaluate the accuracy of our developed GPPRNG, its performance was compared to that of some other commercially available PPRNGs such as MATLAB, FORTRAN and Miller-Park algorithm through employing the specific standard tests. The results of this comparison showed that the developed GPPRNG in this study can be used as a fast and accurate tool for computational science applications.

  14. GNAQPMS v1.1: accelerating the Global Nested Air Quality Prediction Modeling System (GNAQPMS) on Intel Xeon Phi processors

    Science.gov (United States)

    Wang, Hui; Chen, Huansheng; Wu, Qizhong; Lin, Junmin; Chen, Xueshun; Xie, Xinwei; Wang, Rongrong; Tang, Xiao; Wang, Zifa

    2017-08-01

    The Global Nested Air Quality Prediction Modeling System (GNAQPMS) is the global version of the Nested Air Quality Prediction Modeling System (NAQPMS), which is a multi-scale chemical transport model used for air quality forecast and atmospheric environmental research. In this study, we present the porting and optimisation of GNAQPMS on a second-generation Intel Xeon Phi processor, codenamed Knights Landing (KNL). Compared with the first-generation Xeon Phi coprocessor (codenamed Knights Corner, KNC), KNL has many new hardware features such as a bootable processor, high-performance in-package memory and ISA compatibility with Intel Xeon processors. In particular, we describe the five optimisations we applied to the key modules of GNAQPMS, including the CBM-Z gas-phase chemistry, advection, convection and wet deposition modules. These optimisations work well on both the KNL 7250 processor and the Intel Xeon E5-2697 V4 processor. They include (1) updating the pure Message Passing Interface (MPI) parallel mode to the hybrid parallel mode with MPI and OpenMP in the emission, advection, convection and gas-phase chemistry modules; (2) fully employing the 512 bit wide vector processing units (VPUs) on the KNL platform; (3) reducing unnecessary memory access to improve cache efficiency; (4) reducing the thread local storage (TLS) in the CBM-Z gas-phase chemistry module to improve its OpenMP performance; and (5) changing the global communication from writing/reading interface files to MPI functions to improve the performance and the parallel scalability. These optimisations greatly improved the GNAQPMS performance. The same optimisations also work well for the Intel Xeon Broadwell processor, specifically E5-2697 v4. Compared with the baseline version of GNAQPMS, the optimised version was 3.51 × faster on KNL and 2.77 × faster on the CPU. Moreover, the optimised version ran at 26 % lower average power on KNL than on the CPU. With the combined performance and energy

  15. GNAQPMS v1.1: accelerating the Global Nested Air Quality Prediction Modeling System (GNAQPMS on Intel Xeon Phi processors

    Directory of Open Access Journals (Sweden)

    H. Wang

    2017-08-01

    Full Text Available The Global Nested Air Quality Prediction Modeling System (GNAQPMS is the global version of the Nested Air Quality Prediction Modeling System (NAQPMS, which is a multi-scale chemical transport model used for air quality forecast and atmospheric environmental research. In this study, we present the porting and optimisation of GNAQPMS on a second-generation Intel Xeon Phi processor, codenamed Knights Landing (KNL. Compared with the first-generation Xeon Phi coprocessor (codenamed Knights Corner, KNC, KNL has many new hardware features such as a bootable processor, high-performance in-package memory and ISA compatibility with Intel Xeon processors. In particular, we describe the five optimisations we applied to the key modules of GNAQPMS, including the CBM-Z gas-phase chemistry, advection, convection and wet deposition modules. These optimisations work well on both the KNL 7250 processor and the Intel Xeon E5-2697 V4 processor. They include (1 updating the pure Message Passing Interface (MPI parallel mode to the hybrid parallel mode with MPI and OpenMP in the emission, advection, convection and gas-phase chemistry modules; (2 fully employing the 512 bit wide vector processing units (VPUs on the KNL platform; (3 reducing unnecessary memory access to improve cache efficiency; (4 reducing the thread local storage (TLS in the CBM-Z gas-phase chemistry module to improve its OpenMP performance; and (5 changing the global communication from writing/reading interface files to MPI functions to improve the performance and the parallel scalability. These optimisations greatly improved the GNAQPMS performance. The same optimisations also work well for the Intel Xeon Broadwell processor, specifically E5-2697 v4. Compared with the baseline version of GNAQPMS, the optimised version was 3.51 × faster on KNL and 2.77 × faster on the CPU. Moreover, the optimised version ran at 26 % lower average power on KNL than on the CPU. With the combined

  16. Emmarcar el debat: Lliure expressió contra propietat intel·lectual, els propers cinquanta anys

    Directory of Open Access Journals (Sweden)

    Eben Moglen

    2007-02-01

    Full Text Available

    El Prof. Moglen explica i analitza, des d'una perspectiva històrica, la profunda revolució social i legal que resulta de la tecnologia digital quan aquesta s'aplica a tots els camps: programari, música i tot tipus de creacions. En concret, explica la manera en què la tecnologia digital està forçant una modificació substancial (desaparició dels sistemes de propietat intel·lectual i fa prediccions per al futur pròxim dels mercats de la PI.

  17. A brief description and comparison of programming languages FORTRAN, ALGOL, COBOL, PL/1, and LISP 1.5 from a critical standpoint

    Science.gov (United States)

    Mathur, F. P.

    1972-01-01

    Several common higher level program languages are described. FORTRAN, ALGOL, COBOL, PL/1, and LISP 1.5 are summarized and compared. FORTRAN is the most widely used scientific programming language. ALGOL is a more powerful language for scientific programming. COBOL is used for most commercial programming applications. LISP 1.5 is primarily a list-processing language. PL/1 attempts to combine the desirable features of FORTRAN, ALGOL, and COBOL into a single language.

  18. Implementation of Neutronics Analysis Code using the Features of Object Oriented Programming via Fortran90/95

    Energy Technology Data Exchange (ETDEWEB)

    Han, Tae Young; Cho, Beom Jin [KEPCO Nuclear Fuel, Daejeon (Korea, Republic of)

    2011-05-15

    The object-oriented programming (OOP) concept was radically established after 1990s and successfully involved in Fortran 90/95. The features of OOP are such as the information hiding, encapsulation, modularity and inheritance, which lead to producing code that satisfy three R's: reusability, reliability and readability. The major OOP concepts, however, except Module are not mainly used in neutronics analysis codes even though the code was written by Fortran 90/95. In this work, we show that the OOP concept can be employed to develop the neutronics analysis code, ASTRA1D (Advanced Static and Transient Reactor Analyzer for 1-Dimension), via Fortran90/95 and those can be more efficient and reasonable programming methods

  19. Compiling quantum circuits to realistic hardware architectures using temporal planners

    Science.gov (United States)

    Venturelli, Davide; Do, Minh; Rieffel, Eleanor; Frank, Jeremy

    2018-04-01

    To run quantum algorithms on emerging gate-model quantum hardware, quantum circuits must be compiled to take into account constraints on the hardware. For near-term hardware, with only limited means to mitigate decoherence, it is critical to minimize the duration of the circuit. We investigate the application of temporal planners to the problem of compiling quantum circuits to newly emerging quantum hardware. While our approach is general, we focus on compiling to superconducting hardware architectures with nearest neighbor constraints. Our initial experiments focus on compiling Quantum Alternating Operator Ansatz (QAOA) circuits whose high number of commuting gates allow great flexibility in the order in which the gates can be applied. That freedom makes it more challenging to find optimal compilations but also means there is a greater potential win from more optimized compilation than for less flexible circuits. We map this quantum circuit compilation problem to a temporal planning problem, and generated a test suite of compilation problems for QAOA circuits of various sizes to a realistic hardware architecture. We report compilation results from several state-of-the-art temporal planners on this test set. This early empirical evaluation demonstrates that temporal planning is a viable approach to quantum circuit compilation.

  20. Evaluation of the OpenCL AES Kernel using the Intel FPGA SDK for OpenCL

    Energy Technology Data Exchange (ETDEWEB)

    Jin, Zheming [Argonne National Lab. (ANL), Argonne, IL (United States); Yoshii, Kazutomo [Argonne National Lab. (ANL), Argonne, IL (United States); Finkel, Hal [Argonne National Lab. (ANL), Argonne, IL (United States); Cappello, Franck [Argonne National Lab. (ANL), Argonne, IL (United States)

    2017-04-20

    The OpenCL standard is an open programming model for accelerating algorithms on heterogeneous computing system. OpenCL extends the C-based programming language for developing portable codes on different platforms such as CPU, Graphics processing units (GPUs), Digital Signal Processors (DSPs) and Field Programmable Gate Arrays (FPGAs). The Intel FPGA SDK for OpenCL is a suite of tools that allows developers to abstract away the complex FPGA-based development flow for a high-level software development flow. Users can focus on the design of hardware-accelerated kernel functions in OpenCL and then direct the tools to generate the low-level FPGA implementations. The approach makes the FPGA-based development more accessible to software users as the needs for hybrid computing using CPUs and FPGAs are increasing. It can also significantly reduce the hardware development time as users can evaluate different ideas with high-level language without deep FPGA domain knowledge. In this report, we evaluate the performance of the kernel using the Intel FPGA SDK for OpenCL and Nallatech 385A FPGA board. Compared to the M506 module, the board provides more hardware resources for a larger design exploration space. The kernel performance is measured with the compute kernel throughput, an upper bound to the FPGA throughput. The report presents the experimental results in details. The Appendix lists the kernel source code.

  1. Evaluation of the Single-precision Floatingpoint Vector Add Kernel Using the Intel FPGA SDK for OpenCL

    Energy Technology Data Exchange (ETDEWEB)

    Jin, Zheming [Argonne National Lab. (ANL), Argonne, IL (United States); Yoshii, Kazutomo [Argonne National Lab. (ANL), Argonne, IL (United States); Finkel, Hal [Argonne National Lab. (ANL), Argonne, IL (United States); Cappello, Franck [Argonne National Lab. (ANL), Argonne, IL (United States)

    2017-04-20

    Open Computing Language (OpenCL) is a high-level language that enables software programmers to explore Field Programmable Gate Arrays (FPGAs) for application acceleration. The Intel FPGA software development kit (SDK) for OpenCL allows a user to specify applications at a high level and explore the performance of low-level hardware acceleration. In this report, we present the FPGA performance and power consumption results of the single-precision floating-point vector add OpenCL kernel using the Intel FPGA SDK for OpenCL on the Nallatech 385A FPGA board. The board features an Arria 10 FPGA. We evaluate the FPGA implementations using the compute unit duplication and kernel vectorization optimization techniques. On the Nallatech 385A FPGA board, the maximum compute kernel bandwidth we achieve is 25.8 GB/s, approximately 76% of the peak memory bandwidth. The power consumption of the FPGA device when running the kernels ranges from 29W to 42W.

  2. Efficient sparse matrix-matrix multiplication for computing periodic responses by shooting method on Intel Xeon Phi

    Science.gov (United States)

    Stoykov, S.; Atanassov, E.; Margenov, S.

    2016-10-01

    Many of the scientific applications involve sparse or dense matrix operations, such as solving linear systems, matrix-matrix products, eigensolvers, etc. In what concerns structural nonlinear dynamics, the computations of periodic responses and the determination of stability of the solution are of primary interest. Shooting method iswidely used for obtaining periodic responses of nonlinear systems. The method involves simultaneously operations with sparse and dense matrices. One of the computationally expensive operations in the method is multiplication of sparse by dense matrices. In the current work, a new algorithm for sparse matrix by dense matrix products is presented. The algorithm takes into account the structure of the sparse matrix, which is obtained by space discretization of the nonlinear Mindlin's plate equation of motion by the finite element method. The algorithm is developed to use the vector engine of Intel Xeon Phi coprocessors. It is compared with the standard sparse matrix by dense matrix algorithm and the one developed by Intel MKL and it is shown that by considering the properties of the sparse matrix better algorithms can be developed.

  3. Time-efficient simulations of tight-binding electronic structures with Intel Xeon PhiTM many-core processors

    Science.gov (United States)

    Ryu, Hoon; Jeong, Yosang; Kang, Ji-Hoon; Cho, Kyu Nam

    2016-12-01

    Modelling of multi-million atomic semiconductor structures is important as it not only predicts properties of physically realizable novel materials, but can accelerate advanced device designs. This work elaborates a new Technology-Computer-Aided-Design (TCAD) tool for nanoelectronics modelling, which uses a sp3d5s∗ tight-binding approach to describe multi-million atomic structures, and simulate electronic structures with high performance computing (HPC), including atomic effects such as alloy and dopant disorders. Being named as Quantum simulation tool for Advanced Nanoscale Devices (Q-AND), the tool shows nice scalability on traditional multi-core HPC clusters implying the strong capability of large-scale electronic structure simulations, particularly with remarkable performance enhancement on latest clusters of Intel Xeon PhiTM coprocessors. A review of the recent modelling study conducted to understand an experimental work of highly phosphorus-doped silicon nanowires, is presented to demonstrate the utility of Q-AND. Having been developed via Intel Parallel Computing Center project, Q-AND will be open to public to establish a sound framework of nanoelectronics modelling with advanced HPC clusters of a many-core base. With details of the development methodology and exemplary study of dopant electronics, this work will present a practical guideline for TCAD development to researchers in the field of computational nanoelectronics.

  4. A parallel implementation of particle tracking with space charge effects on an INTEL iPSC/860

    International Nuclear Information System (INIS)

    Chang, L.; Bourianoff, G.; Cole, B.; Machida, S.

    1993-05-01

    Particle-tracking simulation is one of the scientific applications that is well-suited to parallel computations. At the Superconducting Super Collider, it has been theoretically and empirically demonstrated that particle tracking on a designed lattice can achieve very high parallel efficiency on a MIMD Intel iPSC/860 machine. The key to such success is the realization that the particles can be tracked independently without considering their interaction. The perfectly parallel nature of particle tracking is broken if the interaction effects between particles are included. The space charge introduces an electromagnetic force that will affect the motion of tracked particles in 3-D space. For accurate modeling of the beam dynamics with space charge effects, one needs to solve three-dimensional Maxwell field equations, usually by a particle-in-cell (PIC) algorithm. This will require each particle to communicate with its neighbor grids to compute the momentum changes at each time step. It is expected that the 3-D PIC method will degrade parallel efficiency of particle-tracking implementation on any parallel computer. In this paper, we describe an efficient scheme for implementing particle tracking with space charge effects on an INTEL iPSC/860 machine. Experimental results show that a parallel efficiency of 75% can be obtained

  5. Compiling a Monolingual Dictionary for Native Speakers

    Directory of Open Access Journals (Sweden)

    Patrick Hanks

    2011-10-01

    Full Text Available

    ABSTRACT: This article gives a survey of the main issues confronting the compilers of monolingual dictionaries in the age of the Internet. Among others, it discusses the relationship between a lexical database and a monolingual dictionary, the role of corpus evidence, historical principles in lexicography vs. synchronic principles, the instability of word meaning, the need for full vocabulary coverage, principles of definition writing, the role of dictionaries in society, and the need for dictionaries to give guidance on matters of disputed word usage. It concludes with some questions about the future of dictionary publishing.

    OPSOMMING: Die samestelling van 'n eentalige woordeboek vir moedertaalsprekers. Hierdie artikel gee 'n oorsig van die hoofkwessies waarmee die samestellers van eentalige woordeboeke in die eeu van die Internet te kampe het. Dit bespreek onder andere die verhouding tussen 'n leksikale databasis en 'n eentalige woordeboek, die rol van korpusgetuienis, historiese beginsels vs sinchroniese beginsels in die leksikografie, die onstabiliteit van woordbetekenis, die noodsaak van 'n volledige woordeskatdekking, beginsels van die skryf van definisies, die rol van woordeboeke in die maatskappy, en die noodsaak vir woordeboeke om leiding te gee oor sake van betwiste woordgebruik. Dit sluit af met 'n aantal vrae oor die toekoms van die publikasie van woordeboeke.

    Sleutelwoorde: EENTALIGE WOORDEBOEKE, LEKSIKALE DATABASIS, WOORDEBOEKSTRUKTUUR, WOORDBETEKENIS, BETEKENISVERANDERING, GEBRUIK, GEBRUIKSAANTEKENINGE, HISTORIESE BEGINSELS VAN DIE LEKSIKOGRAFIE, SINCHRONIESE BEGINSELS VAN DIE LEKSIKOGRAFIE, REGISTER, SLANG, STANDAARDENGELS, WOORDESKATDEKKING, KONSEKWENSIE VAN VERSAMELINGS, FRASEOLOGIE, SINTAGMATIESE PATRONE, PROBLEME VAN KOMPOSISIONALITEIT, LINGUISTIESE PRESKRIPTIVISME, LEKSIKALE GETUIENIS

  6. Sharing analysis in the Pawns compiler

    Directory of Open Access Journals (Sweden)

    Lee Naish

    2015-09-01

    Full Text Available Pawns is a programming language under development that supports algebraic data types, polymorphism, higher order functions and “pure” declarative programming. It also supports impure imperative features including destructive update of shared data structures via pointers, allowing significantly increased efficiency for some operations. A novelty of Pawns is that all impure “effects” must be made obvious in the source code and they can be safely encapsulated in pure functions in a way that is checked by the compiler. Execution of a pure function can perform destructive updates on data structures that are local to or eventually returned from the function without risking modification of the data structures passed to the function. This paper describes the sharing analysis which allows impurity to be encapsulated. Aspects of the analysis are similar to other published work, but in addition it handles explicit pointers and destructive update, higher order functions including closures and pre- and post-conditions concerning sharing for functions.

  7. Reflective memory recorder upgrade: an opportunity to benchmark PowerPC and Intel architectures for real time

    Science.gov (United States)

    Abuter, Roberto; Tischer, Helmut; Frahm, Robert

    2014-07-01

    Several high frequency loops are required to run the VLTI (Very Large Telescope Interferometer) 2, e.g. for fringe tracking11, 5, angle tracking, vibration cancellation, data capture. All these loops rely on low latency real time computers based on the VME bus, Motorola PowerPC14 hardware architecture. In this context, one highly demanding application in terms of cycle time, latency and data transfer volume is the VLTI centralized recording facility, so called, RMN recorder1 (Reflective Memory Recorder). This application captures and transfers data flowing through the distributed memory of the system in real time. Some of the VLTI data producers are running with frequencies up to 8 KHz. With the evolution from first generation instruments like MIDI3, PRIMA5, and AMBER4 which use one or two baselines, to second generation instruments like MATISSE10 and GRAVITY9 which will use all six baselines simultaneously, the quantity of signals has increased by, at least, a factor of six. This has led to a significant overload of the RMN recorder1 which has reached the natural limits imposed by the underlying hardware. At the same time, new, more powerful computers, based on the Intel multicore families of CPUs and PCI buses have become available. With the purpose of improving the performance of the RMN recorder1 application and in order to make it capable of coping with the demands of the new generation instruments, a slightly modified implementation has been developed and integrated into an Intel based multicore computer15 running the VxWorks17 real time operating system. The core of the application is based on the standard VLT software framework for instruments13. The real time task reads from the reflective memory using the onboard DMA access12 and captured data is transferred to the outside world via a TCP socket on a dedicated Ethernet connection. The diversity of the software and hardware that are involved makes this application suitable as a benchmarking platform. A

  8. Intel Many Integrated Core (MIC) architecture optimization strategies for a memory-bound Weather Research and Forecasting (WRF) Goddard microphysics scheme

    Science.gov (United States)

    Mielikainen, Jarno; Huang, Bormin; Huang, Allen H.

    2014-10-01

    The Goddard cloud microphysics scheme is a sophisticated cloud microphysics scheme in the Weather Research and Forecasting (WRF) model. The WRF is a widely used weather prediction system in the world. It development is a done in collaborative around the globe. The Goddard microphysics scheme is very suitable for massively parallel computation as there are no interactions among horizontal grid points. Compared to the earlier microphysics schemes, the Goddard scheme incorporates a large number of improvements. Thus, we have optimized the code of this important part of WRF. In this paper, we present our results of optimizing the Goddard microphysics scheme on Intel Many Integrated Core Architecture (MIC) hardware. The Intel Xeon Phi coprocessor is the first product based on Intel MIC architecture, and it consists of up to 61 cores connected by a high performance on-die bidirectional interconnect. The Intel MIC is capable of executing a full operating system and entire programs rather than just kernels as the GPU do. The MIC coprocessor supports all important Intel development tools. Thus, the development environment is familiar one to a vast number of CPU developers. Although, getting a maximum performance out of MICs will require using some novel optimization techniques. Those optimization techniques are discusses in this paper. The results show that the optimizations improved performance of the original code on Xeon Phi 7120P by a factor of 4.7x. Furthermore, the same optimizations improved performance on a dual socket Intel Xeon E5-2670 system by a factor of 2.8x compared to the original code.

  9. HAL/S-360 compiler test activity report

    Science.gov (United States)

    Helmers, C. T.

    1974-01-01

    The levels of testing employed in verifying the HAL/S-360 compiler were as follows: (1) typical applications program case testing; (2) functional testing of the compiler system and its generated code; and (3) machine oriented testing of compiler implementation on operational computers. Details of the initial test plan and subsequent adaptation are reported, along with complete test results for each phase which examined the production of object codes for every possible source statement.

  10. Regulatory and technical reports: compilation for 1975-1978

    International Nuclear Information System (INIS)

    1982-04-01

    This brief compilation lists formal reports issued by the US Nuclear Regulatory Commission in 1975 through 1978 that were not listed in the Regulatory and Technical Reports Compilation for 1975 to 1978, NUREG-0304, Vol. 3. This compilation is divided into two sections. The first consists of a sequential listing of all reports in report-number order. The second section consists of an index developed from keywords in report titles and abstracts

  11. HOPE: Just-in-time Python compiler for astrophysical computations

    Science.gov (United States)

    Akeret, Joel; Gamper, Lukas; Amara, Adam; Refregier, Alexandre

    2014-11-01

    HOPE is a specialized Python just-in-time (JIT) compiler designed for numerical astrophysical applications. HOPE focuses on a subset of the language and is able to translate Python code into C++ while performing numerical optimization on mathematical expressions at runtime. To enable the JIT compilation, the user only needs to add a decorator to the function definition. By using HOPE, the user benefits from being able to write common numerical code in Python while getting the performance of compiled implementation.

  12. EFFDOS - a FORTRAN-77-code for the calculation of the effective dose equivalent

    International Nuclear Information System (INIS)

    Baer, M.; Honcu, S.; Huebschmann, W.

    1984-01-01

    The FORTRAN-77-code EFFDOS calculates the effective dose equivalent according to ICRP 26 due to the longterm emission of radionuclides into the atmosphere for the following exposure pathways: inhalation, ingestion, γ-ground irradiation (γ-irradiation by radionuclides deposited on the ground) and β- or γ-submersion (irradiation by the passing radioactive cloud). For calculating the effective dose equivalent at a single spot it is necessary to put in the diffusion factor and - if need be - the washout factor; otherwise EFFDOS calculates the input data for the computer codes ISOLA III and WOLGA-1, which then are enabled to compute the atmospheric diffusion, ground deposition and local dose equivalent distribution for the requested exposure pathway. Atmospheric diffusion, deposition and radionuclide transfer are calculated according to the ''Allgemeine Berechnungsgrundlage ....'' recommended by the German Fed. Ministry of Interior. A sample calculated is added. (orig.) [de

  13. Fortran programs for the time-dependent Gross-Pitaevskii equation in a fully anisotropic trap

    Science.gov (United States)

    Muruganandam, P.; Adhikari, S. K.

    2009-10-01

    Here we develop simple numerical algorithms for both stationary and non-stationary solutions of the time-dependent Gross-Pitaevskii (GP) equation describing the properties of Bose-Einstein condensates at ultra low temperatures. In particular, we consider algorithms involving real- and imaginary-time propagation based on a split-step Crank-Nicolson method. In a one-space-variable form of the GP equation we consider the one-dimensional, two-dimensional circularly-symmetric, and the three-dimensional spherically-symmetric harmonic-oscillator traps. In the two-space-variable form we consider the GP equation in two-dimensional anisotropic and three-dimensional axially-symmetric traps. The fully-anisotropic three-dimensional GP equation is also considered. Numerical results for the chemical potential and root-mean-square size of stationary states are reported using imaginary-time propagation programs for all the cases and compared with previously obtained results. Also presented are numerical results of non-stationary oscillation for different trap symmetries using real-time propagation programs. A set of convenient working codes developed in Fortran 77 are also provided for all these cases (twelve programs in all). In the case of two or three space variables, Fortran 90/95 versions provide some simplification over the Fortran 77 programs, and these programs are also included (six programs in all). Program summaryProgram title: (i) imagetime1d, (ii) imagetime2d, (iii) imagetime3d, (iv) imagetimecir, (v) imagetimesph, (vi) imagetimeaxial, (vii) realtime1d, (viii) realtime2d, (ix) realtime3d, (x) realtimecir, (xi) realtimesph, (xii) realtimeaxial Catalogue identifier: AEDU_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEDU_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data

  14. ptchg: A FORTRAN program for point-charge calculations of electric field gradients (EFGs)

    Science.gov (United States)

    Spearing, Dane R.

    1994-05-01

    ptchg, a FORTRAN program, has been developed to calculate electric field gradients (EFG) around an atomic site in crystalline solids using the point-charge direct-lattice summation method. It uses output from the crystal structure generation program Atoms as its input. As an application of ptchg, a point-charge calculation of the EFG quadrupolar parameters around the oxygen site in SiO 2 cristobalite is demonstrated. Although point-charge calculations of electric field gradients generally are limited to ionic compounds, the computed quadrupolar parameters around the oxygen site in SiO 2 cristobalite, a highly covalent material, are in good agreement with the experimentally determined values from nuclear magnetic resonance (NMR) spectroscopy.

  15. Perbandingan Bubble Sort dengan Insertion Sort pada Bahasa Pemrograman C dan Fortran

    Directory of Open Access Journals (Sweden)

    Reina Reina

    2013-12-01

    Full Text Available Sorting is a basic algorithm studied by students of computer science major. Sorting algorithm is the basis of other algorithms such as searching algorithm, pattern matching algorithm. Bubble sort is a popular basic sorting algorithm due to its easiness to be implemented. Besides bubble sort, there is insertion sort. It is lesspopular than bubble sort because it has more difficult algorithm. This paper discusses about process time between insertion sort and bubble sort with two kinds of data. First is randomized data, and the second is data of descending list. Comparison of process time has been done in two kinds of programming language that is C programming language and FORTRAN programming language. The result shows that bubble sort needs more time than insertion sort does.

  16. Program NICOLET to integrate energy loss in superconducting coils. [In FORTRAN for CDC-6600

    Energy Technology Data Exchange (ETDEWEB)

    Vogel, H.F.

    1978-08-01

    A voltage pickup coil, inductively coupled to the magnetic field of the superconducting coil under test, is connected so its output may be compared with the terminal voltage of the coil under test. The integrated voltage difference is indicative of the resistive volt-seconds. When multiplied with the main coil current, the volt-seconds yield the loss. In other words, a hysteresis loop is obtained if the integrated voltage difference phi = ..integral delta..Vdt is plotted as a function of the coil current, i. First, time functions of the two signals phi(t) and i(t) are recorded on a dual-trace digital oscilloscope, and these signals are then recorded on magnetic tape. On a CDC-6600, the recorded information is decoded and plotted, and the hysteresis loops are integrated by the set of FORTRAN programs NICOLET described in this report. 4 figures.

  17. RODDRP - A FORTRAN program for use in control rod calibration by the rod drop method

    International Nuclear Information System (INIS)

    Wilson, W.E.

    1972-01-01

    The different methods to measure reactivity which are applicable to control rod calibration are discussed. They include: 1) the positive period method, 2) the rod drop method, 3) the source-jerk method, 4) the rod oscillation method, and 5) the pulsed neutron method. The instrument setup used at WSU for rod drop measurements is presented. To speed up the analysis of power fall-off trace, a FORTRAN IV program called RODDRP was written to simultaneously solve the in-hour equation and relative neutron flux. The procedure for calculating the worth of the rod that produced the power trace is given. The reactivity for each time relative flux point is obtained. Conclusions about the status of the equipment are made

  18. A FORTRAN program for numerical solution of the Altarelli-Parisi equations by the Laguerre method

    International Nuclear Information System (INIS)

    Kumano, S.; Londergan, J.T.

    1992-01-01

    We review the Laguerre method for solving the Altarelli-Parisi equations. The Laguerre method allows one to expand quark/parton distributions and splitting functions in orthonormal polynomials. The desired quark distributions are themselves expanded in terms of evolution operators, and we derive the integrodifferential equations satisfied by the evolution operators. We give relevant equations for both flavor nonsinglet and singlet distributions, for both spin-independent and spin-dependent distributions. We discuss stability and accuracy of the results using this method. For intermediate values of Bjorken x (0.03< x<0.7), one can obtain accurate results with a modest number of Laguerre polynomials (N≅20); we discuss requirements for convergence also for the regions of large or small x. A FORTRAN program is provided which implements the Laguerre method; test results are given for both the spin-independent and spin-dependent cases. (orig.)

  19. Algorithm 589. SICEDR: a FORTRAN subroutine for improving the accuracy of computed matrix eigenvalues

    International Nuclear Information System (INIS)

    Dongarra, J.J.

    1982-01-01

    SICEDR is a FORTRAN subroutine for improving the accuracy of a computed real eigenvalue and improving or computing the associated eigenvector. It is first used to generate information during the determination of the eigenvalues by the Schur decomposition technique. In particular, the Schur decomposition technique results in an orthogonal matrix Q and an upper quasi-triangular matrix T, such that A = QTQ/sup T/. Matrices A, Q, and T and the approximate eigenvalue, say lambda, are then used in the improvement phase. SICEDR uses an iterative method similar to iterative improvement for linear systems to improve the accuracy of lambda and improve or compute the eigenvector x in O(n 2 ) work, where n is the order of the matrix A

  20. Compiler Construction Using Java, JavaCC, and Yacc

    CERN Document Server

    Dos Reis, Anthony J

    2012-01-01

    Broad in scope, involving theory, the application of that theory, and programming technology, compiler construction is a moving target, with constant advances in compiler technology taking place. Today, a renewed focus on do-it-yourself programming makes a quality textbook on compilers, that both students and instructors will enjoy using, of even more vital importance. This book covers every topic essential to learning compilers from the ground up and is accompanied by a powerful and flexible software package for evaluating projects, as well as several tutorials, well-defined projects, and tes

  1. Compilations and evaluations of nuclear structure and decay data

    International Nuclear Information System (INIS)

    Lorenz, A.

    1978-10-01

    This is the fourth issue of a report series on published and to-be-published compilations and evaluations of nuclear structure and decay (NSD) data. This compilation is published and distributed by the IAEA Nuclear Data Section every year. The material contained in this compilation is sorted according to eight subject categories: General compilations; basic isotopic properties; nuclear structure properties; nuclear decay processes, half-lives, energies and spectra; nuclear decay processes, gamma-rays; nuclear decay processes, fission products; nuclear decay processes (others); atomic processes

  2. Automatic Parallelization An Overview of Fundamental Compiler Techniques

    CERN Document Server

    Midkiff, Samuel P

    2012-01-01

    Compiling for parallelism is a longstanding topic of compiler research. This book describes the fundamental principles of compiling "regular" numerical programs for parallelism. We begin with an explanation of analyses that allow a compiler to understand the interaction of data reads and writes in different statements and loop iterations during program execution. These analyses include dependence analysis, use-def analysis and pointer analysis. Next, we describe how the results of these analyses are used to enable transformations that make loops more amenable to parallelization, and

  3. Programs in Fortran language for reporting the results of the analyses by ICP emission spectroscopy; Programas en lenguaje Fortran para la informacion de los resultados de los analisis efectuados mediante Espectroscopia Optica de emision con fuente de plasma

    Energy Technology Data Exchange (ETDEWEB)

    Roca, M

    1985-07-01

    Three programs, written in FORTRAN IV language, for reporting the results of the analyses by ICP emission spectroscopy from data stored in files on floppy disks have been developed. They are intended, respectively, for the analyses of: 1) waters, 2) granites and slates, and 3) different kinds of geological materials. (Author) 8 refs.

  4. SMMP v. 3.0—Simulating proteins and protein interactions in Python and Fortran

    Science.gov (United States)

    Meinke, Jan H.; Mohanty, Sandipan; Eisenmenger, Frank; Hansmann, Ulrich H. E.

    2008-03-01

    We describe a revised and updated version of the program package SMMP. SMMP is an open-source FORTRAN package for molecular simulation of proteins within the standard geometry model. It is designed as a simple and inexpensive tool for researchers and students to become familiar with protein simulation techniques. SMMP 3.0 sports a revised API increasing its flexibility, an implementation of the Lund force field, multi-molecule simulations, a parallel implementation of the energy function, Python bindings, and more. Program summaryTitle of program:SMMP Catalogue identifier:ADOJ_v3_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/ADOJ_v3_0.html Program obtainable from: CPC Program Library, Queen's University of Belfast, N. Ireland Licensing provisions:Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html Programming language used:FORTRAN, Python No. of lines in distributed program, including test data, etc.:52 105 No. of bytes in distributed program, including test data, etc.:599 150 Distribution format:tar.gz Computer:Platform independent Operating system:OS independent RAM:2 Mbytes Classification:3 Does the new version supersede the previous version?:Yes Nature of problem:Molecular mechanics computations and Monte Carlo simulation of proteins. Solution method:Utilizes ECEPP2/3, FLEX, and Lund potentials. Includes Monte Carlo simulation algorithms for canonical, as well as for generalized ensembles. Reasons for new version:API changes and increased functionality. Summary of revisions:Added Lund potential; parameters used in subroutines are now passed as arguments; multi-molecule simulations; parallelized energy calculation for ECEPP; Python bindings. Restrictions:The consumed CPU time increases with the size of protein molecule. Running time:Depends on the size of the simulated molecule.

  5. Compilation of data for radionuclide transport analysis

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2001-11-01

    This report is one of the supporting documents to the updated safety assessment (project SAFE) of the Swedish repository for low and intermediate level waste, SFR 1. A number of calculation cases for quantitative analysis of radionuclide release and dose to man are defined based on the expected evolution of the repository, geosphere and biosphere in the Base Scenario and other scenarios selected. The data required by the selected near field, geosphere and biosphere models are given and the values selected for the calculations are compiled in tables. The main sources for the selected values of the migration parameters in the repository and geosphere models are the safety assessment of a deep repository for spent fuel, SR 97, and the preliminary safety assessment of a repository for long-lived, low- and intermediate level waste, SFL 3-5. For the biosphere models, both site-specific data and generic values of the parameters are selected. The applicability of the selected parameter values is discussed and the uncertainty is qualitatively addressed for data to the repository and geosphere migration models. Parameter values selected for these models are in general pessimistic in order not to underestimate the radionuclide release rates. It is judged that this approach combined with the selected calculation cases will illustrate the effects of uncertainties in processes and events that affects the evolution of the system as well as in quantitative data that describes this. The biosphere model allows for probabilistic calculations and the uncertainty in input data are quantified by giving minimum, maximum and mean values as well as the type of probability distribution function.

  6. Compilation of data for radionuclide transport analysis

    International Nuclear Information System (INIS)

    2001-11-01

    This report is one of the supporting documents to the updated safety assessment (project SAFE) of the Swedish repository for low and intermediate level waste, SFR 1. A number of calculation cases for quantitative analysis of radionuclide release and dose to man are defined based on the expected evolution of the repository, geosphere and biosphere in the Base Scenario and other scenarios selected. The data required by the selected near field, geosphere and biosphere models are given and the values selected for the calculations are compiled in tables. The main sources for the selected values of the migration parameters in the repository and geosphere models are the safety assessment of a deep repository for spent fuel, SR 97, and the preliminary safety assessment of a repository for long-lived, low- and intermediate level waste, SFL 3-5. For the biosphere models, both site-specific data and generic values of the parameters are selected. The applicability of the selected parameter values is discussed and the uncertainty is qualitatively addressed for data to the repository and geosphere migration models. Parameter values selected for these models are in general pessimistic in order not to underestimate the radionuclide release rates. It is judged that this approach combined with the selected calculation cases will illustrate the effects of uncertainties in processes and events that affects the evolution of the system as well as in quantitative data that describes this. The biosphere model allows for probabilistic calculations and the uncertainty in input data are quantified by giving minimum, maximum and mean values as well as the type of probability distribution function

  7. BGSUB and BGFIX: FORTRAN programs to correct Ge(Li) gamma-ray spectra for photopeaks from radionuclides in background

    International Nuclear Information System (INIS)

    Cutshall, N.H.; Larsen, I.L.

    1980-03-01

    Two FORTRAN programs which provide correction and error analysis for background photopeak contributions to low-level gamma-ray spectra are discussed. A peak-by-peak background subtraction approach is used instead of channel-by-channel correction. The accuracy of corrected results near background levels is substantially improved over uncorrected values

  8. Global compilation of marine varve records

    Science.gov (United States)

    Schimmelmann, Arndt; Lange, Carina B.; Schieber, Juergen; Francus, Pierre; Ojala, Antti E. K.; Zolitschka, Bernd

    2017-04-01

    Marine varves contain highly resolved records of geochemical and other paleoceanographic and paleoenvironmental proxies with annual to seasonal resolution. We present a global compilation of marine varved sedimentary records throughout the Holocene and Quaternary covering more than 50 sites worldwide. Marine varve deposition and preservation typically depend on environmental and sedimentological conditions, such as a sufficiently high sedimentation rate, severe depletion of dissolved oxygen in bottom water to exclude bioturbation by macrobenthos, and a seasonally varying sedimentary input to yield a recognizable rhythmic varve pattern. Additional oceanographic factors may include the strength and depth range of the Oxygen Minimum Zone (OMZ) and regional anthropogenic eutrophication. Modern to Quaternary marine varves are not only found in those parts of the open ocean that comply with these conditions, but also in fjords, embayments and estuaries with thermohaline density stratification, and nearshore 'marine lakes' with strong hydrologic connections to ocean water. Marine varves have also been postulated in pre-Quaternary rocks. In the case of non-evaporitic laminations in fine-grained ancient marine rocks, such as banded iron formations and black shales, laminations may not be varves but instead may have multiple alternative origins such as event beds or formation via bottom currents that transported and sorted silt-sized particles, clay floccules, and organic-mineral aggregates in the form of migrating bedload ripples. Modern marine ecosystems on continental shelves and slopes, in coastal zones and in estuaries are susceptible to stress by anthropogenic pressures, for example in the form of eutrophication, enhanced OMZs, and expanding ranges of oxygen-depletion in bottom waters. Sensitive laminated sites may play the important role of a 'canary in the coal mine' where monitoring the character and geographical extent of laminations/varves serves as a diagnostic

  9. Compiling the First Monolingual Lusoga Dictionary | Nabirye | Lexikos

    African Journals Online (AJOL)

    Another theory, the theory of modularity, was used to bridge the gap between the theory of meaning and the compilation process. The modular ... This article, then, gives an overall summary of all the steps involved in the compilation of the Eiwanika ly'Olusoga, i.e. the Monolingual Lusoga Dictionary. Keywords: lexicography ...

  10. Compilations and evaluations of nuclear structure and decay date

    International Nuclear Information System (INIS)

    Lorenz, A.

    The material contained in this compilation is sorted according to eight subject categories: 1. General Compilations; 2. Basic Isotopic Properties; 3. Nuclear Structure Properties; 4. Nuclear Decay Processes: Half-lives, Energies and Spectra; 5. Nuclear Decay Processes: Gamma-rays; 6. Nuclear Decay Processes: Fission Products; 7. Nuclear Decay Processes: (Others); 8. Atomic Processes

  11. Production compilation : A simple mechanism to model complex skill acquisition

    NARCIS (Netherlands)

    Taatgen, N.A.; Lee, F.J.

    2003-01-01

    In this article we describe production compilation, a mechanism for modeling skill acquisition. Production compilation has been developed within the ACT-Rational (ACT-R; J. R. Anderson, D. Bothell, M. D. Byrne, & C. Lebiere, 2002) cognitive architecture and consists of combining and specializing

  12. Parallel computation for biological sequence comparison: comparing a portable model to the native model for the Intel Hypercube.

    Science.gov (United States)

    Nadkarni, P M; Miller, P L

    1991-01-01

    A parallel program for inter-database sequence comparison was developed on the Intel Hypercube using two models of parallel programming. One version was built using machine-specific Hypercube parallel programming commands. The other version was built using Linda, a machine-independent parallel programming language. The two versions of the program provide a case study comparing these two approaches to parallelization in an important biological application area. Benchmark tests with both programs gave comparable results with a small number of processors. As the number of processors was increased, the Linda version was somewhat less efficient. The Linda version was also run without change on Network Linda, a virtual parallel machine running on a network of desktop workstations.

  13. Analysis of the Intel 386 and i486 microprocessors for the Space Station Freedom Data Management System

    Science.gov (United States)

    Liu, Yuan-Kwei

    1991-01-01

    The feasibility is analyzed of upgrading the Intel 386 microprocessor, which has been proposed as the baseline processor for the Space Station Freedom (SSF) Data Management System (DMS), to the more advanced i486 microprocessors. The items compared between the two processors include the instruction set architecture, power consumption, the MIL-STD-883C Class S (Space) qualification schedule, and performance. The advantages of the i486 over the 386 are (1) lower power consumption; and (2) higher floating point performance. The i486 on-chip cache does not have parity check or error detection and correction circuitry. The i486 with on-chip cache disabled, however, has lower integer performance than the 386 without cache, which is the current DMS design choice. Adding cache to the 386/386 DX memory hierachy appears to be the most beneficial change to the current DMS design at this time.

  14. La responsabilitat davant la intel·ligència artificial en el comerç electrònic

    OpenAIRE

    Martín i Palomas, Elisabet

    2015-01-01

    Es planteja en aquesta tesi l'efecte produït sobre la responsabilitat derivada de les accions realitzades autònomament per sistemes dotats d'intel·ligència artificial, sense la participació directa de cap ésser humà, en els temes més directament relacionats amb el comerç electrònic. Per a això s'analitzen les activitats realitzades per algunes de les principals empreses internacionals de comerç electrònic, com el grup nord-americà eBay o el grup xinès Alibaba. Després de desenvolupar els prin...

  15. Evaluating the networking characteristics of the Cray XC-40 Intel Knights Landing-based Cori supercomputer at NERSC

    Energy Technology Data Exchange (ETDEWEB)

    Doerfler, Douglas [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Austin, Brian [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Cook, Brandon [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Deslippe, Jack [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Kandalla, Krishna [Cray Inc, Bloomington, MN (United States); Mendygral, Peter [Cray Inc, Bloomington, MN (United States)

    2017-09-12

    There are many potential issues associated with deploying the Intel Xeon Phi™ (code named Knights Landing [KNL]) manycore processor in a large-scale supercomputer. One in particular is the ability to fully utilize the high-speed communications network, given that the serial performance of a Xeon Phi TM core is a fraction of a Xeon®core. In this paper, we take a look at the trade-offs associated with allocating enough cores to fully utilize the Aries high-speed network versus cores dedicated to computation, e.g., the trade-off between MPI and OpenMP. In addition, we evaluate new features of Cray MPI in support of KNL, such as internode optimizations. We also evaluate one-sided programming models such as Unified Parallel C. We quantify the impact of the above trade-offs and features using a suite of National Energy Research Scientific Computing Center applications.

  16. Compilation of kinetic data for geochemical calculations

    International Nuclear Information System (INIS)

    Arthur, R.C.; Savage, D.; Sasamoto, Hiroshi; Shibata, Masahiro; Yui, Mikazu

    2000-01-01

    Kinetic data, including rate constants, reaction orders and activation energies, are compiled for 34 hydrolysis reactions involving feldspars, sheet silicates, zeolites, oxides, pyroxenes and amphiboles, and for similar reactions involving calcite and pyrite. The data are compatible with a rate law consistent with surface reaction control and transition-state theory, which is incorporated in the geochemical software package EQ3/6 and GWB. Kinetic data for the reactions noted above are strictly compatible with the transition-state rate law only under far-from-equilibrium conditions. It is possible that the data are conceptually consistent with this rate law under both far-from-equilibrium and near-to-equilibrium conditions, but this should be confirmed whenever possible through analysis of original experimental results. Due to limitations in the availability of kinetic data for mine-water reactions, and in order to simplify evaluations of geochemical models of groundwater evolution, it is convenient to assume local-equilibrium in such models whenever possible. To assess whether this assumption is reasonable, a modeling approach accounting for couple fluid flow and water-rock interaction is described that can be use to estimate spatial and temporal scale of local equilibrium. The approach is demonstrated for conditions involving groundwater flow in fractures at JNC's Kamaishi in-situ tests site, and is also used to estimate the travel time necessary for oxidizing surface waters to migrate to the level of a HLW repository in crystalline rock. The question of whether local equilibrium is a reasonable assumption must be addressed using an appropriate modeling approach. To be appropriate for conditions at the Kamaishi site using the modeling approach noted above, the fracture fill must closely approximate a porous mine, groundwater flow must be purely advective and diffusion of solutes across the fracture-host rock boundary must not occur. Moreover, the mineralogical and

  17. MATH77 - A LIBRARY OF MATHEMATICAL SUBPROGRAMS FOR FORTRAN 77, RELEASE 4.0

    Science.gov (United States)

    Lawson, C. L.

    1994-01-01

    MATH77 is a high quality library of ANSI FORTRAN 77 subprograms implementing contemporary algorithms for the basic computational processes of science and engineering. The portability of MATH77 meets the needs of present-day scientists and engineers who typically use a variety of computing environments. Release 4.0 of MATH77 contains 454 user-callable and 136 lower-level subprograms. Usage of the user-callable subprograms is described in 69 sections of the 416 page users' manual. The topics covered by MATH77 are indicated by the following list of chapter titles in the users' manual: Mathematical Functions, Pseudo-random Number Generation, Linear Systems of Equations and Linear Least Squares, Matrix Eigenvalues and Eigenvectors, Matrix Vector Utilities, Nonlinear Equation Solving, Curve Fitting, Table Look-Up and Interpolation, Definite Integrals (Quadrature), Ordinary Differential Equations, Minimization, Polynomial Rootfinding, Finite Fourier Transforms, Special Arithmetic , Sorting, Library Utilities, Character-based Graphics, and Statistics. Besides subprograms that are adaptations of public domain software, MATH77 contains a number of unique packages developed by the authors of MATH77. Instances of the latter type include (1) adaptive quadrature, allowing for exceptional generality in multidimensional cases, (2) the ordinary differential equations solver used in spacecraft trajectory computation for JPL missions, (3) univariate and multivariate table look-up and interpolation, allowing for "ragged" tables, and providing error estimates, and (4) univariate and multivariate derivative-propagation arithmetic. MATH77 release 4.0 is a subroutine library which has been carefully designed to be usable on any computer system that supports the full ANSI standard FORTRAN 77 language. It has been successfully implemented on a CRAY Y/MP computer running UNICOS, a UNISYS 1100 computer running EXEC 8, a DEC VAX series computer running VMS, a Sun4 series computer running Sun

  18. A compilation of energy costs of physical activities.

    Science.gov (United States)

    Vaz, Mario; Karaolis, Nadine; Draper, Alizon; Shetty, Prakash

    2005-10-01

    There were two objectives: first, to review the existing data on energy costs of specified activities in the light of the recommendations made by the Joint Food and Agriculture Organization/World Health Organization/United Nations University (FAO/WHO/UNU) Expert Consultation of 1985. Second, to compile existing data on the energy costs of physical activities for an updated annexure of the current Expert Consultation on Energy and Protein Requirements. Electronic and manual search of the literature (predominantly English) to obtain published data on the energy costs of physical activities. The majority of the data prior to 1955 were obtained using an earlier compilation of Passmore and Durnin. Energy costs were expressed as physical activity ratio (PAR); the energy cost of the activity divided by either the measured or predicted basal metabolic rate (BMR). The compilation provides PARs for an expanded range of activities that include general personal activities, transport, domestic chores, occupational activities, sports and other recreational activities for men and women, separately, where available. The present compilation is largely in agreement with the 1985 compilation, for activities that are common to both compilations. The present compilation has been based on the need to provide data on adults for a wide spectrum of human activity. There are, however, lacunae in the available data for many activities, between genders, across age groups and in various physiological states.

  19. RESEARCH AND PRACTICE OF THE NEWS MAP COMPILATION SERVICE

    Directory of Open Access Journals (Sweden)

    T. Zhao

    2018-04-01

    Full Text Available Based on the needs of the news media on the map, this paper researches on the news map compilation service, conducts demand research on the service of compiling news maps, designs and compiles the public authority base map suitable for media publication, and constructs the news base map material library. It studies the compilation of domestic and international news maps with timeliness and strong pertinence and cross-regional characteristics, constructs the hot news thematic gallery and news map customization services, conducts research on types of news maps, establish closer liaison and cooperation methods with news media, and guides news media to use correct maps. Through the practice of the news map compilation service, this paper lists two cases of news map preparation services used by different media, compares and analyses cases, summarizes the research situation of news map compilation service, and at the same time puts forward outstanding problems and development suggestions in the service of news map compilation service.

  20. Writing Compilers and Interpreters A Software Engineering Approach

    CERN Document Server

    Mak, Ronald

    2011-01-01

    Long-awaited revision to a unique guide that covers both compilers and interpreters Revised, updated, and now focusing on Java instead of C++, this long-awaited, latest edition of this popular book teaches programmers and software engineering students how to write compilers and interpreters using Java. You?ll write compilers and interpreters as case studies, generating general assembly code for a Java Virtual Machine that takes advantage of the Java Collections Framework to shorten and simplify the code. In addition, coverage includes Java Collections Framework, UML modeling, object-oriented p

  1. Compiler design handbook optimizations and machine code generation

    CERN Document Server

    Srikant, YN

    2003-01-01

    The widespread use of object-oriented languages and Internet security concerns are just the beginning. Add embedded systems, multiple memory banks, highly pipelined units operating in parallel, and a host of other advances and it becomes clear that current and future computer architectures pose immense challenges to compiler designers-challenges that already exceed the capabilities of traditional compilation techniques. The Compiler Design Handbook: Optimizations and Machine Code Generation is designed to help you meet those challenges. Written by top researchers and designers from around the

  2. Thermal Hydraulic Fortran Program for Steady State Calculations of Plate Type Fuel Research Reactors

    International Nuclear Information System (INIS)

    Khedr, H.

    2008-01-01

    The safety assessment of Research and Power Reactors is a continuous process over their life and that requires verified and validated codes. Power Reactor codes all over the world are well established and qualified against a real measuring data and qualified experimental facilities. These codes are usually sophisticated, require special skills and consume much more running time. On the other hand, most of the Research Reactor codes still requiring more data for validation and qualification. Therefore it is benefit for a regulatory body and the companies working in the area of Research Reactor assessment and design to have their own program that give them a quick judgment. The present paper introduces a simple one dimensional Fortran program called THDSN for steady state best estimate Thermal Hydraulic (TH) calculations of plate type fuel RRs. Beside calculating the fuel and coolant temperature distribution and pressure gradient in an average and hot channel the program calculates the safety limits and margins against the critical phenomena encountered in RR such as the burnout heat flux and the onset of flow instability. Well known TH correlations for calculating the safety parameters are used. THDSN program is verified by comparing its results for 2 and 10 MW benchmark reactors with that published in IAEA publications and good agreement is found. Also the program results are compared with those published for other programs such as PARET and TERMIC. An extension for this program is underway to cover the transient TH calculations

  3. A fortran program for elastic scattering of deuterons with an optical model containing tensorial potentials

    International Nuclear Information System (INIS)

    Raynal, J.

    1963-01-01

    The optical model has been applied with success to the elastic scattering of particles of spin 0 and 1/2 and to a lesser degree to that of deuterons. For particles of spin l/2, an LS coupling term is ordinarily used; this term is necessary to obtain a polarization; for deuterons, this coupling has been already introduced, but the possible forms of potentials are more numerous (in this case, scalar products of a second rank spin tensor with a tensor of the same rank in space or momentum can occur). These terms which may be necessary are primarily important for the tensor polarization. This problem is of particular interest at Saclay since a beam of polarized deuterons has become available. The FORTRAN program SPM 037 permits the study of the effect of tensorial potentials constructed from the distance of the deuteron from the target and its angular momentum with respect to it. This report should make possible the use and even the modification of the program. It consists of: a description of the problem and of the notation employed, a presentation of the methods adopted, an indication of the necessary data and how they should be introduced, and finally tables of symbols which are in equivalence or common statements: these tables must be considered when making any modification. (author) [fr

  4. Users manual for an expert system (HSPEXP) for calibration of the hydrological simulation program; Fortran

    Science.gov (United States)

    Lumb, A.M.; McCammon, R.B.; Kittle, J.L.

    1994-01-01

    Expert system software was developed to assist less experienced modelers with calibration of a watershed model and to facilitate the interaction between the modeler and the modeling process not provided by mathematical optimization. A prototype was developed with artificial intelligence software tools, a knowledge engineer, and two domain experts. The manual procedures used by the domain experts were identified and the prototype was then coded by the knowledge engineer. The expert system consists of a set of hierarchical rules designed to guide the calibration of the model through a systematic evaluation of model parameters. When the prototype was completed and tested, it was rewritten for portability and operational use and was named HSPEXP. The watershed model Hydrological Simulation Program--Fortran (HSPF) is used in the expert system. This report is the users manual for HSPEXP and contains a discussion of the concepts and detailed steps and examples for using the software. The system has been tested on watersheds in the States of Washington and Maryland, and the system correctly identified the model parameters to be adjusted and the adjustments led to improved calibration.

  5. PUBG; purex solvent extraction process model. [IBM3033; CDC CYBER175; FORTRAN IV

    Energy Technology Data Exchange (ETDEWEB)

    Geldard, J.F.; Beyerlein, A.L.

    PUBG is a chemical model of the Purex solvent extraction system, by which plutonium and uranium are recovered from spent nuclear fuel rods. The system comprises a number of mixer-settler banks. This discrete stage structure is the basis of the algorithms used in PUBG. The stages are connected to provide for countercurrent flow of the aqueous and organic phases. PUBG uses the common convention that has the aqueous phase enter at the lowest numbered stage and exit at the highest one; the organic phase flows oppositely. The volumes of the mixers are smaller than those of the settlers. The mixers generate a fine dispersion of one phase in the other. The high interfacial area is intended to provide for rapid mass transfer of the plutonium and uranium from one phase to the other. The separation of this dispersion back into the two phases occurs in the settlers. The species considered by PUBG are Hydrogen (1+), Plutonium (4+), Uranyl Oxide (2+), Plutonium (3+), Nitrate Anion, and reductant in the aqueous phase and Hydrogen (1+), Uranyl Oxide (2+), Plutonium (4+), and TBP (tri-n-butylphosphate) in the organic phase. The reductant used in the Purex process is either Uranium (4+) or HAN (hydroxylamine nitrate).IBM3033;CDC CYBER175; FORTRAN IV; OS/MVS or OS/MVT (IBM3033), NOS 1.3 (CDC CYBER175); The IBM3033 version requires 150K bytes of memory for execution; 62,000 (octal) words are required by the CDC CYBER175 version..

  6. FORTRAN routines for calculating water thermodynamic properties for use in transient thermal-hydraulics codes

    International Nuclear Information System (INIS)

    Green, C.

    1979-12-01

    A set of FORTRAN subroutines is described for calculating water thermodynamic properties. These were written for use in a transient thermal-hydraulics program, where speed of execution is paramount. The choice of which subroutines to optimise depends on the primary variables in the thermal-hydraulics code. In this particular case the subroutine which has been optimised is the one which calculates pressure and specific enthalpy given the specific volume and the specific internal energy. Another two subroutines are described which complete a self-consistent set. These calculate the specific volume and the temperature given the pressure and the specific enthalpy, and the specific enthalpy and the specific volume given the pressure and the temperature (or the quality). The accuracy is high near the saturation lines, typically less than 1% relative error, and decreases as the fluid becomes more subcooled in the liquid region or more superheated in the steam region. This behaviour is inherent in the method which uses quantities defined on the saturation lines and assumes that certain derivatives are constant for excursions away from these saturation lines. The accuracy and speed of the subroutines are discussed in detail in this report. (author)

  7. Les multituds intel·ligents com a generadores de dades massives : la intel·ligència col·lectiva al servei de la innovació social

    Directory of Open Access Journals (Sweden)

    Sanz, Sandra

    2015-06-01

    Full Text Available Les últimes dècades es registra un increment de mobilitzacions socials organitzades, intervingudes, narrades i coordinades a través de les TIC. Són mostra de multituds intel·ligents (smart mobs que s'aprofiten dels nous mitjans de comunicació per organitzar-se. Tant pel nombre de missatges intercanviats i generats com per les pròpies interaccions generades, aquestes multituds intel·ligents es converteixen en objecte de les dades massives. La seva anàlisi a partir de les possibilitats que brinda l'enginyeria de dades pot contribuir a detectar idees construïdes com també sabers compartits fruit de la intel·ligència col·lectiva. Aquest fet afavoriria la reutilització d'aquesta informació per incrementar el coneixement del col·lectiu i contribuir al desenvolupament de la innovació social. És per això que en aquest article s'assenyalen els interrogants i les limitacions que encara presenten aquestes anàlisis i es posa en relleu la necessitat d'aprofundir en el desenvolupament de nous mètodes i tècniques d'anàlisi.En las últimas décadas se registra un incremento de movilizaciones sociales organizadas, mediadas, narradas y coordinadas a través de TICs. Son muestra de smart mobs o multitudes inteligentes que se aprovechan de los nuevos medios de comunicación para organizarse. Tanto por el número de mensajes intercambiados y generados como por las propias interacciones generadas, estas multitudes inteligentes se convierten en objeto del big data. Su análisis a partir de las posibilidades que brinda la ingeniería de datos puede contribuir a detectar ideas construidas así como saberes compartidos fruto de la inteligencia colectiva. Ello favorecería la reutilización de esta información para incrementar el conocimiento del colectivo y contribuir al desarrollo de la innovación social. Es por ello que en este artículo se señalan los interrogantes y limitaciones que todavía presentan estos análisis y se pone de relieve la

  8. Fusing a Transformation Language with an Open Compiler

    NARCIS (Netherlands)

    Kalleberg, K.T.; Visser, E.

    2007-01-01

    Program transformation systems provide powerful analysis and transformation frameworks as well as concise languages for language processing, but instantiating them for every subject language is an arduous task, most often resulting in halfcompleted frontends. Compilers provide mature frontends with

  9. Digital compilation bedrock geologic map of the Warren quadrangle, Vermont

    Data.gov (United States)

    Vermont Center for Geographic Information — Digital Data from VG95-4A Walsh, GJ, Haydock, S, Prewitt, J, Kraus, J, Lapp, E, O'Loughlin, S, and Stanley, RS, 1995, Digital compilation bedrock geologic map of the...

  10. AICPA allows low-cost options for compiled financial statements.

    Science.gov (United States)

    Reinstein, Alan; Luecke, Randall W

    2002-02-01

    The AICPA Accounting and Review Services Committee's (ARSC) SSARS No. 8, Amendment to Statement on Standards for Accounting and Review Services No. 1, Compilation and Review of Financial Statements, issued in October 2000, allows financial managers to provide plain-paper, compiled financial statements for the exclusive use of management. Such financial statements were disallowed in 1979 when the AICPA issued SSARS No. 1, Compilation and Review of Financial Statements. With the issuance of SSARS No. 8, financial managers can prepare plain-paper, compiled financial statements when third parties are not expected to rely on the financial statements, management acknowledges such restrictions in writing, and management acknowledges its primary responsibility for the adequacy of the financial statements.

  11. Specification and Compilation of Real-Time Stream Processing Applications

    NARCIS (Netherlands)

    Geuns, S.J.

    2015-01-01

    This thesis is concerned with the specification, compilation and corresponding temporal analysis of real-time stream processing applications that are executed on embedded multiprocessor systems. An example of such applications are software defined radio applications. These applications typically

  12. Gravity Data for Southwestern Alaska (1294 records compiled)

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The gravity station data (1294 records) were compiled by the Alaska Geological Survey and the U.S. Geological Survey, Menlo Park, California. This data base was...

  13. Borrowing and Dictionary Compilation: The Case of the Indigenous ...

    African Journals Online (AJOL)

    rbr

    Keywords: BORROWING, DICTIONARY COMPILATION, INDIGENOUS LANGUAGES,. LEXICON, MORPHEME, VOCABULARY, DEVELOPING LANGUAGES, LOAN WORDS, TER-. MINOLOGY, ETYMOLOGY, LEXICOGRAPHY. Opsomming: Ontlening en woordeboeksamestelling: Die geval van in- heemse Suid-Afrikaanse ...

  14. Digital compilation bedrock geologic map of the Milton quadrangle, Vermont

    Data.gov (United States)

    Vermont Center for Geographic Information — Digital Data from VG95-8A Dorsey, R, Doolan, B, Agnew, PC, Carter, CM, Rosencrantz, EJ, and Stanley, RS, 1995, Digital compilation bedrock geologic map of the Milton...

  15. Digital compilation bedrock geologic map of the Lincoln quadrangle, Vermont

    Data.gov (United States)

    Vermont Center for Geographic Information — Digital Data from VG95-5A Stanley, R, DelloRusso, V, Haydock, S, Lapp, E, O'Loughlin, S, Prewitt, J,and Tauvers, PR, 1995, Digital compilation bedrock geologic map...

  16. Source list of nuclear data bibliographies, compilations, and evaluations

    International Nuclear Information System (INIS)

    Burrows, T.W.; Holden, N.E.

    1978-10-01

    To aid the user of nuclear data, many specialized bibliographies, compilations, and evaluations have been published. This document is an attempt to bring together a list of such publications with an indication of their availability and cost

  17. Solidify, An LLVM pass to compile LLVM IR into Solidity

    Energy Technology Data Exchange (ETDEWEB)

    2017-07-12

    The software currently compiles LLVM IR into Solidity (Ethereum’s dominant programming language) using LLVM’s pass library. Specifically, his compiler allows us to convert an arbitrary DSL into Solidity. We focus specifically on converting Domain Specific Languages into Solidity due to their ease of use, and provable properties. By creating a toolchain to compile lightweight domain-specific languages into Ethereum's dominant language, Solidity, we allow non-specialists to effectively develop safe and useful smart contracts. For example lawyers from a certain firm can have a proprietary DSL that codifies basic laws safely converted to Solidity to be securely executed on the blockchain. In another example, a simple provenance tracking language can be compiled and securely executed on the blockchain.

  18. ASSIST - a package of Fortran routines for handling input under specified syntax rules and for management of data structures

    International Nuclear Information System (INIS)

    Sinclair, J.E.

    1991-02-01

    The ASSIST package (A Structured Storage and Input Syntax Tool) provides for Fortran programs a means for handling data structures more general than those provided by the Fortran language, and for obtaining input to the program from a file or terminal according to specified syntax rules. The syntax-controlled input can be interactive, with automatic generation of prompts, and dialogue to correct any input errors. The range of syntax rules possible is sufficient to handle lists of numbers and character strings, keywords, commands with optional clauses, and many kinds of variable-format constructions, such as algebraic expressions. ASSIST was developed for use in two large programs for the analysis of safety of radioactive waste disposal facilities, but it should prove useful for a wide variety of applications. (author)

  19. Data compilation for particle-impact desorption, 2

    International Nuclear Information System (INIS)

    Oshiyama, Takashi; Nagai, Siro; Ozawa, Kunio; Takeutchi, Fujio.

    1985-07-01

    The particle impact desorption is one of the elementary processes of hydrogen recycling in controlled thermonuclear fusion reactors. We have surveyed the literature concerning the ion impact desorption and photon stimulated desorption published through the end of 1984 and compiled the data on the desorption cross sections and yields with the aid of a computer. This report presents the results of the compilation in graphs and tables as functions of incident energy, surface temperature and surface coverage. (author)

  20. DLVM: A modern compiler infrastructure for deep learning systems

    OpenAIRE

    Wei, Richard; Schwartz, Lane; Adve, Vikram

    2017-01-01

    Deep learning software demands reliability and performance. However, many of the existing deep learning frameworks are software libraries that act as an unsafe DSL in Python and a computation graph interpreter. We present DLVM, a design and implementation of a compiler infrastructure with a linear algebra intermediate representation, algorithmic differentiation by adjoint code generation, domain-specific optimizations and a code generator targeting GPU via LLVM. Designed as a modern compiler ...

  1. Compilation and analysis of Escherichia coli promoter DNA sequences.

    OpenAIRE

    Hawley, D K; McClure, W R

    1983-01-01

    The DNA sequence of 168 promoter regions (-50 to +10) for Escherichia coli RNA polymerase were compiled. The complete listing was divided into two groups depending upon whether or not the promoter had been defined by genetic (promoter mutations) or biochemical (5' end determination) criteria. A consensus promoter sequence based on homologies among 112 well-defined promoters was determined that was in substantial agreement with previous compilations. In addition, we have tabulated 98 promoter ...

  2. A Fortran program for the numerical integration of the one-dimensional Schroedinger equation using exponential and Bessel fitting methods

    International Nuclear Information System (INIS)

    Cash, J.R.; Raptis, A.D.; Simos, T.E.

    1990-01-01

    An efficient algorithm is described for the accurate numerical integration of the one-dimensional Schroedinger equation. This algorithm uses a high-order, variable step Runge-Kutta like method in the region where the potential term dominates, and an exponential or Bessel fitted method in the asymptotic region. This approach can be used to compute scattering phase shifts in an efficient and reliable manner. A Fortran program which implements this algorithm is provided and some test results are given. (orig.)

  3. Application of Pfortran and Co-Array Fortran in the Parallelization of the GROMOS96 Molecular Dynamics Module

    Directory of Open Access Journals (Sweden)

    Piotr Bała

    2001-01-01

    Full Text Available After at least a decade of parallel tool development, parallelization of scientific applications remains a significant undertaking. Typically parallelization is a specialized activity supported only partially by the programming tool set, with the programmer involved with parallel issues in addition to sequential ones. The details of concern range from algorithm design down to low-level data movement details. The aim of parallel programming tools is to automate the latter without sacrificing performance and portability, allowing the programmer to focus on algorithm specification and development. We present our use of two similar parallelization tools, Pfortran and Cray's Co-Array Fortran, in the parallelization of the GROMOS96 molecular dynamics module. Our parallelization started from the GROMOS96 distribution's shared-memory implementation of the replicated algorithm, but used little of that existing parallel structure. Consequently, our parallelization was close to starting with the sequential version. We found the intuitive extensions to Pfortran and Co-Array Fortran helpful in the rapid parallelization of the project. We present performance figures for both the Pfortran and Co-Array Fortran parallelizations showing linear speedup within the range expected by these parallelization methods.

  4. Fiscal 2000 report on advanced parallelized compiler technology. Outlines; 2000 nendo advanced heiretsuka compiler gijutsu hokokusho (Gaiyo hen)

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2001-03-01

    Research and development was carried out concerning the automatic parallelized compiler technology which improves on the practical performance, cost/performance ratio, and ease of operation of the multiprocessor system now used for constructing supercomputers and expected to provide a fundamental architecture for microprocessors for the 21st century. Efforts were made to develop an automatic multigrain parallelization technology for extracting multigrain as parallelized from a program and for making full use of the same and a parallelizing tuning technology for accelerating parallelization by feeding back to the compiler the dynamic information and user knowledge to be acquired during execution. Moreover, a benchmark program was selected and studies were made to set execution rules and evaluation indexes for the establishment of technologies for subjectively evaluating the performance of parallelizing compilers for the existing commercial parallel processing computers, which was achieved through the implementation and evaluation of the 'Advanced parallelizing compiler technology research and development project.' (NEDO)

  5. BADGER v1.0: A Fortran equation of state library

    Science.gov (United States)

    Heltemes, T. A.; Moses, G. A.

    2012-12-01

    The BADGER equation of state library was developed to enable inertial confinement fusion plasma codes to more accurately model plasmas in the high-density, low-temperature regime. The code had the capability to calculate 1- and 2-T plasmas using the Thomas-Fermi model and an individual electron accounting model. Ion equation of state data can be calculated using an ideal gas model or via a quotidian equation of state with scaled binding energies. Electron equation of state data can be calculated via the ideal gas model or with an adaptation of the screened hydrogenic model with ℓ-splitting. The ionization and equation of state calculations can be done in local thermodynamic equilibrium or in a non-LTE mode using a variant of the Busquet equivalent temperature method. The code was written as a stand-alone Fortran library for ease of implementation by external codes. EOS results for aluminum are presented that show good agreement with the SESAME library and ionization calculations show good agreement with the FLYCHK code. Program summaryProgram title: BADGERLIB v1.0 Catalogue identifier: AEND_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEND_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 41 480 No. of bytes in distributed program, including test data, etc.: 2 904 451 Distribution format: tar.gz Programming language: Fortran 90. Computer: 32- or 64-bit PC, or Mac. Operating system: Windows, Linux, MacOS X. RAM: 249.496 kB plus 195.630 kB per isotope record in memory Classification: 19.1, 19.7. Nature of problem: Equation of State (EOS) calculations are necessary for the accurate simulation of high energy density plasmas. Historically, most EOS codes used in these simulations have relied on an ideal gas model. This model is inadequate for low

  6. KUEBEL. A Fortran program for computation of cooling-agent-distribution within reactor fuel-elements

    International Nuclear Information System (INIS)

    Inhoven, H.

    1984-12-01

    KUEBEL is a Fortran-program for computation of cooling-agent-distribution within reactor fuel-elements or -zones of theirs. They may be assembled of max. 40 cooling-channels with laminar up to turbulent type of flow (respecting Reynolds' coefficients up to 2.0E+06) at equal pressure loss. Flow-velocity, dynamic flow-, contraction- and friction-losses will be calculated for each channel and for the total zone. Other computations will present mean heat-up of cooling-agent, mean outlet-temperature of the core, boiling-temperature and absolute pressure at flow-outlet. All characteristic coolant-values, including the factor of safety for flow-instability of the most-loaded cooling gap are computed by 'KUEBEL' too. Absolute pressure at flow-outlet or is-factor may be defined as dependent or independent variables of the program alternatively. In latter case 3 variations of solution will be available: Adapted flow of cooling-agent, inlet-temperature of the core and thermal power. All calculations can be done alternatively with variation of parameters: flow of cooling-agent, inlet-temperature of the core and thermal power, which are managed by the program itself. 'KUEBEL' is able to distinguish light- and heavy-water coolant, flow-direction of coolant and fuel elements with parallel, rectangular, respectively concentric, cylindrical shape of their gaps. Required material specifics are generated by the program. Segments of fuel elements or constructively unconnected gaps can also be computed by means of interposition of S.C. 'phantom channels'. (orig.) [de

  7. RAGBEEF: a FORTRAN IV implementation of a time-dependent model for radionuclide contamination of beef

    Energy Technology Data Exchange (ETDEWEB)

    Pleasant, J C; McDowell-Boyer, L M; Killough, G G

    1982-06-01

    RAGBEEF is a FORTRAN IV program that calculates radionuclide concentrations in beef as a result of ingestion of contaminated feeds, pasture, and pasture soil by beef cattle. The model implemented by RAGBEEF is dynamic in nature, allowing the user to consider age- and season-dependent aspects of beef cattle management in estimating concentrations in beef. It serves as an auxiliary code to RAGTIME, previously documented by the authors, which calculates radionuclide concentrations in agricultural crops in a dynamic manner, but evaluates concentrations in beef for steady-state conditions only. The time-dependent concentrations in feeds, pasture, and pasture soil generated by RAGTIME are used as input to the RAGBEEF code. RAGBEEF, as presently implemented, calculates radionuclide concentrations in the muscle of age-based cohorts in a beef cattle herd. Concentrations in the milk of lactating cows are also calculated, but are assumed age-dependent as in RAGTIME. Radionuclide concentrations in beef and milk are described in RAGBEEF by a system of ordinary linear differential equations in which the transfer rate of radioactivity between compartments is proportional to the inventory of radioactivity in the source compartment. This system is solved by use of the GEAR package for solution of systems of ordinary differential equations. The accuracy of this solution is monitored at various check points by comparison with explicit solutions of Bateman-type equations. This report describes the age- and season-dependent considerations making up the RAGBEEF model, as well as presenting the equations which describe the model and a documentation of the associated computer code. Listings of the RAGBEEF and updated RAGTIME codes are provided in appendices, as are the results of a sample run of RAGBEEF and a description of recent modifications to RAGTIME.

  8. RAGBEEF: a FORTRAN IV implementation of a time-dependent model for radionuclide contamination of beef

    International Nuclear Information System (INIS)

    Pleasant, J.C.; McDowell-Boyer, L.M; Killough, G.G.

    1982-06-01

    RAGBEEF is a FORTRAN IV program that calculates radionuclide concentrations in beef as a result of ingestion of contaminated feeds, pasture, and pasture soil by beef cattle. The model implemented by RAGBEEF is dynamic in nature, allowing the user to consider age- and season-dependent aspects of beef cattle management in estimating concentrations in beef. It serves as an auxiliary code to RAGTIME, previously documented by the authors, which calculates radionuclide concentrations in agricultural crops in a dynamic manner, but evaluates concentrations in beef for steady-state conditions only. The time-dependent concentrations in feeds, pasture, and pasture soil generated by RAGTIME are used as input to the RAGBEEF code. RAGBEEF, as presently implemented, calculates radionuclide concentrations in the muscle of age-based cohorts in a beef cattle herd. Concentrations in the milk of lactating cows are also calculated, but are assumed age-dependent as in RAGTIME. Radionuclide concentrations in beef and milk are described in RAGBEEF by a system of ordinary linear differential equations in which the transfer rate of radioactivity between compartments is proportional to the inventory of radioactivity in the source compartment. This system is solved by use of the GEAR package for solution of systems of ordinary differential equations. The accuracy of this solution is monitored at various check points by comparison with explicit solutions of Bateman-type equations. This report describes the age- and season-dependent considerations making up the RAGBEEF model, as well as presenting the equations which describe the model and a documentation of the associated computer code. Listings of the RAGBEEF and updated RAGTIME codes are provided in appendices, as are the results of a sample run of RAGBEEF and a description of recent modifications to RAGTIME

  9. An efficient MPI/OpenMP parallelization of the Hartree–Fock–Roothaan method for the first generation of Intel® Xeon Phi™ processor architecture

    International Nuclear Information System (INIS)

    Mironov, Vladimir; Moskovsky, Alexander; D’Mello, Michael; Alexeev, Yuri

    2017-01-01

    The Hartree-Fock (HF) method in the quantum chemistry package GAMESS represents one of the most irregular algorithms in computation today. Major steps in the calculation are the irregular computation of electron repulsion integrals (ERIs) and the building of the Fock matrix. These are the central components of the main Self Consistent Field (SCF) loop, the key hotspot in Electronic Structure (ES) codes. By threading the MPI ranks in the official release of the GAMESS code, we not only speed up the main SCF loop (4x to 6x for large systems), but also achieve a significant (>2x) reduction in the overall memory footprint. These improvements are a direct consequence of memory access optimizations within the MPI ranks. We benchmark our implementation against the official release of the GAMESS code on the Intel R Xeon PhiTM supercomputer. Here, scaling numbers are reported on up to 7,680 cores on Intel Xeon Phi coprocessors.

  10. Experience with low-power x86 processors (Atom) for HEP usage. An initial analysis of the Intel® dual core Atom™ N330 processor

    CERN Document Server

    Balazs, G; Nowak, A; CERN. Geneva. IT Department

    2009-01-01

    In this paper we compare a system based on an Intel Atom N330 low-power processor to a modern Intel Xeon® dual-socket server using CERN IT’s standard criteria for comparing price-performance and performance per watt. The Xeon server corresponds to what is typically acquired as servers in the LHC Computing Grid. The comparisons used public pricing information from November 2008. After the introduction in section 1, section 2 describes the hardware and software setup. In section 3 we describe the power measurements we did and in section 4 we discuss the throughput performance results. In section 5 we summarize our initial conclusions. We then go on to describe our long term vision and possible future scenarios for using such low-power processors, and finally we list interesting development directions.

  11. ELT-scale Adaptive Optics real-time control with thes Intel Xeon Phi Many Integrated Core Architecture

    Science.gov (United States)

    Jenkins, David R.; Basden, Alastair; Myers, Richard M.

    2018-05-01

    We propose a solution to the increased computational demands of Extremely Large Telescope (ELT) scale adaptive optics (AO) real-time control with the Intel Xeon Phi Knights Landing (KNL) Many Integrated Core (MIC) Architecture. The computational demands of an AO real-time controller (RTC) scale with the fourth power of telescope diameter and so the next generation ELTs require orders of magnitude more processing power for the RTC pipeline than existing systems. The Xeon Phi contains a large number (≥64) of low power x86 CPU cores and high bandwidth memory integrated into a single socketed server CPU package. The increased parallelism and memory bandwidth are crucial to providing the performance for reconstructing wavefronts with the required precision for ELT scale AO. Here, we demonstrate that the Xeon Phi KNL is capable of performing ELT scale single conjugate AO real-time control computation at over 1.0kHz with less than 20μs RMS jitter. We have also shown that with a wavefront sensor camera attached the KNL can process the real-time control loop at up to 966Hz, the maximum frame-rate of the camera, with jitter remaining below 20μs RMS. Future studies will involve exploring the use of a cluster of Xeon Phis for the real-time control of the MCAO and MOAO regimes of AO. We find that the Xeon Phi is highly suitable for ELT AO real time control.

  12. Heat dissipation for the Intel Core i5 processor using multiwalled carbon-nanotube-based ethylene glycol

    International Nuclear Information System (INIS)

    Thang, Bui Hung; Trinh, Pham Van; Quang, Le Dinh; Khoi, Phan Hong; Minh, Phan Ngoc; Huong, Nguyen Thi

    2014-01-01

    Carbon nanotubes (CNTs) are some of the most valuable materials with high thermal conductivity. The thermal conductivity of individual multiwalled carbon nanotubes (MWCNTs) grown by using chemical vapor deposition is 600 ± 100 Wm -1 K -1 compared with the thermal conductivity 419 Wm -1 K -1 of Ag. Carbon-nanotube-based liquids - a new class of nanomaterials, have shown many interesting properties and distinctive features offering potential in heat dissipation applications for electronic devices, such as computer microprocessor, high power LED, etc. In this work, a multiwalled carbon-nanotube-based liquid was made of well-dispersed hydroxyl-functional multiwalled carbon nanotubes (MWCNT-OH) in ethylene glycol (EG)/distilled water (DW) solutions by using Tween-80 surfactant and an ultrasonication method. The concentration of MWCNT-OH in EG/DW solutions ranged from 0.1 to 1.2 gram/liter. The dispersion of the MWCNT-OH-based EG/DW solutions was evaluated by using a Zeta-Sizer analyzer. The MWCNT-OH-based EG/DW solutions were used as coolants in the liquid cooling system for the Intel Core i5 processor. The thermal dissipation efficiency and the thermal response of the system were evaluated by directly measuring the temperature of the micro-processor using the Core Temp software and the temperature sensors built inside the micro-processor. The results confirmed the advantages of CNTs in thermal dissipation systems for computer processors and other high-power electronic devices.

  13. Quantum Chemical Calculations Using Accelerators: Migrating Matrix Operations to the NVIDIA Kepler GPU and the Intel Xeon Phi.

    Science.gov (United States)

    Leang, Sarom S; Rendell, Alistair P; Gordon, Mark S

    2014-03-11

    Increasingly, modern computer systems comprise a multicore general-purpose processor augmented with a number of special purpose devices or accelerators connected via an external interface such as a PCI bus. The NVIDIA Kepler Graphical Processing Unit (GPU) and the Intel Phi are two examples of such accelerators. Accelerators offer peak performances that can be well above those of the host processor. How to exploit this heterogeneous environment for legacy application codes is not, however, straightforward. This paper considers how matrix operations in typical quantum chemical calculations can be migrated to the GPU and Phi systems. Double precision general matrix multiply operations are endemic in electronic structure calculations, especially methods that include electron correlation, such as density functional theory, second order perturbation theory, and coupled cluster theory. The use of approaches that automatically determine whether to use the host or an accelerator, based on problem size, is explored, with computations that are occurring on the accelerator and/or the host. For data-transfers over PCI-e, the GPU provides the best overall performance for data sizes up to 4096 MB with consistent upload and download rates between 5-5.6 GB/s and 5.4-6.3 GB/s, respectively. The GPU outperforms the Phi for both square and nonsquare matrix multiplications.

  14. Stereoscopic-3D display design: a new paradigm with Intel Adaptive Stable Image Technology [IA-SIT

    Science.gov (United States)

    Jain, Sunil

    2012-03-01

    Stereoscopic-3D (S3D) proliferation on personal computers (PC) is mired by several technical and business challenges: a) viewing discomfort due to cross-talk amongst stereo images; b) high system cost; and c) restricted content availability. Users expect S3D visual quality to be better than, or at least equal to, what they are used to enjoying on 2D in terms of resolution, pixel density, color, and interactivity. Intel Adaptive Stable Image Technology (IA-SIT) is a foundational technology, successfully developed to resolve S3D system design challenges and deliver high quality 3D visualization at PC price points. Optimizations in display driver, panel timing firmware, backlight hardware, eyewear optical stack, and synch mechanism combined can help accomplish this goal. Agnostic to refresh rate, IA-SIT will scale with shrinking of display transistors and improvements in liquid crystal and LED materials. Industry could profusely benefit from the following calls to action:- 1) Adopt 'IA-SIT S3D Mode' in panel specs (via VESA) to help panel makers monetize S3D; 2) Adopt 'IA-SIT Eyewear Universal Optical Stack' and algorithm (via CEA) to help PC peripheral makers develop stylish glasses; 3) Adopt 'IA-SIT Real Time Profile' for sub-100uS latency control (via BT Sig) to extend BT into S3D; and 4) Adopt 'IA-SIT Architecture' for Monitors and TVs to monetize via PC attach.

  15. CAPS OpenACC Compilers: Performance and Portability

    CERN Multimedia

    CERN. Geneva

    2013-01-01

    The announcement late 2011 of the new OpenACC directive-based programming standard supported by CAPS, CRAY and PGI compilers has open up the door to more scientific applications that can be ported on many-core systems. Following a porting methodology, this talk will first review the principles of programming with OpenACC and then the advanced features available in the CAPS compilers to further optimize OpenACC applications: library integration, tuning directives with auto-tune mechanisms to build applications adaptive to different GPUs. CAPS compilers use hardware vendors' backends such as NVIDIA CUDA and OpenCL making them the only OpenACC compilers supporting various many-core architectures. About the speaker Stéphane Bihan is co-funder and currently Director of Sales and Marketing at CAPS enterprise. He has held several R&D positions in companies such as ARC international plc in London, Canon Research Center France, ACE compiler experts in Amsterdam and the INRIA r...

  16. The Katydid system for compiling KEE applications to Ada

    Science.gov (United States)

    Filman, Robert E.; Bock, Conrad; Feldman, Roy

    1990-01-01

    Components of a system known as Katydid are developed in an effort to compile knowledge-based systems developed in a multimechanism integrated environment (KEE) to Ada. The Katydid core is an Ada library supporting KEE object functionality, and the other elements include a rule compiler, a LISP-to-Ada translator, and a knowledge-base dumper. Katydid employs translation mechanisms that convert LISP knowledge structures and rules to Ada and utilizes basic prototypes of a run-time KEE object-structure library module for Ada. Preliminary results include the semiautomatic compilation of portions of a simple expert system to run in an Ada environment with the described algorithms. It is suggested that Ada can be employed for AI programming and implementation, and the Katydid system is being developed to include concurrency and synchronization mechanisms.

  17. Compilation of current high energy physics experiments - Sept. 1978

    Energy Technology Data Exchange (ETDEWEB)

    Addis, L.; Odian, A.; Row, G. M.; Ward, C. E. W.; Wanderer, P.; Armenteros, R.; Joos, P.; Groves, T. H.; Oyanagi, Y.; Arnison, G. T. J.; Antipov, Yu; Barinov, N.

    1978-09-01

    This compilation of current high-energy physics experiments is a collaborative effort of the Berkeley Particle Data Group, the SLAC library, and the nine participating laboratories: Argonne (ANL), Brookhaven (BNL), CERN, DESY, Fermilab (FNAL), KEK, Rutherford (RHEL), Serpukhov (SERP), and SLAC. Nominally, the compilation includes summaries of all high-energy physics experiments at the above laboratories that were approved (and not subsequently withdrawn) before about June 1978, and had not completed taking of data by 1 January 1975. The experimental summaries are supplemented with three indexes to the compilation, several vocabulary lists giving names or abbreviations used, and a short summary of the beams at each of the laboratories (except Rutherford). The summaries themselves are included on microfiche. (RWR)

  18. Charged particle induced thermonuclear reaction rates: a compilation for astrophysics

    International Nuclear Information System (INIS)

    Grama, C.

    1999-01-01

    We report on the results of the European network NACRE (Nuclear Astrophysics Compilation of REaction rates). The principal reason for setting up the NACRE network has been the necessity of building up a well-documented and detailed compilation of rates for charged-particle induced reactions on stable targets up to Si and on unstable nuclei of special significance in astrophysics. This work is meant to supersede the only existing compilation of reaction rates issued by Fowler and collaborators. The main goal of NACRE network was the transparency in the procedure of calculating the rates. More specifically this compilation aims at: 1. updating the experimental and theoretical data; 2. distinctly identifying the sources of the data used in rate calculation; 3. evaluating the uncertainties and errors; 4. providing numerically integrated reaction rates; 5. providing reverse reaction rates and analytical approximations of the adopted rates. The cross section data and/or resonance parameters for a total of 86 charged-particle induced reactions are given and the corresponding reaction rates are calculated and given in tabular form. Uncertainties are analyzed and realistic upper and lower bounds of the rates are determined. The compilation is concerned with the reaction rates that are large enough for the target lifetimes shorter than the age of the Universe, taken equal to 15 x 10 9 y. The reaction rates are provided for temperatures lower than T = 10 10 K. In parallel with the rate compilation a cross section data base has been created and located at the site http://pntpm.ulb.ac.be/nacre..htm. (authors)

  19. TSPP - A Collection of FORTRAN Programs for Processing and Manipulating Time Series

    Science.gov (United States)

    Boore, David M.

    2008-01-01

    This report lists a number of FORTRAN programs that I have developed over the years for processing and manipulating strong-motion accelerograms. The collection is titled TSPP, which stands for Time Series Processing Programs. I have excluded 'strong-motion accelerograms' from the title, however, as the boundary between 'strong' and 'weak' motion has become blurred with the advent of broadband sensors and high-dynamic range dataloggers, and many of the programs can be used with any evenly spaced time series, not just acceleration time series. This version of the report is relatively brief, consisting primarily of an annotated list of the programs, with two examples of processing, and a few comments on usage. I do not include a parameter-by-parameter guide to the programs. Future versions might include more examples of processing, illustrating the various parameter choices in the programs. Although these programs have been used by the U.S. Geological Survey, no warranty, expressed or implied, is made by the USGS as to the accuracy or functioning of the programs and related program material, nor shall the fact of distribution constitute any such warranty, and no responsibility is assumed by the USGS in connection therewith. The programs are distributed on an 'as is' basis, with no warranty of support from me. These programs were written for my use and are being publically distributed in the hope that others might find them as useful as I have. I would, however, appreciate being informed about bugs, and I always welcome suggestions for improvements to the codes. Please note that I have made little effort to optimize the coding of the programs or to include a user-friendly interface (many of the programs in this collection have been included in the software usdp (Utility Software for Data Processing), being developed by Akkar et al. (personal communication, 2008); usdp includes a graphical user interface). Speed of execution has been sacrificed in favor of a code that

  20. FORTRAN Code for Glandular Dose Calculation in Mammography Using Sobol-Wu Parameters

    Directory of Open Access Journals (Sweden)

    Mowlavi A A

    2007-07-01

    Full Text Available Background: Accurate computation of the radiation dose to the breast is essential to mammography. Various the thicknesses of breast, the composition of the breast tissue and other variables affect the optimal breast dose. Furthermore, the glandular fraction, which refers to the composition of the breasts, as partitioned between radiation-sensitive glandular tissue and the adipose tissue, also has an effect on this calculation. Fatty or fibrous breasts would have a lower value for the glandular fraction than dense breasts. Breast tissue composed of half glandular and half adipose tissue would have a glandular fraction in between that of fatty and dense breasts. Therefore, the use of a computational code for average glandular dose calculation in mammography is a more effective means of estimating the dose of radiation, and is accurate and fast. Methods: In the present work, the Sobol-Wu beam quality parameters are used to write a FORTRAN code for glandular dose calculation in molybdenum anode-molybdenum filter (Mo-Mo, molybdenum anode-rhodium filter (Mo-Rh and rhodium anode-rhodium filter (Rh-Rh target-filter combinations in mammograms. The input parameters of code are: tube voltage in kV, half-value layer (HVL of the incident x-ray spectrum in mm, breast thickness in cm (d, and glandular tissue fraction (g. Results: The average glandular dose (AGD variation against the voltage of the mammogram X-ray tube for d = 4 cm, HVL = 0.34 mm Al and g=0.5 for the three filter-target combinations, as well as its variation against the glandular fraction of breast tissue for kV=25, HVL=0.34, and d=4 cm has been calculated. The results related to the average glandular absorbed dose variation against HVL for kV = 28, d=4 cm and g= 0.6 are also presented. The results of this code are in good agreement with those previously reported in the literature. Conclusion: The code developed in this study calculates the glandular dose quickly, and it is complete and

  1. Compiling gate networks on an Ising quantum computer

    International Nuclear Information System (INIS)

    Bowdrey, M.D.; Jones, J.A.; Knill, E.; Laflamme, R.

    2005-01-01

    Here we describe a simple mechanical procedure for compiling a quantum gate network into the natural gates (pulses and delays) for an Ising quantum computer. The aim is not necessarily to generate the most efficient pulse sequence, but rather to develop an efficient compilation algorithm that can be easily implemented in large spin systems. The key observation is that it is not always necessary to refocus all the undesired couplings in a spin system. Instead, the coupling evolution can simply be tracked and then corrected at some later time. Although described within the language of NMR, the algorithm is applicable to any design of quantum computer based on Ising couplings

  2. Compilation of current high-energy-physics experiments

    International Nuclear Information System (INIS)

    Wohl, C.G.; Kelly, R.L.; Armstrong, F.E.

    1980-04-01

    This is the third edition of a compilation of current high energy physics experiments. It is a collaborative effort of the Berkeley Particle Data Group, the SLAC library, and ten participating laboratories: Argonne (ANL), Brookhaven (BNL), CERN, DESY, Fermilab (FNAL), the Institute for Nuclear Study, Tokyo (INS), KEK, Rutherford (RHEL), Serpukhov (SERP), and SLAC. The compilation includes summaries of all high energy physics experiments at the above laboratories that (1) were approved (and not subsequently withdrawn) before about January 1980, and (2) had not completed taking of data by 1 January 1976

  3. Acceleration of Cherenkov angle reconstruction with the new Intel Xeon/FPGA compute platform for the particle identification in the LHCb Upgrade

    Science.gov (United States)

    Faerber, Christian

    2017-10-01

    The LHCb experiment at the LHC will upgrade its detector by 2018/2019 to a ‘triggerless’ readout scheme, where all the readout electronics and several sub-detector parts will be replaced. The new readout electronics will be able to readout the detector at 40 MHz. This increases the data bandwidth from the detector down to the Event Filter farm to 40 TBit/s, which also has to be processed to select the interesting proton-proton collision for later storage. The architecture of such a computing farm, which can process this amount of data as efficiently as possible, is a challenging task and several compute accelerator technologies are being considered for use inside the new Event Filter farm. In the high performance computing sector more and more FPGA compute accelerators are used to improve the compute performance and reduce the power consumption (e.g. in the Microsoft Catapult project and Bing search engine). Also for the LHCb upgrade the usage of an experimental FPGA accelerated computing platform in the Event Building or in the Event Filter farm is being considered and therefore tested. This platform from Intel hosts a general CPU and a high performance FPGA linked via a high speed link which is for this platform a QPI link. On the FPGA an accelerator is implemented. The used system is a two socket platform from Intel with a Xeon CPU and an FPGA. The FPGA has cache-coherent memory access to the main memory of the server and can collaborate with the CPU. As a first step, a computing intensive algorithm to reconstruct Cherenkov angles for the LHCb RICH particle identification was successfully ported in Verilog to the Intel Xeon/FPGA platform and accelerated by a factor of 35. The same algorithm was ported to the Intel Xeon/FPGA platform with OpenCL. The implementation work and the performance will be compared. Also another FPGA accelerator the Nallatech 385 PCIe accelerator with the same Stratix V FPGA were tested for performance. The results show that the Intel

  4. Emulation of MS DOS Operational System on the Autonomous Crate-Controller with I8086 microprocessor

    International Nuclear Information System (INIS)

    Hons, Z.; Cizek, P.; Streit, V.

    1988-01-01

    KM-DOS operating system for CAMAC autonomous crate-controller based on Intel 8086/8087 microprocessor connected with Pravec-16 IBM PC is described. The KM-DOS system fully emulates the MS DOS environment on the CAMAC controller. Thus ASSEMBLER, FORTRAN, C and PASCAL programs compiled and linked on IBM PC and compatible can be run on the CAMAC controller and parall work of both computers is enabled

  5. METHUSELAH II - A Fortran program and nuclear data library for the physics assessment of liquid-moderated reactors

    International Nuclear Information System (INIS)

    Brinkworth, M.J.; Griffiths, J.A.

    1966-03-01

    METHUSELAH II is a Fortran program with a nuclear data library, used to calculate cell reactivity and burn-up in liquid-moderated reactors. It has been developed from METHUSELAH I by revising the nuclear data library, and by introducing into the program improvements relating to nuclear data, improvements in efficiency and accuracy, and additional facilities which include a neutron balance edit, specialised outputs, fuel cycling, and fuel costing. These developments are described and information is given on the coding and usage of versions of METHUSELAH II for the IBM 7030 (STRETCH), IBM 7090, and KDF9 computers. (author)

  6. HAUFES : a FORTRAN code for the calculation of compound nuclear cross-sections by Hauser-Feshbach theory

    International Nuclear Information System (INIS)

    Viyogi, Y.P.; Ganguly, N.K.

    1975-01-01

    The FORTRAN code described in the report has been developed for the BESM-6 computer with a view to calculate the cross-section of reactions proceeding via the formation of compound nucleus for all open two-body reaction channels using Hauser-Feshbach theory with Moldauer's correction for the fluctuation of level widths. The code can also be used to analyse data from 'crystal blocking' experiments to obtain nuclear level densities. The report describes the input-output specifications along with a short account of the algorithm of the program. (author)

  7. DATA-ENTRY-3: some observations and pragmatics of a structured design. [In FORTRAN for PDP-11/10

    Energy Technology Data Exchange (ETDEWEB)

    Sparks, D.

    1977-08-01

    The FORTRAN program DATA-ENTRY-3 was developed from the COBOL program DATA-ENTRY-1, which solves a large class of elementary data-capture, data-formating, and data-editing problems of managerial accounting. Most of the work involved finding methods to make DATA-ENTRY-3, which is written for a small-machine environment (PDP-11/10 under the RT-11 operating system), logically equivalent to DATA-ENTRY-1, which is written for a large-machine environment (CDC 6600 under a time-sharing operating system). This report explains how structured programing helped, and briefly describes the function of each subroutine.

  8. EDDY - a FORTRAN program to extract significant features from eddy-current test data - the basis of the CANSCAN system

    International Nuclear Information System (INIS)

    Jarvis, R.G.; Cranston, R.J.

    1982-09-01

    The FORTRAN program EDDY is designed to analyse data: from eddy-current scans of steam generator tubes. It is written in modular form, for future development, and it uses signal-recognition techniques that the authors developed in the profilometry of irradiated fuel elements. During a scan, significant signals are detected and extracted for immediate attention or more detailed analysis later. A version of the program was used in the CANSCAN system 'for automated eddy-current in-service inspection of nuclear steam generator tubing'

  9. Compilation of historical information of 300 Area facilities and activities

    International Nuclear Information System (INIS)

    Gerber, M.S.

    1992-12-01

    This document is a compilation of historical information of the 300 Area activities and facilities since the beginning. The 300 Area is shown as it looked in 1945, and also a more recent (1985) look at the 300 Area is provided

  10. Statistical Compilation of the ICT Sector and Policy Analysis | CRDI ...

    International Development Research Centre (IDRC) Digital Library (Canada)

    Statistical Compilation of the ICT Sector and Policy Analysis. As the presence and influence of information and communication technologies (ICTs) continues to widen and deepen, so too does its impact on economic development. However, much work needs to be done before the linkages between economic development ...

  11. The Compilation of a Shona Children's Dictionary: Challenges and Solutions

    Directory of Open Access Journals (Sweden)

    Peniah Mabaso

    2011-10-01

    Full Text Available Abstract: This article outlines the challenges encountered by the African Languages Research Institute (ALRI team members in the compilation of the monolingual Shona Children's Dictionary. The focus is mainly on the problems met in headword selection. Solutions by the team members when dealing with these problems are also presented.

  12. Compilation of a global inventory of emissions of nitrous oxide

    NARCIS (Netherlands)

    Bouwman, A.F.

    1995-01-01

    A global inventory with 1°x1° resolution was compiled of emissions of nitrous oxide (N 2 O) to the atmosphere, including emissions from soils under natural vegetation, fertilized agricultural land, grasslands and animal excreta, biomass burning, forest clearing,

  13. A Journey from Interpreters to Compilers and Virtual Machines

    DEFF Research Database (Denmark)

    Danvy, Olivier

    2003-01-01

    We review a simple sequence of steps to stage a programming-language interpreter into a compiler and virtual machine. We illustrate the applicability of this derivation with a number of existing virtual machines, mostly for functional languages. We then outline its relevance for todays language...

  14. Indexed compilation of experimental high energy physics literature. [Synopsis

    Energy Technology Data Exchange (ETDEWEB)

    Horne, C.P.; Yost, G.P.; Rittenberg, A.

    1978-09-01

    An indexed compilation of approximately 12,000 experimental high energy physics documents is presented. A synopsis of each document is presented, and the documenta are indexed according to beam/target/momentum, reaction/momentum, final-state-particle, particle/particle-property, accelerator/detector, and (for a limited set of the documents) experiment. No data are given.

  15. Final report: Compiled MPI. Cost-Effective Exascale Application Development

    Energy Technology Data Exchange (ETDEWEB)

    Gropp, William Douglas [Univ. of Illinois, Urbana-Champaign, IL (United States)

    2015-12-21

    This is the final report on Compiled MPI: Cost-Effective Exascale Application Development, and summarizes the results under this project. The project investigated runtime enviroments that improve the performance of MPI (Message-Passing Interface) programs; work at Illinois in the last period of this project looked at optimizing data access optimizations expressed with MPI datatypes.

  16. Not mere lexicographic cosmetics: the compilation and structural ...

    African Journals Online (AJOL)

    This article offers a brief overview of the compilation of the Ndebele music terms dictionary, Isichazamazwi SezoMculo (henceforth the ISM), paying particular attention to its struc-tural features. It emphasises that the reference needs of the users as well as their reference skills should be given a determining role in all ...

  17. Compilation of the nuclear codes available in CTA

    International Nuclear Information System (INIS)

    D'Oliveira, A.B.; Moura Neto, C. de; Amorim, E.S. do; Ferreira, W.J.

    1979-07-01

    The present work is a compilation of some nuclear codes available in the Divisao de Estudos Avancados of the Instituto de Atividades Espaciais, (EAV/IAE/CTA). The codes are organized as the classification given by the Argonne National Laboratory. In each code are given: author, institution of origin, abstract, programming language and existent bibliography. (Author) [pt

  18. Compilation of historical information of 300 Area facilities and activities

    Energy Technology Data Exchange (ETDEWEB)

    Gerber, M.S.

    1992-12-01

    This document is a compilation of historical information of the 300 Area activities and facilities since the beginning. The 300 Area is shown as it looked in 1945, and also a more recent (1985) look at the 300 Area is provided.

  19. DJ Prinsloo and BP Sathekge (compil- ers — revised edition).

    African Journals Online (AJOL)

    The compilers of this new edition have successfully highlighted the important additions to the last edition of the dictionary. It is important to inform pro- spective users about new information. It is also a marketing strategy to announce the contents of a new product in both the preface and at the back of the cover page, as is the ...

  20. Statistical Compilation of the ICT Sector and Policy Analysis | IDRC ...

    International Development Research Centre (IDRC) Digital Library (Canada)

    Statistical Compilation of the ICT Sector and Policy Analysis. As the presence and influence of information and communication technologies (ICTs) continues to widen and deepen, so too does its impact on economic development. However, much work needs to be done before the linkages between economic development ...